CERN Computing Resources Administration and LCG

Notes from a 2nd meeting between Maria Dimou, Dave Kelsey (part-time), Wim van Leersum, Ian Neilson, Nick Ziogas on 2004-05-19. This is a follow-up of the 2004-04-22 meeting and the one with the PIE specialists of 2004-04-30.

In this meeting we discussed with the technical responsibles of the new CCDB project, now called Computing Resources Admin (CRA) possibilities for our LCG User Registration tools to read-in/link-to the necessary Personal information from CERN HR db (generic name adopted in sub-sequent discussions: ORGDB, for ORGanisational DB) for new candidate LCG users. We explained that the LCG Virtual Organisation Data Base (VODB) must contain Personal information, i.e. Authentication (AuthN) data and Grid access Authorization (AuthZ) data for every user.

The AuthZ data can only be entered and maintained in VODB. The AuthN data can be only present in ORGDB. Open questions at the time of the meeting:

Nick sent by email straight after the meeting the following proposal that keeps the VO manager as the mail data validator and minimises automation of the process to join the VO:
- After discussion with all VOs a common interface is elaborated and agreed, with a min amount of checks, that all VOs will have to use to enter personal data.
- At data entry, the interface sends a request to the "ORGDB"s of specific or all ORGs and a best effort match is done at the "ORG". Data is sent back and the data entry clerk or manager chooses from the list (s) or rejects them and enters the person manually
- This mechanism is also available in batch mode. It can run regularly but it means that someone will have to go though the matched data and sort it out. It could run once to 'clean' and validate the data.
It is not ideal as it requires manual intervention but then again if people do not want to follow specific procedures they won't automatically get the benefits. If the do not want to end with unmanageable systems 2 years from now they must be prepared to do the manual work.

In response to Maria's question: "How many records per LHC experiment can be retrieved in ORGDB?" Wim sent the numbers by email.

Wim's comment to these notes sent on 2004-06-15:
I would like to stress that the choice of how to link the databases should depend on the number of overlaps. If 95% of the population already exists in HR, I would ask the experiment secretariats to enter the remaining 5% as well through PIE, thus ensuring coherency. If it's only 50%, it may not be worth forcing them to use PIE, and you could just take a copy of what we have and develop your own tools to add the other half. However, if a person is entered that way and later comes to CERN and gets registered through PIE as well, you may end up with duplicates, because matching two persons with all the possibilities of data entry mistakes is quite difficult. For that reason we do have a duplicate checker which is invoked every time a person is entered in PIE, which is based on statistics, and which is very difficult to maintain and tune.

Maria Dimou, IT/GD, Grid Infrastructure Services