LHC Computing Grid Project

Project Execution Board

Notes of the meeting of Tuesday November 11, 2003 - Draft

 

Present:

Dario Barberis, Ian Bird, Federico Carminati, Philippe Charpentier; Dave Foster, Frédéric Hemmer, Jürgen Knobloch, Bernd Panzer, Les Robertson, David Stickland, Torre Wenaus

 

 

Les started the meeting by announcing that this was an informal meeting because the approval process for a modified LCG management structure is still ongoing. The choice between the options for the role of the SC2 are still open.  For the time being we stay with the name “PEB” (instead of PMB) – JK has agreed to be the permanent secretary. The current timeslot of Tuesdays at 16:00 was endorsed.

Les asked that proposed agenda items for future meetings should be sent to JK. The next two meetings cannot take place because of the two LCG reviews. The next meeting will take place on December 2.

Conclusions from GDB (10 November)

Les reported that the most crucial item was the question of a POOL catalog and the different approaches in US and Europe. The GDB did not come to a clear conclusion but it is important that the LCG project formulates a clear strategy.

CMS (DS) is not asking anything special. They advocate having ONE working system – if necessary with reduced functionality – to run the productions. If necessary, the catalog could be cross-populated to an alternate system later. It is however important that the service is run more reliably without interruptions. A half-hour interrupt can affect a day of work.

Ian emphasized that the plan is to deploy the EDG RLS working with GFAL replica manager and that there is a strong argument not to do anything else. If the experiments think differently they need to make a statement soon – within the next few weeks.

While this strategy covers the 2004 data challenges, a plan needs to be developed for what should happen after mid-2004. Les proposed that LCG-2 be supported and used as a service until something better can replace part or all of it.

There was a general opinion to go for two separate teams – one supporting and maintaining LCG-2 and one for developing new functionality.

 

LCG-1 and experiment usage

 

The experiments stated where they stand with LCG-1:

 

LHCb:

Andrei Tsaregorodtsev from Marseille got the LHCb production going with EDG 2.0 without using the resource broker but interfacing directly to the computing element. They plan then to target directly LCG-2 without going to LCG-1 in between. The LHCb production is due to start in April. In the meantime they are not just waiting for LCG-2 but preparing by testing the LCG-2 software installation tools.

 

Atlas:

Oxana Smirnova and the testing team are actively using LCG-1 with a stable but obsolete version of the Atlas software that was used for the previous data challenge. They are not currently using POOL. There is no major production going on right now. Smaller productions are taking place without using LCG-1. The situation is now reasonably stable and the target timeframe for a more extended use in Atlas is LCG-2.

In parallel, discussions are taking place concerning the installation (including re-compilation and using shared libraries) of Atlas software on LCG-1.  Dario asked that ordinary users should be able to compile and submit jobs which are not the standard production jobs in a kind of sandbox model.

In the discussion, the urgency to converge on the general installation issue was emphasized. Federico proposed that Ian should go ahead with his proposal of software installation tools with rpms in spite of ongoing discussions on the matter in GAG.

 

Alice:

A couple of FTEs are hammering LCG-1. Most of this activity takes place in Italy.

Alice is trying to use as much as possible of LCG-1 via their interface between Alien and LCG-1. The risk is that they are debugging at the same time the interface and LCG-1. They hope to be able to use LCG-2 for the data challenge in January.

The system seems much more stable. Alice would like to see common mailing list for LCG-1 problems. Ian said that there was one. The proposal is that the list be more used in general.

 

CMS:

Most of the work is still going on with LCG-0. LCG-1 is currently used for testing and not yet for production. They set up the jobs with the POOL catalog and RLS and then use the resource broker to run them detached. All bits and pieces are working and have been used for test productions with Oscar using POOL.

 

There was some discussion launched by Philippe on “slicing” the file POOL file catalog into XML. (The secretary does not feel competent to elaborate more on the discussion).

 

LCG-2 status

Ian reported that all the pieces were available. The original request was to have the same functionality as in LCG-1 but using the compiler gcc 3.2.

Now, apart from many bug fixes, a major issue is that the RLS has changed following a request from POOL to provide a more convenient interface to access RLS. A POOL release that goes with LCG-2 and this interface will be ready a couple of weeks in December. This change should be transparent to end users. Another change is that the schema in the database has changed to make queries now more efficient. It is preferable to migrate database entries now before they are more populated. The idea is to have this LCG-2 release ready to go by the end of this month (November) and expect to do the deployment of LCG-2 within a few days after the release.

On a question of Federico, Ian said that the plan with RGMA was to use it in an isolated way as a monitoring system not interfering with the production system.

 

ARDA/HEPCAL response planning

Les proposed that following the Arda discussion in SC2, Frédéric and Torre should now make proposals on a work plan. It would be useful if the view of experiments could be entered into the process of sharing between middleware, applications and experiment specific developments.

Federico made the proposal that Frédéric should study and make a statement of what could be done in the middleware area – what is left after that should go to the applications area and what is still left should go to the experiments.  He added that as much as possible should be done in the middleware.

Dario, Philippe and David S. seconded this proposal.

Dario said that Atlas has avoided to have an own solution and was always counting on a common development in this area. They have however a strong interest and want therefore to be involved in the discussion on the functionality and modularity of the end product. Atlas people are also interested in other layers – including Atlas own software.  Anyway things should be defined in a coherent way.

David expressed an uncertainty on how to relate this with the prototype stage.

Philippe stressed the importance of functionality  - independent of work breakdown and boundaries of responsibilities

Federico stressed that after the decision on middleware one should go as fast as possible for a prototype for analysis.

On the question of David on where the people could come from apart from EGEE resources, Les remarked that one should also include also people from the experiments. Frédéric noted that EGEE manpower would be available only in April 2004.

Torre remarked that it was unfortunate to come only now with analysis requirements when POOL is already much advanced. There are some 3 to 4 people from the application area to work on this.  Federico stressed the importance to come up with one OGSI compliant implementation soon.

In conclusion, the next step is that Frédéric and Torre would develop ideas and propose a workplan towards a prototype to be available after the first half of 2004. A draft plan will be presented at the next SC2 in December.

On the question of Les on who would be the grid experts from the experiments to participate, the experiment representatives made first proposals.


Jürgen Knobloch