LCG
Management Board
|
|
Date/Time: |
Tuesday 10 January 2006 at 16:00 |
Agenda: |
|
Members: |
|
|
(Version 3 - 17.1.2005) |
Participants: |
A.Aimar (notes), D.Barberis, L.Bauerdick, I.Bird, K.Bos, N.Brook, F.Carminati, T.Cass, Ph.Charpentier, I.Fisk, B.Gibbard, J.Gordon, F.Hernandez, E.Laure, P.Mato, H.Marten, M.Mazzucato, G.Merino, B.Panzer, Di Quing, L.Robertson, J.Shiers, M.Schulz |
Action List |
|
Next Meeting: |
Tuesday 17 January 2006 from 16:00 |
Minutes and Matters Arising ( minutes ) |
|
Comment to WLCG
High Level Milestones The changes were
approved by the MB. News from OSG Announcement:
R.Pordes has been elected Executive Director of OSG. There will be a
meeting of the OSG Consortium in the week of the 23rd Jan 2006;
many LCG and EGEE representatives will be present. LCG Collaboration
Board The Collaboration
Board has been formed and there will be representation for each Tier-0,
Tier-1 and Tier-2 site or federation. The first meeting
will be on the 3rd of February. The main goal of the meeting is
the election the Chair of the CB. The list of the
members is reachable from the C-RRB page (http://lcg.web.cern.ch/LCG/Boards/crrb.html)
the links to the lists of Tier-1 Centers and Tier-2 Centers mention their CB
representatives. Summary of
Christmas Operations The summary was
distributed by J.Shiers to the MB ( email
). SC3 ran over
Christmas to 9 out of 11 Tier-1 sites (all but FNAL and NDGF) using FTS. CMS
was not using FTS to FNAL and there was no news from NDGF. Met few problems and
most of them were solved quite rapidly (with ASGC, INFN and IN2P3). Here is
the detailed
log. A more significant
problem was that the network to SARA was down for most of the period. This
highlights the issue that the procedures to handle such situation were not
well defined. The contact details of the network operators on the site were
not available., and the network site tests were only
checking the general network, not the SC3 dedicated line. ATLAS used the grid
quite successfully during the period. Noted that some jobs were scheduled for
execution several days after they had been submitted, during which time their
input files had (intentionally) been deleted. The weekly Operations
meeting should be attended by all experiments: only LHCb was present at last
meeting. |
|
Action List Review ( list of actions ) |
|
The action list will be reviewed outside this meeting, contacting the MB members involved. Action: 15 Jan 06 - A.Aimar
will contact the people with pending actions and will update the action list.
|
|
Another attempt to agree on the re-run of the SC3 throughput tests. Does this include tape or not? And if it does not when does the tape path get tested prior to the SC4 throughput tests in April? |
|
SC3 re-run tape tests In December the MB agreed that the tape tests were important for the SC3 re-run and should be executed. Since then some sites have announced that they would prefer not to
participate, but it is too risky to wait until April when the tapes will have
to be working for SC4. An assessment of this risk was requested from each Tier-1 site: FZK Additional tape drives were ordered, and it may be difficult to execute the requested tests during the SC3 re-run. Will send more information to the MB. ASGC Difficult to be ready for the SC3 re-run. Will send more
information. SARA
In February will receive more equipment for SC4. Therefore SC4
will use different equipment, but the same HPSS storage system. RAL Action: 17 Jan 06 –
Tier-1 sites send updated information and dates about their SC3 tape tests.
If they cannot do those tests in time for SC3 they must send the recovery
plans in order to perform their tape tests as soon as possible and before the
April SC4 throughput run. Castor 2 recording test at CERN
Other tests with more streams reached 1.2 GB/s for a 48 h period. There were only minor problems, the system behaved well and the results are very positive. Debugging of SC3 disk/disk transfers SC3 disk /disk tests are being debugged, experiments are asked not to start data transfers for the coming week to make it easier for the SC team to understand the problem. LHCb (and ATLAS requested the same at the GDB) wants to repeat
their data transfer tests from Tier-0 to their Tier-1 sites after the SC3
tests (end f February). |
|
SC4 plans and requirements ( more
information ) Agreement on the process
and timetable for defining the services to be provided and building a
detailed plan |
|
The plans for SC4 need to be defined in greater detail before the CHEP workshop (10 Feb 06) in order to be final for end of February. Not all requests from the experiments can be fulfilled for SC4 therefore there must be no ambiguity on what will be available. The list provided by F.Donno is a good starting point in order to clarify what is possible to implement in time, with priorities and effort. The plans (see the proposal in the MB agenda) should include: - an “experiment view” of services mentioning exactly which features within each service will be available to each experiment for SC4 - a “site view” with details of the schedule of each service, and when it will to deployed, tested and in production at each site - a “schedule by services” that summarizes all activities (development, testing, deployment, commissioning, etc) that need to be executed in order to have each service tested and in production. An initial proposal will be prepared by the SC4 team with the PO. It will then require several iterations with experiments, sites, deployment team and development projects. All parties involved (experiments, site, services, etc) should give priority to the definition of this SC4 plan. The plan will cover Tier-1 sites first. Tier-2 will be included once the plan is better defined for the Tier-1 sites. In principle their installation schedules should not be too different from the Tier-1 plans. A first version of the SC4 plan should be finished by the end of January, and therefore be ready for a detailed and conclusive discussion at the CHEP Workshop. In parallel the discussion at the EGEE TCG will continue, to prioritise the list of features and improvements, also derived from the same list compiled by F.Donno. This (TCG) process will deal also with the prioritising and planning of longer term developments that will not make it into SC4. The MB agreed to endorse the proposal and experiments and Tier-1 sites will nominate, within 48h, the people able to help with the definition of the plan. Action: 13 Jan 06 – Experiments and sites should
all send urgently the name of one person with the authority for discussing
the details of the plans for SC4. |
|
AOB |
|
The VO-boxes Workshop is going to take place on the 24-25 February 2006. |
|
Summary of New Actions |
|
13 Jan 06 – Experiments and sites should
all send urgently the name of one person with the authority for discussing the
details of the plans for SC4. 15 Jan 06 - A.Aimar
will contact the people with pending actions and will update the action list.
17 Jan 06 –
Tier-1 sites send updated information and dates about their SC3 tape tests.
If they cannot do those tests in time for SC3 they must send the recovery
plans in order to perform their tape tests as soon as possible and before the
April SC4 throughput run. The full Action List, current and past items, will be in this wiki page before next MB meeting. |
|
|