Minutes of the PTB Meeting on January 28th Present: Alexia Augier-Bochon, Kors Bos, Olof Barring, Maite Barroso Lopez, Vincent Breton, Federico Carminati, Peter Clarke, Mauro Draoli, François Etienne, Steve Fisher, Fabrizio Gagliardi, Antonia Ghiselli, John Gordon, Frank Harris, Bob Jones (chairman), Dave Kelsey, Peter Kunszt, Julian Linford, Cal Loomis, Robin Middelton, ? Mark Parsons, Laura Perini, Francesco Prelz, Pascale Primet, Roberto Puccinelli, Les Robertson, Markus Schulz, Ben Segal,  Massimo Sgaravatto, Heinz Stockinger, David Widegren Gabriel Zaquine ------------------------------------------------------------------------------ Introduction to EDMS by Gabriel Zaquine see: (http://edmsoraweb.cern.ch:8001/cedar/doc.info?document_id=335478&version=0.2&p_tab=) This is only a short summary of the presentation. For more details the transparencies will provide much additional information. David Widegren is a new member on the EDMS team. EDMS is a management system for documents based on the commercial product CADIM to which CERN has added a web based interface. The document management allows, among others, handling of versioning, change control and access control. The system can be configured to reflect the project structure and the release procedures are adjusted to the EU requirements. A document can be in one of the following states: In Work, Under Review, Under PTB Approval, Approved, Obsolete. The approved state will be extended to indicate approval by the PTB and EU. A document can be passed back and forth between the In-Work and Under-Review states without changing the version number. From later states the document can be moved back to IN-Work only by passing through the Obsolete state and the creation of a new version. The status of an document can be changed by the WP managers and their deputies. To create documents the completion of a CERN form and the approval of the CERN administration is required. During the approval process all documents that are attached to a document have to be approved too. Documents can be simultaneously attached to several nodes in the project. CERN maintains a support web page and phone line (78888) to support users. The second part of the presentation was a demonstration of the life cycle of a document inside EDMS. Some discussions arose when it became clear that documents with the identical EDG signature can be created at different places inside the hierarchy. To avoid this every document name should contain the EDMS internal document number. It was pointed out that there is no requirement to have documents during the Working_state always stored in EDMS. This would affect mainly those who prefer to prepare the documents with other tools but Word. Handling in parallel multiple versions of a document is possible by marking all but one as Obsolete. The automatic version increase effects the last level of the versioning scheme used. However the version numbers can be increased manually. When uploading documents it is important to ensure the EDMS number is present inside the document. Dialog forms should be attached to documents via a reference to the URL. ---------------------------------------------------------------------------- Introduction Les Robertson - The second year of the project is starting - Many deliverables - A lot of work went into the integration process - The project is well placed for the next two years - The next step is to move from small scale testbeds to large scale systems - Thanks to Mark for helping us to prepare documents presented to the EU. - Bob will take over to lead the PTB from now on. The audience thanked Les for his contribution to the project. ------------------------------------------------------------------------------- Minutes of the previous meeting and matters arising Bob Jones A status of the open action items from the previous meeting was given. Fabrizio reported on the process to come to an acceptable agreement between WP6 and the application WP mangers. This action item could be closed. Federico reported on the action item to create a merged list of application requirements. There is a first version. Now the joint RTag will build a common list in the scope of the LHC-grid. This is a different scope, but will serve the same purpose. Kors announced that the problems with the projects web pages have been addressed and the action item can be closed. Pascale has sent the letter concerning D7.5 to the Project Office. ----------------------------------------------------------------------------- Overview of Mode of Work by Bob Jones (see http://documents.cern.ch/cgi-bin/setlink?base=agenda&categ=a0295&id=a0295s1t2/transparencies) The meetings of the PTB are scheduled to take place every 2-3 month. The agenda of the meetings will be managed via CERN's Agenda Maker. Speakers are requested to upload their slides and relevant material to the agenda. The agenda for PTB meetings can be found following http:// documents.cern.ch/AGE/current/ The password for this meeting is PTB012002. The new secretary of the PTB is Markus Schulz (markus.schulz@cern.ch). Kors Bos is responsible to oversee the process of deliverables and nominate moderators. The agenda of the PTB will cover items that affect the overall project. Items that will be covered are: Testbeds, software release status, status of deliverables, relation with other Grid projects,architecture and cross WP planning. ------------------------------------------------------------------ Testbed Status Cal Loomis (see http://documents.cern.ch/cgi-bin/setlink?base=agenda&categ=a0295&id=a0295s1t3/transparencies) Globus: There is a severe problem with the globus 2 alpha 15 release that effects file transfers. Currently this is the version deployed at most of the EDG sites. The best version of Globus that is available so far is version Beta-21. This is currently used on some of the sites. Several minor bugs have been discovered. There are indications for a gsincftp problem. The sandboxes are currently transported via gsincftp. However, for file transfer globus-url-copy performs correctly and could be used as a replacement.. Frank H. mentioned that LHCb has done tests and have experienced problems with the integrity of the sandbox. Tests at CNAF don't show this behavior despite the similar setup. Francesco Prelz made a comment for WP1. The testbed used by WP1 is now based on the Beta-21 release. WP1 sees no way how to support two different testbeds. He wanted to know why WP6 moved to the new version while half of the testbed sites still are using version alpha-15. There was a short discussion about the motivation of globus to move from ftp to URL-Copy. This seems to be motivated by licensing issues with the ftp implementation used. There are different versions of the WP1 code running on different sites. Cal explained his semi-automatic testing suit for gsincftp. The problem seems to be size related, but he has observed already 1k file transfers to fail. A second site with Beta-21 will be set up and tested. EDG Software Status For WP1,2,3,4,5 and WP6 new versions of their contributions have been either submitted or are expected to arrive soon. Thursday will be the deadline for the EDG 1.1 release. Bob stressed the point that delivery of RPMs is not sufficient without providing proper documentation. Cal clarified that a new procedure is in place to manage the acceptance of RPMs that are released by the WPs. Testbed at CERN A Beta-21 resource broker points to CERN's site 1, CNAF-Bologna and Lyon. The site 1 at CERN is currently running globus alpha-15 and is installed by LCFG. The storage element is set up for multiple VOs. There is a request to have some machines set up as SEs and provide 50-100GB storage to test replication. The second site at CERN is installed and configured using LCFG. Globus-beta-21 is used and version 1.03 of the WP1 software. A local resource broker has been set up. Other Sites: The sites at CNAF-Bologna, Lyon, RAL and soon NIKHEF all operate with globus-beta-21. Except these core sites most places still use the alpha-15 release. Plans: The 1.0.0 release will be tagged on Friday. All sites are asked to follow the upgrade. After a weeks test on Friday the 8th. version 1.1.0 will be released. After this release all sites have to follow the upgrade to 1.1.0. Main focus is to get ready for the EU Demo. The discussion that followed was focused on the problems arising due to multiple parallel versions and the frequent change of versions. There was a broad consensus that the 1.0.0 release mainly serves as a test for the release procedure and most sites should follow the 1.1.0 release. ----------------------------------------------------------------- Future WP6 Tasks Cal Loomis (see http://documents.cern.ch/cgi-bin/setlink?base=agenda&categ=a0295&id=a0295s1t3/transparencies) Cal identified several administrative tasks for WP6. Among those are: -The development of release procedures for the integration team -Policies to handles daemons -Packaging rules for applications (RPM as a format is not very much liked by the experiments) The handling of daemons touches the region of interest of WP4 and has to be discussed with them. There are several legal issues, like the proper placement of copyright statements and a license review, which have to be solved across all WPs. The procedure to auto build the EDG middle-ware from the CVS repository has to be put in place. The experience of the integration with testbed 1 shows the need to start a test group to develop a framework for running test a suite and it is important to have a release a first version soon. There was a short discussion about the scope of a test suite. It was agreed that a test should cover globus too. A package to be run on each new site should allow an automatic basic evaluation of a sites status. There are plans to make more detailed (and accurate) information about the status of the testbeds available via a Web based interface. Several tools are planned to help with the package distribution. A script to verify that the correct daemons are running will be provided. Olof pointed out that the functionality to monitor daemons is in WP4s current release. Documentation: The basic User's Guide has to be expanded into a useful document. Information from separate sources has to be combined to produce a Developer's Guide and the EDG-Installation Guide has to be updated continuously. Operating Systems: Port to Solaris 7. WP6 needs more than 2 machines for this. Upgrade to Red Hat 7.2 is planned for July. Bob made clear that there might be severe problems for the experiments to follow the upgrade to 7.2 because CMS depends on Objectivity and there is no port to 7.2 available. Kors suggested that we could run different versions on the grid. Cal thinks that a uniform testbed is easier to manage. Miscellaneous: More information of the configuration of a site has to be published by information providers (set of RPMs installed...) The User Interface should be repacked to create a version that could run from a laptop and be deployed within AFS at CERN. Bob wanted to know how Cal's plans cover the priority list provided by the applications. Cal referred to Federico's presentation. Bob thanked Cal for the presentation and asked about what are the next steps. Francois said the intention is to better define the testing and support role of WP6. Bob agreed and asked for a plan including the roles of the WP6 individuals to be prepared before the EU review. action: Francois Etienne to prepare plan for WP6 testing and support activities. ------------------------------------------------------------------------------------ TB1 - An Application View Federico Carminati (see http://documents.cern.ch/cgi-bin/setlink?base=agenda&categ=a0295&id=a0295s1t35/transparencies) The feedback given by the experiments was summarized. LHCB: LHCb tried to test several commands, most didn't work correctly or failed completely. ATLAS: ATLAS had only 10 days before Christmas for testing. They intend to use EDG tools in production for the DC1. DC1 will start in two months. This will have a major impact on the future of EDG in ATLAS. ATLAS managed to test successfully the WP1 tools. Stress tests couldn't be performed. For full operation ATLAS needs stable middle ware from WP1, 2 and 5 very soon. The integration with MSS is needed to conduct meaningful tests. An extension of the testbed to non-EU sites is important for ATLAS. A procedure has to be defined to install the ATLAS DC1 software. This includes Objectivity and AFS (or a local AFS mirror faking /afs). ALICE: Many tests performed, both application dependent and independent. Missing are tests that involve the Replica Catalog and the MSS. The uniformity of the testbed is very important for the experiments. The correct environment has to be set everywhere. All sites (and nodes) need to be synchronized to a common time server. CMS: The job submission with resource matching could be tested with success. The Sandbox could be used. The API of the Replica Catalog was tried. The tests suffered under a highly unstable testbed. There were several problems affecting the Resource Broker. Large file transfers suffered under data corruption. The main points that have to improve: The testbed has to provide support for multiple VOs. Instructions are needed on how to use GDMP for data replication. There is an interface needed to the SE I/O that updates at the same time the RC. EO: Except for the retrieval of the output the basic job submission worked. Up to now there have been no stress tests. Replication was not tested, mainly because the procedure is not clear. The lack of stability of the testbed is a problem. If the documentation of EDG software provides examples they have to work to be useful. The integration effort has been underestimated. Integration: The integration procedure lasted too long (Sep.1 to Dec 10). The lack of an high level architecture might be part of the problem. The applications should be full members not observers of the integration team to avoid problems like missing libraries and environment. The integration was terminated before one site was fully operational. Some integration work had to be carried out by the Loose Cannons and the WPs. There is the need for WP6 to elaborate a clear plan for deployment. There is a mismatch between ITeam's priorities and the needs of the application. The applications are preparing a report to address this more specifically. Validation: There should be more support from WP6 to the applications. The ITeam at CERN disintegrated before the two local sites were up and running. There is a lack of uniformity between different sites. A clear distinction between test sites and stable sites on the testbed is needed. The 10 days period that the applications had for testing was too short. Antonia made a request for a distributed wide area testbed with more support from WP6. Priority List of Applications: Federico gave a list of commands that have to work on all testbed sites. This list includes access to CASTOR and HPSS. The priorities for storage and replication are dealing with the problem that the function and interaction of the basic elements involved is not very well understood. A high level model is missing. The documentation of typical usage scenarios is missing. On the practical side a system with at least two SEs is needed to test replication and publishing of files into the RC. The problem of Sandbox corruption has to solved. The user environment should at least provide a unique location that can be used to run scripts to set up the environment for the applications. PATH and library path should be uniform for the system software and core software (CERNLIB). The automatic file replication via GDMP has to be tested. Heinz pointed out that the changes underway will make this easier. Federico replied that the quality of GDMPs documentation is currently a major problem. Actions: More tests are needed. A priority list for the ITeam. A priority list for the WPs. The SC2 will produce a common set of Use-Cases. The Loose Cannons will decrease their WP6 related work. During the discussion following the presentation several questions came up which were answered by Federico. How will LHC-Grid and EDG work together? They will use common Use-Cases for the four HEP experiments. Why only integrate the Use-Cases for HEP? We will first unite the four LHC experiments and if this works Earth-Observation and Bio-Med will come on board. There was the concern that the requirements list for the ITeam mixes problem domains of different WPs and that the ITeam isn't the right body to discuss new functionality. Federico stated that he agrees, but that the ITeam has the full picture of the system. It was decided that Cal should make a list of people working on the test framework. This should happen between release 1.0 and 1.1. action: Cal to identify individuals working on the test framework. ---------------------------------------------------------------- Project Review Agenda Fabrizio Gagliardi The annual Report will be out within one week to be seen by everyone. Fabrizio urges WPs to send the contributions in time. Out of 12 only 7 have been received in time. CERN has no additional resources. Review Agenda: The executive summaries are of great importance due to the amount of material submitted. There was a short discussion about providing a reading list. It was agreed that a reading list should be assembled. Actions: - All WP managers are asked to submit a two line statement about the most important component provided and the most pressing open issues. - A member of the Biolmed or EO community should present the accomplishments of the project from the users point of view. Ben made the remark that we start the review with the part that is currently one of the weaker areas. He questioned if this is a good strategy. Fab. said We have to explain and present the testbed in an enthusiastic way. The demonstration will take part in building 600 at CERN. The site should be set up in a way that is visual appealing too. Ingo will describe the demo and Eric (LHCb) will give the demonstration. We should stress for the review activities which have not been mentioned in the technical annex, like the collaboration with GEANT. ------------------------------------------------------------------------- Status of Cost Claims Alexia Augier-Bochon The deadline for claims was the 14th of January. Six of the partners have overspend their money and seven spent less than allocated. Up to now eight partners have failed to submit their claims. ------------------------------------------------------------ Data Grid Conference Agenda Bob Jones (see http://documents.cern.ch/cgi-bin/setlink?base=agenda&categ=a0295&id=a0295s1t6/transparencies) On Monday and Tuesday there will be parallel sessions for the WPs and bilateral meetings. In addition meeting concerning DataTAG will take place. On Wednesday there will be a plenary session in the morning where the feedback from the project review will be presented and the most important topics identified. The afternoon will be organized into parallel sessions. Each session will cover one topic raised by the review. On Thursday political leaders of France will visit the conference. In addition the Architecture Group will meet. The topic groups will report on Friday morning during a plenary meeting. During this plenary meeting planning for Testbed-2 will be covered. In the afternoon their project managers give their reports. Other grid projects will present their status and plans. In the evening the PMB will convene. Olof brought up the question about the preparation of the next release. His technical people will not be present on Friday. In this view the current agenda doesn't have enough technical contents. It was agreed to request more rooms and try to have three days of technical meetings. The agenda will be updated and sent out a.s.a.p. ------------------------------------------------------------------- Status of Deliverables Kors Bos (see http://eu-datagrid.web.cern.ch/eu-datagrid/Deliverables/Deliverables.htm) 5.1 (John Gordon) Just a draft has been prepared. It will be send out in a week. It seems that one of the authors is a reviewer of the document,too. The document has to be accepted by Friday in a week. Mark will check the document and release it on Monday. 2.1 There were some comments concerning the review process. Part of the contents have been moved into 5.1. The document is ready for review by the PTB since one week. 5.2 Was given last week to the PTB and received positive comments from the reviewers and some of the WPs send their remarks. Mark has only a few small comments on this document, a few technical details are missing. Since this document covers the Storage Element it should be read by all WPs, comments have to be send out this week. 6.2 Yannick produced a new document which has to read by WP1,2,5 and WP7 before being send out. Some details need clarification. There is an overlap with the software release document. The evaluation should be not part of this document and has to be included in 6.4. The document should be read by Friday to give Yannick one week to incorporate changes. An executive summary needs to be added. 1.3, 2.3, 3.3, 4.3, 5.3, 6.3 These cover the user manuals and should be grouped together. The resulting document will contain a cover note and the individual documents. The reorganization has to be done by Friday. The document has to be released on the 12th. 6.4 (Jeff and Mario) The document covering the evaluation of the testbed has been submitted very late. Nevertheless some WPs have already responded. The chapter about deployment and operation has to be written. This will be finished during this week. The skeleton of the document should be reviewed. The integration process is still missing. Cal already wrote part of the Plan of Deployment chapter and the part covering the experiences gained with the deployment. The section about the achievements has to be written. Testing needs broader coverage. Testing Strategy and Global Strategy is missing. Olof suggested that the document should contain a list of the items that went well and of those that went wrong. Antonia suggested that the packaging should be covered too. She offered to ask Flavia to write a chapter describing the installation kit that has been developed. There is a daily phone conference for this deliverable until the release next Thursday. 3.2 (Peter) The comments of the reviewers have been discussed. Concerning the choice of MDS <-> RGMA the document should describe why MDS has been chosen. The justification has to be based on a test plan addressing scalability and performance. Bob will write asection of text outlining the policy for the future. 7.1 Has been approved. 12.3 The document about the Software Release Procedure is ready and can be released soon. 3.1 Stefano gave the moderator's report. This deliverable has been much improved. Mark has read it. The document claims that MDS is inferior to RGMA, but no arguments are given. FTree is very positively described, but it seems not to be clear which one should be chosen. Steve Fisher and Mark agreed to go through the arguments and remove statements that are not supported by clear arguments from the document. There where some comments about why tools like NetSaint are not described or covered by references to 7.2. 7.2 Clear requirements section, the existing tools are listed, the matrix described. The deliverable should be submitted. Mark had no objections. He mentioned that he had the impression that none of the tools seems to address the problem of large file transfers. It was stated that this is covered by NetLogger. It was pointed out that not the document, but the prototype is the deliverable. Mark wished that the objective should be stated more clearly. One or two additional sentences might be enough. 8.2 Was delivered on time on the 15th in a rather initial state. It has been reviewed by four people. Now version 2.0 has been produced and it is almost finished. Some recommendations were made by Julian. There are parts that need rewording. Peter has still some open questions and wants the process to continue. Mark has the general impression that the document has arrived almost in its final state. The executive summary doesn't cover what it should, however the conclusion section is a correct executive summary and can be used as such. Mark made the more general remark that the authors should stress more the success of their work. 9.2 No problems or open questions about this. 10.2 There have been two iterations and the response is good to the current version. The document has been released on the 22th to the reviewers. 10.2 contains very detailed, maybe too detailed, descriptions of the applications. Despite missing an executive summary the document has been recommended for acceptance. Mark gave a comment on the part that covers parallel computing. He thinks that this is a bit over optimistic since currently there are significant problems with the tools for parallel programming like GlobusMPI. 12.4 The Architectural Requirement Document is quite ready. There is some input concerning the interfaces needed. WP5 and WP7 sent their input. It should be pointed out that this might be only a starting point for an architectural group. There will be a new release soon. Summary: The x.3 documents will be merged. Six deliverables are still open. These are 5.1, 6.2, 12.3, 12.4, 8.2 will be finished this week. Only 6.4 will require significant work. ---------------------------------------------------------------------------- New Web Report Roberto Puccinelli The new web site for the EDG project will be used as the default site from 31th January onwards. The handling of confidential contents is still an issue. We keep those information that has to be protected on a site in Italy. There was broad consensus that all information should be located at one location. The result of a short discussion was that we want at least the PMB and PTB minutes to be protected. The calendar is not used widely enough. There where some remarks that it is hard to update the calendar. Action Items: For the deliverables some additional classes of documents are needed. In some places the email addresses are missing. A how-to for the calender has to be provided. ------------------------------------------------------------------- Architecture Group Bob Jones (see http://documents.cern.ch/cgi-bin/setlink?base=agenda&categ=a0295&id=a0295s1t12/transparencies) Bob gave a short overview about the current lack of architecture and a list of lessons learned from testbed-1 and D12.4. He stressed the importance of the experience gained with testbed-1. The deliverable 12.4 is seen by him as a starting point for future work on the architecture. The mandate of the architecture group is to modify D12.4 in a way that clarifies the interfaces and dependencies of the different components with a focus on the requirements of the testbed-2. Especially the comments of the reviewers, the experience with testbed-1 and the changing requirements of the applications group have to be considered. In addition the design has to take into account the future development of the toolkits used and has to aim for convergence with the architectures being developed in other Grid projects. The AG should have about 15 members. The Technical Coordinator will be the chair of this body. The middle-ware work packages 1-5, networking, integration and the applications will be represented. In addition there will be a person to address security requirements of the architecture. The AG will invite consultants from related projects to participate in the work. The member list will be finalized during the week and the first meeting is scheduled for Thursday 7th March in Paris. They will meet once a month. Participants are expected to dedicate 30% of their time to the work of the AG. This requires considerable effort between the meetings. There was a question about how to find a person working on security matters. Bob answered that a person from the WP7 security group is going to work on the architecture. Ben asked to open up and audit the meetings of the AG. It was agreed that there has to be a compromise found between openness and effectiveness. ------------------------------------------------------------------------- AOB Peter K. asked about licensing. He wanted clarification about whether there would be an EDG-license or if we wait for the globus license approach. It was mentioned that at the suggestion given by Foster at the GGF couldn't be accepted by the funding agencies. The next meeting of the PTB will be on April the 10th. ---------------------------------------------------------------- Action Item: Provide 2-3 additional SEs with sufficient storage to do meaningful tests. >2 Solaris 7. machines are needed to test EDG software under Solaris Francois Etienne: To prepare plan for WP6 testing and support activities Cal: Create a list of people working on the test framework. This should happen between release 1.0 and 1.1. WP Managers: All WP managers are asked to submit a two line statement about the most important component provided and the most urgent open issues. Bob: For the first three days of the conference more rooms should be requested and the agenda should be changed to have three days of technical meetings.. Roberto & Gabriel For the deliverables some classes of documents are needed. In some places the email addresses are missing. A how-to for the calender has to be provided.