Minutes of dTeam meeting 25 Nov 2005. Present: Glasgow: Graeme Imperial: Olivier RAL: Chris, Jeremy (chair), Stephen, Jens (mins) Review of actions "Follow up on questions about Brunel and submitting SFTs to sites not in the RB used by Polish SFT submission" - new SFT questions with LeSC joining (although jobs run successfully at LeSC). May need to test RB. New Action Olivier to follow up with appropriate lcg-rollout. "Reconfirm expectations for Tier-2 hardware" - Olivier announced at IC-HEP will get new hw on time. "Starting with Alessandra review document in Wiki" - it is known not to have reached Graeme so must be either with Alessandra or Fraser. "Investigate methods to remove old transfer files (for SC4)" - should also be raised in storage group (the storage group decided that FTS is also within its scope, but mainly because there are 'delete' issues with some SRMs - see e.g. bug #10120). In fact it has been raised but nothing much has been done yet. "Find out hs are to know which experiment software version to have installeed" - Olivier reports that he talks to (Atlas) SGM, but that doesn't scale. Jeremy needs to raise issue at GDB. Site name change proposals Done at IC after shutdown; RHUL interested but need procedure (other admins have also asked). Suggested naming scheme UKI-* where * is decided by Tier 2 coordinator and must be consistent. Feedback from LHC CRRB Graeme's talk well received by review board, also good feedback from Les & Ian. Q: about communication between T2 & experiments, A: needs improving. Board also asked about install tools. SC3 throughput tests failed and need redoing in January. VOBox mentioned but no comments. Update on security challenges Jeremy mentioned the jobs were running but delayed due to a cleanup at RAL. Progress with SC4 preparations SRM deployment OK, sites that said "mid Nov" have started. Only concerns about ~2 sites now. Graeme reports Matt Hodges involved in FTS, setting up channels, documenting. Performance problems (low xfer rate) to dCache needs resolving, getting good performance between DPMs. Testing between Glasgow and Edinburgh because the latter has both. Needs testing with 1.6.6 too - Edinburgh and RAL-Storage (not Tier 1) have 1.6.6, but Tier 1 upgrade imminent (30 Nov). Question about monitoring GridFTP rate for dCache - storage group had this as bug #10092, but annoyingly it was closed without commenting what the solution was. I will try to locate the information, IIRC it required increasing log verbosity in the old days but this may have improved. Data needs to be published via R-GMA. ACTION Jens&Graeme: how to do this. How to get at-a-glance xfer rates between sites? The status page is already getting crowded. May need planning on a different page from the status? LFC document: pretty much done, Durham tested it, worked fine modulo a certificate problem. Difficult to do more than a nameserver test. ACTION coordinators: need >= 1 LFC in each Tier 2 by end Dec - nominate a site. Chris didn't need much interaction with Tier 1 to set up FTS client; tuning may be more interesting. Caution: There is more than one page that claims to have the "latest" version. There is only 1 server, at RAL. Network test: Graeme mentions to (1) let network people know you are doing them (and when), (2) do them overnight. Will try 1 TB overnight. Jens mentioned that he has asked Pete and Gareth to put SC4 storage/FTS preps on the hepsysman agenda, with speakers tbd. UKI performance Glasgow has large MPI jobs clogging up cluster, blocking out FTS. QMUL had PBS problems. Maybe need a queue for short jobs. AOB Jeremy says admins can now edit ROC report within a specific time window, site admins can record reasons for failures. See mail to list. Feedback to Jeremy. Jens reports that he's interacting with ETF where ~3 sites are installing gLite components, and a new gLite report was circulated to ETF (but is not yet circulatable outside). ETF evaluating DPM. Also asked to feedback on gLite install (as a non-LCG shop). In general site admins advised to do same feedback. ------------------------------------------------------------------------ Summary of new actions Olivier to follow up on LeSC/SFT/RB problem with LCG/lcg-rollout. Jens&Graeme: (1) get dCache gridftp xfer performance out of dCache and (2) publish via RGMA. Coordinators: nominate a site in each T2 to have LFC installed by end of Dec. That is _installed_ by end Dec, not nominate by end Dec.