SC day-by-day REPORT 25/06 CMS: many import error message at CNAF (CastorStagerInterface: device or resource busy. Also tape server problems causing lack of free space on tapes and causing failures when sourcing data to be transmitted to DESY. 100% job aborted at CNAF (see graph attached). ATLAS: dataset production expected to restart in the afternoon at 2pm, while consequent data transfer starting to CNAF, including both RAW and AOD data. Transfer to T2 sites expected (Milano works fine as testest last week, Roma1 suffered from a power cut problem). RAW: on tape AOD: on disk only Roma1: blackout FTS channel list problem fixed by following guidelines related to GGUS #9737 26/06 27/06 ATLAS: Start of AOD transfers expected today CMS: job submission errors on ce03 at CNAF, long debugging session. Cause of problem discovered in the evening, problem was due to an upgrade from CE LCG '2.7.0' to InfnGrid 3.0.0. Downgrade solved the problem. The sc pool sufferd from many hanging srm processes, probably due to a very high load of the castor1 lhcb stager. This kind of problems will not occur any more after the complete migration to castor2 (already ongoing), where a gridftp server pool is not needed and all file servers run gridftpd. sc pool restored on 28/06 11 am. Roma1: completed upgrade to gLite 3.0.0 28/06 ATLAS: activities in the T2 sites suspended. 29/06 Roma1: New disk server available working as disk pool in DPM (OS SL4 64bit). Disk space will increment by 5 TB. Number of parallel file transfer from CNAF to Roma1 increased from 3 to 6 (CMS transfer jobs in pending state seen for a too long time) CNAF: planning for an upgrade of the LFC server. Oracle server needs to be decoupled from the LFC server. In addition two LFC servers need to be made available for load balancing reasons. 30/06 CMS Load transfer tests for import of files to CNAF started failing at 1am. Cause: kernel panic of Castor2 CMS file server, with consequent hanging of the the CMS stager. CMS file server back to operation mode at around 12 am. ATLAS: very slow migration of data to tape experienced, under investigation. Possible cause: concurrent write/read operations on tape. Consequence of the problem: ATLAS disk pool full, as garbage collection proceeds with deletion only if files are successfully written to tape. Stop of ATLAS activities at around 12.30 am waiting for restoration of a working environment. Napoli: SRM DPM configured and operational =============================================== FTS CHANNEL STATUS SUMMARY: Channel list: complete Functionality: fine, with the only temporary exception of Roma1 (CMS and ATLAS T2 site), and the CNAF-Napoli channel (Napoli is a ATLAS T2 site), recently setup, to be tested tomorrow. DETAILS: Status of FTS channels configured on FTS server at CNAF: . LHCb: the T1-T1 matrix for LHCb is complete . ATLAS: the T1-T2 INFN set of channels needed by ATLAS is complete . CMS: the T1-T1 and T1-T2 INFN channels needed are all configured . ALICE: the T1-T2 INFN channels are configured Functionality: . T1-T2 INFN channels: all working fine, with the only exception of CNAF - Roma1 which is suffering with SRM problems due to a recent power cut (the channel has been tested before this event and it used to work fine), and the CNAF - Napoli channel that was configured last Friday. Under test tomorrow June 28. . T1-T1 LHCb matrix: usually fine, but right now suffering problems with the LHCb stager of Castor at CNAF (the stager is currently used for storing many small log files, as backup of a host at CERN currently not functional), not because of FTS problems . T1-T2 INFN for CMS: channels tested in both directions and work fine (with the only exception of Roma1, for the same reason reported below. MONITORING A first web monitoring page for FTS (both production and pre-production services) is avaiable on http://tier1.cnaf.infn.it/FTS/ It's based on rrd, parsing channel agent logs. TODO: aggregate stats by VOs activity (gridice people are collecting channel, VO agent and tomcat logs on a mysql db). More details about metric definition on http://gridice.forge.cnaf.infn.it/dev/Measurements/FTS