WLCG-OSG-EGEE Operations meeting

Europe/Zurich
28-R-15 (VRVS (plane room))

28-R-15

VRVS (plane room)

Description
grid-operations-meeting@cern.ch
Weekly OSG, EGEE, WLCG infrastructure coordination meeting.
We discuss the weekly running of the production grid infrastructure based on weekly reports from the attendees. The reported issues are discussed, assigned to the relevant teams, followed up and escalated when needed. The meeting is also the forum for the sites to get a summary of the weekly WLCG activities and plans
Attendees:
  • OSG operations team
  • EGEE operations team
  • EGEE ROC managers
  • WLCG coordination representatives
  • WLCG Tier-1 representatives
  • other site representatives (optional)
  • GGUS representatives
  • VO representatives
  • VRVS "plane" room will be available 15:30 until 18:00 CET

    actionlist
    minutes
      • 16:00 17:25
        WLCG-OSG-EGEE Operations Meeting 28-R-15

        28-R-15

        • 16:00
          Feedback on last meeting's minutes 5m
          Minutes
        • 16:05
          EGEE Items 20m
          • <big> Grid-Operator-on-Duty handover </big> 5m
            From Russia ROC (backup: Italy) to CERN ROC (backup: DECH ROC)
            Tickets:
            Open 55
            Closed 30
            2-mail 13
            Modified 48
            All 146

            Notes:
            1. No information on SFT PPS was enabling.
            2. The dashboard was very unstable and did not refresh since Fri, 29 Sep 2006 16:16:19 +0200 till now.
          • <big>Job priorities WG</big> 10m
            Summary of the the Job Priorities WG recommendations and deployment plans
            Speaker: Jeff Templon, Dietrich Liko
            transparencies
          • <big> Move to the new version of FCR </big> 5m
            In migration to the new version of FCR the VOs should be reminded to apply their settings on the new version, as the old one will be phased out by 6th October, 2006. This is especially important because of the 'dteam' => 'ops' change. (Currently most VOs don't have a Critical Test set defined for 'ops').
          • VOs need to check that their settings in the new FCR tool are correct
          • Owners of top-level BDIIs which use FCR need to use the new LDIFF
          Speaker: Judit Novak
      • <big> Update on SLC4 migration </big> 5m
        The goal is to port the gLite components into the ETICS system by the end of October, which automatically means that they will be built on SLC3 (ia32) and SLC4 (ia32 and x86_64). In theory also builds on Debian and ia64 should be possible although the builds system on those systems has not been completely tested.
        Components are ported by subsystem with priority given to those required to have the UI and WN ready first. In any case components are built as they are available.

        For an example please look at:
        http://etics.cern.ch/rundir/glite-wms-utils_R_3_1_8_x86_slc_3/
        http://etics.cern.ch/rundir/glite-wms-utils_R_3_1_8_x86_slc_4/
        or the general build portal:
        http://etics.cern.ch:8080/reportBrowser/

        The corresponsing packages can be found in the ETICS repository:
        http://etics.cern.ch:8080/repositoryBrowser/
        We will also work on some script to provide the package list in a form suitable for populating the gLite APT repository directly.
      • <big> summary on the status of the request to allow users to pass arguments to the underlying LRMS </big> 5m
        Speaker: Alessandra Forti
      • <big> Savannah bugs to follow up </big> 5m
      • bugs 17738 and 15746 (both GFAL): work will start in around 4 weeks time. Delay is because SRM 2.2 work has to be completed first. Need to give feedback if this is too long (with justification)
        bug #17738: GFAL info system timeout too low
        bug #15746: GFAL should optimize LDAP queries

      • bug 19878: work is currently due to start at beginning of December. Feedback should be given if this is too long (with justification)
        bug #15878: DNs with "." are not properly handled
  • <big> EGEE issues coming from ROC reports </big> 15m
    Reports were not received from these ROCs: AP, SWE, UKI

    1. Item 1 (NE ROC): A major concern for the Netherlands is the possible drop of support for VOMS-enabled Pre-WS GRAM on the gLite-CE. A number of the VOs that we support use Nimrod to submit jobs which works on Pre-WS (VOMS-enabled) GRAM. At least as long as Globus packages are in their toolkit. Also see remarks made for SARA-MATRIX site.


    2. Item 2 (SEE ROC): 1) AEGIS Yet again non-official and invalid SFT sent to our site by Rafal Lichwala from SFT Admin Tool on 27-09-2006 10:51 is present in our CIC daily report. While we don't mind having any regular jobs sent to our site through supported VOs, CIC daily report should not contain such SFT failure. To make the matters worse, this SFT failure is triplicated.


    3. Item 3 (Italy ROC): The errors on the SFT tests for this day - marked as critical (CT) - we were not able to reproduce for a 'dteam' user. How would be the procedure to test as 'ops'? Should we ask to become member of the 'ops' VO?
      (UKI ROC) The problem of the OPS failure with 3rd party replication is being investigated. It seems this is a very limited problem, affecting only this VO and only lxn1183.cern.ch as a remote SE. As the site has no one in the OPS VO to aid with testing it's very hard to debug this. We suggest that at least one support person in each ROC be a member of the OPS VO to help sites with problems like this.


  • 16:25
    OSG Items 5m
    • <big> Item 1 </big> 5m
  • 16:30
    WLCG Items 35m
    • <big> WLCG Service Commissioning report and upcoming activities </big> 15m
      Speaker: Harry Renshall
      document
      more information
  • 17:05
    Review of action items 15m
    actionlist
  • 17:20
    AOB 5m
    • Item 1: change of day of operations meeting, back to Mondays at 16:00 5m