LCG Workshop on Operational Issues

Europe/Zurich
CERN

CERN

Ian Bird
Description
Currently over 80 sites are connected to the LCG grid and over 8000 processors are available to run a variety of applications. During the Data Challenges of the LHC experiments the grid middleware has proven to be stable though incomplete. It seems a good moment to shift the attention from getting more reliable software to getting more reliable grid operation. In the end we need an infrastructure which is always operational, where software upgrades can be done regularly and in a controlled way, where bugs can be fixed quickly and efficiently and where users can get support when and where needed. To discuss how to achieve this a Workshop will be organised at CERN from 2 to 4 November. We would like to make this a real workshop with one plenary sessions only followed by many small dedicated meetings focused on just one aspect. For this open workshop people responsible for the operation of the major LCG centers are invited as well as the people responsible for the EGEE Operational Management Center, the Regional Operation Centers and Core Infrastructure Centers. The people with the real hands-on experience of operations should come to propose solutions for the bottlenecks we will identify as well as the managers that can assign resources and manpower to make these solutions become true. The format of the workshop is outlined below. The IT Auditorium can hold approximately 100 people and we would therefore like to ask you to register by filling up the form at the following web address: http://lcg.web.cern.ch/LCG/SC2/LCGWorkshop/LCGWorshopReg.asp Although this is an open workshop the organizers retain the right to make some choices in case of over subscription.
    • 09:00 18:00
      Plenary Session I 40-SS-C01

      40-SS-C01

      CERN

      • 09:00
        Introduction to the workshop 30m
        Speaker: Ian Bird (CERN (IT-GD))
        transparencies
      • 09:30
        Current LCG/EGEE Operations 1h
        What are the issues?
        Speaker: Stephen Burke (RAL)
        transparencies
      • 10:30
        COFFEE 30m
      • 11:00
        Security incident response 30m
        Summary of OSG/EGEE/LCG incident response procedures and plans + discussion
        Speaker: Dave Kelsey (RAL)
        transparencies
      • 11:30
        Overview of the deployment process 30m
        Speaker: Markus Schulz (CERN (IT-GD))
        transparencies
      • 12:00
        Operations management 30m
        Overview of site testing, problem follow up and escalation. Local vs remote control of sites. Discussion on escalation procedures. How to handle bad sites?
        Speaker: Piotr Nyczyk (CERN (IT-GD))
        transparencies
      • 12:30
        LUNCH 1h
      • 13:30
        Grid3 Operations 30m
        Speaker: Doug Pearson
        transparencies
      • 14:00
        Sharing work and responsibilities 30m
        Summary of what was discussed at CHEP. How to share the work between the CICs and other GOCs.
        Speaker: John Gordon (RAL)
        transparencies
      • 14:30
        Globus Monitoring 30m
        Speaker: Jennifer Schopf (ANL)
        transparencies
      • 15:00
        Monitoring Frameworks 30m
        Overview of R-GMA monitoring framework. What information needs to be published by each site?
        Speaker: Min Tsai
        transparencies
      • 15:30
        Monitoring in LCG 30m
        Speaker: Dave Kant (RAL)
        transparencies
      • 16:00
        COFFEE 30m
      • 16:30
        LCG/EGEE User Support - GGUS 30m
        Speaker: Torsten Antoni/Holger Marten (FZK)
        transparencies
      • 17:00
        User Support in grid.it 30m
        Speaker: Marco Velato
        more information
      • 17:30
        Accounting, current status 30m
        Speaker: Luciano Gaido
    • 08:50 12:30
      Working Group 1: Operational Security 14/4-002

      14/4-002

      CERN

      12
      Show room on map
    • 08:55 12:30
      Working Group 2: Operational Support 40-SS-C01

      40-SS-C01

      CERN

    • 09:00 12:30
      Working Group 3: User Support 13/2-005

      13/2-005

      CERN

      90
      Show room on map
    • 09:00 12:30
      Working Group 4: Fabric Management 13/3-005

      13/3-005

      CERN

      20
      Show room on map

      original room was: 13-1-017

    • 09:10 12:30
      Working Group 5: Software Management 13/3-005

      13/3-005

      CERN

      20
      Show room on map
    • 13:30 17:00
      Plenary Session III 40-SS-C01

      40-SS-C01

      CERN

      The working groups report on what they have achieved. The slides should show an operational plan with a schedule when this can achieved and with names of people who are responsible for it.

      • 13:30
        Report from WG 1 30m
        Speaker: Ian Neilson (CERN)
        more information
      • 14:00
        Report from WG 2 30m
        Speaker: Ian Bird (CERN)
        transparencies
      • 14:30
        Report from WG 3 30m
        Speaker: Flavia Donno (CERN)
        transparencies
      • 15:00
        COFFEE 30m
      • 15:30
        Report from WG 4 30m
        Speaker: Davide Salomoni (NIKHEF)
      • 16:00
        Report from WG 5 30m
        Speaker: Steve Traylen (RAL)
        more information
      • 16:30
        Summary 30m
        Speaker: Ian Bird (CERN)
    • 17:00 18:00
      DRINK 501--

      501--

      CERN