opennebula conf | lightning talk: managing a scientific computing facility with opennebula by sara...
Post on 06-Jul-2015
318 Views
Preview:
DESCRIPTION
TRANSCRIPT
Managing a Scientific Computing Facility with OpenNebula
!Sara Vallero on behalf of the INFN-Torino computing team
OpenNebula Conf - December 2-4, 2014 - Berlin
The present work is partially funded under contract 20108T4XTM of Programmi di Ricerca Scientifica di Rilevante Interesse Nazionale (Italy).
1
OpenNebula Conf - December 2-4, 2014 - Berlin S.Vallero
The INFN Torino Computing Centre
STORAGE RESOURCES • 1600 TB (gross) total
COMPUTATIONAL RESOURCES • 69 hypervisors (KVM)• 1200 job-slots• 200 virtual machines
LAN/WAN • 10Gbps links
Cloud project started in 2009
2
Stakeholders• WLCG grid Tier2
(primarily for the ALICE experiment at CERN)
• grid Tier2 for the BESIII experiment at IHEP Beijing
• Computing for upcoming experiments:
• PANDA at FAIR Darmstadt
• Belle-2 at KEK Tsukuba
• Virtual Analysis Facility for ALICE (interactive analysis, elasticity)
INFN: Italian National Institution for Nuclear Physics • fundamental physics studies • several units in major Italian cities
• Medical Image Processing (local research group)
• Theory (local research group)
• Virtual farms on-demand
Torino Unit
OpenNebula Conf - December 2-4, 2014 - Berlin S.Vallero
Storage ServersImage Repository
Datastore
Services Cluster Workers Cluster
Two clusters for different VM classesSERVICES-CLASS VMs (pets)
• provide critical services• in/out-bound connectivity• live migration• server-class hardware• no particular local disk I/O requirements• shared image repository• resiliency-optimized FS for shared system disks (RAID1)
3
WORKERS-CLASS VMs (cattle)• computational work-force
(e.g. grid worker nodes)• private IP only• high storage I/O performance• lower-class hardware• locally cached image repository for fast start-up• performance-optimized file system
Gluster Replicated
Volume
SharedDatastorefor runningVMs
Cache for image repo Datastore
OpenNebula Conf - December 2-4, 2014 - Berlin S.Vallero
Current and planned activities
4
1. Toolkit for virtual farm on-demand provisioning • Virtual Routers (OpenWRT appliances)• Elastic public IP• iSCSI datastore for persistent disk space• EC2 interface• CloudInit contextualisation
2. Elasticity • automatic reallocation of VMs according to application’s needs (wherever
appropriate)• though: works only in infinite resources approximation, we usually run in saturation• in place only for the Virtual Analysis Facility so far
3. National federated cloud for scientific computing • upcoming INFN-wide project mostly based on OpenStack• need to interoperate with OpenStack-based geographical services (e.g. Keystone)
4. Monitoring as a service • based on the ELK stack (ElasticSearch, Logstash, Kibana)• uniform monitoring interface for applications/infrastructure
move to new OpenNebula tools
top related