the next generation root file server andrew hanushevsky stanford linear accelerator center...
TRANSCRIPT
The Next Generation Root File Server
Andrew HanushevskyStanford Linear Accelerator Center
27-September-2004
http://xrootd.slac.stanford.edu
27-Sep-04 2: xrootd
What is xrootd?
File Server Provides high performance file-based access Scalable, extensible, naively usable Fault tolerant
Server failures handled in a natural way Servers may be dynamically added and removed
Secure Framework allows use of almost any protocol
Rootd Compatible
27-Sep-04 3: xrootd
Goals II
Simplicity Can run xrootd out of the box
No config file needed for non-complicated/small installations
Generality Can configure xrootd for ultimate performance
Meant for intermediate to large-scale sites
27-Sep-04 4: xrootd
How Is high performance achieved?
Rich but efficient server protocol Combines file serving with P2P elements Allows client hints for improved performance
Pre-read, prepare, client access & processing hints, Multiplexed request stream
Multiple parallel requests allowed per client
An extensible base architecture Heavily multi-threaded
Clients are dedicated threads whenever possible Extensive use of OS I/O features
Async I/O, device polling, etc. Load adaptive reconfiguration.
27-Sep-04 5: xrootd
xrootd Server Architecture
Protocol LayerProtocol Layer
Filesystem Logical LayerFilesystem Logical Layer
Filesystem Physical LayerFilesystem Physical Layer
Filesystem ImplementationFilesystem Implementation
Protocol & Thread ManagerProtocol & Thread Manager
(included in distribution)
p2p heart
27-Sep-04 6: xrootd
Rootd Bilateral Compatibility
xrootdxrootd
rootdrootd
xrootdxrootdrootd compabilityrootd compabilityTNetFile
Application
TNetFile
TXNetFile
Application
rootdrootd
Client-Side Compatibility
Server-Side Compatibility
clientclient
27-Sep-04 7: xrootd
How performant is it?
Can deliver data at disk speeds (streaming mode) Assuming good network & proper TCP buffer size
Low CPU overhead 75% less CPU than NFS for same data load
It is memory hungry, however.
General requirements Middling speed machine
The more CPU’s the better
1-2GB of RAM
27-Sep-04 8: xrootd
How is scalability achieved?
Protocol allows server scalability Server directed I/O segmenting Request deferral to pace client load Unsolicited responses for ad hoc client steering P2P elements for lashing servers together
Request redirection key element Integrated with a P2P control network
olbd servers provide control information
27-Sep-04 9: xrootd
How does it scale?
xrootd scales in multiple dimensions Can run multiple load balanced xrootd’s
Provides single uniform name and data space Scales from 1 to over 32,000 cooperating data servers Architected as self-configuring structured peer-to-peer
(SP2) data servers Servers can be added & removed at any time
Client (TXNetFile) understands SP2 configurations xrootd informs client when running in this mode Client has more recovery options in the event of failure
27-Sep-04 10: xrootd
Load Balancing Implementation
Control Interface (olbd) Load balancing meta operations
Find files, change status, forwarded requests
Data Interface (xrootd) Data is provided to clients Interfaces to olbd via the ofs layer
Separation is important Allows use of any protocol Client need not know the control protocol
27-Sep-04 11: xrootd
Entities & Relationships
datadataxrootd
olbdxrootd
olbd
Data Clients
Redirectors
Data Servers
MM
SS
ctlctl
olbdolbd Control Network
Managers & Servers(resource info, file location)
xrootdxrootd Data Network
(redirectors steer clients to dataData servers provide data)
27-Sep-04 12: xrootd
Typical SP2 Configuration
Dynamic
Selection
redirector
subscribe
subscribe
subscribe
subscribe
27-Sep-04 13: xrootd
Example: SLAC Configuration
client machinesclient machines
kan01 kan02 kan03 kan04 kanxx
bbr-olb03 bbr-olb04 kanolb-a
27-Sep-04 14: xrootd
Why do this?
Can transparently & incrementally scale
Servers can come and go Load balancing effects recovery
New servers can be added at any time Servers may be brought down for maintenance Files can be moved around in real-time
Client simply adjust to the new configuration TXNetFile object handles recovery protocol
27-Sep-04 15: xrootd
What we have seen
For a single server: 1,000 simultaneous clients 2,200 simultaneous open files
Bottlenecks Disk I/O (memory next behind)
27-Sep-04 16: xrootd
What Have We Heard
The system is too stable Users run extra-long jobs (1-2 weeks) now
Error not discovered until weeks later
The system is too aggressive New servers are immediately taken over
Easy configuration but startling for administrators
27-Sep-04 17: xrootd
Next: Getting remote data
SLAC
IN2P3RAL
xroo
td’s
xroo
td’sRALRAL proxyproxy
IN2P3IN2P3 proxyproxy
Firewalls requireProxy servers
27-Sep-04 18: xrootd
Proxy Service
Attempts to address competing goals Security
Deal with firewalls Scalability
Administrative Configuration
Performance Ad hoc forwarding for near-zero wait time Intelligent caching in local domain
27-Sep-04 19: xrootd
Proxy Implementation
Uses capabilities of olbd and xrootd Simply an extension of local load balancing Implemented as a special file system type
Interfaces in the ofs layer Functions in the oss layer
Primary developer is Heinz Stockinger
27-Sep-04 20: xrootd
Proxy Interactions
client machinesclient machines
red01 data02 data03 proxy01
local olb
data01 data02 data03 data04
proxy olb
proxy olb
local olbRAL
SLAC
11
22
33
55
44
27-Sep-04 21: xrootd
Why This Arrangement?
Minimizes cross-domain knowledge Necessary for scalability in all areas
Security Configuration Fault tolerance & recovery
27-Sep-04 22: xrootd
Scalable Proxy Security
SLAC PROXY OLBD RAL PROXY OLBD
3
2 2
1
11 Authenticate & develop session key22 Distribute session key to authenticated subscribers
33 Data servers can log into each other using session key
Data ServersData Servers
27-Sep-04 23: xrootd
Proxy Performance
Introduces minimal latency overhead Virtually undetectably from US/Europe Negligible on faster links
2% slower on fast US/US links 10% slower on LAN
Can be further improved Parallel streams Better window size calculation Asynchronous I/O
27-Sep-04 24: xrootd
Conclusion
xrootd provides high performance file access Unique performance, usability, scalability,
security, compatibility, and recoverability characteristics
Should scale to tens of thousand clients Can support tens of thousand of servers
Distributed as part of the CERN root package Open software, supported by
SLAC (server) and INFN-Padova (client)