vice president and chief architect beepy@netappnfs: its applications and future - lisa ‘04 nfs,...
TRANSCRIPT
NFS: Its applications and future - LISA ‘04
NFS, its applicationsand future
Brian PawlowskiVice President and Chief [email protected]
NFS: Its applications and future - LISA ‘04
Who am I? Why am I here?
NFS: Its applications and future - LISA ‘04
The arc of the presentation
• What is NFS?
• The evolution of NFS
• NFS Version 4
• Drill down: Linux NFS Version 4
• What about iSCSI?
• Linux compute clusters
• NFS in context
• Challenges for NFS
With occasional sidetracks…
NFS: Its applications and future - LISA ‘04
What is NFS?
NFS: Its applications and future - LISA ‘04
What is NFS?
• NFS is a protocol for a distributed filesystem.– Based on Sun's RPC version 2 protocol
– Can export arbitrary local disk formats
• First revision, NFSv2, was published in 1985.– It exports basic POSIX 32-bit filesystems
– Slow, particularly for writing files
• NFSv3, was published in 1994– Extended to 64-bit files & improved write caching
– It is perhaps the most commonly used protocol forsharing files on *NIX/Linux LANs today
NFS: Its applications and future - LISA ‘04
Remember these file systems?
• Apollo Domain
• AT&T Remote File System (RFS)
• Andrew File System (AFS)
• Distributed File System (DFS)
(NFS Version 4 is influenced by AFS)
NFS: Its applications and future - LISA ‘04
NFS Today
• “It was 20 years ago today.”– SCSI and NFS grew up together
• Transformed from something you turn on in a UNIXrelease to a well-defined storage segment
• Home directories• Large partitionable tasks that may run as parallel
threads– Typical applications include search engines, e-mail,
animation and rendering, scientific simulations, andengineering
• Scalable databases• GRID computing
NFS: Its applications and future - LISA ‘04
NFS Version 4
NFS: Its applications and future - LISA ‘04
NFS Version 4
• Openly specified distributed filesystem– NFSv2/v3 quasi-open with Informational RFC– Windows, AFS, DFS not “open”
• Well-suited for complex WAN deployment and fire-walled architectures– Reduced latency, public key security
• Strong security– Public and Private key– Fine-grained access control
• Improved multi-platform support• Extensible
– Lays groundwork for migration/replication and globalnaming
NFS: Its applications and future - LISA ‘04
The IETF process and NFS
Working Group Draft
Strawman Proposal from Sun
Proposed Standard RFC 3010
Draft Standard
Internet Standard apotheosis
1998
1999
2000
2002
Sun/IETF Agreement
Meetings, writing, e-mailPrototyping by 5 organizations
BOF, working group forms
Additional prototypingSix working group draftsWorking Group Last CallIETF Last CallIESG ReviewAssign RFC number
2001
Proposed Standard RFC 3530Two independent implementations6+ months
NFS: Its applications and future - LISA ‘04
Couple things we did right this time
• Open source reference implementations ofNFS Version 4 were funded early (started bySun)
• Interoperability events held 3 times a year– With focus of non-Connectathon events on NFS
Version 4
• Huge improvements in execution andcoordination over NFS Version 3
NFS: Its applications and future - LISA ‘04
NFS Protocol Stack
RPCSEC_GSS (RFC2203)RPC (RFC1831)XDR (RFC1832)
KerberosV5 (RFC1510)SPKM-3LIPKEY (RFC2847)
TCP*
NFSv4 (RFC3530)
NFS: Its applications and future - LISA ‘04
NFS Version 4 operations• ACCESS• CLOSE• COMMIT• CREATE• DELEGPURGE• DELEGRETURN• GETATTR• GETFH• LINK• LOCK• LOCKT• LOCKU• LOOKUP• LOOKUPP• NVERIFY• OPEN• OPENATTR• OPEN_CONFIRM• OPEN_DOWNGRADE
• PUTFH• PUTPUBFH• PUTROOTFH• READ• READDIR• READLINK• RENAME• RESTOREFH• SAVEFH• SECINFO• SETATTR• SETCLIENTID• SETCLIENTID_CONFIRM• VERIFY• WRITE• RELEASE_LOCKOWNER
NFS: Its applications and future - LISA ‘04
NFS operation aggregation
• The COMPOUND operation– NFS procedures are now groupable
• Potential for reduced latencies and roundtriptimes
• Part of framework for minor versioning
NFS: Its applications and future - LISA ‘04
Example: mount server:/test/dir
• Client generates this COMPOUND
PUTROOFHGETFHLOOKUP(test)GETFHGETATTR
SYMLINK_SUPPORTLINK_SUPPORTFH_EXPIRE_TYPETYPESUPPORTED_ATTRSFSID
SECINFO (dir)LOOKUP (dir)GETFHGETATTR
The operation formerly knownas MOUNT
NFS: Its applications and future - LISA ‘04
NFS Version 4 is secure
• Mandatory to implement– Optional to use– Extensible via GSSAPI RPC– Kerberos V5 available now– Public key flavor emerging
• Security negotiated– Per file system policy– Continuous security– Security negotiation mechanisms
• Levels– Authentication– Integrity– Privacy
• ACLs (based on Windows NT)
NFS: Its applications and future - LISA ‘04
Specification vs. implementation
• RFC 3530 defines required, recommended andoptional features– The required features form core of interoperability
– Recommended and optional features are negotiated
• ACLs, for example, are a “recommended” attribute– Not required for compliance
– Dependent on underlying local file system support onserver (on Linux - that’s a lot of file systems)
– ACLs are ill-defined in *ix environs - mapping issues aretripping us up
NFS: Its applications and future - LISA ‘04
NFSv4 - Stateful
• Protocol is “session” oriented (OPEN call exists)– But in reality NFS Version 3 was also stateful via adjunct
Locking Protocol
• Lease-based recovery of state (simplified errorhandling)
• File locking integrated into protocol– OPEN provides atomicity for Windows integration
• Addition of delegations– Client in absence of sharing allowed to locally cache
data and locking state– This does not mean data sharing is defined in absence
of explicit locking
NFS: Its applications and future - LISA ‘04
Delegations (making lemonade)
• Use stateful design to enhance performance andscalability
• Enables aggressive caching on client– Shared read– Write exclusive
• Reduced roundtrips of the wire– Read, write, locking etc. cacheable locally– The fastest packet is the one never sent
• Server-based policy– Server manages conflicts– Sharing model reverts to NFS Version 3 when conflicted
NFS: Its applications and future - LISA ‘04
/
A
E
G IH
FD
CB
Server Local FS
C
/
D F
I
Pseudofs
The Pseudo file system
A
NFS: Its applications and future - LISA ‘04
Protocol vs. Implementation II
• Administration amongst the remaining *ixesdiffer
• Recommended and optional features requirenegotiation– Security
– Extensions
NFS: Its applications and future - LISA ‘04
Simplified server namespace
• Exported file systems mountable from a single serverroot– Root filehandle for the top of the file tree
– Still a client mount command
• For all shared/exported filesystems, server constructspseudo filesystems to span name space
• Client can still individually mount portions of the namespace
• Differing security policies can cover different parts ofexported space
NFS: Its applications and future - LISA ‘04
Firewall friendly
PORTMAPMOUNT
NFSv2/v3
ACL*
Port 111
DynamicPort 2049
STATUSLOCK/NLM
Dynamic
DynamicDynamic
NFSv4Port 2049
TCP}
NFS: Its applications and future - LISA ‘04
NFS Version 4 availability
• Network Appliance (Feb. ‘03), Hummingbird(late ‘02), Linux (via SuSE mid-’04), in RedHatFedora (May ‘04), RHEL 4.0 Dec. 04)– Must be explicitly enabled
• Solaris 10 imminent (uh, yesterday)– On by default
• IBM AIX 5L V5.3
• BSD (Darwin) (date?)
NFS: Its applications and future - LISA ‘04
The future of NFS Version 4
• Enhanced “Sessions” based NFS (correctness)• CCM - session-based security for IPsec• Directory delegations• Migration/replication completion
– Core protocol defines client/server failover behaviour– Definition of the server-server file system movement– Transparent reconfiguration from client viewpoint
• Proxy NFS file services (NFS caches)• Uniform global name space
– Features exist in the core protocol to support this
• Support for multi-realm security
NFS: Its applications and future - LISA ‘04
Scalability: Attacking the I/O bottleneck
• Remote Direct Memory Access– Bypasses CPU on client and server for networking
– Reduces the memory bottlenecks on high speednetworks such as Infiniband or 10G ethernet.
– See the NFSv4.1 RDMA extensions
• Parallel storage extensions (pNFS)– “Intelligent” clients are given (limited!) direct access to
the storage net using block protocols such as iSCSI etc.
– Bypasses server altogether for block file operations, butnot for metadata
NFS: Its applications and future - LISA ‘04
Drill down: Linux NFS
NFS: Its applications and future - LISA ‘04
Linux 2.6 – Recent NFS (3 and 4) clientchanges
• Support for O_EXCL file creation
• Cached ACCESS permission checks
• “Intents” allow unnecessary lookups andpermissions checks to be optimized away
• Asynchronous read/write improvements– Removed 256 page read/write request limit
– Async support also for r/wsize < PAGE_SIZE
• DIRECT_IO / uncached I/O
• RPCSEC_GSS
NFS: Its applications and future - LISA ‘04
Linux 2.6 NFS Version 4
• Framework added October ‘02 to Linux 2.5• Additional cleanups and features added over
past year– Performance work to get to V3 levels
• Delegations
– Framework for atomic OPEN– Memory management issues– Basic state management
• State recovery
– Bug fixes and code stabilization
NFS: Its applications and future - LISA ‘04
Linux 2.6 NFS Version 4
• Goal is to stabilize basic V4 and add further advancedfeatures– Will remain an EXPERIMENTAL feature dependent on
testing
– No NFS V4 ROOT (for diskless operation)
• Must address the user experience– State recovery under network partition
– NFSv4 perceived as complicated to administer• Strong security, names spaces, migration/replication, etc.
NFS: Its applications and future - LISA ‘04
Linux-2.6 – Generic server changes
• Adds upcall mechanism– Adds RPCSEC_GSS support (version 3 and 4)
– Id mapper
– Improves support for exports to NIS or DNSdomains.
• Zero-copy NIC support– Accelerates NFS READ calls over UDP+TCP
• 32K r/wsize support
NFS: Its applications and future - LISA ‘04
Linux 2.6.10 NFS V4 - What’s in
• All NFS ops and COMPOUND• GSSAPI
– LIPKEY/SPKM and Kerberos (authentication andintegrity)
• Single server mount• Locking• Client side delegations (read and write)
– Does not cache byte-range locks
• ID mapping (UID to names - user@domain)• O_DIRECT bug fixes• AutoFS support
NFS: Its applications and future - LISA ‘04
Linux 2.6.10 NFS V4 - TODO
• Security– RPCSEC_GSS privacy support– SECINFO– Keyrings
• Server side delegation support (going in now)• Migration/replication client support
– “Nohide” mount point crossing - RFC 3530 compliance
• ACL support• Named attribute support• Non-Posix server support (missing attribute
simulation)• Volatile file handle handling• Global name space
NFS: Its applications and future - LISA ‘04
What about iSCSI?
NFS: Its applications and future - LISA ‘04
iSCSI background
SCSI ProtocolSCSI Protocol
IPIP
TCPTCP
iSCSIiSCSI
FibreChannelFibreChannel
FCPFCP Parallel BusParallel Bus
TCP/IP Transport for SCSI command sets
SCSI block protocol access
Internet standard - RFC 3720
iSCSI is a direct replacement for FCP
FCP is the SAN fabric today
NFS: Its applications and future - LISA ‘04
The important thing about iSCSI is SCSI
EthernetHeader CRCIP
ChecksumChecksum
TCP iSCSI SCSI Data…
//
Addressing and routingand security
The important stuffand common with FCP
NFS: Its applications and future - LISA ‘04
“So, iSCSI is a replacement for NFS, right?”
• In the first iSCSI presentation I made to aprospect, this was first thing out IT manager’smouth
• I used to say “No”, but the thoughtsunderlying the question are interesting
NFS: Its applications and future - LISA ‘04
iSCSI value proposition
• Leverage existing Gigabit ! 10 Gigabitnetworking infrastructure
• Leverage existing rich set of managementtools
• Leverage existing base of skilled personnel
Reducing costs
NFS: Its applications and future - LISA ‘04
iSCSI points
• iSCSI software drivers freely and ubiquitouslyavailable– Windows platforms– Linux, and other *ixes
• HBAs and TOEs– Scale performance from software solution ! HW assist! full offload
• Saying “performance” and “iSCSI” in the same breaththough misses the point– Performance is not always the primary issue (else use
FC SAN)– Many application deployments have spare (CPU and I/O)
capacity– Optimize performance as needed
NFS: Its applications and future - LISA ‘04
For some, iSCSI represents the path of leastresistance
• It is semantically equivalent to FC SAN (SCSI)– But more familiar because of TCP/IP and
Ethernet - so friendly outside the data center
• Application migration is trivial– My remote booting desktop from FC to iSCSI
• Provides a path for easily reclaiming FC portcapacity by moving less critical apps to iSCSI
• With some of the important cost benefits ofNAS
NFS: Its applications and future - LISA ‘04
iSCSI is part of a solution
• Cost effective alternative to captured storage• Windows HCL afterburner• A place to move “less critical” SAN
applications freeing up FC capacity– The early adopter approach
• Friendly outside the data center– And manageable!
• Easily envision applications like remote C:drive– Remind me to tell you a story
• An additional tool in the storage toolbox
NFS: Its applications and future - LISA ‘04
Cluster computing and Linux
NFS: Its applications and future - LISA ‘04
The Old Way
Imagine Charlton Heston in a chariot.Imagine Charlton Heston in a chariot.
NFS: Its applications and future - LISA ‘04
The New Way
Imagine an airplane full of chickens.Imagine an airplane full of chickens.
NFS: Its applications and future - LISA ‘04
“The GRID” – what is it?
• Sets of toolkits or technologies to construct pools ofshared computing resources– Screen-savers that use millions of desktop machines to
analyse radio telescope data.
– Vertical applications in the enterprise on a large Linuxcluster.
– Middleware that ties together geographically separatedcomputing centres to satisfy the exponentiallyincreasing demands of scientific research.
• To many, “Grid computing” and “Cluster computing”are synonymous.
NFS: Its applications and future - LISA ‘04
Modern numerology (this is not important)
Number of filers per 1000 Linux nodes in GRID<9
The number of trusted minions to Linus10?
There is only one - Linus Torvalds1
Expected number of nodes in Linux compute cluster innext two years?
10000
Typical number of Linux nodes in a render or ECADsimulation farm today
2000
Maximum expected nodes in a Linux database cluster128
Number of physical processors in commodity pizzaboxes (poor man’s blade)
2
The preferred architecture for commodity computing(and oddly, the number of years it took the Red Sox towin the World Series)
86
NFS: Its applications and future - LISA ‘04
Scalable compute cluster
• Linux is ahead of the game– growing infrastructure, expertise and support
• It's all about choice!
• No! It’s all about freedom!
• Well, no actually, it’s all about cost.
NFS: Its applications and future - LISA ‘04
The rise of the Linux compute cluster
• Driven by cost
• Particularly the cost of cheap commodityCPUs– Monolithic application servers have no chance
competing with the Tflop/price ratio
• Choices exist– *BSD development marches on
– Sun is renewing its investment in Solaris x86
– Even Windows has a market share
NFS: Its applications and future - LISA ‘04
Compute cluster points
• The x86 platform won– Any questions?
• Support costs may still be significant– ...but largely offset by the hardware cost savings - it’s
about leveraging small MP commodity x86 hardware– Some customers choose to pay more for better quality
in order to lower support costs and improveperformance
– Maturation of “free software” - paying for support
• For Unix environments, NFS is the cluster file sharingprotocol of choice– Customers simply want storage solutions that scale as
easily as their compute clusters
NFS: Its applications and future - LISA ‘04
Applications: Batching
• Large partitionable tasks that may run asparallel threads– Clusters may include up to several thousand
nodes– Often uses LSF and other queueing tools to
manage jobs– Read-only data may be shared, but writing is
partitioned to avoid locking issues.– Typical applications include search engines, e-
mail, animation and rendering, scientificsimulations, and engineering
NFS: Its applications and future - LISA ‘04
Other applications
• Scalable databases– Fewer nodes than the batch case: up to a hundred or
so.
– High degree of write sharing necessitates heavier use oflocking
– Data integrity is often supported by specialized fencingand recovery techniques that may again need support inthe underlying filesystem.
• Extremely I/O intensive high performance computing
• GRID computing
NFS: Its applications and future - LISA ‘04
Cluster-friendly features in NFSv4
• Stateful model– ...but recovery remains client driven!– in case of a recoverable server outage, clients just retry
operations until they succeed
• Leases solve NFS Version 3 state leakage problemsdue to client outages– Cures the “lost locks” syndrome– But introduces new issues on network partition
• Delegations allow for aggressive caching– Stops short of a full coherency model, though– Callback mechanism is firewall-unfriendly.
NFS: Its applications and future - LISA ‘04
Cluster-friendly features in NFSv4
• GRID friendly– Obligatory support for strong security makes
NFS deployment over the internet possible• Including public key
• Includes support for data encryption
• Firewall-friendly: only port 2049 needs to beopened
• there are no side-band mount/lock/... protocols
• callbacks/delegations are not mandatory
NFS: Its applications and future - LISA ‘04
NFS in the short term future
• Aim to provide robust global name spaces
• Adaptations to work with GRIDs– GRIDNFS – project to adapt NFS to the Globus
GRID toolkit
– In the long run, we need improved cachingmodels for use with high latency environments
• Performance improvements using hardwareassisted networking– NFSv4.1 includes support for RDMA
NFS: Its applications and future - LISA ‘04
Storage in a clustered environment
• “Scale my storage as easily as I can scale my CPUs.”• Data sharing
– Data must be accessible to all compute nodes
• (high) data availability– No single point of failure
• Reliable data handling– No data corruption
• Security– In particular secure transport of data between compute
and storage nodes
• Support commodity TCP/IP and ethernet networking• Performance
NFS: Its applications and future - LISA ‘04
Scaling yet further
• Virtualization techniques permit horizontalscaling of storage using the current protocol– See NetApp/Spinnaker
• Parallel NFS – NFSv4 extensions to scalebeyond storage network bandwidth limits– Allow for striping files across several storage
units
– Explore SAN and object storage integration(NFSv4 as metadata server)
NFS: Its applications and future - LISA ‘04
Putting NFS in perspective
NFS: Its applications and future - LISA ‘04
Let’s put this in perspective
• “Wow. Michaelangelo, great statue - was that a7 inch chisel you used?”
• “Great flick Welles, what camera did youuse?”
• “Great quarter you guys had! Did you use NFSto access your financial data?”
NFS: Its applications and future - LISA ‘04
It’s about applications
• Applications drive storage choices– What does the application vendor support?
– What do they recommend?
– For example, Exchange is driving iSCSI in theWindows environment
• Mix of applications in a single enterprise– There is no one perfect storage approach
– There’s likely more than one vendor
NFS: Its applications and future - LISA ‘04
It’s about data management
• Integration of applications with datamanagement– Key applications like Exchange - application-
driven backup/restore– Fertile ground for virtualization - blurring line
between client application and storage
• Disaster recovery• Finding data when you need it
– Higher level data organization and grouping?
NFS: Its applications and future - LISA ‘04
Non Disruptive Migration
Vol A Vol B Vol C
NearStore
Vol BVol BVol B
Migration:Fast (on-demand)
Transparent to users
Migration of NameSpace or backendVolumes
Migrate aged data toNearStore
Global Name Space
NFS: Its applications and future - LISA ‘04
It’s about cost
• Ability to (re)provision, expand and managestorage to maintain high utilization will mostaffect overall cost long term
• Leveraging commodity networking– iSCSI and NFS are similar here
• Primary storage and Nearline support for allstorage access - transparently– Migration and replication
• Consolidation to reduce management costs
NFS: Its applications and future - LISA ‘04
Existing Storage Hierarchy
StorageNetwork
StorageNetwork
LAN or WAN
Filers
Servers
TapeLibrary
OpticalLibrary
Primary Storage Archive Targets
HeterogeneousStorage
Price/PerformanceGap
• The traditional two-tier hierarchy creates a large price/performance gap
• Current challenges need a storage solution that fills the gap
SlowVery Fast
$$$$$/ MB $/ MB
NFS: Its applications and future - LISA ‘04
Economics of recovery
• Acceptable downtime is applicationdependent
• Simplest – but most costly - approach is fullmirror online
• ATA drives are cheap
NFS: Its applications and future - LISA ‘04
Emerging Storage Hierarchy
NearStoreTM
Fast
$$/MB
StorageNetwork
StorageNetwork
LAN or WAN
NearStoreTM
Appliance
Backup Target/Reference Data
Filers
Servers
TapeLibrary
OpticalLibrary
Primary Storage Archive Targets
HeterogeneousStorage
SlowVery Fast
$$$$$/ MB $/ MB
NFS: Its applications and future - LISA ‘04
Understanding the context around NFS
• Regardless of data access protocol, similar issuesand solutions in data management– Common management regardless of access method
• That other operating systems (besides Linux) drivefundamental architecture decisions– Blade provisioning via NFS is a non-starter perhaps -
because of multi-OS support– Enter iSCSI - the least common denominator - the
Windows splash effect
• People don’t buy NFS servers– They buy Oracle or other applications– They build application compute clusters– And manage the data around it - with NFS perhaps
NFS: Its applications and future - LISA ‘04
beepy, are you saying there is nodifference between storagearchitectures?
NFS: Its applications and future - LISA ‘04
Differences are important
• NAS protocols define a file view - higher levelorganization and semantics– Enables sharing– Enables large compute clusters (>5,000 nodes)
• iSCSI, like FC SANs, provides simpler SCSI blockinterface– Higher level semantics via explicit file system
encapsulation– Sharing via layered cluster file system (complexity and
cost?)
• Customers will use and continue to explore a varietyof approaches
NFS: Its applications and future - LISA ‘04
The challenge for NFS
NFS: Its applications and future - LISA ‘04
Other than that Mrs. Lincoln…
• NFS = Network File System• NFS = Not For Speed• NFS = Not For Security
But an NFS vendor may sit there and think “My side ofthe boat is dry!”
Exactly.
First impressions last a long time, and hard toturnaround once set.
NFS: Its applications and future - LISA ‘04
So, back to NFS - what do customers want?
• No surprises– Customers really want a better NFS Version 3– Are we prepared to provide support for NFS Version 4?
• Reliability– Testing– Scalability
• Playing well with others– Agreeing on common administration models– Agreeing on common features (else we will drop things
from spec in IETF)
• Security– Administration needs to be simplified simplified
simplified
• Performance is at bottom of list I think
NFS: Its applications and future - LISA ‘04
• NFS– http://www.netapp.com/tech_library/ftp/3183.pdf (Linux NFS)– http://www.netapp.com/tech_library/hitz94.html (NFS Version 3)– http://www.beepy.com/SANEnfs.pdf (NFS and Linux clusters)
• NFS Version 4– http://www.ietf.org/rfc/rfc3530.txt– http://www.nluug.nl/events/sane2000/papers/pawlowski.pdf (OLD)– http://www.redbooks.ibm.com/abstracts/SG247204.html– http://www.nfsv4.org
• SAN and iSCSI case studies– http://www.netapp.com/case_studies/index_prod.html#san-nas
• SAN Performance Technical Report– http://www.netapp.com/tech_library/3321.html
Additional Resources
NFS: Its applications and future - LISA ‘04
Questions