nfs on steroids: building worldwide distributed file system · 2013 storage developer conference....
TRANSCRIPT
2013 Storage Developer Conference. © Intel All Rights Reserved.
NFS on Steroids: Building Worldwide
Distributed File System
Gregory Touretsky
Intel IT
2013 Storage Developer Conference. © Intel All Rights Reserved.
Legal Notices
This presentation is for informational purposes only. INTEL MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.
Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
* Other names and brands may be claimed as the property of others.
Copyright © 2013, Intel Corporation. All rights reserved.
2013 Storage Developer Conference. © Intel All Rights Reserved.
Agenda
Intel IT and Design Computing infrastructure
Global Data Access challenges
Considered solutions
Selected direction
Remote NFS access
Kerberos authentication
Implementation details
3
2013 Storage Developer Conference. © Intel All Rights Reserved.
2013 Intel IT Vital Statistics
6,500 IT employees @ 59 global IT sites
>95,200 Intel employees @ 164 Intel sites / 63 countries
68 Data Centers
>147,000 Devices
4
2013 Storage Developer Conference. © Intel All Rights Reserved.
Design Site Infrastructure
5
Interactive poolRemote
Desktop
NFS file servers
Batch poolNIS
Cron serversApplication servers
Configuration
managementEvents
monitoring
Name
Space
Samba
2013 Storage Developer Conference. © Intel All Rights Reserved.
Many sites, many projects
6
Project A
Project
B Project C
Challenge: provide global data access solutions without compromising security
2013 Storage Developer Conference. © Intel All Rights Reserved.
Cross-site data access: 2012
NFS1
Export11
Export12
Export13
NFS2
Export21
Export22
Client1
Client2
Client3
Site A
Site B
NFS3 Client1
Client2
Client3
Export31
Export32
Export33
Site C
NFS4 Client1
Client2
Client3
Export41
Export42
Export43
NFSNFS
Rsync++
• Replication is time consuming. Hard to define specific dataset
• A subset of data is available at remote sites
• 16 GIDs constraint
Batch &
Interactive
2013 Storage Developer Conference. © Intel All Rights Reserved.
Global User / Group accounts
Global project – requires global user, group
accounts
User may have 16 groups in site A, different 16
groups in site B and no account at site C
8
2013 Storage Developer Conference. © Intel All Rights Reserved.
Goals
Every file to be accessible from anywhere
Same path access
Server downtime shouldn’t impact clients which aren’t accessing data on this server
WAN-friendly
Every user account at every site. Every group at every site
Solve 16 GIDs limitation
Local IO performance shouldn’t be compromised
Leverage existing NAS infrastructure
2013 Storage Developer Conference. © Intel All Rights Reserved.
Remote access - options
OpenAFS
Cloud storage
NFS client-side caching
WAN optimization
NFS site-level caching
Proprietary
Open source (NFS Ganesha)
In-house development
*Other names and brands may be claimed as the property of others.
2013 Storage Developer Conference. © Intel All Rights Reserved.
Direct NFS mount over WAN
optimized tunnel
Limited optimizer cache size
Wrap the data store in days
No data pinning
NFS ops termination at the remote site
(getattr/access)
220ms * 100,000 NFSops = 22,000 sec
Multiple potential routes cache miss
Same client going via different routes
(experienced during the testing) different
response from the other side11
2013 Storage Developer Conference. © Intel All Rights Reserved.
Selected: Site-level NFS caching
and Kerberos
NFS1
Export11
Export12
Export13
Cache
Export21
Export52
Export…
CL1
CL2
CL3
Site A
Site B
NFS2CL1
CL2
CL3
Export21
Export22
Site C
NFS3 CL1
CL2
CL3
Export31
NFS+KRB
Cache
Export11
Cache
Export11
Cache
NFS+KRB
WAN
optimization
• Read/Write cache appliance “mounts” remote file system and re-exports it
• Kerberized NFS access increased number of GIDs per user
2013 Storage Developer Conference. © Intel All Rights Reserved.
Cache implementations
First implementation: vendor-specific
Evaluating alternatives
Read Cache vs. Read/Write Cache
Consistency vs. Performance
Attribute cache timeout / Max writeback delay
Optimizations
Delegations
Proactive attribute validation for hot files
Cache pre-population
2013 Storage Developer Conference. © Intel All Rights Reserved.
Where it’s problematic?
14
• Read once over high latency link
• First read – large file – interactive work
• Large percentage of non-cacheable ops
• Seldom access – beyond actimeo
• Business Continuity requirements
2013 Storage Developer Conference. © Intel All Rights Reserved.
Cache management implementation
Goal: Self-Service management for data caching
Today: requires customer and several storage administrators
Use cases:
Cache my disk at sites A, B
Modify cache parameters
Remove cache
Migrate source/cache
Get cache usage statistics
Shared capacity management
…
Expanding our Storage and Replication management framework
2013 Storage Developer Conference. © Intel All Rights Reserved.
> stodstatus areas --cell pdx --fields
project,path,sizegb,usagegb,cachedcells 'cachedcells=~"png"'
-----------------------------------------------------------------
Project Path SizeGB UsageGB CachedCells
-----------------------------------------------------------------
P1 /nfs/pdx/proj/path1 200.000 170.944 png
P2 /nfs/pdx/proj/path2 200.000 137.627 png
P3 /nfs/pdx/disks/path3 8.000 6.582 png
P3 /nfs/pdx/disks/path4 1.000 0.021 png cr
P3 /nfs/pdx/disks/path5 1.000 0.000 png
P4 /nfs/pdx/disks/path6 700.000 388.860 ibw png
-----------------------------------------------------------------
16
> stodcache create --cell pdx --path /nfs/pdx/proj/path1 --cache
iil
Cached area with path /nfs/pdx/proj/path1 (id pdx.213204)
successfully created on iil
Management system - example
2013 Storage Developer Conference. © Intel All Rights Reserved.
Cache capacity planning
Goal: every file to be accessible on-demand
everywhere
Track cache usage by Org/Project
Shared cache capacity, multi-tenant
Initial “rule of thumb”: 7-10% of the source
capacity
Seeding capacity at key locations
2013 Storage Developer Conference. © Intel All Rights Reserved.
Usage models
• Remote validation
• Write once, Read many
• Get results back from remote sites
• Write once, Read once
• Drop box
• Generate in one site, get anywhere
• Single home directory
• Quick remote environment setup
• Data access from branch locations
18
2013 Storage Developer Conference. © Intel All Rights Reserved.
NFS (RPC) authentication
AUTH_SYS RPCSEC_GSS (KRB5)
NFS
client
NFS
server
NFS
• User is authenticated by Linux
client
• Client tells server User ID and
list of Group IDs (up to 16)
NFS
client
NFS
server
NFS
KDC
• User is authenticated via KDC
on the client
• Server defines which groups
the user belongs to
(128/256/1024)
UID
Primary GID
GID1-16
Ticket
2013 Storage Developer Conference. © Intel All Rights Reserved.
Bringing Kerberos?
20
Interactive poolRemote
Desktop
NFS file servers
Batch poolNIS
Cron serversApplication servers
Configuration
managementEvents
monitoring
Name
Space
Samba
2013 Storage Developer Conference. © Intel All Rights Reserved.
Touching everything
Linux client
NFS file servers
SSH
Batch scheduler
Remote Desktop / interactive servers
Name space and automounter
Trusted hosts vs. regular hosts
Samba
Setuid / sudo
Cron jobs and service accounts
Keytab management system
21
Transparent Kerberos environment – in the Linux world
2013 Storage Developer Conference. © Intel All Rights Reserved.
Supporting Transition
Can’t jeopardize projects at critical phases
Unknown stability issues
Unknown tools’ issues
Unknown performance issues
AUTH_SYS RPCSEG_GSS transition will be
gradual
How to allow same user @ the same NFS client
16 GIDs limit for AUTH_SYS mounts
>16 GIDs for RPCSEC_GSS mounts
22
2013 Storage Developer Conference. © Intel All Rights Reserved.
Solution: 2nd NIS domain
2 distinct group views are needed during transition from non-KRB to
KRB:
NFS File servers: full group membership
NFS clients: up to 16 GIDs
Site’s NIS domain must be partially “mirrored” for the storage, with a
slightly different view
On the NIS master, the domain will be mirrored to
<domain>.storage
Some maps require data munging, as the domain name may be
in keys and values
23
2013 Storage Developer Conference. © Intel All Rights Reserved.
Welcome on board GDA Airlines
24
Customer 1: Works like magic!It’s like running in site A, while it’s actually in site B!
Customer 2: It enables transparency across sites that is just not possible thus far. we just do see a big gain and I´m more than happy to pioneer this with IT! This is actually one of the biggest methodology advancements I see in years from database perspective!
Customer 3: A true cloud computing… Three cheers to you guys who brought this solution… I am sure this will revolutionize remote site usage
2013 Storage Developer Conference. © Intel All Rights Reserved.
Summary
NFSv3 can be accessed over WAN – using NFS
caching proxy
Some workloads do better than others
NFSv3 environment can be Kerberized
Major effort is required
Transition is challenging
It would be as challenging for NFSv4/KRB
25
Facing similar challenges? Contact
2013 Storage Developer Conference. © Intel All Rights Reserved.26
Sharing Intel IT Best Practices with the WorldIT@Intel
Learn more about Intel IT’s Initiatives at www.intel.com/IT