vendor lock-in-free storage at msu

31
UPDATED SLIDES AT: HTTP://BIT.LY/CASITZFS 1

Upload: greg-mason

Post on 09-Aug-2015

91 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Vendor lock-in-free storage at MSU

U P D AT E D S L I D E S AT: H T T P : / / B I T. LY / C A S I T Z F S

1

Page 2: Vendor lock-in-free storage at MSU

V E N D O R L O C K - I N - F R E E S T O R A G E AT M I C H I G A N S TAT E U N I V E R S I T Y

G R E G M A S O N I N S T I T U T E F O R C Y B E R - E N A B L E D R E S E A R C H

2

Page 3: Vendor lock-in-free storage at MSU

W H O A M I

• Sysadmin at MSU for over 6 years

• Couple of years in industry before that doing operations.

• Primary engineer for HPC storage

• On the internet: [email protected], @nodoubleg

3

Page 4: Vendor lock-in-free storage at MSU

W H AT I S H P C ?

• High Performance Computing

• Built with fast CPUs, low-latency high-bandwidth networks, fast storage, and batch job schedulers

4

Page 5: Vendor lock-in-free storage at MSU

M S U ’ S S C A L E

• ~7,600 cores

• ~50TB RAM

• ~2PB storage

• ~2,000 software titles installed

5

Page 6: Vendor lock-in-free storage at MSU

M S U ’ S H P C W O R K L O A D

• We serve everybody, from Ag Econ to Zoology

• Tuning anything for a specific workload is futile

• Chemistry

• Bioinformatics

6

Page 7: Vendor lock-in-free storage at MSU

M S U ’ S H P C S T O R A G E

• Persistent storage is all ZFS. Reasonably fast, reasonably available, cheap. Always safe*

• High-speed parallel storage is Lustre. Currently based on a modified ext4. Fast, only moderately reliable/safe.

• NetApp filer, to support VMware environment

7

Page 8: Vendor lock-in-free storage at MSU

Z F S AT M S U

• Run in production since 2009. OpenSolaris then, OpenZFS now.

• Over 1.5PB in production at iCER

• Even using it in odd places

8

Page 9: Vendor lock-in-free storage at MSU

T O P I C S

• Benefits of ZFS

• Overview of ZFS

• Platforms with ZFS

• Build a ZFS-based system

• Potential pitfalls

• ZFS alternatives, if you must

• Storage of the future

9

Page 10: Vendor lock-in-free storage at MSU

T O P I C S

• Benefits of ZFS

• Overview of ZFS

• Platforms with ZFS

• Build a ZFS-based system

• Potential pitfalls

• ZFS alternatives, if you must

• Storage of the future

10

Page 11: Vendor lock-in-free storage at MSU

( s o m e ) B E N E F I T S O F Z F S

• Checksum ALL THE THINGS!!1

• Integrated raid understands the objects it stores

• Copy-on-write transactions are atomic

• Snapshots

• Reduces hardware costs

• Simplified administration: zfs set refquota=3T tank/filesystem zfs snapshot tank/filesystem@beforeupgrade zpool status

11

Page 12: Vendor lock-in-free storage at MSU

T O P I C S

• Benefits of ZFS

• Overview of ZFS

• Platforms with ZFS

• Build a ZFS-based system

• Potential pitfalls

• ZFS alternatives, if you must

• Storage of the future

12

Page 13: Vendor lock-in-free storage at MSU

O V E R V I E W O F Z F S C O M P O N E N T S

• pool: A collection of devices that provides storage for data managed by ZFS

• vdev: A top-level device in a pool. Can be a plain disk, raid group (raidz), or mirror.

• dataset: A zvol or filesystem

• zvol: A block device presented to the OS

• filesystem: A plain ol’ POSIX filesystem

• snapshot: A copy-on-write reference to a dataset at a point in time. Not just a copy.

• zil/log/slog/logzilla: ZFS Intent Log. All writes not yet committed to disk are stored here. Only read from when recovering from an unclean shutdown. Not a buffer.

• ARC/primarycache: Adaptive Replacement Cache. Some of the smart’s behind the performance of ZFS. Not just a dumb page cache. Resides in RAM.

• l2arc/cache/secondarycache: A block-device version of the ARC, commonly an SSD. When objects are evicted from the ARC, they might end up on the l2arc.

• For more info: http://bit.ly/zfsdocs

13

Page 14: Vendor lock-in-free storage at MSU

T O P I C S

• Benefits of ZFS

• Overview of ZFS

• Platforms with ZFS

• Build a ZFS-based system

• Potential pitfalls

• ZFS alternatives, if you must

• Storage of the future

14

Page 15: Vendor lock-in-free storage at MSU

P L AT F O R M S W I T H Z F S

• OpenZFS

• Illumos

• FreeBSD

• Linux

• Mac OS X

• Oracle ZFS

• ZFS Storage Appliance

• Solaris 11

15

Page 16: Vendor lock-in-free storage at MSU

T O P I C S

• Benefits of ZFS

• Overview of ZFS

• Platforms with ZFS

• Build a ZFS-based system

• Potential pitfalls

• ZFS alternatives, if you must

• Storage of the future

16

Page 17: Vendor lock-in-free storage at MSU

B U I L D A Z F S - B A S E D S Y S T E M

• You want trustworthy HBAs, disks, and NICs.

• I use LSI HBAs with the IT firmware.

• my NICs are Mellanox and Intel.

• Hardware spec isn’t scary!

• Illumos HCL: http://illumos.org/hcl/

• FreeBSD & Linux: anything these run on. Tend to have better hardware vendor support

17

Page 18: Vendor lock-in-free storage at MSU

R E C O M M E N D E D C O N F I G

• Quanta M4600H, Seagate 84-drive JBOD, Sanmina JBODS, or Supermicro SAS JBODs.

• Servers: any decent 2-socket Intel server with lights-out management. Lots of ECC RAM for the cache.

• Network: at least 10-gig. Investigate 40-gig Ethernet or Infiniband (IB).

• Be sure the number of disks meets the performance requirement

18

Page 19: Vendor lock-in-free storage at MSU

T O P I C S

• Benefits of ZFS

• Overview of ZFS

• Platforms with ZFS

• Build a ZFS-based system

• Potential pitfalls

• ZFS alternatives, if you must

• Storage of the future

19

Page 20: Vendor lock-in-free storage at MSU

P O T E N T I A L P I T FA L L S

• Using cheap SATA hard drives

• Using SAS expanders with SATA drives

• Improperly-sized raid stripes

• Picking the wrong SSDs for acceleration

• Using the wrong disk multipathing strategy/algorithm

20

Page 21: Vendor lock-in-free storage at MSU

T O P I C S

• Benefits of ZFS

• Overview of ZFS

• Platforms with ZFS

• Build a ZFS-based system

• Potential pitfalls

• ZFS alternatives, if you must

• Storage of the future

21

Page 22: Vendor lock-in-free storage at MSU

Z F S A LT E R N AT I V E S ( i f y o u m u s t )

• btrfs

• ReFS

• GPFS

• HAMMER

• Ceph

22

Page 23: Vendor lock-in-free storage at MSU

B T R F S

• Default filesystem for some Linux distros. Only very recently considered stable.

• Features checksums, mirroring, integrated double-parity raid that is still maturing.

• Can shrink the “array” or pool of disks, thanks for reused code from Linux MD raid.

• “mostly works ok” “typically doesn’t corrupt itself” as of kernel 3.10

• As of kernel 4.0, things are looking better-ish

23

Page 24: Vendor lock-in-free storage at MSU

R e F S

• Proprietary, successor to NTFS

• Works with Storage Spaces in Windows

• Supports most NTFS features

• 64-bit checksums are stored separately for metadata. Same for data, when enabled.

• Keeps running even after checksum failures, allowing for online recovery

• Performance is very low when data checksums are enabled

24

Page 25: Vendor lock-in-free storage at MSU

G P F S

• Proprietary parallel filesystem from IBM

• Similar to Lustre on ZFS: parallel filesystem with integrated raid, checksumming, and compression.

• Better raid implementation (declustered raid)

• Excellent policy-driven data movement, and truly global namespaces

25

Page 26: Vendor lock-in-free storage at MSU

H A M M E R

• Default filesystem in DragonflyBSD

• All data is CRC-checked. Smaller checksum than ZFS, designed for bit rot detection, not blind data verification.

• Raid is left to other software/devices. A bit flip on a raid array is not easily recoverable.

• single-file history accessible with undo command

• Smallest maximum filesystem size of the alternatives, at “only” 1 exabyte

26

Page 27: Vendor lock-in-free storage at MSU

C E P H

• A data storage system, not filesystem

• Superb object store and block device provider (RADOS)

• Objects are the way of the future

27

Page 28: Vendor lock-in-free storage at MSU

T O P I C S

• Benefits of ZFS

• Overview of ZFS

• Platforms with ZFS

• Build a ZFS-based system

• Potential pitfalls

• ZFS alternatives, if you must

• Storage of the future

28

Page 29: Vendor lock-in-free storage at MSU

S T O R A G E O F T H E F U T U R E ?

• ZFS still plays an important role for persistent data storage, and robust POSIX filesystems.

• Vendors are publicly committing to Lustre on ZFS.

• Future is object stores. Ceph, Amazon S3, Microsoft Azure, even objects on Lustre.

• Networks will unify, bringing unified storage with them. Infiniband and Ethernet will converge.

29

Page 30: Vendor lock-in-free storage at MSU

M O R E I N F O R M AT I O N

• OpenZFS: http://www.open-zfs.org

• me: [email protected], @nodoubleg

30

http://bit.ly/zfsdocs

Page 31: Vendor lock-in-free storage at MSU

Q U E S T I O N S ?

31