share and share alike

66
Share and Share Alike Using System V shared memory constructs in MRI Ruby projects

Upload: awebneck

Post on 29-Nov-2014

724 views

Category:

Technology


0 download

DESCRIPTION

Using System V Shared Memory in MRI Ruby Projects

TRANSCRIPT

Page 1: Share and Share Alike

Share and Share Alike

Using System V shared memory constructs in MRI Ruby projects

Page 2: Share and Share Alike

● Jeremy Holland

● Senior Lead Developer at CentreSource in beautiful Nashville, TN

● Math and Algorithms nerd

● Scotch drinker

● @awebneck, github.com/awebneck, freenode: awebneck, etc.

Who Am I?

Page 3: Share and Share Alike

The Problem

● FREAKIN'● HUGE● BINARY● TREE

Page 4: Share and Share Alike

● Huge. Millions of nodes, each node holding ~500 bytes

● e.g. Gigabytes of data

● K-d tree of non-negligible dimension (varied, around 6-10)

● No efficient existing implementation that would serve the purposes needed

Fast search Reasonably fast consistency

How huge?

Page 5: Share and Share Alike

● Index the tree, persist to disk

Loading umpteen gigs of data from disk takes a spell.

Reload it for each query WAY TOO SLOW

Things we considered...and discarded

Page 6: Share and Share Alike

● Index once and hold in memory

Issues both with maintaining index consistency and balance

Difficult to share among many processes / threads without duplicating in memory.

Things we considered...and discarded

Page 7: Share and Share Alike

● DRb

Simulates memory shared by multiple processes, but not really

While the interface to search the tree is available to many different processes, actually searching it takes place in the single, server-based process

Things we considered...and discarded

Page 8: Share and Share Alike

● Benefits

Shared segment actually accessible by multiple, wholly separate processes

Built-in access control and permissions Built-in per-segment semaphore

● Drawbacks

With great power comes great responsibility Acts like a bytearray – manual serialization

Enter Shared Memory

Page 9: Share and Share Alike

● Ruby:

Everything goes on the heap Garbage collected - no explicit freeing of

memory● C:

Local vars, functions, etc. on the stack Explicit allocations on the heap (malloc) Explicit freeing of heap – no GC

Ruby-level memory paradigmvs C-level memory paradigm

Page 10: Share and Share Alike

● Before start of process

Ruby

Page 11: Share and Share Alike

● Process starts

● Heap begins to grow

Ruby

Page 12: Share and Share Alike

● Process runs

● Heap continues to grow with additional allocations

Ruby

Page 13: Share and Share Alike

● Process runs

● GC frees allocated memory no longer needed...

Ruby

Page 14: Share and Share Alike

● ...so it can be reallocated for new objects

Ruby

Page 15: Share and Share Alike

● Process ends

● Heap freed

Ruby

Page 16: Share and Share Alike

● Process starts

● Stack grows to hold functions, local vars

C

Page 17: Share and Share Alike

● Process runs

● Memory is explicitly allocated from the heap in the form of arrays, structs, etc.

C

Page 18: Share and Share Alike

● Process runs

● A function is called, and goes on the stack

C

Page 19: Share and Share Alike

● Process runs

● The function returns, and is popped off the stack

C

Page 20: Share and Share Alike

● Process runs

● The item in the heap, no longer needed, is explicitly freed

C

Page 21: Share and Share Alike

● Process runs

● A new array is allocated from the heap

C

Page 22: Share and Share Alike

● Process ends (untidily)

● The stack and heap are reclaimed by the OS as free

C

Page 23: Share and Share Alike

Ruby itself has no concept of shared memory.

TRUTH

Page 24: Share and Share Alike

C does.

TRUTH

Page 25: Share and Share Alike

● A running process (as viewed from the C level)

Shared Memory

Page 26: Share and Share Alike

● A shared segment is created with an explicit size – like allocating an array

Shared Memory

Page 27: Share and Share Alike

● The segment is ”attached” to the process at a virtual address

Shared Memory

Page 28: Share and Share Alike

● Yielding to the process a pointer to the beginning of the segment

Shared Memory

Page 29: Share and Share Alike

● A new process starts, wishing to attach to the same segment.

Shared Memory

Page 30: Share and Share Alike

● It asks the OS for the identifier of the segment based on an integer key

Shared Memory

Are you there?

Yup!

Page 31: Share and Share Alike

● ...and attaches it to itself in fashion similar to the original.

Shared Memory

Page 32: Share and Share Alike

● Both processes can now - depending on permissions – read and write from the segment simultaneously!

Shared Memory

Page 33: Share and Share Alike

● The first process finishes with the segment and detaches it.

Shared Memory

Page 34: Share and Share Alike

● And thereafter, ends.

Shared Memory

Page 35: Share and Share Alike

● ...leaving only the second process, still attached

Shared Memory

Page 36: Share and Share Alike

● Now, the second process detaches...

Shared Memory

Page 37: Share and Share Alike

● ...and subsequently ends

Shared Memory

Page 38: Share and Share Alike

● Note that the shared segment is still in persisted in memory

● Can be reattached to another process with permission to do so

Shared Memory

Page 39: Share and Share Alike

● Later, a new process comes along and explicitly destroys the segment, all processes being finished with it.

Shared Memory

Page 40: Share and Share Alike

● Precisely how much memory can be drafted into service for sharing purposes is controlled by kernel parameters

kernel.shmall – the maximum number of memory pages available for sharing (should be at least ceil(shmmax / PAGE_SIZE))

kernel.shmmax – the maximum size in bytes of a single shared segment

kernel.shmmni – the maximum number of shared segments allowed.

How it's done: Configuration

Page 41: Share and Share Alike

● To view your current settings:

How it's done: Configuration

Page 42: Share and Share Alike

● Or...

How it's done: Configuration

Page 43: Share and Share Alike

● Setting the values temporarily can be accomplished with sysctl...

How it's done: Configuration

Page 44: Share and Share Alike

● ...or more permanently by editing /etc/sysctl.conf

How it's done: Configuration

Page 45: Share and Share Alike

● int shmget(key_t key, size_t size, int shmflag)

key_t key: integer key identifying the segment or IPC_PRIVATE

size_t size: integer size of segment in bytes (will be rounded up to next multiple of PAGE_SIZE)

int shmflag: mode flag consisting of standard o-g-w and IPC_CREAT (to create or attach to existing) and optionally IPC_EXCL (to throw an error if it already exists)

How it's done: Creating New and Acquiring Existing Segments

Page 46: Share and Share Alike

● int shmget(key_t key, size_t size, int shmflag)

Returns: valid segment identifier integer on success, or -1 on error

How it's done: Creating New and Acquiring Existing Segments

Page 47: Share and Share Alike

● void * shmat(int shmid, const void *shmaddr, int shmflag)

shmid: integer identifier returned by a call to shmget

shmaddr: Pointer to the address at which to attach the memory. Almost always want to leave this NULL, so that the system will address the segment wherever there's room for it.

How it's done: Attaching segments

Page 48: Share and Share Alike

● void *shmat(int shmid, const void *shmaddr, int shmflag)

shmflag: several flags for controlling the attachment – most importantly, SHM_RDONLY (what it looks like)

returns: a void pointer to the start of the attached segment, or (void *)-1 on error

How it's done: Attaching segments

Page 49: Share and Share Alike

● int shmdt(const void *shmaddr)

shmaddr: Pointer returned by the call to shmat returns: 0 or -1 on error

How it's done: Detaching segments

Page 50: Share and Share Alike

● int shmctl(int shmid, int cmd, struct shmid_ds *buf)

shmaddr: The identifier returned by shmget cmd: The command to execute – for this

purpose, IPC_STAT Buf: A shmid_ds struct

How it's done: Getting segment information

Page 51: Share and Share Alike

struct shmid_ds {

  struct   ipc_perm;    permissions/ownership

  size_t   shm_segsz;   size of segment in bytes

  time_t   shm_atime;   last attachment time

  time_t   shm_dtime;   last detachment time

  time_t   shm_ctime;   last change time

  pid_t    shm_cpid;    pid of creator

  pid_t    shm_lpid;    pid of last attached

  shmatt_t shm_nattch;  # of attached processes

}

How it's done: Getting segment information

Page 52: Share and Share Alike

● int shmctl(int shmid, int cmd, struct shmid_ds *buf)

shmaddr: The identifier returned by shmget cmd: IPC_RMID Buf: A shmid_ds struct (you can ignore it

afterwards, but it'll throw a fit if you don't provide it)

How it's done: Destroying segments

Page 53: Share and Share Alike

Examples

Examples

Page 54: Share and Share Alike

● Addressing Segments are attached wherever there is room

for them in the attaching process' address space

Challenges and Caveats

Page 55: Share and Share Alike

● Maybe here in one process...

Challenges and Caveats

0x7f195bda2000

Page 56: Share and Share Alike

● ...maybe here in another

Challenges and Caveats

0x73f882c1f000

Page 57: Share and Share Alike

● So if you store an absolute pointer in the segment that points somewhere else in the segment...

Challenges and Caveats

0x7f195bda2004

Page 58: Share and Share Alike

● It's not terribly likely to point where you think it should when referenced in a separate process

Challenges and Caveats

0x7f195bda2004

Page 59: Share and Share Alike

● Addressing Segments are attached wherever there is room

for them in the attaching process' address space

Absolute pointers are effectively useless Relative pointers – i.e. Offsets BSTs as heaps (the data structure). Serialization.

Challenges and Caveats

Page 60: Share and Share Alike

● Duplication and copying Ruby primitivesques (numerics, strings, etc) are

all allocated on the heap Shared data must be effectively copied Diminishes the usefulness of the tool for certain

applications (large data sharing) Not everything is a nail

Challenges and Caveats

Page 61: Share and Share Alike

● Duplication and copying But... fantastic for certain applications Search

Search the shared structure at c level Copy and coerce results to ruby objects |results| << |data to be searched|

Semaphore, interprocess messaging Built-in to the IPC/SHM lib!

Challenges and Caveats

Page 62: Share and Share Alike

● Tracking resource allocation Effectively an integer checked when a process

allocates some resource If nonzero, decrement If zero, the resource isn't available

● Simple, but slightly weird API.

Semaphore

Page 63: Share and Share Alike

● Push bytearray/string messages into the queue, shift 'em off

● Simple, slightly less bizarre API

Message Queues

Page 64: Share and Share Alike

● Quite exciting, this computer magic● Don't just use it because it's there

Have a NEED

● Don't be afraid to drop to C Don't know C?

Learn it – a pretty simple language, when all's said and done

Building ruby C extensions is actually pretty painless

In closing...

Page 65: Share and Share Alike

In which I probably get trolled

Questions / Comments

Page 66: Share and Share Alike

Enjoy the rest of the conference!

Thanks for listening!