mull en der 87 process

Upload: arindam-sarkar

Post on 07-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Mull en Der 87 Process

    1/14

    Process Management in a DistributedOperat ing SystemS a p e J . M u l l e n d e r

    Centre for Mathematics & Com puter ScienceAmsterdama n d

    Com puter LaboratoryCambridge University

    A s p a r t o f d e s i g n i n g a n d b u i ld i n g t h e A m o e b a d i s tr ib u t e d o p e r a t i n g s y s te m , w eh a v e c o m e u p w i t h a s i m p l e se t o f m e c h a n i s m s f o r p r o c e s s m a n a g e m e n t th a ta l lo w s d o w n l o a d i n g , p r o c e s s m i g r a t io n , c h e c k p o i n t in g , r e m o t e d e b u g g i n g a n de m u l a t io n o f a l ie n o p e r a t i n g s y s t e m i n te r fa c e s .

    T h e b a s i c p r o c e s s m a n a g e m e n t f a c il it ie s a r e r e a li z e d b y t h e A m o e b a K e r n e la n d c a n b e a u g m e n t e d b y u s e r - s p a c e s e r v i c e s : D e b u g S e r v i c e , L o a d - B a l a n c i n gS e r v i c e , U n i x - E m u l a t i o n S e r v i ce , C h e c k p o i n t S e r v i c e , e tc .T h e A m o e b a K e r n e l c a n p r o d u c e a r e p r e s e n t a t io n o f t h e s t a te o f a p r o c e s sw h i c h c a n b e g i v e n to a n o t h e r K e r n e l w h e r e i t is a c c e p t e d f o r c o n t i n u e d e x e -c u t i o n . T h i s s t a t e c o n s i s ts o f th e m e m o r y c o n t e n t s i n t h e f o r m o f a c o l l e c ti o no f s e g m e n t s , a n d a Process Descriptor w h i c h c o n t a i n s t h e a d d i t i o n a l s t a t e , p r o -g r a m c o u n t e r s , s t a c k p o i n t e r s , s y s t e m c a l l s t a te , e tc .

    C a r e f u l s e p a r a t i o n o f m e c h a n i s m a n d p o l i c y h a s re s u l te d i n a c o m p a c t s e to f K e r n e l o p e r a t i o n s f o r p ro c e s s c r e a t i o n a n d m a n a g e m e n t . A c o l l e c t io n o fu s e r - s p a c e s e r v i c e s p r o v i d e s p r o c e s s m a n a g e m e n t p o l i c i e s a n d a s i m p l e i n t e r -f a c e f o r a p p l i c a t i o n p r o g r a m s .

    I n t h is p a p e r w e s h a ll d e s c r i b e t h e m e c h a n i s m s a s t h e y a r e b e i n g i m p l e -m e n t e d i n t h e A m o e b a D i s t r i b u t e d S y s t e m a t t h e C e n t r e f o r M a t h e m a t i c s a n dC o m p u t e r S c i e n c e in A m s t e r d a m . W e b e l i e v e t h a t t h e m e c h a n i s m s d e s c r i b e dh e r e c a n a l s o a p p l y t o o t h e r d i s tr ib u t e d s y s t e m s .CR Categories: D . 4 , C . 2 . 4 , D . 2 . 5 .Key Words & Phrases: D i s t ri b u te d o p e r a t i n g sy s t e m , p r o c e s s m a n a g e m e n t ,m i g r a t i o n , A m o e b a .

    1. INTRODUCTIONO u r g o a l i n d e s ig n i n g t h e p r o ce s s m a n a g e m e n t p r i m i t i v e s d e s c r ib e d i n t hi s p a p e rw a s t o p r o v i d e m e c h a n i s m s t h a t c a n d o w h a t p r oc e ss m a n a g e m e n t p r im i t iv e s i ne x is ti ng g e n e r a l -p u r p o s e o p e r a ti n g s ys te m s c a n d o a n d m u c h m o r e . T h e a d d e df u n c t i o n a li t y h a s t o d o w i t h t h e p r o p e r t i e s o f t h e k i n d s o f d i s t ri b u t e d s y s te m s w ea r e i n t e r e s te d i n : p e r s o n a l w o r k s ta t io n s , s h a r e d s e r v e r m a c h i n e s a n d gue st systems,c o n n e c t e d b y a f a s t lo c a l - a r e a n e t w o r k .

  • 8/3/2019 Mull en Der 87 Process

    2/14

    39

    The workstations are normally used by a single person, but, when nobody isusing them, they are available as a comput ing resource to users of other worksta-tions. Toge ther with processors dedicated to being allocated for the execution ofuser programs, the idle workstations form a Processor Pool. Shared servermachines provide a distributed file system, name service, gateways to the inter-net, access to printers, tape drives, etc. By 'guest systems' we mean traditionaloperating systems that have become connected to the distributed system withsome software to allow the sharing of software between the 'new' and the 'old'world. In the case of our system, U~Ix t systems are still used because o f theenormous body of software available to us there; software that is only slowlyreplaced by equivalent or better in the distributed system.

    We are building a general-purpose distributed system, so the programmingenvironment we design for is a heterogeneous one: many languages, several filesystems, existing software developed on other systems, possibly a wide variety ofhardware an d different kinds of networks to connect the machines. Th edesigned process abstraction must allow running existing software. The re mustthus be support for heavy-weight processes and emulation of foreign operatingsystem interfaces (with the possibility of prov iding binary compatib ility: binariesfi"om the foreign system must run without modification).

    In this environment, sufficient protection mechanisms must be implemented toprevent one user's programs from disturbing another's. Programs from differentusers will frequently share one physical processor, so they must run in separateaddress spaces.

    Not all machines can be expected to have a local file system, so programs willhave to be downloaded over the network. The mechanisms that do this must befast; some programs are several megabytes in size, so loading takes seconds, evenin the best of cases, and the user is often impatiently waiting a t the terminal.

    Distributed applications will rely heavily on fast interprocess communication.In many distributed systems, the basic communication mechanism is the messagetransaction, a message pair: a request message from a client process to a server,followed by a reply message from the server back to the client. On top, remoteprocedure cal l is often provided. When carefully designed and implemented, mes-sage transactions form one of the most efficient communication protocols forlocal-area networks, both in terms of delay an d of throughput [6, 2]. In m an ypopular implementations, when a client process has sent a request, it blocks untila reply arrives; when a server has asked for a request, it blocks until one arrives.

    Using message transactions has several consequences for the design of the pro-gramming environment. First, processes block once on each message transac-tion. Two process switches thus occur : one when the process blocks to runanother process and one after the process has become unblocked again to runthe original again. If message transactions are to be very fast, process switchinghad better be fast too.

    Second, message transactions provide no parallelism: when the client runs, theserver waits for a request, and when the server runs, the client waits for a reply.t UNix s a Trademark of AT&T Bell Laboratories.

  • 8/3/2019 Mull en Der 87 Process

    3/14

    40

    O n l y o n e p r o c e s s r u n s a t a t i m e , a l b e i t o n d i f f e re n t m a c h i n e s . O n e s o l u t io nc o u l d b e t o i m p l e m e n t n o n - b l o c k i n g t r a n s a c t i o n s , t h u s k i l l i n g t w o b i r d s w i t h o n es t on e : p r o c e ss sw i t ch e s n e e d n o t c o m p e t e w i t h m e s s a g e t r a n s a c t i o n s i n s p e e d a n ym o r e a n d p a r a l le l is m c a n b e o b t a i n e d b y s e n d in g r e q ue s ts t o m a n y s e rv e rss i m u l t a n e o u s l y . T h i s s o lu t io n , h o w e v e r , i n tr o d u c e s a w h o l e n e w s e t o f p r o b l e m s[ 7] . O n e p r o b l e m i s t h a t t h e i n t e rf a c e b e t w e e n a p r o c es s a n d t h e c o m m u n i c a -t io n s s u b s t ra t e b e c o m e s m o r e c o m p l i c a t e d : t h e r e m u s t b e h a n d l e s f o r te l li n g ap r o c e ss w h e n a m e s s a g e h a s a r r i v e d . A n o t h e r i s t h a t t h e n u m b e r o f p r o c es ss w i tc h e s d o e s n o t d e c r e a s e a t a l l: t h e c o m m u n i c a t i o n s s o f t w a r e ( w h i c h m u s tr e s i d e i n a s e p a r a t e a d d r e s s s p a c e o r i n t h e k e r n e l f o r p r o t e c t i o n ) i s i n v o k e du p o n r e q u e s ts t o s e n d, r e q u e s ts t o r e c e iv e , a n d u p o n r e c e i p t o f a m e s s a g e f r o mt h e n e tw o r k . A t h i r d p r o b l e m is t h a t a n o n - b l o c k i n g m e s s a g e t r a n s a c t i o n i n t e r-f a c e is e x t r e m e l y h a r d t o p r o g r a m a n d d e b u g , b e c a u s e t h e o r d e r o f e v e n t s is n olonge r spec i f i ed .

    P a r a l le l is m m u s t b e p r o v i d e d i n s o m e o th e r w a y , a n d t h e w a y t h a t w a s c h o s e ni n A m o e b a , a s w e l l a s m a n y o t h e r m o d e r n d i s t r i b u t e d ) y s t e m s , i s t o i m p l e m e n tl i g h t - w e ig h t p ro cesse s , o r t h read s o f co n t ro l. M a n y t h r e a d s c a n s h a r e a s i n g l e a d d r e s ss p a c e ; s i n c e m u c h o f t h e s t a t e o f t h e l i g h t - w e i g h t p r o c e ss e s is s h a r e d , t h r e a ds w i t c h i n g c a n b e d o n e b l i n d i n g l y f a st . U s i n g l i g h t - w e i g h t p ro c e s s e s m a k e s i t p o s -s i b l e t o i m p l e m e n t s e r v e r s b y h a v i n g o n e p r o c e s s s e r v e a s i n g l e c l i e n t a t a t i m e ;m a n y c l i e n t s c a n b e s e r v e d s i m u l t a n e o u s l y b y c r e a t i n g m a n y p a r a l l e l l i g h t -w e i g h t p ro c e ss e s. U s u a l l y , a s y n c h r o n i z a t i o n m e c h a n i s m i s p r o v i d e d t o a ll o w th ep ro c es se s t o s h a re c o m m o n d a t a s t ru c t ur e s i n s h a r e d m e m o r y (e.g. , i n t h e f o r m o fsem~p~res).

    L i g h t - w e i g h t p r o c es s e s a n d b l o c k i n g m e s s a g e t r a n s a c ti o n s a r e u s e d i n m a n yd i s t r i b u t e d s y s t e m s t o s im p l i f y w r i t i n g s o f t w a r e t h a t e x p l o i t s p a r a l l e l i s m [ 1 1, 1 , 7 ] .M e c h a n i s m s f o r m i g r a t i o n o f p r o ce s se s in d i s t r ib u t e d s y s te m s h a v e b e e n p r o -

    p o s e d o r i m p l e m e n t e d s e v e r a l t im e s , b u t n o a l g o r i t h m s h a v e b e e n p r o p o s e d t ou s e m i g r a t i o n fo r l o a d - b a l a n c i n g . G i v e n t h e t im e r e q u i r e d t o m i g r a t e a l a rg ep r o ce s s ( o n t h e o r d e r o f t e n s e c o n d s ), m i g r a t i o n f o r l o a d - b a l a n c i n g d o e s n o ta p p e a r t o b e v e r y u se fu l. I t c a n b e u s ef u l, h o w e v e r , i n a n e n v i r o n m e n t o f p e r -s o n a l w o r k s t a t i o n s , w h e r e i d l e w o r k s t a t i o n s a r e ' l e n t o u t ' a s a p r o c e s s i n g r e s o u r c ef o r o t h e r s a n d ' t a k e n b a c k ' w h e n t h e i r o w n e r s re t u r n .

    2. TH E AMOE BA DISTRmU TED OPERATL'qOSYSTEMA m o e b a is a d i s tr ib u t e d o p e r a t i n g s y st em , b a se d o n t h e p o p u l a r p a r a d i g m o fclient p r o c e s s e s c o m m u n i c a t i n g w i t h s e r v i c e s v i a m e s s a g e t r a n s ac t i o n s . A m o e b auses capabi l i t ies t o a c c e s s s e r v ic e s a n d t h e o b j e c t s t h e s e s e rv i c es i m p l e m e n t .

    A capabi l i ty i s a 2 5 6 - b i t r e fe r e n c e t o a n o b j e c t ; t h e f ir s t 6 4 b i t s - - k n o w n a s t h ep o rt - - r e f e r t o t h e s e rv i ce m a n a g i n g t h e o b j e c t; t h e n e x t 6 4 b i t s a r e a v a i l a b l e t ot h e s y s t e m f o r u s e a s a loca t ion h in t ; t h e r e m a i n i n g 1 2 8 b i t s a r e a l l o c a t e d b y t h es e r v ic e t o i d en t i fy t h e o b je c t , A c a p a b i l i t y is g e n e r a t e d i n s u c h a w a y - - a n d c o n -t a i n s s u ff ic i en t b i t s - - t h a t t h e p r o b a b i l i t y o f a n u n a u t h o r i z e d u s e r g u es s in g a no b j e c t ' s c a p a b i l i t y i s n e g l ig i b l e .

  • 8/3/2019 Mull en Der 87 Process

    4/14

    41

    T h e s e c a p a b i li t ie s a r e u s e d f o r p r o t e c ti o n , a n d a l so as th e p r i m a r y m e c h a n i s mf o r a d d r e s s i n g r e q u e s t s t o d o o p e r a t i o n s o n o b j ec t s . W h e n a c li e n t se n d s ar e q u e s t , t h e s y s t e m u se s t h e p o r t t o d e t e r m i n e w h i c h s e r v ic e s h o u l d h a n d l e t h er e q u e s t. A s e r v e r f o r t h a t s e r vi c e is t h e n f o u n d t h r o u g h a l o c a t e o p e r a t i o n , e . g . ,i m p l e m e n t e d t h r o u g h b r o a d c a s t in g ' w h e r e - a r e - y o u " p a c k e t s . T h e s e r v e r u s es t h ep r i v a t e p a r t o f t h e c a p a b i l i t y t o i d e n t if y t h e o b j e c t. A f t e r c a r r y i n g o u t a r e q u e s t ,t h e s e r v e r r e t u r n s a r e p l y .

    M o s t se rv ic e s r u n i n u s e r s p a c e. T h e A m o e b a K e r n e l p r o v i d e s o n l y t h e b a r em i n i m u m o f s e rv i ce : m e s s a g e - t ra n s a c t i o n f a c il it ie s, , p ro c e ss m a n a g e m e n t , a n da c c e s s p a t h s t o p e r i p h e r a l s . F i l e s e r v i c e , f o r i n s t a n c e , i s a u s e r - s p a c e s e r v i c e w i t hn o s p e c i a l p r iv i le g e s, e x c e p t k n o w l e d g e o f t h e c a p a b i l i ti e s t o g e t t o t h e d i sk sw h e r e t h e f i l e s a r e s t o r e d .

    M e s s a g e t r a n s a c t i o n s a r e b l o c k i n g , a n d t h e s y s t e m p r o v i d e s n o b u f f e r i n g .W h e n a s e rv e r ca ll s g e t r e q u e s t ( p o r t , c a p a b i l i t y , r e q u e s t b u f f e r ) , ( t h e p o r t i d e n t i f i e s t h es e r v e r t o t h e s y s t e m ) , t h e s e r v e r is b l o c k e d u n t i l a r e q u e s t a r r iv e s . T h e s e r v e rr e t u r n s a r e p l y w i t h p u t r e p l y ( r e p l y b u f f e r ) , w h i c h d o e s n ' t b l o ck . W h e n t h e c l ie n tc a l l s t r a n s ( c a p a b i l i t y , r e q u e s t b u f f e r , r e p l y b u f fe r ) , i t b l o c k s u n t i l t h e s e r v e r ' s r e p l y i sr e c e i v e d .

    I n c a s e o f a f a i lu r e , t h e c l i e nt is t o l d t h a t t h e s e r v e r c o u l d n o t b e r e a c h e d , o rt h a t n o r e p l y w a s r e c e i v e d . I n t h e f o r m e r c a se , t h e c l ie n t c a n s a f el y r e t r y ; i n t h el a t t e r c a s e , t h e c l i e n t w i l l h a v e t o f i n d o u t w h e t h e r t h e f a i l u r e o c c u r r e d b e f o r e ,d u r i n g , o r a f t e r e x e c u t i o n o f t h e r e q u e s t ( u n le s s t h e r e q u e s t w a s i d e m p o t e n t ; i n t h i sc a s e t h e r e q u e s t c a n a l w a y s b e s a f e ly r e p e a t e d ) . W h e n a c l i e n t f ai ls d u r i n g at r a n s a c t i o n , t h e r e p l y i s l o s t .

    A k e r n e l r e q u e s t is j u s t a r e q u e s t fo r a n o p e r a t i o n o n a n o b j e c t m a i n t a i n e d b yt h e k e r n e l . A k e r n e l r e q u e s t , o r s y s t e m c a l l , i s a t r a n s a c t i o n w i t h t h e K e r n e l S e r v i c e .T h u s , i n p ri n c ip l e , A m o e b a c o u l d h a v e o n l y t h e s y s te m c a ll s f o r d o i n g m e s s a g et r a n s a c t io n s . I n p r a c t i c e , h o w e v e r , it is m o r e e f f ic i e n t t o i m p l e m e n t s o m e o f t h ev e r y f r e q u e n t k e r n e l s e r v i c e re q u e s t s a s o r d i n a r y s y s t e m c a ll s.

    S i n c e A m o e b a t r a n s a c t io n s t a r e b l o c k in g , t h e y c a n n o t b e u s e d t o o b t a i np a r al l el is m . A m o e b a u se s p a r a l le l p r o ce s se s t o a c h i e v e t h a t . A m o e b a i m p l e -m e n t s l i g h t - w e i g h t p a r a l l e l p r o c e s s e s , c a l l e d t a s k s . F o r e f fi ci e nc y , a n u m b e r o ft a sk s c a n s h a r e a n a d d r e s s sp a c e . A n a d d r e s s s p a c e w i t h a n u m b e r o f t a sk s i n iti s a c l u s t e r . B e c a u s e t h e t e r m p r o c e s s c o u l d r e f e r b o t h t o a t a s k o r a c l u s t e r , w eh a v e a v o i d e d i t a s m u c h a s po s si bl e i n t h e r e m a i n d e r o f t h e p a p e r .T o a l l o w p r o g r a m m e r s t o us e s e p a r a te t a sk s fo r s m a l l u n i t s o f w o r k ( e . g . , u s e as e p a r a t e t a s k f o r e a c h r e q u e s t r e c e i v e d b y a f i le s e r v e r) , t a sk s a r e c h e a p t oc r e a t e , d e s t r o y a n d s c h e d u l e . T h e c u r r e n t s c h e m e f o r t h is is q u i t e e f f ic i en t , b u tw e b e l ie v e it c a n b e m a d e m o r e f l ex i bl e a n d m o r e e f f ic i e nt s ti ll . T h i s p a p e rd is c us se s a n e w d e s i g n f o r t a s k a n d c l u s t e r m a n a g e m e n t .

    F o r m o r e i n f o rm a t i o n a b o u t A m o e b a , s ee ' T h e D e s i gn o f a C a p a b i l it y - B a s e dD i s t r i b u t e d O p e r a t i n g S y s t e m ' [ 1 2, 9 , 1 0, 8 ]. F o r d e t a i ls o f t h e A m o e b a p r o t e c -t i o n m e c h a n i s m , s ee ' P r o t e c t i o n a n d R e s o u r c e C o n t r o l i n D i s t r i b u t e d S y s t e m s '[ 5 ] .~" Not to be confused w ith database transactions or atomic transactions. In Am oeba, a transaction isa message transaction, a request/reply pair.

  • 8/3/2019 Mull en Der 87 Process

    5/14

    42

    3. THE KERNEL SERVERT h e A m o e b a K e r n e l m a n i p u l a t e s t h r e e k i n d s o f b a s ic o b j e c t s t o re a l iz e th e p r o -c e ss a b s t r a c t i o n i n A m o e b a . A cluster i s a v i r t u a l a d d r e s s s p a c e c o n s i s t in g o f an u m b e r o f segments a n d a n u m b e r o f t h r e a d s o f co n t r o l, c a l l e d t ask s.

    T h e m a s o n f o r h a v i n g t a sk s s h a r e a n a d d r e s s s p a c e i s o n e o f ef fi ci en c y : T a s k sc a n e x c h a n g e i n f o r m a t io n a m o n g e a c h o t h e r m o r e e f fi ci en tl y i n s h a re d m e m o r y ,a n d , s i n ce t a s k s h a v e l it tl e c o n t e x t , t a s k sw i t c h i n g c a n b e m a d e v e r y fa s t. T h ec o n c e p t o f ta s k s is u s e d i n s e v e r a l m o d e m d i s t ri b u t e d s y s te m s , n o t a b l y , V [ 1 ],M e s a [ 4] , a n d T o p a z . *3.1. SegmentsA segment is a n a m e d l i n ea r s ec ti on o f m e m o r y . I t is a n o b j ec t , m a n a g e d b yK e r n e l S e r v ic e . S e g m e n t s a r e c r e a t e d b y m a p p i n g t h e m i n t o a c lu st er as a d d r e s ss p a c e u s i n g a seg_.map s y s t e m c a ll . I t i s n o t p o s s i b l e t o m a p i n a n e x i s t i n g s e g -m e n t ; t h u s , s eg m e n t s c a n n o t b e s h a r e d b e t w e e n c lu s te r s. T h e c a l ls f o r s e g m e n tm a n a g e m e n t a r e d e p i c t e d i n Fm U RE 1.

    S ~ ' s t e m C a l l ss i d -= s e g _ m a p ( s e g m e n t , a d d r e s s , l e n g t h , h o w )s e g _ u n m a p ( s i d )seg__grow(s id , newlength)eg_i soO

    T r a n s a c t i o n sS e g C a ' e a t e ( k e m e l c a p , i n c a p , o u t c a p )S e g _ R e a d ( c a p a b i l i t y , o f fs e t, b u f f er , c o u n t )S e g _ W r i t e ( c a p a b i l i t y , o f f s e t , b u f f e r , c o u n t )l e n g t h = , S e g _ L e n g t h ( c a p a b i l i t y )S e g D e l e t e (c a p a b i li t y )

    FIOURE 1. Seg me nt-m anag em ent ystem calls and transacdomT h e f u n d a m e n t a l d i f f e r e n c e b e t w e e n s y s t e m c a l ls a n d t r a n s a c t i o n s is t h a t s y s -

    t e m c a ll s a r e h a n d l e d b y t h e l o c al k er n e l a n d c a n t h u s o n l y b e u s e d t o m a n i p u -l a t e lo c a l o b j ec t s. T r a n s a c t i o n s , h o w e v e r , a r e a d d r e s s e d t o a s e r v ic e t h a t m a y b eb o t h l o c a l o r r e m o t e ; t h e Seg_Read t r a n s a c t i o n c a n t h e r e f o r e b e u s e d t o r e a d t h ec o n t e n t s o f r e m o t e s e g m e n t s.The seg map c a l l c r e a t e s a n e w s e g m e n t , i n i t ia l iz e s i t t o t h e c o n t e n t s o f t h e s e g -m e n t i n ~ a t e d b y t h e segment, w h i c h i s a c a p a b i l i t y . T h e h o w p a r a m e t e r* Unfortunately, no p apers on this very interesting muhiprocessoroperating system have been pub-lished to da te; the developers at DE C's SystemResearch Cen ter need eve ry possible encouragementt o remed y this.

  • 8/3/2019 Mull en Der 87 Process

    6/14

    43

    specifies mapping options, such as read-only mapping, which end is affected byseg grow calls, etc. Th e call returns a small integer called a segment identifierwhich represents the mapped segment. This integer is used in calls to seg_unmapand seg_grow.When a segment is unmapped, it is removed from a cluster's address space butcontinues to exist as an object in memory. Seg_unmap returns the capability of thismemory object for further manipulation with the transactions for segmentmanagement.

    Seg_/njb returns information about the calling cluster's current memory map.The transactions for segment management speak for themselves.3.2. ClustersThe Kernel Service also manages clusters, which are created by sending aCreateGluster request to the kernel server. Th e pa ramete r to the request is a clus-ter d escriptor which describes the initial state of the cluster by describing the s tateof its tasks, the address space in which these tasks will run, and the processor onwhich the cluster must run. FIGORE2 illustrates a cluster descriptor.

    Host DescriptorAccounting

    &Scheduling

    Exception Handle rNumbe r of SegmentsMapping Descriptors

    Number of TasksTask Descriptors

    Fiotr~ 2. Cluster DescriptorT h e hos t de sc riptor describes the kind of processor the cluster runs on. Its

    entries have a type and a value. The type "instruction set ; for instance, assumesvalues such as V A X , M 6 8 0 0 0 , or N S 3 2 0 0 0 , and describes the instruction set towhich the cluster's code belongs. An instruction-set-dependent options type isused to indicate whether the cluster will need instruction-set options like floatingpoint or extended instruction sets. Th e raemo~ size type has a value that indi-cates the maximum size that the cluster's address space may need to grow to.There are many more po~ible types; new types can easily be added. The Ker-nel Service recognizes a number of useful types and uses their values to deter-mine whether it can or will handle the cluster. Oth er types may be used by user

  • 8/3/2019 Mull en Der 87 Process

    7/14

    44

    s e r vi c e s t h a t m a n i p u l a t e c l u s t e r d e s c r i p t o rs (e.g., E m u l a t i o n S e r v i c e ) .T h e accounting & schedul ing f ie ld c o n t a i n s i n f o r m a t i o n a b o u t j u s t t h a t . I t is n o t

    r e a l ly u s e d a t t h e m o m e n t , b u t w e e n v i s a g e t h a t o n e o f i ts us es c a n b e t o p r o v i d es c h e d u l i n g i n f o r m a t i o n f r o m a p r e v i o u s h o s t t o t h e n e x t o n e w h e n a c l u s t e rm i g r a t e s . A n o t h e r u s e is a s a m e a s u r i n g d e v i c e f o r e x e c u t i o n t i m e s o f c lu s te rs .

    T h e excep tion hand ler f ie l d g i v es t h e p o r t o f t h e s e r v i ce t o h a n d l e e x c e p t i o n sw h e n t h e y o c cu r .

    T h e n f o ll o w t h e mapping descr iptors , o n e f o r e a c h s e g m e n t i n t h e c l u st e r 'sa d d r e s s s p a c e . T h e k e r n e l c r e a t e s t h e c l u s t e r 's v i r t u a l a d d r e s s s p a c e b y c a r r y i n go u t seg map c a l ls a s s p e c if i e d b y e a c h o f t h e m a p p i n g d e s c r ip t o r s .

    F i n a l l y , a l i s t o f task descriptors, o n e f o r e a c h t a s k i n t h e c l u s t e r , g i v e s t h e s t a t eo f e a c h t a s k i n th e c l u st e r. T h i s i ll u s tr a te s o n e o f t h e a d v a n t a g e s o f h a v i n g t r a n -s a c ti o n s a s t h e o n l y w a y t o c o m m u n i c a t e o u t s id e a c l us te r : t h e A m o e b a K e r n e lm a i n t a i n s v e r y l i t tl e s t a t e f o r t h e t a sk s . T h e s t a t e o f a t a s k c o n s is t s o n l y o fw h e t h e r i t is r u n n a b l e o r b l o c k e d o n a s e m a p h o r e o r c o n d i t io n v a r i a b le , t h ev a l u e o f t h e p r o g r a m c o u n t e r , t h e s t a c k p o i n t e r , p r o c e s s o r s t a tu s w o r d , t h e o t h e rr e g i s te r s a n d , i f a t r a n s a c t i o n is i n p r o g r e s s , i ts s t a te . N o t e t h a t a t a s k c a n b ei n v o l v e d i n o n l y t w o t r a n s a c t i o n s a t a t i m e : I t c a n b e d o i n g a t r a n s a c t i o n w i t h as e r v e r w h i l e s e r v i n g a r e q u e s t f o r a c li e n t it se lf . S t a t e h a s t o b e m a i n t a i n e d f o rb o t h o n g o i n g t r a n s a c t io n s . L a t e r , w e sh a l l r e t u r n t o t h e i ss ue s o f s t a r t i n g a n ds t o p p i n g c l u s t e rs w i t h t a s k s t h a t a r e i n t h e m i d d l e o f a t r a n s a c t i o n .

    T h u s , t h e k e r n e l s e r v e r h a s a ll t h e i n f o r m a t i o n i t n e e d s t o s t a r t u p t h e n e wc l u s t e r . I t r e t u r n s a cluster capa bili ty t o t h e c l i e n t i s s u in g t h e r e q u e s t , s o t h a t o n l yt h is c l ie n t , a s t h e o w n e r o f t h e n e w c l u st e r, c a n e x e r t c o n t r o l o v e r t h e c l us te r .3.3. Th e Kernel Server InterfaceT A B L E 1 li st s t h e i n t e r fa c e w i t h t h e k e r n e l s e r v e r f o r p r o c e ss m a n a g e m e n t . T h ef ir st a r g u m e n t t o a r e q u e s t i s t h e c a p a b i l i t y o f t h e o b j e c t t h e r e q u e s t r e fe rs to .The Cr ea t eC l us t e r r e q u e s t r e f e r s t o a n o b j e c t t h a t d o e s n o t y e t e x i st ; i ts c a p a b i l i t yi s a K e r n e l C a p a b i l i t y t h a t p r o v i d e s p r o t e c t i o n a g a i n s t u n a u t h o r i z e d c l i e nt ss p a w n i n g c l u s t e r s o n k e r n e l s t h e y h a v e n o a c c e s s t o . A c k i n d i c a t e s a g e n e r i c s u c -c e s s o r f a i l u r e r e p l y . I n c a s e o f f a i l u r e i t g i v e s a r e a s o n a s w e ll .

    H a l f t h e ' s y s t e m c al ls ' a r e i m p l e m e n t e d a s t r a n s ac t i o n s w i t h t h e k e rn e l, t h eo t h e r h a l f as t r ap s i n t o th e k e r n el . T h e r e a s o n f o r i m p l e m e n t i n g s o m e o f t h ec a l l s a s t r a p s i s o n e o f e f f i c i e n c y . A c t i o n s s u c h a s M a k e T a s k o r P a r e e x e c u t e dv e r y v e r y o f te n a n d n e e d t o b e im p l e m e n t e d i n t h e a b s o lu t e m i n i m u m n u m b e ro f i n s t r u c ti o n s p o s si b le . N o t e t h a t t h e s y s t e m c a ll s h a v e e f f e c t o n l y w i t h i n g t h ec l u s t e r t h a t i s su e s t h e m . T h e t r a n s a c t i o n c a ll s n e e d t h e p r o t e c t i o n o f t h eA m o e b a p r o t e c t i o n m e c h a n i s m .

    CreateCluster c r e a t e s a n e w c l u s te r . I t s a r g u m e n t i s a c lu s t e r d e s c r i p t o r t h a td e s c r ib e s t h e c l u s t e r t o b e s ta r t e d . F o r e a c h t a s k o f t h e c l u st e r , t h e d e s c r i p t o ri n c l u d e s a t a s k d e s c r i p t o r g i v i n g p r o g r a m c o u n t e r , s t a c k p o i n t e r a n d r e g i st e rc o n t e nt s , a n d f o r e a c h s e g m e n t , t h e m a p p i n g i n f o r m a t i o n is p r e s e nt i n t he m a p -p i n g d e s c r i p t o rs . I n 5 .2 w e s h a l l r e t u r n t o th e p r o b l e m s o f c r e a t i n g c l u st e rs ina n a r b i t r a r y s t a te , s u c h a s n e e d e d f o r m i g r a t i o n .

  • 8/3/2019 Mull en Der 87 Process

    8/14

    45

    T r a n s a c t i o n s w i t h K e r n e l S e r v i ceC ! " ~ t e r c r e a t i o n a n d d e l e t i o n ( c l u s t e r m a y d e l e t e s e l f ) :C rea teC lus te r ( Kerne lCap Clus te rDesc ) r e t u r n s ClusterCap

    Dele teClus te r (Clus te rGap) ; re turns Clus te rDescI n t e r r u p t i n g c lu s te rs :

    Signal (Clus terCap, SignalType, Parameter); r e t u r n s ackKemd Sys tem C a l l s

    T a s k m a n a g e m e n t :M ake Ta sk (P rog ram Counter , S tackPointe r ); r e t u r n s ackE x i t T a s k ( ) ; d o e s n o t r e t u r n

    S y n c h r o n i z a t i o n :P(Semaphore);V (Semaphore ;Sleep(Condition);W a k e u p ( C o nd itio n) ;

    TABLE 1. Kernel r equests and system calls for process mana gemen t.

    DeleteCluster d e l e t e s a c l u s t e r a n d r e t u r n s i t s c l u s t e r d e s c r i p t o r . T h e c l u s t e rd e s c r i p t o r r e t u r n e d c o u l d b e g i v e n t o a CreateGluster c o m m a n d a n d t h e c l u s t e rw o u l d c o n t i n u e w h e r e i t w a s s t o p p e d , i f i t w e r e n ' t f o r th e f a c t t h a t o t h e r c l u st e rsc o m m u n i c a t i n g w i t h t h i s o n e m a y h a v e b e e n t o l d t h a t i t w a s k i l l e d b e t w e e n t h eDeleteClus ter a n d th e CreateClus ter . S u s p e n d i n g o r m i g r a t i n g a c l u s t e r i s t r i c k i e rt h a n t h is . T h e d e t a i l s a r e d e s c r i b e d i n 5 . 2.S e g m e n t s a r e c r e a t e d b y d o i n g a seg_map o p e r a t i o n o n e v e r y s e g m e n td e s c r i b e d in th e s e g m e n t d e sc r ip t o r . T h e c o n t e n t s o f t h e s e g m e n t w h o s e c a p a b i l -i t y is p r o v i d e d i n t h e c a l l f o r m t h e i n i ti a l c o n t e n ts . A n e m p t y s e g m e n t is m a d eb y s p e c if y i n g t h e n u l l c a p a b i l i t y .T h e e x e c u t i o n o f a c l u s te r c a n b e i n t e r r u p t e d b y s e n d i n g a s ig n a l . A s i g n a lc a u s e s a c l u s t e r t o f r e e z e i n i t s t r a c k s a n d i t s s t a t e t o b e s e n t t o a debug ger server.T o h a n d l e t h e s ig n a l , t h e d e b u g g e r c a n i n s p e c t a n d c h a n g e t h e s t a t e o f t h e c l u s-t e r b e f o r e a l l o w i n g i t t o c o n t i n u e e x e c u t io n . T h e s i g n a l - a n d e x c e p t i o n - h a n d l i n gm e c h a n i s m i s d e s c ri b e d i n 4 .

    T h e K e r n e l S e r v ic e t r a n s a c t i o m d e s c r ib e d a b o v e a r e p r o t e c t e d b y t h e n o r m a lc a p a b i l i ty - b a s e d p r o t ec t i o n m e c h a n i s m s o f t h e A m o e b a s y s te m : A n a p p l i c a t io nc a n o n l y c r e a t e c l u s te r s o n t h o se p r o c e ss o r s f o r w h i c h i t h a s a K e r n e l C a p a b i l i t y .A n u n a u t h o r i z e d u s e r c a n t h u s e as i ly b e p r e v e n t e d f r o m r u n n i n g c l us t er s o na n o t h e r u s e r 's w o r k s t a t i o n , f o r i n s ta n c e . S e g m e n t s a r e a l s o p r o t e c t e d w i t h t h ec a p a b i l i t y m e c h a n i s m . O n e u s e r 's p r i v a t e s e g m e n t c a n n o t b e m a p p e d b ya n o t h e r w i t h o u t e x p l i c i t p e r m i s s i o n . S i g n a l s c a n o n l y b e s e n t t o c lu s t e r s b y h o l d -e r s o f a n o w ne r c a p a b i l i t y f o r t h e c l u s t e r.

  • 8/3/2019 Mull en Der 87 Process

    9/14

    46

    T h e c a ll s w e a r e a b o u t t o d e sc r ib e d o n o t n e e d t h i s h e a v y - w e i g h t p r o t e c t i o nm e c h a n i s m , b e c a u s e t h e y o n l y a ff e c t t h e c l u s te r f r o m w h i c h t h e y a r e d o n e . T h et a s k m a n a g e m e n t c a ll s a n d t a s k s y n c h r o n i z a t i o n c a ll s c o u l d t h e r e fo r e b e s af e lyi m p l e m e n t e d a s ' r e a l ' s y s t e m ca l ls , w h i c h i s f o r t u n a t e , b e c a u s e t h e i r e ff i ci e n ti m p l e m e n t a t i o n is c r it ic a l t o p e r f o r m a n c e .

    A n e w t a s k i s c r e a t e d w i t h a M a k e T a s k s y s t e m c a ll . T h e p a r a m e t e r s a r e ap r o g r a m c o u n t e r a n d a s t a c k p o i n te r . T h e n e w t a s k w i ll s t a r t e x e c u t i o n a t t h ea d d r e s s in d i c a t e d b y t h e p r o g r a m c o u n t e r . A n e w t a s k c a n n o t b e s t a r te d i n t h em i d d l e o f a t r a n s a c t i o n ; r e g is t e rs a r e u n d e f i n e d . A t a s k c a n d e l e t e it s e lf b y a nE x i t T a s k c a l l .

    F o r s y n c h r o n i z a t io n , f o u r c a l l s a r e p r o v i d e d : P a n d V , o p e r a t i n g o n b i n a r ys e m a p h o r e s , a n d S l e e p a n d W a k e u p o n c o n d i t i o n v a r i a b l e s [ 3 ] . S l e e p p u t s a t a s kt o s l e e p a n d W a k e u p w a k e s u p e v e r y ta s k s l e e p in g o n th e c o n d i ti o n . T h e s e p r i m -i ti v e s a r e e s s e n t i a ll y t h e s a m e a s t h o s e i n t h e T o p a z d i s t r i b u t e d s y s t e m , a n d i tsp r e d e c es s o r , M e s a [ 4 ]. I n t h e n o r m a l c a s e ( n o c o n t e n t i o n f o r t h e s e m a p h o r e ) , Pa n d V e x e c u t e c o m p l e t e l y i n u s e r s p a c e . A s y s t e m c a l l o n P i s o n l y n e c e s s a r y i ft h e s e m a p h o r e h a s a l r e a d y b e e n a c q u i r e d b y a n o t h e r t a s k ; o n V , o n e is n e c e s s a ryo n l y i f a n o t h e r t a s k i s b l o c k e d w a i t i n g f o r it . W e s t o l e t h e i d e a f o r t h is o p t i m i z a -t io n f r o m T o p a z .

    4 . THE DEB Ut SERVERW h e n a n A m o e b a c l us t er t r a p s b e c a u s e o f a n e x c e p t io n , a d e b u g g e r is a u t o m a t i -c a l l y i n v o k e d . T h e D e b u g S e r v e r , a u s e r - s p a c e c l u s t e r w i t h n o s p e c i a l p r i v il e g e s ,c a n r e s id e o n t h e s a m e k e r n e l a s t h e f a u l t y c lu s t e r, b u t i t c a n a l s o b e r e m o t e .F o r r e m o t e d e b u g g i n g , h o w e v e r , s o m e h e l p f r o m t h e P r o c e ss S e r v e r is d e s i r a b le .I n t h is s e c t io n , w e s h a l l d e s c r i b e t h e m e c h a n i s m s f o r h a n d l i n g e x c e p t i o n s a n ds i g n a l s .

    E x c e p t i o n s a n d s i g n a ls a r e d i f f e re n t , b u t h a n d l e d i d e n t i c a ll y . A n exception ise s s e n t ia l l y a s y n c h r o n o u s e v e n t , c a u s e d b y a c l u s te r to it se lf . T y p i c a l e x c e p t i o n sa r e d i v i s io n b y z e r o , a d d r e s s in g n o n - e x i s te n t v i r t u a l m e m o r y , a t t e m p t i n g t o e x e -c u t e n o n - i n s t r u c t i o n s , e tc . A s ignal i s a n a s y n c h r o n o u s e v e n t , c a u s e d b y a s o u r c ee x t e r n a l t o a c l u s te r . S ig n aL s a r e t y p i c a l l y c a u s e d b y h u m a n s h i t t in g t h e interruptk e y o n t h e ir t e r m i n a l a n d t h e y a r e m e a n t t o te r m i n a t e e x e c u t i o n o f a c lu s te r , o ra t l e as t m a k e i t i n t e r r u p t i ts n o r m a t f lo w o f e x e c u t i o n . S i g n a ls p l a y a n i m p o r -t a n t r o l e in m i g r a t i o n , a s w e s h a l l s e e i n t h e n e x t s e c ti o n .

    S i g n a l s a n d e x c e p t i o n s i n t e r r u p t t h e e x e c u t i o n o f a c l u s te r . E x c e p t i o n s g e n -e r a l l y c a u s e a h a r d w a r e t r a p , w h i c h i s h a n d l e d b y t h e k e rn e l . S i m i l a r ly , s ig n a lsa l s o e n d u p i n t h e k e r n e l o n w h i c h t h e c l u s t e r e x e c u t e s . B o t h s ig n a l s a n d e x c e p -t i on s c a u s e t h e f o l lo w i n g t h i n g s t o h a p p e n :1 . A l l r u n n i n g t a s k s in t h e c l u s te r s t o p e x e c u t i o n . O n a m u l t i p r o c e s s o r , i t is n o t

    p o s s i b l e t o s to p a l l t a sk s a t o m i c l y ; h e r e , w e a t t e m p t t o s t o p t h e t a s k s asq u i c k l y a s p o s si b le .

    2 . A c t i v e t r a n s a c t i o n s a r e frozen: t h e t r a n s a c t i o n p r o t o c o l r e p l ie s t o i n c o m i n gm e s s a g e s w i t h a "t~y again la ter , th is c lus ter is froz en ' r e s p o n s e . T h i s w i ll c a u s e t h es e n d i n g p r o t o c o l e n t i t i e s to r e t r y s e n d i n g t h e s a m e m e s s a g e l a te r , r e p e a t e d l y ,

  • 8/3/2019 Mull en Der 87 Process

    10/14

    47

    w i t h o u t g i v i n g u p a s l o n g a s t h i s r e p l y i s g i v e n .3 . A c l u s t e r d e s c r i p t o r f o r t h e s i g n a l e d c l u s t e r is m a d e a n d t h e K e r n e l s e n d s aPleaseDebug r e q u e s t t o t h e s e r v e r w h o s e c a p a b i l i t y w a s i n t h e s ignal capabi l i ty

    f ie l d o f t h e c l u s te r d e s c r i p t o r w h e n t h e c l u s t e r w a s c r e a t e d .4 . T h e K e r n e l t h e n w a i ts fo r a r e p ly f r o m t h e D e b u g S e r ve r , w h i c h m a y c o n t a i na m o d i f i e d c l u s t e r d e s c r i p t o r . A f t e r i n c o r p o r a t i n g t h e m o d i f i c a t i o n s i n th es t a t e o f th e c l u s te r ,

    5 . T h e c l u s t e r r e s u m e s e x e c u t io n , p o s s i b l y i n a m o d i f i e d s t at e .O n g o i n g t r a n s a c t i o n s a r e o n l y f ro z e n i n a f e w w e l l - d e f i n e d s ta t e s: S e r v e r s c a n b ef r o z en w h i le w a i t i n g f o r a n i n c o m i n g r e q u e s t ( b u t n o t a f t e r th e r e q u e s t h a ss t a r t e d c o m i n g i n) , o r w h i le p r o c e s s in g a r e q u e s t ( b e t w e e n t h e c o m p l e t i o n o fgetreques t a n d t h e c a l l o f putreply) . C l i e n t s c a n o n l y b e f r o z e n b e t w e e n w h e n s e n d -h a g t h e r e q u e s t h a s c o m p l e t e d a n d t h e r e p l y s ta r ts c o m i n g i n. F u r t h e r , c l i e n ts o rs e r v e r s c a n n o t b e f r o z e n w h i l e t h e p r o t o c o l i s w a i t i n g f o r a n a c k n o w l e d g e m e n t .C l u s t e r s t h a t a r e n e i t h e r c l i e n t n o r s e r v e r ( i . e . , i n b e t w e e n t r a n s a c t i o n s ) c a na l w a y s b e f ro z e n. N o t e t h a t t r a n s a c t i o n s t h u s c a n n o t b e f r o z e n if m e s s a g e s m a yh a v e t o b e r e t r a n s m i t t e d ( w a i t i n g f o r a n a c k n o w l e d g e m e n t ) . N o t e a ls o t h a t t h et i m e s d u r i n g w h i c h t r a n s a c t io n s m a y n o t b e f r o z e n ~ r e b o u n d e d i n l e n g t h ( b ym a x i m u m n u m b e r o f r et ra n sm i ss io n s, m a x i m u m n u m b e r o f p a c k e ts i n a m e s -s a ge , re t r a m m i s s i o n ti m e a n d m a x i m u m p a c k e t l ife t im e ) a n d a r e g e n e r a l lyshor t .

    T h e r e p l i e s t h e D e b u g S e r v e r c a n g i v e t o t h e Pl eas eD ebug r e q u e s t a r e cont inueo r de l e t e . The f o r m e r a l lo w s t h e c l u s t e r to c o n t i n u e e x e c u t i o n ; i f a m o d i f i e d c lu s -t e r d e s c r i p t o r a c c o m p a n i e s t h e r e p l y , t h e s t a te o f th e c l u s t e r is fi rs t a d a p t e d .T h e l a t t e r d o e s n o t r e s t a r t t h e c l u s t e r b u t d e l e t e s i t .

    5 . l ~ o c E s s M ~ ^ Q E M E r c rM o s t p r o ce s se s w i ll b e c r e a te d a n d m a n a g e d b y a c o m m a n d i n te r p re t e r, b u t a n yo t h e r p r o c e s s m a y a l s o c r e a t e n e w o n e s . A l l t h a t is r e q u i r e d is t h e c a p a b i l i t yt h a t a l lo w s c o m m u n i c a t i o n w i t h a n A m o e b a K e r n e l . M o s t u s e rs w i ll h a v e a c c e s st o t h e c l u s t e r c r e a t i o n c a p a b i l i t y f o r t h e A m o e b a K e r n e l r u n n i n g o n t h e i r o w nw o r k s t a t i o n ; t h a t is , u s er s c a n c r e a t e n e w p r o ce s s es o n t h e i r o w n w o r k s t a t io n .

    T h e c a p a b i l it i e s f o r c r e a t in g p r o c es s e s o n p o o l p r o c e s s o rs w il l t y p i c a l l y b e k e p tb y a " P r o c e s s o r P o o l " s er v ic e t h a t w i ll a c t a s a n a g e n t f o r r u n n i n g p r o g r a m s o nb e h a l f o f u s e r p r oc e ss e s. L o a d b a l a n c i n g c a n b e a c h i e v e d b y t h e P r o c e s s o r P o o ls e r v i c e w h e n i t a l lo c a t e s p o o l p r o c e s s o r s j u d i c i o u s l y .5.1. Migrat ionA l t h o u g h c l u s t e r s r a r e l y m o v e t o a n e w h o s t a f t e r b e i n g s t a r t e d u p , m i g r a t i o n i sa c e n t r a l c o n c e p t i n t h e A m o e b a p ro c e ss m a n a g e m e n t m e c h a n is m s . T h i s isb e c a m e l o a d i n g n e w c lu s te rs in t o m e m o r y , t a k in g c o re d u m p s , m a k i n g c h e c k -p o i n ts , a n d d o i n g r e m o t e d e b u g g i n g a r e a l l s i m i l a r t o m i g r a t i n g a c lu s te r . I nf a c t, if w e c a n m i g r a t e a c l u s t er f r o m o n e m a c h i n e t o a n o t h e r, d o w n l o a d i n g ,c h e c k p o i n t i n g , d e b u g g i n g , e t c . , s h o u l d b e s i m p l e .

  • 8/3/2019 Mull en Der 87 Process

    11/14

    48

    Load balancing by migrating cluster is a poorly understood area and it isdubious whether it is very useful with the current sort of workstations and net-works. Migrat ing a five megaby te cluster, for instance, will take at least sevenseconds, because that is how long it takes a fast transport protocol to copy thememory contents over a 10 Mbit Ethernet; five megabyte programs are not atall uncommon , especially as candidates for migration: long-lived clusters areusually large too. Migration is thus rathe r expensive, and the gain of a migrateoperation must be big in order to merit one.

    In spite of this, we believe that migration can be useful. When a workstation'sowner logs off in the evening, the workstation can turn itself into a Pool Proces-sor and provide process-execution service to the rest of the system. When theowner returns in the morning, however, and logs back on, the guest clusters run-ning there could be n udged off by migrating them away to some other worksta-tion [ 10].

    Th e kernels implement the cluster migration mechanism. The y do not imple-ment a policy; the decision to migrate a cluster and where to migrate it is madein a higher level of service. We shall not go into how tlais decision is made. Theprocess that orders the migration will be called the Process Server.

    When a cluster moves from one machine to another, the kernel at the oldmach ine makes the memory contents and cluster descriptor of the cluster avail-able to the kernel on the new machine. The kernel on the new machine loadsthe cluster into memory and starts it off. We will call these kernels Old Hos t andNew Hos t . We will examine the migration cluster from the point where ProcessServer has decided to migrate the cluster to N ew Host . Process Server has to setthings up to handle the cluster's signals as the cluster's debugger.First, Process Server sends a signal to the cluster, which causes Old Host tofreeze it in its tracks and send a cluster descriptor to Process Server (Process Serveracts as the debugger for this cluster). Then, Process Server sends the clusterdescriptor to N e w H o st in a RunCluster request.

    New Hos t , when it receives the RunCluster request creates the necessary seg-ments, initializes them by sending Seg_Read requests to Old Hos t and maps theminto the new cluster's address space. With both client (New Hos t ) and server (OldHost) can send and receive directly out of mapped memory; a cluster's memorycontents can thus be copied over an Ethernet at speeds well above half a mega-byte per second.When all the segment contents have been copied, N e w H o s t starts the clusterand sends a reply to Process Server containing the new cluster's capability. ProcessServer then deletes the old cluster with a DeleteCluster request to O ld Host.

    Note that, while migration was in progress, the cluster existed on its old hostin 'frozen' condition. The kernel thus replied to all messages for the frozen clus-ter with a "t~yag ain later, this cluster is frozen" message. After the cluster has beendeleted, those messages will come in again at some point, and the kernel willthen reply with something like 'this port is unknown at this address." The sender willthen do a locate operation to find the new whereabouts of the cluster, and com-munication will be re-established.

  • 8/3/2019 Mull en Der 87 Process

    12/14

    49

    The protocol for dealing with message transactions during migration is moresubtle than described here, but would take too muc h space to describe fully. Topreserve the at-most -once semantics of Amoeba message transactions, client andserver need to use unique communication ports so that the locate operation can-not yield the address of the wrong server, for instance.

    6. EMULATIONSERVICEOne of the most important applications of the rathe r general mechanisms forhandling signals, traps and exceptions in the previous section is that it allows theemulation of any operating system environment. Amoe ba was developed in aUmx environment, which is why we have concentrated on UNIX emulation, b utthere is no reason why any other operating system interface could not be emu-lated.We have implemented two forms of UNIX emulation: by intercept ing the sys-tem calls at the level of the C source code, or a t the level of the system call.The former is simpler to realize and--combined with tailored supportingservices--gives adequa te performance. The latter is more complicated, but itcan be used to provide binary compat ib i l i ty : binaries that run under ordinary UNIXcan be made to run under Amoeba without changing a single bit.

    We have done both under a previous version of the Amoeba Kernel. Thelibrary for UNIX emulat ion at the source-code level will remain prac tical lyunchanged under the new process management dominion. The Kernel versionthat UNIX emula tion runs on now mainta ins a table o f {task c apab i l i t y , e m u la to rc a p a b i l i t y } pairs. When a task traps, and an entry is found in the table, theregisters (PC, SP, PS W and general purpose registers) and the address of theinterrupt vector are sent to the emulator. The emula tor uses transactions withthe S e g m e n t S e r v e r (a server for reading and writing memory which will bereplaced by the Process Server under the new process management regime) toget at the memor y contents of the cluster. It returns new values for the registersto the kernel. The emula tor itself runs on UNIx, which was modified to allowdoing transactions. The emulator interprets the system calls given to it by doingthem on Umx and passing the results back.

    Both in this scheme and the new one, the Amoeba Kernel has no knowledgewhatsoever of UNIX system calls. It merely invokes the debugger when a usertask traps. Th e differences between the working system and the one we areimplementing are the following: In the old one, processes to be emulated arecreated through the emula tor which keeps track of most of its state; the stategiven to the emul ato r consists of just the registers. In the new scheme, the statewill be the cluster descriptor. Clusters to be emulated need not be created bythe emulator. In the old scheme, memory is read through transactions with theSegment Server. In the new one, memor y can be read and written directly bythe emulator, because it is mapped into its own address space.

    When we have some experience with this arrangement, we will decide if thisnew pat h t hrough the Kernel to the UNIX emulator is too long. If so, we shallhave to cons truct a representation of a l i g h t . w e i g h t s t a t e that can be given to the

  • 8/3/2019 Mull en Der 87 Process

    13/14

    50

    e m u l a t o r in s t e a d o f t h e c u r r e n t , r a t h e r h e a v y - w e i g h t , c l u s t e r d e s c r ip t o r . I n a n yc a se , th e e m u l a t o r w i ll h a v e t h e e m u l a t e d c l u s te r 's m e m o r y m a p p e d i n to i ts o w na d d r e s s s p a c e a s w e l l , p r o v i d i n g v e r y e f f ic i en t m e m o r y a c c e ss .

    7 . CONCLUSIONST h i s p a p e r r e v e a l s t h e t i p o f a n i c e b e r g . B u i l d i n g a c o h e r e n t s et o f p r i m i t i v e s f o rp r o c e s s m a n a g e m e n t t h a t i n c l u d e s m i g r a t i o n o f c lu st er s, c h e c k p o i n t i n g , d e b u g -g i n g a n d e m u l a t i o n o f a r b i t r a r y o p e r a t i n g s y s t e m i n t c r f ac c s i n v o l v e s v e r y c a r e fu ld e s i g n , n o t o n l y o f t h e m e c h a n i s m s t h a t d e a l w i t h p r o c e s s m a n a g e m e n t d ir ec tl y,b u t a l s o w i t h a l l o f t h e s u r r o u n d i n g c n v i r o n m c n t . I n t h i s s e c t io n , w e s h a l la t t c m p t t o lif t u t s o m e o f t h e d e s i g n c o n s i d e r a t i o n s t h a t m a d e it p o s s i b l c f o r u st o d e s i g n t h e s y s t e m a s w e d i d .

    T h e A m o e b a K e r n e l p r o v i de s a m i n i m u m o f f u nc ti on s : p r oc e ss m a n a g e m e n ta n d i n te r p ro c e s s c o m m u n i c a t i o n . T h e r e i s t h us a l so a m i n i m u m a m o u n t o f s t a tet h a t h a s t o m i g r a t e w h e n a c l u s te r m i g r a t e s . W e b e l i e v e t h a t t h i s w a s o n e o f t h ee s se n ti al c h o ic e s t h a t m a d e o u r m e c h a n i s m s w o r k. T h i n g s w o u l d h a v e b e e nm u c h m o r e d i f f i c u l t i f w e h a d t o d e a l w i t h t h i n g s l i k e ' o p e n f i le s t a t e , ' ' c o n t r o l -l in g t e r m i n a l s ' o r t h e c o m p l i c a t e d c o n n e c t i o n s t a t e o f a s li d i n g -w i n d o w p r o t o c o l.

    T h e A m o e b a i n t e r p r o c e s s c o m m u n i c a t i o n m e c h a n i s m h a s a l s o b e e n v i t a l t ot h e s u cc e ss o f o u r d e s i g n. F i rs t, t h e c o m m u n i c a t i n g e n ti ti e s a r e n a m e d u s i n g al o c a ti o n - in d e p e n d e n t n a m i n g m e c h a n i s m t h a t u s es a n u n d e r l y i n g locate se rv i ce t of in d o u t d y n a m i c a l l y w h e r e t h e p a c k e t s h a v e t o b e s en t. N o n e o f t h e m i g r a t i o na p p a r a t u s h a s t o w o r r y a b o u t r e r o u t i n g m e s s a g e s , n o f o r w a r d i n g a d d r e s s e s h a v et o b e l e f t b e h i n d ; [ 9 ] e- x-h os ts c a n f o r g e t a b o u t t h e e x i s te n c e o f a c l u s t e r i m m e d i -a t e l y a f t e r m i g r a t i o n is c o m p l e t e .

    S e c o n d , t h e s i m p l i c it y o f t h e A m o e b a p r o to c o l s c o n t r i b u t e e n o r m o u s l y t o t h ep o r t a b i l i t y o f c lu s te r s. T h e p r o t o c o l h a s o n l y a f e w s t a te s i n w h i c h i t c a n s t a yf o r a r b i t r a r y l e n g th s o f t i m e a n d i t is r e l a ti v e l y e a s y t o m i g r a t e a c l u s te r i n t h e ses t a t e s u s i n g t h e "I 'm f roze n , don ' t bo ther me" m e s s a g e s d e s c r i b e d e a r li e r. W h e n t h ep r o t o c o l is i n a n y o f t h e o t h e r s t at e s, t h e A m o e b a K e r n e l c a n w a i t u n t il t h e p r o -t o c o l r e a c h e s a ' m i g r a t a b l e ' o ne .

    T h e m o s t i m p o r t a n t c o n c l u s i o n w e h a v e d r a w n f r o m t h i s d e s i g n - - w h i c h i strd ll b e i n g i m p l e m e n t e d - - i s t h a t i t is p o s s ib l e to b u i l d a s i m p l e m e c h a n i s m t h a ti s s u f f ic i e n t t o r e a l i z e d o w n l o a d i n g , m i g r a t i o n , e x c e p t i o n h a n d l i n g , c h e c k p o i n t -h ag , e m u l a t i o n a n d d e b u g g i n g . A l t h o u g h t h e i m p l e m e n t a t i o n is n o t c o m p l e t e a tt h e t i m e o f w r i t i n g t h is p a p e r , w e e x p e c t t o f in is h s o o n e n o u g h t o p r e s e n t p e r f o r -m a n c e i n f o rm a t i o n a t th e S O S P c o n fe r en c e .

    8. A C K N O W L E D G E M E N T SJ a c k J a n s e n ( w h o i s i m p l e m e n t i n g t h e s y s t e m d is cu s se d i n t h is p a p e r ) , R o b b e r tv a n R e n e s s e a n d I h a v e h a d m a n y d i s c u s s i o n s a b o u t t h e A m o e b a p r o c e s sm a n a g e m e n t d e s i g n w h i c h i m p r o v e d i t c o ns i d er a b ly .

    D a v e R e d e l l a n d L u c iU e G l a s s m a n r e a d t h e d r a ft o f t hi s p a p e r a n d s u g g es te dn u m e r o u s c h a n g e s t o m a k e i t r e a d a b le .

  • 8/3/2019 Mull en Der 87 Process

    14/14

    51

    REFERENCES1. D. R . CIHERrroN AND W . ZWAENEPOEL, (O c tobe r 1983). Th e Di s t r ibu ted VKernel and i t s Performance for Diskless Worksta t ions, Proc. Nin th A C M Symp.on Operating System s Principles, 128-140.2 . D . R . C HE m TON ~ a n u a r y 1 98 7) . V M T P : Ve r sa t il e Me ssa g e T r a n sa c t i o nProtocol , Stanford University Com puter Science D ept. Report.3. C.A .R. HOA RE, (Au gust 1978). Co m m un icat ing Seq uent ia l Processes , Com-munications of the A C M , 21.8, 666-677.4 . B .W. LAM ~ON AND D.D. RED ELL, (Feb rua ry 1980). Exper i ence wi thProcesses and Moni tors in Mesa , Communications o f the A C M , 23.2, 105-117.5 . S .J . MULLENDERAND A. S . TAr~ENBAUM, (1984). Pro tectio n a nd R eso urc eCont ro l i n D i s t ribu ted O pera t ing Systems , Computer Netw orks, 8.5,6, 421-432.6 . S .J . MULL~rOER AND R . VAN RENESS~, (Se ptem ber 1984) . A Secu re H igh-Speed Transac t ion Pro toco l , Proceedings o f the Cam bridge E U U G Conference.7. S . J . MULLENDER, (O ctob er 1985) . Principles of Distributed Operating SystemDesign: S i C , A m s t e r d a m .8. S. J. MULLENDER AND A. S. TANEm3AtrM, (1986). T he Des ign o f aCapab i l i t y -Based Dis t r ibu ted Opera t ing Sys tem, The Computer Journal, 29.4,289-300.9 . M. L. POWELL AND B. P. M ILL ER , (1983). Process M igra t ion in .D E M O S / M P , Proc. Ninth Syrup. Operating Syst. P rin ., 110- i 19, ACM .10. M . M . THEnVmR, K . A . I .~l~rrz, AND D. R . C~E~rroN, (D ec em be r 1985).Preemptab le Remote Execu t ion Fac i l i t i e s fo r t he V-Sys tem, Proceedings of thelO th Symposium on O perating System s Principles, 2-12.11. R . W . WATSON AND J. G. FLETCHER, (F eb ru ary 1980). A n ar chite cture forSup por t o f Ne twork O pera t ing Sys tem Services , Computer Networks, 4.1, 33-49.12. E . Z^YAS, (N ov em ber 1987) . A t tacking the Process M igra t ion Bot t leneck,Proc. l l th SOSP, 13-24.