trustless grid computing in concert (progress report)
DESCRIPTION
Trustless Grid Computing in ConCert (Progress Report). Robert Harper Carnegie Mellon University. Acknowledgements. Co-PI’s Karl Crary, Frank Pfenning, Peter Lee. Support NSF ITR program. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/1.jpg)
Trustless Grid Computing in ConCert
(Progress Report)Robert Harper
Carnegie Mellon University
![Page 2: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/2.jpg)
Acknowledgements
• Co-PI’sKarl Crary, Frank Pfenning, Peter Lee.
• SupportNSF ITR program.
• Students (who do the real work)Chang, Delap, Dreyer, Kliger , Magill, Moody, Murphy, Petersen, Sarkar, Vanderwaart, Watkins.
• Thanks to FGC Organizers for the invitation!
![Page 3: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/3.jpg)
Grid Computing
• “The network is a computer.”– Exploit idle resources on the network.– Many ad hoc grids.
• SETI@HOME• FOLDING@HOME
• But what is a general grid model?– Trust model, programming model,
participation model?
![Page 4: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/4.jpg)
Application Model
• What is the (a?) grid computer?– Parallelism?– Dependencies?– Sharing resources?– Failures?
• Centralized vs. distributed.– Bottlenecks (e.g., SETI traffic at UCB).– Reliability, robustness.
![Page 5: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/5.jpg)
Application Model
• Most grid app’s are massively parallel.– Depth = 1, no dependencies.– Ray tracing, GIMPS, SETI.
• Is a grid useful for depth > 1?– Game-tree search.– Theorem proving.
• Is parallelism the only benefit?– What about data locality?
![Page 6: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/6.jpg)
Host Model
• Active intervention required.– Must download code, apply upgrades.– Must decide on which grids to participate.
• Motivation to participate?– At scale, largely altruism, coolness.– Ad hoc grids on an intranet.– Economic models? (Cf Lillibridge, et al.)
![Page 7: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/7.jpg)
Trust Relationships
• Hosts trust applications.– Denial of service attacks.– Privacy/secrecy attacks.– Accidental misbehavior (e.g., SETI).
• Applications trust hosts.– Spoofed answers.– Collusion among participants.
• Can we minimize these?
![Page 8: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/8.jpg)
The ConCert Approach
• One computer, many keyboards.– Decentralized scheduling.– Emphasis on code mobility.
• Policy-based participation.– Declarative statement of participation criteria.– Applications must prove compliance.
• Dependency-based scheduling.– Arbitrary depth.– And/or dependencies.– Inspired by CILK/NOW.
![Page 9: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/9.jpg)
The ConCert Network
ClientHosts
![Page 10: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/10.jpg)
Host Setup
Locator
Scheduler
Worker
Peer-to-Peer Discovery Protocol
Distributed Scheduler
Loader/Verifier/Runner
![Page 11: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/11.jpg)
Scheduler
• Maintain ready and waiting queues.– Ready queue: available for “stealing”.– Wait queue: awaiting satisfying assignment.
• Work-stealing model.– Who has work to do?– Grab work, compute result, deliver to owner.
• Dependencies.– Supports depth > 1 parallelism.– Don’t care and don’t know parallelism.
![Page 12: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/12.jpg)
Scheduler
• The unit of work on the grid is a cord.
![Page 13: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/13.jpg)
Scheduler
• Cord structure:– Code: cached using MD5 fingerprints.– Certificate of compliance: (more later).– Dependencies: positive boolean formula.
• Assumptions:– Idempotent: can always be re-run.– Non-blocking: runs to completion (but may create
more cords, often as continuations).– Communication only via dependencies. Satisfying
assignment passed on activation.
![Page 14: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/14.jpg)
Worker
• Steal work from (self or) neighbor.• Obtain cord from host.
– Typically arguments + dependencies.– Code shipped at most once.
• Verify certificate of compliance.• Load and execute as a DLL.
– Currently combined with verification.– Should verify at most once (cache result).
• Deliver result to owner.
![Page 15: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/15.jpg)
Control
• Client.– Submit a job to the grid.– “One per keyboard.”
• Monitor.– Web server interface.– Displays cord status.– [Change policy.]
![Page 16: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/16.jpg)
Moving Cords Around
A client submits work, broken into cords, to the local conductor.
![Page 17: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/17.jpg)
Idle peers steal cords to work on.
Cords have destinations for their answers, shown by color here.
Moving Cords Around
![Page 18: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/18.jpg)
Moving Cords Around
Some cords spawn new cords. They might depend on other cords before they can run.
The destination of F and G is the green node, since they will be used to fill H’s dependencies.
![Page 19: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/19.jpg)
Moving Cords Around
When a cord finishes, the result is sent to its destination. The client interprets and displays the results.
Simultaneously, unfinished cords continue to be stolen...
![Page 20: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/20.jpg)
Moving Cords Around
When the green node has answers for F and G, H is then ready to be stolen.
![Page 21: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/21.jpg)
Popcorn/Grid Model
• my_cord: string £ witness ! string.– Marshals argument and result itself.– Witness is the satisfying assignment for its
dependencies.
• Typical structure:– Input = entry point + arguments.– Dispatch on entry point.
• Cords as distributed continuations.– Perform some work, spawn new cords.– Supports various higher-level parallelism models.
![Page 22: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/22.jpg)
ML/Grid Model
• One program for client and its cords.– Compiler separates client from cords.
• Compiler handles marshalling.
• Run-time checks enforce distinctions (more later).– Cord cannot perform I/O.– Client cannot submit itself as a cord.
• Compiles to TAL/Grid.
![Page 23: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/23.jpg)
ML/Grid Model
• Primitives:– spawn : (unit ! ) ! task– sync : task ! – relax : task list ! £ task list
• Must be provided as primitives.– Requires access to representations.
• Further higher-level libraries.– E.g., parallelism models.
![Page 24: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/24.jpg)
Examples
• GML ray-tracer (ICFP01 Contest).– Depth = 1.– Written in Popcorn/Grid, compiles to TALx86/Grid.
• Chess player.– Depth > 1, and-or dependencies.– Written in Popcorn/Grid, compiles to TALx86/Grid.
• Theorem prover for MLL.– Depth > 1, and-or dependencies.– Written in SML, runs on simulator.– Being ported to ML/Grid.
![Page 25: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/25.jpg)
Some Problems
• Failures.– Fail-stop model is easily supported.– Demonic failures require result certification.
• Abandoning cords.– Or-dependencies are satisfied by first cord to deliver
answer.– Parent must be prepared to receive result long after it
is no longer needed.
• Sharing results.– Grid-wide cache of answers?
![Page 26: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/26.jpg)
Result Certification
• Main idea: make host prove validity of answer.– Avoid need for application to trust hosts.
• Some applications admit native certification.– For theorem prover: the proof.– For factoring, the facts.
• Are there general result certification methods?– Work-stealing model precludes random allocation /
redundancy methods (SETI, Bayanihan).– Centralized methods are not robust or scalable.
![Page 27: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/27.jpg)
Result Certification
• A crazy idea: use the PCP theorem.– Use interactive dialog to spot-check a proof.
• Host proves that it ran given code on given data.– Execution trace is a proof that it did.– But traces can be huge!
• Engage in a dialog with O(1) rounds to check proof with high probability.– Avoids need to transmit trace itself.– But the representation is enormous!
![Page 28: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/28.jpg)
Two Foundational Questions
• What is a type system for a GPL?– Enforce mobility constraints.– Clean type system to support development,
compilation, certification.
• What policies can we support?– How to state policies?– How to prove compliance?– How to support multiple policies?
![Page 29: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/29.jpg)
A Type System for GPL
• Main idea: modalities for mobility.– Cf. related ideas by Cardelli, Gordon, et al.– Cf. recent work by Walker.– Here: Curry-Howard applied to modal logic.
• Necessity (¤ A): a computation of A anywhere.– Classifies mobile code of type A.– Enforces marshalling and access restrictions.
• Possibility (¦ A): a computation of A somewhere.– Classifies remote code of type A.– Ensures that access is limited to remote values.
![Page 30: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/30.jpg)
Necessity for Mobility
• Truth (local) typing judgement:
Valid (Mobile) Bindings
True (Local) Bindings
![Page 31: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/31.jpg)
Necessity for Mobility
• Validity (mobile) typing judgement:
• Mobile = does not use local resources.
![Page 32: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/32.jpg)
Necessity for Mobility
• Box = marshal value and bindings.
• Values of boxed type are mobile.
![Page 33: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/33.jpg)
Necessity for Mobility
• Unboxing = unbox and run mobile code.
• Implicit un-marshalling:
![Page 34: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/34.jpg)
Necessity for Mobility
• Marshalling = cast into network form.– Base types, structured types: fairly typical.– Function types: certified binary.
• Code mobility is a form of semantic linking.– Import object from the network.– Un-marshall, verify, load, execute.– (More later.)
![Page 35: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/35.jpg)
Possibility for Locality
• Possible (somewhere) typing judgement:
• What is here is somewhere:
![Page 36: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/36.jpg)
Possibility for Locality
• Create a local reference to something somewhere:
![Page 37: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/37.jpg)
Possibility for Locality
• Move to remote entity:
• May be useful for managing data locality.– Return call has type ¦¤ (A! B).– Cf “upcalls”.
![Page 38: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/38.jpg)
Modalities for Mobility
• These rules are for S4 modal logic.– Accessibility is reflexive and transitive.
• Is this the right notion of accessibility?– Symmetry = S5. “You can go home again.”– Judgmental form requires three contexts.– Explicit-world form uses a record of contexts.
• Other varieties of modal logic are also under consideration.
![Page 39: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/39.jpg)
Policies and Certification
• Current certification methods are uniform.– 9 sec. policy 8 problems safety is assured.
• Eg, PCC for Java• Eg, TAL for Popcorn.
– Safety means memory and type safety.• Baseline requirement.• But not adequate for all applications.
• Recall: policies should be per-host.
![Page 40: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/40.jpg)
Foundational Certification
• Non-uniform setup: 8 probs 9 type system– Shift the type system for object code out of
the TCB (untrusted, problem-specific).– Must provide a proof that type system is safe.
• Compare Appel, et al.– Their goal: minimize TCB.– Our goal: support multiple safety policies.– Could be consolidated, but it’s a lot of work.
![Page 41: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/41.jpg)
Foundational Safety
• Host specifies target architecture.– Fully realistic, e.g., IA-32 + OS + RTS.– No unsafe transitions.
• Safety policy: target does not get stuck.– Any type system must come with a proof of
progress relative to the target machine.– Experience shows that progress proofs are
readily mechanizable.
![Page 42: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/42.jpg)
Foundational Certification (I)
![Page 43: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/43.jpg)
Foundational Certification (I)
• Object code is essentially a DLL.
• Type system is specified in LF.– Using typical LF representations.
• Safety proof: well-typed ) safe.– Represented as an LF term.– Obtained with Twelf proof search engine.
• Derivation: type annotations for code.– Makes mechanical checking feasible.
![Page 44: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/44.jpg)
Foundational Certification (I)
• May cache type system and safety proof.– Reduces certificate size.– Many cords for one type system is typical.
• May use oracle strings for derivation.– Relies on details of operational behavior of
host-side checker.– Therefore not completely declarative.– But significantly reduces certificate size.
![Page 45: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/45.jpg)
Foundational Certification (II)
![Page 46: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/46.jpg)
Foundational Certification (II)
• Object code is a DLL as before.• Type checker is a program.
– Currently, a Twelf logic program.– Could be ML code.
• Safety proof shows partial correctness of the checker.– Checking succeeds ) safety.
• Annotations support mechanical checking.• Time limit precludes looping.
– Can refuse if limit is too large.
![Page 47: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/47.jpg)
Examples
• TALT– Essentially TALx86 with a safety proof.– Proof is mechanically derived and checked.– Structured as a safety proof for an abstract
machine plus a simulation lemma for target.
• TALT + Resource Bounds– Goal: ensure that object code yields
processor at set intervals.– Precludes denial of CPU service.
![Page 48: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/48.jpg)
Resource Bound Certification
• Type system enforces upper bound on yield interval.– Specified as a parameter of the type system.
• Basic method:– Conservative instruction counting (join points).– Yield processor at start of every basic block.– Prove that block can complete before next
yield (else split block).
![Page 49: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/49.jpg)
Resource Bound Certification
• Smarter techniques are under development.– Better analysis of code behavior across calls.– Fewer yields overall.
• Run-time checks reduce overhead.– Use static analysis to insert minor yields that
check true interval.– Minor yields re-calibrate, possibly incurring a
major yield (system call).
![Page 50: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/50.jpg)
A Meta-Grid?
• ConCert Conductor represents one model of grid computing.– Compute-intensive, distributed scheduling.– Not much reason to believe this is canonical.
• Can we support a variety of models inside of a single meta-grid?– Applications choose grid model.– Hosts are indifferent to programming model.
![Page 51: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/51.jpg)
A Meta-Grid?
• The ur-grid:– A TCP port.– Foundational code certification.
• A grid framework:– Scheduler, recovery model, host policy.– Runs application cords.
![Page 52: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/52.jpg)
A Meta-Grid?
• Key capability: safe dynamic loading and linking.– Current ConCert framework must be certified
against host safety policy.– It must be able to load application policies and
application code.
• Requires a fairly sophisticated theory of sage linking.
![Page 53: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/53.jpg)
Semantic Linking
• Marshalling is meta-programming.– Create values of a grid type system.– Cast grid values as local values.
• Certification is how we marshal code.– Functions are marshalled as closures plus
proof of compliance with host type system.– Ensures that cast will succeed, safely.
• The ur-grid is just an unmarshaller.• Grid frameworks are meta-programs.
![Page 54: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/54.jpg)
Summary
• Declarative approach to safe grids.– Passive, policy-based participation model.– Logic and proof technology for specifying
policies and proving compliance.
• Close interplay between systems building and foundational theory.– Type systems for mobile code.– Type systems for various safety policies.
![Page 55: Trustless Grid Computing in ConCert (Progress Report)](https://reader033.vdocuments.us/reader033/viewer/2022052701/56813611550346895d9d8953/html5/thumbnails/55.jpg)
Thanks!
• Web site: http://www.cs.cmu.edu/~concert.
• Demonstration available after talk.
• Questions or comments?