languages and compilers (sprog og oversættere) concurrency and distribution

Download Languages and Compilers (SProg og Oversættere) Concurrency and distribution

If you can't read please download the document

Upload: evania

Post on 08-Jan-2016

22 views

Category:

Documents


0 download

DESCRIPTION

Languages and Compilers (SProg og Oversættere) Concurrency and distribution. Bent Thomsen Department of Computer Science Aalborg University. With acknowledgement to John Mitchell whose slides this lecture is based on. Concurrency, distributed computing, the Internet. Traditional view: - PowerPoint PPT Presentation

TRANSCRIPT

  • Languages and Compilers(SProg og Oversttere)

    Concurrency and distributionBent ThomsenDepartment of Computer ScienceAalborg UniversityWith acknowledgement to John Mitchell whose slides this lecture is based on.

  • Concurrency, distributed computing, the InternetTraditional view:Let the OS deal with this=> It is not a programming language issue!End of LectureWait-a-minute Maybe the traditional view is getting out of date?

  • Languages with concurrency constructsMaybe the traditional view was always out of date?SimulaModula3OccamConcurrent PascalADALindaCMLFacileJo-CamlJavaC#

  • Categories of Concurrency:Physical concurrency - Multiple independent processors ( multiple threads of control)Uni-processor with I/O channels (multi-programming)Multiple CPU (parallel programming)Network of uni- or multi- CPU machines (distributed programming)Logical concurrency - The appearance of physical concurrency is presented by time-sharing one processor (software can be designed as if there were multiple threads of control)Concurrency as a programming abstraction

    Def: A thread of control in a program is the sequence of program points reached as control flows through the program

  • IntroductionReasons to Study ConcurrencyIt involves a different way of designing software that can be very usefulmany real-world situations involve concurrencyControl programsSimulationsClient/ServersMobile computingGames2. Computers capable of physical concurrency are now widely usedHigh-end serversGame consolesGrid computing

  • The promise of concurrencySpeedIf a task takes time t on one processor, shouldnt it take time t/n on n processors?AvailabilityIf one process is busy, another may be ready to helpDistributionProcessors in different locations can collaborate to solve a problem or work together Humans do it so why cant computers?Vision, cognition appear to be highly parallel activities

  • ChallengesConcurrent programs are harder to get rightFolklore: Need an order of magnitude speedup (or more) to be worth the effort Some problems are inherently sequentialTheory circuit evaluation is P-completePractice many problems need coordination and communication among sub-problemsSpecific issuesCommunication send or receive informationSynchronization wait for another process to actAtomicity do not stop in the middle and leave a mess

  • Why is concurrent programming hard?NondeterminismDeterministic: two executions on the same input it always produce the same outputNondeterministic: two executions on the same input may produce different outputWhy does this cause difficulty?May be many possible executions of one systemHard to think of all the possibilitiesHard to test program since some may occur infrequently

  • Traditional C Library for concurrencySystem Calls- fork( )- wait( )- pipe( )- write( )- read( )Examples

  • Process CreationFork( )NAMEfork() create a new processSYNOPSIS# include # include pid_t fork(void)RETURN VALUEsuccessparent- child pidchild- 0failure-1

  • Fork()- program structure#include #include #include Main(){pid_t pid;if((pid = fork())>0){/* parent */}else if ((pid==0){/*child*/}else {/* cannot fork*}exit(0);}

  • Wait() system callWait()- wait for the process whose pid reference is passed to finish executingSYNOPSIS#include#includepid_t wait(int *stat)loc)The unsigned decimal integer process ID for which to waitRETURN VALUEsuccess- child pidfailure- -1 and errno is set

  • Wait()- program structure#include #include #include #include Main(int argc, char* argv[]){pid_t childPID;if((childPID = fork())==0){/*child*/}else {/* parent* wait(0);}exit(0);}

  • Pipe() system callPipe()- to create a read-write pipe that may later be used to communicate with a process well fork off.SYNOPSISInt pipe(pfd)int pfd[2];PARAMETER Pfd is an array of 2 integers, which that will be used to save the two file descriptors used to access the pipe RETURN VALUE: 0 success; -1 error.

  • Pipe() - structure/* first, define an array to store the two file descriptors*/ Int pipes[2]; /* now, create the pipe*/ int rc = pipe (pipes); if(rc = = -1) { /* pipe() failed*/ Perror(pipe); exit(1); }If the call to pipe() succeeded, a pipe will be created, pipes[0] will contain the number of its read file descriptor, and pipes[1] will contain the number of its write file descriptor.

  • Write() system callWrite() used to write data to a file or other object identified by a file descriptor.SYNOPSIS#include Size_t write(int fildes, const void * buf, size_t nbyte);PARAMETERfildes is the file descriptor,buf is the base address of area of memory that data is copied from,nbyte is the amount of data to copyRETURN VALUEThe return value is the actual amount of data written, if this differs from nbyte then something has gone wrong

  • Read() system callRead() read data from a file or other object identified by a file descriptorSYNOPSIS#include Size_t read(int fildes, void *buf, size_t nbyte);ARGUMENTfildes is the file descriptor,buf is the base address of the memory area into which the data is read, nbyte is the maximum amount of data to read.RETURN VALUEThe actual amount of data read from the file. The pointer is incremented by the amount of data read.

  • Solaris 2 SynchronizationImplements a variety of locks to support multitasking, multithreading (including real-time threads), and multiprocessing.Uses adaptive mutexes for efficiency when protecting data from short code segments.Uses condition variables and readers-writers locks when longer sections of code need access to data. Uses turnstiles to order the list of threads waiting to acquire either an adaptive mutex or reader-writer lock.

  • Windows 2000 SynchronizationUses interrupt masks to protect access to global resources on uniprocessor systems.Uses spinlocks on multiprocessor systems.Also provides dispatcher objects which may act as wither mutexes and semaphores.Dispatcher objects may also provide events. An event acts much like a condition variable.

  • Basic questionMaybe the library approach is not such a good idea?

    How can programming languages make concurrent and distributed programming easier?

  • Language support for concurrencyHelp promote good software engineeringAllowing the programmer to express solutions more closely to the problem domainNo need to juggle several programming models (Hardware, OS, library, )Make invariants and intentions more apparent (part of the interface and/or type system)Allows the compiler much more freedom to choose different implementationsBase the programming language constructs on a well-understood formal model => formal reasoning may be less hard and the use tools may be possible

  • What could languages provide?Abstract model of system abstract machine => abstract systemExample high-level constructsCommunication abstractionsSynchronous communicationBuffered asynchronous channels that preserve msg orderMutual exclusion, atomicity primitivesMost concurrent languages provide some form of lockingAtomicity is more complicated, less commonly providedProcess as the value of an expressionPass processes to functionsCreate processes at the result of function call

  • Basic issue: conflict between processesCritical sectionTwo processes may access shared resourceInconsistent behavior if two actions are interleavedAllow only one process in critical section

    DeadlockProcess may hold some locks while awaiting othersDeadlock occurs when no process can proceed

  • ConcurrencyDef: A task is disjoint if it does not communicate with or affect the execution of any other task in the program in any way Task communication is necessary for synchronizationTask communication can be through:1. Shared nonlocal variables2. Parameters3. Message passing

  • SynchronizationKinds of synchronization:1. Cooperation Task A must wait for task B to complete some specific activity before task A can continue its execution e.g., the producer-consumer problem2. Competition When two or more tasks must use some resource that cannot be simultaneously used e.g., a shared counterCompetition is usually provided by mutually exclusive access (approaches are discussed later)

  • Design Issues for Concurrency:

    How is cooperation synchronization provided?How is competition synchronization provided?How and when do tasks begin and end execution?Are tasks statically or dynamically created?Are there any syntactic constructs in the language?Are concurrency construct reflected in the type system?

  • Concurrent Pascal: cobegin/coendLimited concurrency primitiveExamplex := 0;cobeginbegin x := 1; x := x+1 end; begin x := 2; x := x+1 end;coend;print(x);execute sequentialblocks in parallelx := 0x := 2x := 1print(x)x := x+1x := x+1Atomicity at level of assignment statement

  • Mutual exclusionSample actionprocedure sign_up(person)beginnumber := number + 1;list[number] := person;end;Problem with parallel executioncobeginsign_up(fred);sign_up(bill);end;bobfred

  • Locks and Waiting

    cobeginbegin sign_up(fred); // critical sectionend;begin sign_up(bill); // critical section end;end;Need atomic operations to implement wait

  • Mutual exclusion primitivesAtomic test-and-setInstruction atomically reads and writes some locationCommon hardware instruction Combine with busy-waiting loop to implement mutex

    SemaphoreAvoid busy-waiting loop Keep queue of waiting processesScheduler has access to semaphore; process sleepsDisable interrupts during semaphore operationsOK since operations are short

  • Monitor Brinch-Hansen, Dahl, Dijkstra, HoareSynchronized access to private data. Combines:private dataset of procedures (methods) synchronization policyAt most one process may execute a monitor procedure at a time; this process is said to be in the monitor. If one process is in the monitor, any other process that calls a monitor procedure will be delayed.

    Modern terminology: synchronized object

  • Java ConcurrencyThreadsCreate process by creating thread objectCommunicationshared variablesmethod callsMutual exclusion and synchronizationEvery object has a lock (inherited from class Object)synchronized methods and blocksSynchronization operations (inherited from class Object)wait : pause current thread until another thread calls notifynotify : wake up waiting threads

  • Java ThreadsThreadSet of instructions to be executed one at a time, in a specified orderJava thread objectsObject of class ThreadMethods inherited from Thread:start : method called to spawn a new thread of control; causes VM to call run method suspend : freeze execution interrupt : freeze execution and throw exception to threadstop : forcibly cause thread to halt

  • Example subclass of Threadclass PrintMany extends Thread {private String msg;public PrintMany (String m) {msg = m;}public void run() {try { for (;;){ System.out.print(msg + ); sleep(10);}} catch (InterruptedException e) {return;}} (inherits start from Thread)

  • Interaction between threadsShared variablesTwo threads may assign/read the same variableProgrammer responsibilityAvoid race conditions by explicit synchronization!!Method callsTwo threads may call methods on the same objectSynchronization primitivesEach object has internal lock, inherited from ObjectSynchronization primitives based on object locking

  • Synchronization exampleObjects may have synchronized methodsCan be used for mutual exclusionTwo threads may share an object.If one calls a synchronized method, this locks object.If the other calls a synchronized method on same object, this thread blocks until object is unlocked.

  • Synchronized methodsMarked by keywordpublic synchronized void commitTransaction() {} Provides mutual exclusionAt most one synchronized method can be activeUnsynchronized methods can still be calledProgrammer must be carefulNot part of method signaturesync method equivalent to unsync method with body consisting of a synchronized blocksubclass may replace a synchronized method with unsynchronized method

  • Join, another form of synchronization Wait for thread to terminateclass Future extends Thread {private int result;public void run() { result = f(); }public int getResult() { return result;}}Future t = new future;t.start() // start new threadt.join(); x = t.getResult(); // wait and get result

  • Aspects of Java ThreadsPortable since part of languageEasier to use in basic libraries than C system callsExample: garbage collector is separate threadGeneral difficulty combining serial/concur codeSerial to concurrentCode for serial execution may not work in concurrent sysConcurrent to serialCode with synchronization may be inefficient in serial programs (10-20% unnecessary overhead)Abstract memory modelShared variables can be problematic on some implementations

  • C# ThreadsBasic thread operationsAny method can run in its own threadA thread is created by creating a Thread objectCreating a thread does not start its concurrent execution; it must be requested through the Start methodA thread can be made to wait for another thread to finish with JoinA thread can be suspended with SleepA thread can be terminated with Abort

  • C# ThreadsSynchronizing threadsThe Interlock classThe lock statementThe Monitor classEvaluationAn advance over Java threads, e.g., any method can run its own threadThread termination cleaner than in JavaSynchronization is more sophisticated

  • Polyphonic C#An extension of the C# language with new concurrency constructsBased on the join calculusA foundational process calculus like the p-calculus but better suited to asynchronous, distributed systemsA single model which works both forlocal concurrency (multiple threads on a single machine)distributed concurrency (asynchronous messaging over LAN or WAN)It is differentBut its also simple if Mort can do any kind of concurrency, he can do this

  • In one slide:Objects have both synchronous and asynchronous methods.Values are passed by ordinary method calls:If the method is synchronous, the caller blocks until the method returns some result (as usual).If the method is async, the call completes at once and returns void.A class defines a collection of chords (synchronization patterns), which define what happens once a particular set of methods have been invoked. One method may appear in several chords.When pending method calls match a pattern, its body runs.If there is no match, the invocations are queued up.If there are several matches, an unspecified pattern is selected.If a pattern containing only async methods fires, the body runs in a new thread.

  • Extending C# with chordsInteresting well-formedness conditions:At most one header can have a return type (i.e. be synchronous).Inheritance restriction.ref and out parameters cannot appear in async headers.Classes can declare methods using generalized chord-declarations instead of method-declarations.chord-declaration ::= method-header [ & method-header ]* body

    method-header ::= attributes modifiers [return-type | async] name (parms)

  • A Simple Bufferclass Buffer {String get() & async put(String s) {return s;}}Calls to put() return immediately (but are internally queued if theres no waiting get()).Calls to get() block until/unless theres a matching put()When theres a match the body runs, returning the argument of the put() to the caller of get().Exactly which pairs of calls are matched up is unspecified.

  • OCCAMProgram consists of processes and channelsProcess is code containing channel operationsChannel is a data objectAll synchronization is via channelsFormal foundation based on CSP

  • Channel Operations in OCCAMRead data item D from channel CD ? CWrite data item Q to channel CQ ! CIf reader accesses channel first, wait for writer, and then both proceed after transfer.If writer accesses channel first, wait for reader, and both proceed after transfer.

  • Concurrent MLThreadsNew type of entityCommunicationSynchronous channelsSynchronizationChannelsEventsAtomicityNo specific language support

  • ThreadsThread creationspawn : (unit unit) thread_idExample code CIO.print "begin parent\n"; spawn (fn () => (CIO.print "child 1\n";)); spawn (fn () => (CIO.print "child 2\n";)); CIO.print "end parent\nResultend parentchild 2child 1 begin parent

  • ChannelsChannel creationchannel : unit a chanCommunicationrecv : a chan asend : ( a chan * a ) unitExample ch = channel(); spawn (fn()=> send(ch,0); ); spawn (fn()=> recv ch; );Result

    send/recv

  • CML programmingFunctionsCan write functions : channels threadsBuild concurrent system by declaring channels and wiring together sets of threadsEventsDelayed action that can be used for synchronizationPowerful concept for concurrent programmingSample ApplicationeXene concurrent uniprocessor window system

  • A CML implementation (simplified)Use queues with side-effecting functionsdatatype 'a queue = Q of {front: 'a list ref, rear: 'a list ref} fun queueIns (Q()) = (* insert into queue *)fun queueRem (Q()) = (* remove from queue *)And continuationsval enqueue = queueIns rdyQ fun dispatch () = throw (queueRem rdyQ) () fun spawn f = callcc (fn parent_k => ( enqueue parent_k; f (); dispatch()))

    Source: Appel, Reppy

  • Language issues in client/server programmingCommunication mechanismsRPC, Remote Objects, SOAPData representation languagesXDR, ASN.1, XMLParsing and deparsing between internal and external representationStub generation

  • Client/server exampleThe basic organization of the X Window System A major task of most clients is to interact with a human user and a remote server.

  • Client-Side Software for Distribution TransparencyA possible approach to transparent replication of a remote object using a client-side solution.

  • The Stub Generation Process Interface SpecificationStub GeneratorServerStubCommonHeaderClient StubClientSource RPCLIBRARYServerSource Compiler / LinkerRPCLIBRARYClientProgram ServerProgramCompiler / Linker

  • RPC and the OSI Reference Model

  • RepresentationData must be represented in a meaningful format.Methods:Sender or Receiver makes right (NDR).Network Data Representation (NDR).Transmit architecture tag with data.Represent data in a canonical (or standard) form XDRASN.1Note these are languages, but traditional DS programmers dont like programming languages, except C

  • XDR - eXternal Data RepresentationXDR is a universally used standard from Sun Microsystems used to represent data in a network canonical (standard) form. A set of conversion functions are used to encode and decode data; for example, xdr_int( ) is used to encode and decode integers. Conversion functions exist for all standard data typesIntegers, chars, arrays, For complex structures, RPCGEN can be used to generate conversion routines.

  • RPC Example

  • XDR Example#include ..XDR sptr; // XDR stream pointerXDR *xdrs; // Pointer to XDR stream pointerchar buf[BUFSIZE]; // Buffer to hold XDR dataxdrs = (&sptr);xdrmem_create(xdrs, buf, BUFSIZE, XDR_ENCODE);..int i = 256;xdr_int(xdrs, &i);printf(position = %d. \n, xdr_getpos(xdrs));

  • Abstract Syntax Notation 1 (ASN.1)ASN.1 is a formal language that has two features: a notation used in documents that humans reada compact encoded representation of the same information used in communication protocols. ASN.1 uses a tagged message format: < tag (data type), data length, data value >Simple Network Management Protocol (SNMP) messages are encoded using ASN.1.

  • Distributed ObjectsCORBAJava RMISOAP and XML

  • Distributed ObjectsProxy and Skeleton in Remote Method Invocation

  • CORBACommon Object Request Broker ArchitectureAn industry standard developed by OMG to help in distributed programmingA specification for creating and using distributed objectsA tool for enabling multi-language, multi-platform communicationA CORBA based-system is a collection of objects that isolates the requestors of services (clients) from the providers of services (servers) by an encapsulating interface

  • CORBA objects They are different from typical programming objects in three ways:CORBA objects can run on any platformCORBA objects can be located anywhere on the networkCORBA objects can be written in any language that has IDL mapping.

  • ClientClientObject ImplementationObject ImplementationORB

    ORB

    NETWORKIDLIDLIDLIDLA request from a client to an Object implementation within a network

  • IDL (Interface Definition Language)CORBA objects have to be specified with interfaces (as with RMI) defined in a special definition language IDL.The IDL defines the types of objects by defining their interfaces and describes interfaces only, not implementations.From IDL definitions an object implementation tells its clients what operations are available and how they should be invoked.Some programming languages have IDL mapping (C, C++, SmallTalk, Java,Lisp)

  • IDL FileIDL CompilerClient StubFileServerSkeleton FileClientImplementationObjectImplementationORB

  • The IDL compilerIt will accept as input an IDL file written using any text editor (fileName.idl)It generates the stub and the skeleton code in the target programming language (ex: Java stub and C++ skeleton)The stub is given to the client as a tool to describe the server functionality, the skeleton file is implemented at the server.

  • IDL Examplemodule katytrail { module weather { struct WeatherData { float temp; string wind_direction_and_speed; float rain_expected; float humidity; }; typedef sequence WeatherDataSeq interface WeatherInfo { WeatherData get_weather( in string site ); WeatherDataSeq find_by_temp( in float temperature ); };

  • IDL Example Cont.interface WeatherCenter { register_weather_for_site ( in string site, in WeatherData site_data ); }; };};Both interfaces will have Object Implementations.A different type of Client will talk to each of theinterfaces.

    The Object Implementations can be done in oneof two ways. Through Inheritance or througha Tie.

  • Stubs and SkeletonsIn terms of CORBA development, the stubs and skeleton files are standard in terms of their target language.Each file exposes the same operations specified in the IDL file.Invoking an operation on the stub file will cause the method to be executed in the skeleton fileThe stub file allows the client to manipulate the remote object with the same ease with each a local file is manipulated

  • Java RMIOverviewSupports remote invocation of Java objectsKey: Java Object Serialization Stream objects over the wire Language specificHistoryGoal: RPC for JavaFirst release in JDK 1.0.2, used in Netscape 3.01Full support in JDK 1.1, intended for appletsJDK 1.2 added persistent reference, custom protocols, more support for user control.

  • Java RMIAdvantagesTrue object-orientation: Objects as arguments and valuesMobile behavior: Returned objects can execute on callerIntegrated securityBuilt-in concurrency (through Java threads)DisadvantagesJava onlyAdvertises support for non-JavaBut this is external to RMI requires Java on both sides

  • Java RMI ComponentsBase RMI classesExtend these to get RMI functionalityJava compiler javacRecognizes RMI as integral part of languageInterface compiler rmicGenerates stubs from class filesRMI Registry rmiregistryDirectory serviceRMI Run-time activation system rmidSupports activatable objects that run only on demand

  • RMI ImplementationStubSkeletonClient HostServer Host

  • Java RMI Object SerializationJava can send object to be invoked at remote siteAllows objects as arguments/resultsMechanism: Object SerializationObject passed must inherit from serializableProvides methods to translate object to/from byte streamSecurity issues:Ensure object not tampered with during transmissionSolution: Class-specific serialization Throw it on the programmer

  • Building a Java RMI ApplicationDefine remote interfaceExtend java.rmi.RemoteCreate server codeImplements interfaceCreates security manager, registers with registryCreate client codeDefine object as instance of interfaceLookup object in registryCall objectCompile and runRun rmic on compiled classes to create stubsStart registryRun server then client

  • Parameter PassingPrimitive typescall-by-valueRemote objectscall-by-referenceNon-remote objectscall-by-valueuse Java Object Serialization

  • Java SerializationWrites object as a sequence of bytesWrites it to a StreamRecreates it on the other endCreates a brand new object with the old dataObjects can be transmitted using any byte stream (including sockets and TCP).

  • Codebase PropertyStub classpaths can be confusing3 VMs, each with its own classpathServer vs. Registry vs. ClientThe RMI class loader always loads stubs from the CLASSPATH firstNext, it tries downloading classes from a web server(but only if a security manager is in force)java.rmi.server.codebase specifies which web server

  • CORBA vs. RMICORBA was designed for language independence whereas RMI was designed for a single language where objects run in a homogeneous environmentCORBA interfaces are defined in IDL, while RMI interfaces are defined in JavaCORBA objects are not garbage collected because they are language independent and they have to be consistent with languages that do not support garbage collection, on the other hand RMI objects are garbage collected automatically

  • SOAP IntroductionSOAP is simple, light weight and text based protocolSOAP is XML based protocol (XML encoding)SOAP is remote procedure call protocol, not object oriented completelySOAP can be wired with any protocolSOAP is a simple lightweight protocol with minimum set of rules for invoking remote services using XML data representation and HTTP wire.Main goal of SOAP protocol Interoperability

    SOAP does not specify any advanced distributed services.

  • Why SOAP Whats wrong with existing distributed technologiesPlatform and vendor dependent solutions (DCOM Windows) (CORBA ORB vendors) (RMI Java)Different data representation schemes (CDR NDR)Complex client side deployment Difficulties with firewall Firewalls allows only specific ports ( port 80 ), but DCOM and CORBA assigns port numbers dynamically. In short, these distributed technologies do not communicate easily with each other because of lack of standards between them.

  • Base Technologies HTTP and XMLSOAP uses the existing technologies, invents no new technology.XML and HTTP are accepted and deployed in all platforms.Hypertext Transfer Protocol (HTTP)HTTP is very simple and text-based protocol.HTTP layers request/response communication over TCP/IP. HTTP supports fixed set of methods like GET, POST.Client / Server interactionClient requests to open connection to server on default port numberServer accepts connectionClient sends a request message to the ServerServer process the requestServer sends a reply message to the clientConnection is closedHTTP servers are scalable, reliable and easy to administer.SOAP can be bind any protocol HTTP , SMTP, FTP

  • Extensible Markup Language (XML)XML is platform neutral data representation protocol.HTML combines data and representation, but XML contains just structured data. XML contains no fixed set of tags and users can build their own customized tags.Bhavin [email protected] is platform and language independent.XML is text-based and easy to handle and it can be easily extended.

  • Architecture diagram

  • Parsing XML DocumentsRemember: XML is just textSimple API for XML (SAX) ParsingSAX is typically most efficientNo Memory Implementation!Left to the DeveloperDocument Object Model (DOM) ParsingParsing is not fundamental emphasis.A DOM Object is a representation of the XML document in a binary tree format.

  • Parsing: ExamplesSaxParseExampleCallback functions to process NodesDomParseExampleUse of JAXP (Java API for XML Parsing)Implementations can be swapped, such as replacing Apache Xerces with Sun Crimson.JAXP does not include some advanced features that may be useful.SAX used behind the scenes to create object model

  • Web-based applications todayPresentation: HTML, CSS, Javascript, Flash, Java applets, ActiveX controlsBusiness logic: C#, Java, VB, PHP, Perl, Python,Ruby Database: SQLFile systemApplication serverWeb serverContent management systemOperating SystemSockets, HTTP, email, SMS, XML, SOAP, REST, Rails, reliable messaging, AJAX, Replication, distribution, load-balancing, security, concurrencyBeans, servlets, CGI, ASP.NET,

  • Languages for distributed computingMotivationWhy all the fuss about language and platform independence?It is extremely inefficient to parse/deparse to/from external/internal representation 95% of all computers run Windows anywayThere is a JVM for almost any processor you can think ofFew programmers master more than one programming language anyway

    Develop a coherent programming models for all aspects of an application

  • Facile Programming LanguageIntegration of Multiple ParadigmsFunctionsTypes/complex data typesConcurrencyDistribution/soft real-timeDynamic connectivityImplemented as extension to SMLSyntax for concurrency similar to CML

  • Facile implementationPre-emptive scheduler implemented at the lowest levelExploiting CPS translation => state characterised by the set of registersGarbage collector used for linearizing data structuresLambda level code used as intermediate language when shipping data (including code) in heterogeneous networksNative representation is shipped when possiblei.e. same architecture and within same trust domainPossibility to mix between interpretation or JIT depending on usage

  • ConclusionConcurrency may be an order of magnitude more difficult to handleProgramming language support for concurrency may help make the task easierWhich concurrency constructs to add to the language is still a very active research areaIf you add concurrency construct, be sure you base them on a formal model!

  • The guiding principle

    Provide better level of abstraction Make invariants and intentions more apparent (part of the interface)Give stronger compile-time guarantees (types)Enable different implementations and optimizationsExpose structure for other tools to exploit (e.g. static analysis)Put important features in the language itself, rather than in libraries

    One thing important to remember; XML is simple text, delimited by a tag language that follows standard rules. In the early days of developing the SAX API (a community effort managed from the XML-DEV mailing list), the common stated goal was to create an API that could be implemented by a Desperate Perl Hacker in a weekend.

    While the goal likely fell far short of that (ask the developers of Xerces Perl!), simple text processing forms the basis of all XML parsing and handling techniques.

    (As an aside, one of the advantages of using XML over other data formats, especially formats that are designed to be system-portable, is the elimination of the newline problem. Anyone who has dealt with files on a UNIX system created by a Windows text editor knows that Windows text processing follows the newline, carriage return format /r/n, while UNIX (and now, Macintosh) systems use the simple newline character: \n. This has lead to hours upon hours of wasted developer time.)

    SAX is typically the most efficient method of processing XML, in terms of machine and memory utilization. The advantage gained here is largely due to the fact that in-memory data representation is left to developers, and can be tuned to whatever the task is at hand. SAX parsing is likely the best choice for large datasets, and applications where memory is a limited asset. (Such as mobile devices!)

    Example: SaxParseExample

    DOM parsing methods are simpler to implement, as the parsing steps are left to the implementation. In DOM processing, parsing is not the fundamental emphasis; rather, the parser creates a memory representation of a document that generically represents an XML document. DOM methods are best suited to typical XML development; relatively small datasets where parsing performance is not the largest hot spot in the program. (A good example: using XML documents as data formats, or for servlet configuration documents.)