space based programming
DESCRIPTION
Presentation of space based programming given at Skills Matter 02 Sep 2009TRANSCRIPT
Why should you care?
It helps us build applications that: can scale out to lots of machines easily can grow and shrink dynamically have massive throughput handle massive amounts of data
So what are spaces?
Data spaces are “network attached memory”, allowing us to read, put or take objects
Space takes care of redundancy, failover, transactions …
Alternatively, send tasks to the object and let it execute it.
The idea has been around for a while, but somehow has not caught on…
however it’s coming back with a bang!
Another language named after a Lovelace
David Gelertner invents Linda in the 80’s
Distributed processing based on tuples Orthogonal process coordination Data coupling rather than process coupling
Sun Jini in the 90’s
Evolvable architectures, autodiscovery and lots of other flux capacitors nobody needed or knew how to use at the time…
Grid computing in 00's
Great for computations, but what about transaction processing?
Space-based systems will be key for cloud scalability
Products
http://coherence.oracle.com http://www.gigaspaces.com http://virtuoso.openlinksw.com/ http://www.almaden.ibm.com/cs/TSpaces/ http://www.jboss.org/infinispan
Command Pattern (GOF)
“is a design pattern in which an object is used to represent and encapsulate all the information needed to call a method at a later time”... (wikipedia)
You need a recipient, probably an entity by ID
You need a “recipient”, probably an entity by ID
A “command” with all the information
required to run it
You need a “recipient”, probably an entity by ID
A “command” with all the information
required to run it
And an “invoker” to do the job
And the command gets executed...
So what does it have to do with spaces?
“...makes it easier to construct general components that need to delegate, sequence or execute method calls” (also wikipedia)
You can use many invokers
And do loads of work in parallel
And you can do something more productive with your time...
So what does that have to do with spaces?
Space is where recipients reside and where you send commands
Lots of different processors run in the space, but from the outside appear as a single “mind”
This scales really well and it is virtually indestructible....
Space: all your objects
Processing units (=partitions)
GigaSpace data objects
[SpaceClass]
public class Message
{
[SpaceID(AutoGenerate=true)]
public String MessageId {get; set;}
[SpaceRouting]
public String MessageType{ get; set;}
...
}
Space Data Properties
[SpaceID] is unique for the class in Space [SpaceRouting] determines the partition
(defaults to space ID) Indexes speed up queries
[SpaceProperty(Index=SpaceIndexType.Basic)] [SpaceVersion] for optimistic locking [SpaceExclude] are not serialized
Recipient (Command context)
Space object Space ID is the entity ID Routing ID is the same field
Commands
Space object Space ID is a GUID
(can be auto-generated)
Target recipient ID is the Routing ID
Processing Units
Worker thread pool Template matches the command
Class matching Property matching (if not null)
Works inside a PU container
Example processor
[PollingEventDriven(MinConcurrentConsumers = 1,
MaxConcurrentConsumers = 4)]
internal class MessageProcessor
{
[EventTemplate]
public Message TemplateForThisProcessor { get{ ... } }
[DataEventHandler]
public Message ProcessMessage(Message message)
{.... }
}
Processes
Contain one or more processing unit containers Own a space partition Run on the network, balanced, clustered,
backed up
Coherence - distributed HashMaps
Works on POCO objects, but you can implement PortableObject for .NET/Java interop
void IPortableObject.ReadExternal(IPofReader reader)
{
firstName = reader.ReadString(0);
addrHome = (Address)reader.ReadObject(1);
....
void IPortableObject.WriteExternal(IPofWriter writer)
{
writer.WriteString(0, firstName);
writer.WriteObject(1, addrHome);
Works as a hashmap
INamedCache cache = CacheFactory.GetCache(“my map”);
cache.Add(key, value)
cache.Remove(key, value)
Also supports queries, notifications etc
Entry Processors – push code to objects
cache.Insert("BGD", new Temperature(25, 'c', 12));
IValueUpdater updater = new ReflectionUpdater("setDegree");
IEntryProcessor processor = new UpdaterProcessor(updater, 26);
object result = cache.Invoke("BGD", processor);
Key ideas to do it efficiently
Forget about n-tier systems Group data together with all processes Ensure that invokers have all the information
needed to run (so no unnecessary serialization) Ensure that the recipients are the correct
aggregates for execution (so low contention during execution)
Use asynchronous persistence
That's it for now...
• http://gojko.net
• http://ukdotnet.ning.com
• October 1st, Mike Hadlow on MassTransit