490dp part i: challenges intermezzo: applications part ii: java object serialization robert grimm

43
490dp Part I: Challenges Intermezzo: Applications Part II: Java Object Serialization Robert Grimm

Post on 19-Dec-2015

237 views

Category:

Documents


0 download

TRANSCRIPT

490dpPart I: Challenges

Intermezzo: ApplicationsPart II: Java Object Serialization

Robert Grimm

Challenges

• Pervasive computing– Vision: Focus on users and their tasks– Enabled by ubiquitous smart devices

• Central question– How can devices get users’ tasks done?

• They need to work together!

Distributed State

“Information retained in one place that describes something, or is determined by something, somewhere else in the system”

• Examples– Association between addresses and names– Sequence number to identify most recent data– File block cached in memory of client– List of clients caching a file

Why is Distributed State Good?

• Performance– Not going over the network saves time– Example: Local cache of files

• Coherency– Easier to coordinate based on knowledge– Example: Server notification when cache expires

• Reliability– Replication makes it possible to tolerate failures– Example: Same files stored on two servers

Why is Distributed State Bad?

• Consistency• Crash sensitivity• Time and space overheads• Complexity

Consistency

• Problem: Keep copies consistent• Approaches

– Detect stale data on use• Treat copy as hint• Example: name-to-address map

– Prevent inconsistency• Require exclusive ownership before modifying• Example: all operations go through one node

– Tolerate inconsistency• Make window of inconsistency small• Example: delays in network games

Crash Sensitivity

• Problem: Mask failures• Approaches

– Reconstruct state• Example: reopening files in Sprite FS

– Limit degree of distribution / affected state• Example: partition files according to usage

– Fully replicate state• Example: Coda file system

Time and Space Overheads

• Time– Go across the network

• Space– Distributed copies– Tracking distributed copies

• Overheads depend on– Degree of sharing– Degree of modification

Complexity

• Distributed state requires– Maintaining consistency– Masking of failures

• Distributed state makes it harder to– Debug– Tune

Trade-offs

• Consistency• Availability• Scalability• Complexity

No Perfect Solution

• Solution needs to be “good enough”

NFS Sprite FS

Consistency Limited

Availability Limited Limited

Scalability LimitedLimited

(Baseline OK)

Complexity Low High

What Means “Good Enough”

• Depends on application domain– Make an informed trade-off

• Examples– Cluster-based services

• Porcupine• Distributed Data Structures

– Disconnected storage services• Epidemic replication• Two-tier replication

Porcupine

• Cluster-based email server• Assumptions

– Email typically doesn’t get modified– Deleted emails may reappear (temporarily)

• Eventual consistency• But, availability and scalability

[Saito et al. 99]

Distributed Data Structures

• Cluster-based hash table• Assumptions

– Network is fast and doesn’t partition– Nodes fail infrequently– OK to return failure at storage layer

• Consistency, availability, and scalability

[Gribble et al. 00]

Bayou

• Epidemic replication [Demers et al. 87]

– Two nodes periodically synchronize state– Only pair-wise connectivity

• Structured storage (database)• Eventually consistent• But, always available

[Petersen et al. 97]

Coda

• First-tier nodes

– Fully connected

– Store all data

• Second-tier nodes

– Often disconnected

– Store subset of data

• Limited consistency, but greater availability [Kistler & Satya 92, Mummert et al. 95]

Conflicts

• Caused by competing updates• Detected “after the fact”• Need to be resolved automatically

Conflict Resolution Techniques

• Based on data– Timestamps– Heuristics– Programs [Kumar & Satya 95, Reiher et al. 94]

• Part of update: Bayou [Terry et al. 95]

– Dependency check– Merge procedure

Morals

• No perfect solution– Need to exploit application domain

• Complexity grows very quickly– Beware of special case code (recovery)

Intermezzo: Applications

• Team 1: Cluster-based application– Scalable Napster / Gnutella repository– Scalable document repository

• Leased storage• Customizable actions when leases expire

• Team 2: Roving application– Personal jukebox– PIM on steroids– Universal inbox

Break

Java Object Serialization

• Problem– Turn graph of objects into byte string– Turn byte string back into graph of objects

A

B C

D

A

B C

D

The Basic Idea

• Write a description of each object

• Keep track of each written object

• <1:A <2:B <3:D>> <4:C ref(3)>>

A

B C

D

A

B C

DD

All Things Serializable

• Not everything is serializable– java.lang.Object– java.lang.Thread

• Serializable objects implementjava.io.Serializable

– An empty marker interface

Default Serialization

• Writes out all fields– Independent of their access controls

(private, package private, protected, public)

• Good style to document invariants– Use @serial tag

@serial Must not be <code>null</code>

Default Deserialization

• Allocates memory for new object– No constructor invoked– Fields initialized to their default values

• Reads in all fields– Independent of their access controls

Transient Fields

• Some fields shouldn’t or can’t be serialized private Object lock;

• How to prevent default serializationfrom trying to write them out?– Declare such fields as “transient”

private transient Object lock;

– Restored to defaults during deserialization•null in above example

Overriding Serialization

• Customize serialization by implementing private void writeObject(ObjectOutputStream) throws IOException;

• Good style to document customization– Use @serialData tag

Overriding Serialization

• Example: Thread-safe serialization private void writeObject(ObjectOutputStream out) throws IOException {

synchronized (lock) { out.defaultWriteObject(); }}

Overriding Serialization

• Example: Filter elements from a list– Declare list to be transient– In writeObject()

• Invoke default serialization• Iterate over list, writing filtered elements

out.writeObject(el);

• Write end-of-list markerout.writeObject(Boolean.FALSE);

• Alternatively, write length & elements

Overriding Deserialization

• Customize deserialization by implementing private void readObject(ObjectInputStream) throws IOException, ClassNotFoundException;

Overriding Deserialization

• Example: Restore lock private void readObject(ObjectInputStream in) throws IOException, ClassNotFoundException {

in.defaultReadObject(); lock = new Object();}

Overriding Deserialization

• Example: Restore list• In readObject()

– Invoke default deserialization– Read filtered elements until end-of-list

marker– Alternatively, read length & elements

Notes on Customization

• Don’t perform operations that take a long time– No I/O besides accessing object stream

• Swing UI elements are serializable– But are not designed for long-term storage– Declare them transient– Restore UI in application logic

The Replacements

• Example: Symbols — there can only be one private Object readResolve() throws ObjectStreamException { return intern(name);}

• Done after object graph has been restored– Embedded self references

are not replaced!

Inheritance

• If a superclass implements Serializable,all subclasses are also serializable– Each class in such a hierarchy

serializes only its own state– Classes can control all state

by implementing java.io.Externalizable• If superclass is not Serializable,

a serializable subclass must handle the superclass’s state

Inheritance

• To make a subclass of a serializable classnot serializable

private void writeObject( ObjectOutputStream o) throws IOException { throw new NotSerializableException( getClass().getName());}

• This indicates a semantic problem!

Versioning

• Problem– Classes can change– While instance is in serialized form

• Solution– Let classes declare their version– Define what are compatible changes

Stream Unique Identifier (SUID)

• Hash of the class– Determined by serialver tool– Accessible in Java throughObjectStreamClass.getSerialVersionUID()

• Modified version declares same SUIDas original version private static final long serialVersionUID = …;

Incompatible Changes

• Deleting fields• Moving classes up or down in the hierarchy• Changing non-static fields to static• Changing non-transient fields to transient• Changing the declared type of a field• Adding / removing access to default fields

from writeObject() / readObject()• See specification!

Compatible Changes

• Adding fields• Adding / removing classes• Adding Serializable• Adding / removing writeObject() / readObject()

• Changing static fields to non-static• Changing transient fields to non-transient

Security

• Serialized objects expose their internal state• If that state is sensitive it must be protected

– Don’t serialize sensitive state– Encrypt sensitive state– Encrypt serialized objects

There’s More

• serialPersistentFields to declare the serialized format– Useful for backwards compatibility

• ObjectInputValidation to validate deserialized objects

• Class descriptors– Serialized form of a class

• Two versions of serialization protocol