the history of java: a case · why is the jar standard better? simplicity / extensibility: crypto...
TRANSCRIPT
The History of Java: A case study in software engineeringDan S. Wallach and Mack Joyner, Rice University
Copyright © 2016 Dan Wallach, All Rights Reserved
The Beginnings
Sun Microsystems (popular Unix system vendor) research project: 1990: Design goal: programming for small devices
• Set-top boxes, interactive television, PDAs • Single-digit MHz CPUs, single-digit MBytes of RAM, etc.
1991-1993: James Gosling and Patrick Naughton develop the “Oak” programming language 1993: The Web is taking off, Oak ➜ Java for web applets
NCSA Mosaic
Written by UIUC undergrads Who founded Mosaic Communications Corp in 1993, later renamed Netscape
They started over, created Netscape Navigator, the basis of today’s Firefox.
HotJava (1995)
Web browser, written in Java. Supported Java “applets” that ran inside the browser
• “Mobile code”, not just HTML • This happened before
JavaScript!
Late 1995, Netscape announced it would license Java.
Java language principles
Familiar C / C++ syntax (curly braces, etc.) But really, closer to languages like Modula-2, etc.
• No pointer arithmetic: references to objects are opaque • No operator overloading • No multiple inheritance of classes • No preprocessor • No explicit memory deallocation • No global functions - everything in a class
Nice features of Java, all there in day 1Garbage collector
“Everything is an object” hierarchy
Type safety (vs. C which lets you pretend any memory has any type)
Bounds checking on arrays (vs. C which doesn’t care)
Real strings (vs. C’s array of characters)
Package system
Interfaces
Exceptions for error handling (vs. no standard error handling in C)
Multithreading with language support for locking and synchronization
JavaDoc documentation system
Java is C++ without the guns, knives, and clubs. - James Gosling
Java underlying design
Java compiler: emits machine-independent “bytecode” (.class files) Stack-machine approach (push, pop, etc., just like your RPN calculator) Standardized types Portable: write once, run anywhere
Bytecode is executed on a Java “virtual machine” All objects are allocated on the heap, not on the stack Interpreter was small, performance was “good enough”
Java security?Java wasn’t initially meant to be “secure” for running untrusted code Security properties were grafted on with HotJava
Bytecode verifier: reject “malicious” bytecode Example: verify that you never call a “private” method from outside
Dean, Felten, Wallach (’96): security was easy to break
Java security?Java wasn’t initially meant to be “secure” for running untrusted code Security properties were grafted on with HotJava
Bytecode verifier: reject “malicious” bytecode Example: verify that you never call a “private” method from outside
Dean, Felten, Wallach (’96): security was easy to break
Today, Java in the browser for untrusted code is dead. Java is
widely used for server apps, Android apps, and other places where
malicious code isn’t a problem.
1996
JDK (Java Development Kit) 1.0 was released
Compiler and runtime for Solaris, Linux, Windows, Mac OS
Java was a huge hit with industrial programmers and in CS education
Sun created JavaSoft (30-40 employees in 1996) Budimlić (co-taught Comp215 last year) interned at JavaSoft
Netscape was shipping Java to millions of users Wallach interned at Netscape
Engineering challenges for security
Proposal #1 (Mark Miller and others, “E” language) Let’s redo all the Java libraries to improve security! “Capability” style libraries have useful security properties
• No public constructors or static methods • Instead, somebody passes you a “filesystem” or a “network” at the
beginning, and you query it
Rejected: Too many people already using the Java libraries as-is
Adding security to JavaOriginal hack: look at the Java call stack If the stack is “thick” enough with system code, then it must be fine
Netscape improvement: stack annotations System code “enables” security-critical APIs only when necessary If security isn’t enabled, APIs will fail
But the compiler writers hate stack annotations Restricts their ability to do performance optimizations
Modern solutions don’t do this (take Comp427, learn more later!)
Engineering challenge: startup latency
Java 1.0: each .java file becomes a separate .class file Each file has to be downloaded as a separate web request Modestly complex Java applets took over a minute to load
Solution: put everything in a zip file, extend the <applet> tag Implemented on the side by one Netscape engineer (Warren Harris) Didn’t ask anybody for permission, just did it (shipped in Netscape 3.0) No backward compatibility issues
Engineering challenge: adding signaturesGoal: Add a “digital signature” scheme to Java “Signed applets” could have more security privileges
Challenge: Where to add the crypto without breaking things? Meeting at JavaSoft: three Netscape engineers + three JavaSoft engineers (including Wallach!)
Failed idea: shoehorn crypto into the .class files Backward compatibility? Makes it hard to evolve the class file spec
Final idea: add a signatures subdirectory to the .zip file Netscape’s hack became a real standard and grew additional features
• A “jar” file is just a “zip” file with a particular layout on the inside
Why is the Jar standard better?Simplicity / Extensibility: crypto code only sees arrays of bytes Crypto can work on other file types (images, etc.)
Orthogonality: Java .class file format is unrelated to the crypto If the class file format changes, crypto doesn’t care If you don’t care about crypto, you can just ignore the crypto
Performance: faster downloads (single HTTP get)
Fun fact: The Jar spec defined two different crypto algorithms And, of course, Netscape implemented one, JavaSoft the other one
Other Java features trickle inJDK 1.1 (1997): Java gets lots more useful for general programming! inner/anonymous classes, RMI, reflection, JIT compiler
JDK 1.2 (aka Java2) (1998): New libraries! updated security model (joint with Netscape) java.util “collections” classes “strictfp” updates (useful for scientific computing)
Java3 (2000): Performance! HotSpot JVM (radically improved performance)
Java4 (2002): More new libraries!
Meanwhile at MicrosoftInternet Explorer 3 (1996) had its own version of Java Reimplemented from scratch inside Microsoft Tweaked the language New libraries
• Windows-only!
“Embrace and extend” Sun sued Microsoft (1997)
Meanwhile at MicrosoftInternet Explorer 3 (1996) had its own version of Java Reimplemented from scratch inside Microsoft Tweaked the language New libraries
• Windows-only!
“Embrace and extend” Sun sued Microsoft (1997)
Microsoft’s solution (in 2000): C#C# is sort of Java with reliability, productivity, and security deleted. - James Gosling
Actually, C# learned a lot of lessons from Java Arrays of objects are contiguous in memory: better for performance Common Language Runtime (CLR) supported VisualBasic, others Built from the beginning to be compiled Better integration with Microsoft COM Much simpler for writing a new Windows program
Microsoft Research got involved in C# security before it shipped
C#, F#, Spec#, and others: Microsoft is still innovating around C#
Java5 (2004): Added generics!
Java5 was a huge change from earlier Java No generics prior to 2004, even though it was widely requested
Lots of other features Autoboxing/unboxing of primitive types (int/Integer, etc.) Static imports Annotations (e.g., @Override)
Java generic engineering
Challenge: should List<Integer> and List<String> be different? In Java 4, there was only java.util.List (i.e., List of Object) With C++ templates, the compiler specializes the code
• Benefit: performance, Cost: code-size bloat
Complication: Should the JVM have to know about generics? Many JVMs now besides Sun (e.g., IBM)
Solution: Type erasure Type parameters only exist in the source code, not the compiled code!
Type erasure: good, bad, ugly
Good: we can implement singleton empty lists static <T> IList<T> makeEmpty() { @SuppressWarnings("unchecked") IList<T> typedEmptyList = (IList<T>) Empty.SINGLETON; return typedEmptyList;}
Bad: you can’t say new T() (because T isn’t there at runtime) T isn’t a “reified type” (C# supports reified types)
Ugly: you instead end up passing around Class<T> instances public class NamedMatcher<PatternT extends Enum<PatternT> & NamedMatcher.TokenPatterns> { public NamedMatcher(@NotNull Class<PatternT> enumPatternsClazz) { …
Remember: we want to find bugs early! // create an array of strings String[] strings = new String[10]; // cast it to an array of objects Object[] objects = strings; // insert an object into the array objects[0] = new Object();
Similar code with generic Vectors:
// create a vector of strings Vector<String> strings = new Vector<String>(10); // cast it to a vector of objects Vector<Object> objects = (Vector<Object>)strings; // insert an object into the vector objects.add(new Object());
runtime exception
compiler error
Java6 (2006)
Mostly, lots of library support
Notable new features Pluggable annotations (e.g., @Contract and @NotNull) JavaScript integration (more on this in a few weeks)
Meanwhile at Yahoo!
Hadoop: Started in 2002 at Yahoo!
Now: Open source (Apache Software Foundation)
Google’s MapReduce programming model Decompose your programs into “maps” and “folds” (sound familiar?)
Java virtual machines on all nodes
Hadoop Distributed filesystem (HDFS)
Widely used across industry (newer alternatives are better…)
Meanwhile, Android!
Android, Inc. started in 2003, acquired by Google in 2005 All Android apps (and half of Android itself) are written in Java.
Android doesn’t use the JVM, or any of the Java graphics libraries All built from scratch. But Sun seemed okay with this.
Meanwhile, Apple!
Mac OS X (circa 2000) supported Java Apple integrated updates from Sun.
Stock Java language, custom APIs for access to Mac features
2009: Oracle merges with Sun
2011: Java7
Support for dynamic languages (new JVM byte code instructions) Used by a JavaScript engine that they also added
Better support for external compilers/debuggers (like IntelliJ)
Type inference (the diamond <> type)
Better support for 64-bit computers
Oracle decides to “monetize” Java
Oracle decides to “monetize” Java
Oracle decides to “monetize” Java
Legal risk? What do you do?Apple’s solution: Swift (a whole new programming language) Lessons learned from Java, C#, others Special language syntax to deal with null Compatible with existing Objective-C code
• Reference-counted memory (not GC!)
Microsoft looks pretty smart Owning their core technologies…
What did Google do?Android supported Java APIs, but reimplemented the libraries and VM from scratch Oracle’s lawsuit: copyright on APIs? Outcome: Google won (so far)
Recently, Google has partial Java8 support with their own toolchain
Meanwhile, Java8 (2014)First serious changes to Java since Java5 (ten years earlier!) Lambdas!
• Using the invokedynamic instruction introduced in Java7. • Labeling old interfaces “functional” allows lambdas to replace
many existing uses of anonymous inner classes. Default methods on interfaces! Type inference! Java streams (parallel execution on functional list-like things)! Better garbage collection!
Challenge: adding new features for old code
Java collections classes (HashMap, ArrayList, etc.) are roughly the same as a decade earlier, used everywhere You can’t change any API without breaking everything! Third-party libraries also implement the same interfaces!
Default methods on interface are a clean and clever solution Add a new default method to the interface:
• It works on every implementation of the interface • Support new features in terms of old methods
Challenge: adding functional programming1) Immutable “wrappers” around existing mutating classes (They couldn’t do what we do in Comp215 and start over.)
2) Totally new “streams” which are a lot like our IList But they can run in parallel (more on this later in the semester)
Meanwhile, other JVM languages
JRuby, Jython: Run Ruby or Python code in the JVM!
Groovy: dynamically typed (your “gradle” files are Groovy programs)
Scala: functional, object-oriented, and lots more (used in Comp311)
Clojure: a LISP-family language
Kotlin: like Scala, but simpler (used internally by IntelliJ) Wallach’s temptation: teach Comp215 in Kotlin rather than Java8.
And lots of other random projects
CardJava: programming for tiny not-so-smartcards Still used in some places
Java Micro Edition: flip phone programming Apple iOS and Google Android won; Oracle (and Microsoft) lost
Java TV: back to Java’s set-top box roots! Most “smart” TVs today are just a corner-to-corner web browser Java is part of the Blu-ray standard (BD-J)
Twenty years later… How’s Java doing?Most Java code written in 1995 will still compile and run just fine today Really old C code doesn’t always work properly with modern C compilers.
Java has huge tooling and library support And for CS education, high school and college, it’s still the standard.
Java gave up on running untrusted code in the browser JavaScript now has that particular honor, and it’s got its own issues. Java’s type safety makes it more resilient to security attacks than C or C++.
Are Java’s days numbered? Twenty years of engineering decisions piled on top of one another. But new JVM languages (like Kotlin) are very pleasant to use.