java collections the force awakens - jax london...reducing scope for bugs ~280 bugs in 28 projects...
TRANSCRIPT
![Page 1: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/1.jpg)
Java CollectionsThe Force Awakens
Darth @RaoulUKDarth @RichardWarburto
![Page 2: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/2.jpg)
![Page 3: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/3.jpg)
Collection Problems
Java Episode 8 & 9
Persistent & Immutable Collections
HashMaps
![Page 4: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/4.jpg)
Collection bugs
1. Element access (Off-by-one error, ArrayOutOfBound)2. Concurrent modification 3. Check-then-Act
![Page 5: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/5.jpg)
Scenario 1
List<String> jedis = new ArrayList<>(asList("Luke", "yoda"));
for (String jedi: jedis) {
if (Character.isLowerCase(jedi.charAt(0))) {
jedis.remove(jedi);
}
}
![Page 6: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/6.jpg)
Scenario 2
Map<String, BigDecimal> movieViews = new HashMap<>();
BigDecimal views = movieViews.get(MOVIE);
if(views != null) {
movieViews.put(MOVIE, views.add(BigDecimal.ONE));
}
views != nullmoviesViews.get movieViews.putThen
Check Act
![Page 7: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/7.jpg)
Reducing scope for bugs
● ~280 bugs in 28 projects including Cassandra, Lucene
● ~80% check-then-act bugs discovered are put-if-absent
● Library designers can help by updating APIs as new idioms emerge
● Different data structures can provide alternatives by restricting reads & updates to reduce scope for bugs
CHECK-THEN-ACT Misuse of Java Concurrent Collectionshttp://dig.cs.illinois.edu/papers/checkThenAct.pdf
![Page 8: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/8.jpg)
Collection Problems
Java Episode 8 & 9
Persistent & Immutable Collections
HashMaps
![Page 9: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/9.jpg)
Java 8 Lazy Collection Initialization
Many allocated HashMaps and ArrayLists never written to, eg Null object pattern
Java 8 adds Lazy Initialization for the default initialization case
Typically 1-2% reduction in memory consumption
http://www.javamagazine.mozaicreader.com/MarApr2016/Twitter#&pageSet=28&page=0
![Page 10: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/10.jpg)
![Page 11: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/11.jpg)
Java 9 API updates
Collection factory methods● Non-goal to provide persistent immutable collections● http://openjdk.java.net/jeps/269
java.util.Optional● ifPresentOrElse(), or(), stream(), getWhenPresent()● Optional.get() will be deprecated in future
java.util.Stream & java.util.stream.Collectors● takeWhile, dropWhile● filtering, flatMapping
java.util.concurrent.CompletableFuture● orTimeout, completeOnTimeout
![Page 12: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/12.jpg)
Collection Problems
Java Episode 8 & 9
Persistent & Immutable Collections
HashMaps
![Page 13: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/13.jpg)
Categorising Collections
Mutable
Immutable
Non-Persistent Persistent
Unsynchronized Concurrent
Unmodifiable View
Available in Core Library
![Page 14: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/14.jpg)
Mutable
● Popular friends include ArrayList, HashMap, TreeSet
● Memory-efficient modification operations
● State can be accidentally modified
● Can be thread-safe, but requires careful design
![Page 15: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/15.jpg)
Unmodifiable
List<String> jedis = new ArrayList<>();
jedis.add("Luke Skywalker");
List<String> cantChangeMe = Collections.unmodifiableList(jedis);
// java.lang.UnsupportedOperationException
//cantChangeMe.add("Darth Vader");
System.out.println(cantChangeMe); // [Luke Skywalker]
jedis.add("Darth Vader");
System.out.println(cantChangeMe); // [Luke Skywalker, Darth Vader]
![Page 16: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/16.jpg)
![Page 17: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/17.jpg)
Immutable & Non-persistent
● No updates
● Flexibility to convert source in a more efficient representation
● No locking in context of concurrency
● Satisfies co-variant subtyping requirements
● Can be copied with modifications to create a new version (can be
expensive)
![Page 18: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/18.jpg)
Immutable vs. Mutable hierarchy
ImmutableList MutableList
+ ImmutableList<T> toImmutable()
java.util.List
+ MutableList<T> toList()
Eclipse Collections (formaly GSCollections) https://projects.eclipse.org/projects/technology.collections/
ListIterable
![Page 19: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/19.jpg)
Immutable and Persistent
● Changing source produces a new (version) of the collection
● Resulting collections shares structure with source to avoid full copying on updates
![Page 20: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/20.jpg)
Persistent List (aka Cons)
public final class Cons<T> implements ConsList<T> {
private final T head;
private final ConsList<T> tail;
public Cons(T head, ConsList<T> tail) {
this.head = head; this.tail = tail;
}
@Override
public ConsList<T> add(T e) {
return new Cons(e, this);
}
}
![Page 21: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/21.jpg)
Updating Persistent List
A B C X Y Z
Before
![Page 22: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/22.jpg)
Updating Persistent List
A B C X Y Z
Before
A B D
After
Blue nodes indicate new copiesPurple nodes indicates nodes we wish to update
![Page 23: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/23.jpg)
Concatenating Two Persistent Lists
A B C
X Y Z
Before
![Page 24: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/24.jpg)
Concatenating Two Persistent Lists
- Poor locality due to pointer chasing- Copying of nodes
A B C
X Y Z
Before
A B C
After
![Page 25: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/25.jpg)
Persistent List
● Structural sharing: no need to copy full structure
● Poor locality due to pointer chasing
● Copying becomes more expensive with larger lists
● Poor Random Access and thus Data Decomposition
![Page 26: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/26.jpg)
Updating Persistent Binary Tree
Before
![Page 27: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/27.jpg)
Updating Persistent Binary Tree
After
![Page 28: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/28.jpg)
Persistent Array
How do we get the immutability benefits with performance of mutable variants?
![Page 29: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/29.jpg)
Trieroot
10 4520
3. Picking the right branch is done by using parts of the key as a lookup
1. Branch factor not limited to binary
2. Leaf nodes contain actual values
a
a e
bc
b c f
![Page 30: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/30.jpg)
Persistent Array (Bitmapped Vector Trie)... ...
... ...
... ...
... ...
.
.
.
.
.
.
1 31
0 1 31
Level 1 (root)
Level 2
Leaf nodes
![Page 31: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/31.jpg)
Trade-offs
● Large branching factor facilitates iteration but hinders updates
● Small branching factor facilitates updates but hinders traversal
![Page 32: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/32.jpg)
Java Persistent Collections
- Not available as part of Java Core Library
- Existing projects includes- PCollections: https://github.com/hrldcpr/pcollections- Port of Clojure DS: https://github.com/krukow/clj-ds- Port of Scala DS: https://github.com/andrewoma/dexx- Coming soon to Javaslang
![Page 33: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/33.jpg)
Memory usage survey
10,000,000 elements, heap < 32GB
int[] : 40MBInteger[]: 160MBArrayList<Integer>: 215MBPersistentVector<Integer>: 214MB (Clojure-DS)Vector<Integer>: 206MB (Dexx, port of Scala-DS)
Data collected using Java Object Layout: http://openjdk.java.net/projects/code-tools/jol/
![Page 34: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/34.jpg)
Primitive specialised collections
● Collections often hold boxed representations of primitive values
● Java 8 introduced IntStream, LongStream, DoubleStream and
primitive specialised functional interfaces
● Other libraries, eg: Agrona, Koloboke and Eclipse-Collections provide
primitive specialised collections today.
● Valhalla investigates primitive specialised generics
![Page 35: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/35.jpg)
Takeaways
● Immutable collections reduce the scope for bugs
● Always a compromise between programming safety and performance
● Performance of persistent data structure is improving
![Page 36: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/36.jpg)
Collection Problems
Java Episode 8 & 9
Persistent & Immutable Collections
HashMaps
![Page 37: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/37.jpg)
![Page 38: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/38.jpg)
HashMaps Basics
...
Han Solohash = 72309
Chewbaccahash = 72309
![Page 39: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/39.jpg)
Chaining Probing
HashMaps
a separate data structure for collision lookups
Store inline and have a probing sequence
![Page 40: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/40.jpg)
Aliases: Palpatine vs Darth Sidious
![Page 41: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/41.jpg)
Chaining Probing
HashMaps
aka Closed Addressing
aka Open Hashing
aka Open Addressing
aka Closed Hashing
![Page 42: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/42.jpg)
Chaining Probing
HashMaps
Linked List Based Tree Based
![Page 43: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/43.jpg)
java.util.HashMap
Chaining Based HashMap
Historically maintained a LinkedList in the case of a collision
Problem: with high collision rates that the HashMap approaches O(N) lookup
![Page 44: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/44.jpg)
java.util.HashMap in Java 8
Starts by using a List to store colliding values.
Trees used when there are over 8 elements
Tree based nodes use about twice the memory
Make heavy collision lookup case O(log(N)) rather than O(N)
Relies on keys being Comparable
https://github.com/RichardWarburton/map-visualiser
![Page 45: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/45.jpg)
So which HashMap is best?
![Page 46: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/46.jpg)
Benchmarking is about building a mental model of the performance tradeoffs
![Page 47: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/47.jpg)
Example Jar-Jar Benchmark
call get() on a single value for a map of size 1
No model of the different factors that affect things!
![Page 48: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/48.jpg)
Benchmarking HashMaps
Load FactorNonlinear key accessSuccessful vs Failed get()Hash CollisionsComparable vs Incomparable keysDifferent Keys and ValuesCost of hashCode/Equals
![Page 49: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/49.jpg)
Tree Optimization - 60% Collisions
![Page 50: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/50.jpg)
Tree Optimization - 10% Collisions
![Page 51: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/51.jpg)
Probing vs Chaining
Probing Maps usually have lower memory consumption
Small Maps: Probing never has long clusters, can be up to 91% faster.
In large maps with high collision rates, probing scales poorly and can be significantly slower.
![Page 52: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/52.jpg)
Takeaways
There’s no clearcut “winner”.
JDK Implementations try to minimise worst case.
Linear Probing requires a good hashCode() distribution, Often hashmaps “precondition” their hashes.
IdentityHashMap has low memory consumption and is fast, use it!
3rd Party libraries offer probing HashMaps, eg Koloboke & Eclipse-Collections.
![Page 53: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/53.jpg)
Conclusions
![Page 54: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/54.jpg)
Interface Popularity
List 1576210
Set 980763
Map 803171
Queue 62024
Deque 3464
SortedSet 9121
NavigableSet 1735
SortedMap 8677
NavigableMap 1484
![Page 55: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/55.jpg)
Implementation Popularity
ArrayList 225029
LinkedList 26850
ArrayDeque 1086
HashSet 68940
TreeSet 10108
EnumSet 10512
HashMap 137610
TreeMap 7734
WeakHashMap 3473
IdentityHashMap 2443
EnumMap 1904
![Page 56: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/56.jpg)
Evolution can be interesting ...Java 1.2 Java 10?
![Page 57: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/57.jpg)
![Page 58: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/58.jpg)
Any Questions?
www.iteratrlearning.com
● Modern Development with Java 8● Reactive and Asynchronous Java● Java Software Development Bootcamp
![Page 59: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/59.jpg)
Further reading
Fast Functional Lists, Hash-Lists, Deques and Variable Length Arrayshttps://infoscience.epfl.ch/record/64410/files/techlists.pdf
Smaller Footprint for Java Collectionshttp://www.lirmm.fr/~ducour/Doc-objets/ECOOP2012/ECOOP/ecoop/356.pdf
Optimizing Hash-Array Mapped Tries for Fast and Lean Immutable JVM Collectionshttp://michael.steindorfer.name/publications/oopsla15.pdf
RRB-Trees: Efficient Immutable Vectorshttps://infoscience.epfl.ch/record/169879/files/RMTrees.pdf
![Page 60: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/60.jpg)
Further reading
Doug Lea’s Analysis of the HashMap implementation tradeoffshttp://www.mail-archive.com/[email protected]/msg02147.html
Java Specialists HashMap article
http://www.javaspecialists.eu/archive/Issue235.html
Sample and Benchmark Codehttps://github.com/RichardWarburton/Java-Collections-The-Force-Awakens
![Page 61: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/61.jpg)
Further reading
Debian code search used for popularityhttps://codesearch.debian.net/
![Page 62: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/62.jpg)
Small HashMaps
Many HashMaps are small or empty
Lazy Initialization In Java 8+
Specialised Implementations● Collections.singleton*/Collections.empty*● Collectors.partitioningBy()● Specialised Eclipse Collections (eg Doubleton)
![Page 63: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/63.jpg)
Probing Sequence
Linear- Cache Locality
Quadratic- Tree
Clever ideas
![Page 64: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/64.jpg)
Implementing Persistent Collections
Fat node● Nodes store updated values in an internal list ● Different versions accessible using an order (e.g. timestamp)
Path copying● Copy path leading to updated node● Share rest with previous version
![Page 65: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/65.jpg)
Benchmarking HashMaps
Test different Assumptions + Behaviours
Understand costs, don’t just measure them
Be Scientific
Use a framework
Peer Review - Wisedom of crowds
![Page 66: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/66.jpg)
h = key.hashCode() ^ (h >>> 16);
Preconditioning
![Page 67: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/67.jpg)
CopyOnWrite
public boolean add(E e) {
final ReentrantLock lock = this.lock;
lock.lock();
try {
Object[] elements = getArray();
int len = elements.length;
Object[] newElements = Arrays.copyOf(elements, len + 1);
newElements[len] = e;
setArray(newElements);
return true;
} finally {
lock.unlock();
}
}
![Page 68: Java Collections The Force Awakens - JAX London...Reducing scope for bugs ~280 bugs in 28 projects including Cassandra, Lucene ~80% check-then-act bugs discovered are put-if-absent](https://reader035.vdocuments.us/reader035/viewer/2022070713/5ed24ff1e0d2e942d71afb85/html5/thumbnails/68.jpg)
Persistent Array (Bitmapped Vector Trie)
● Uses bit pattern (representing index number) for efficient arithmetic / lookup of elements
● Branching factor of 32 and depth of 5 can stores 33 millions elements and requires 5 lookups to find an element “practically constant”