data structures and other

Upload: igor-petruk

Post on 03-Apr-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Data structures and other

    1/72

    Data structuresExamples

    Data Structures and beyond

    Igor Petruk

    EPAM Systems.

    London, 2013

    Igor Petruk Data Structures and beyond

  • 7/29/2019 Data structures and other

    2/72

    Data structuresExamples

    i

    1 Data structuresGoals

    What is a data structure?

    2 Examples

    Igor Petruk Data Structures and beyond

  • 7/29/2019 Data structures and other

    3/72

    Data structuresExamples

    GoalsWhat is a data structure?

    Igor Petruk Data Structures and beyond

  • 7/29/2019 Data structures and other

    4/72

    Data structuresExamples

    GoalsWhat is a data structure?

    Goals of the presentation

    Find data structures you like (maybe reminds a part of your system)

    Igor Petruk Data Structures and beyond

    G

  • 7/29/2019 Data structures and other

    5/72

    Data structuresExamples

    GoalsWhat is a data structure?

    Goals of the presentation

    Find data structures you like (maybe reminds a part of your system)

    Make a note with their names

    Igor Petruk Data Structures and beyond

    D t t t G l

  • 7/29/2019 Data structures and other

    6/72

    Data structuresExamples

    GoalsWhat is a data structure?

    Goals of the presentation

    Find data structures you like (maybe reminds a part of your system)

    Make a note with their names

    Become aware of availability of the trickiest data structures

    Igor Petruk Data Structures and beyond

    Data structures Goals

  • 7/29/2019 Data structures and other

    7/72

    Data structuresExamples

    GoalsWhat is a data structure?

    Goals of the presentation

    Find data structures you like (maybe reminds a part of your system)

    Make a note with their names

    Become aware of availability of the trickiest data structures

    Discover some cool ways to use them

    Igor Petruk Data Structures and beyond

    Data structures Goals

  • 7/29/2019 Data structures and other

    8/72

    Data structuresExamples

    GoalsWhat is a data structure?

    Goals of the presentation

    Find data structures you like (maybe reminds a part of your system)

    Make a note with their names

    Become aware of availability of the trickiest data structures

    Discover some cool ways to use themGet inspired to practice data structure driven development

    Igor Petruk Data Structures and beyond

    Data structures Goals

  • 7/29/2019 Data structures and other

    9/72

    Data structuresExamples

    GoalsWhat is a data structure?

    Goals of the presentation

    Find data structures you like (maybe reminds a part of your system)

    Make a note with their names

    Become aware of availability of the trickiest data structures

    Discover some cool ways to use themGet inspired to practice data structure driven development

    Not a goal of the presentation

    Trying to remember what I say

    Igor Petruk Data Structures and beyond

    Data structures Goals

  • 7/29/2019 Data structures and other

    10/72

    Data structuresExamples

    GoalsWhat is a data structure?

    What is a data structure?

    data structure is a particular way of storing and organizing data in acomputer so that it can be used efficiently.

    Igor Petruk Data Structures and beyond

    Data structures Goals

  • 7/29/2019 Data structures and other

    11/72

    Examples What is a data structure?

    What is a data structure?

    data structure is a particular way of storing and organizing data in acomputer so that it can be used efficiently.

    Built of

    DataBehaviour

    Igor Petruk Data Structures and beyond

    Data structures Goals

  • 7/29/2019 Data structures and other

    12/72

    Examples What is a data structure?

    What is a data structure?

    data structure is a particular way of storing and organizing data in acomputer so that it can be used efficiently.

    Built of

    DataBehaviour

    They can

    Igor Petruk Data Structures and beyond

    Data structuresE l

    GoalsWh i d ?

  • 7/29/2019 Data structures and other

    13/72

    Examples What is a data structure?

    What is a data structure?

    data structure is a particular way of storing and organizing data in acomputer so that it can be used efficiently.

    Built of

    DataBehaviour

    They can

    Be hard to imagine

    Be complex to understandSolve your problem in one go

    Igor Petruk Data Structures and beyond

    Data structuresE l

    GoalsWh t i d t t t ?

  • 7/29/2019 Data structures and other

    14/72

    Examples What is a data structure?

    What kinds are out there?

    Some examples (from Java)

    HashMapLinkedList

    Array

    TreeSet

    Anything in common?

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    GoalsWhat is a data structure?

  • 7/29/2019 Data structures and other

    15/72

    Examples What is a data structure?

    What kinds are out there?

    Some examples (from Java)

    HashMapLinkedList

    Array

    TreeSet

    Anything in common? Eager, finite, mutable

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    GoalsWhat is a data structure?

  • 7/29/2019 Data structures and other

    16/72

    Examples What is a data structure?

    What kinds are out there?

    Some examples (from Java)

    HashMapLinkedList

    Array

    TreeSet

    Anything in common? Eager, finite, mutable

    They can be

    Eager vs Lazy

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    GoalsWhat is a data structure?

  • 7/29/2019 Data structures and other

    17/72

    Examples What is a data structure?

    What kinds are out there?

    Some examples (from Java)

    HashMapLinkedList

    Array

    TreeSet

    Anything in common? Eager, finite, mutable

    They can be

    Eager vs Lazy

    Mutable vs Immutable

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    GoalsWhat is a data structure?

  • 7/29/2019 Data structures and other

    18/72

    p

    What kinds are out there?

    Some examples (from Java)

    HashMapLinkedList

    Array

    TreeSet

    Anything in common? Eager, finite, mutable

    They can be

    Eager vs Lazy

    Mutable vs Immutable

    Persistent

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    GoalsWhat is a data structure?

  • 7/29/2019 Data structures and other

    19/72

    p

    What kinds are out there?

    Some examples (from Java)

    HashMapLinkedList

    Array

    TreeSet

    Anything in common? Eager, finite, mutable

    They can be

    Eager vs Lazy

    Mutable vs Immutable

    PersistentFinite vs Infinite

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    GoalsWhat is a data structure?

  • 7/29/2019 Data structures and other

    20/72

    What kinds are out there?

    Some examples (from Java)

    HashMapLinkedList

    Array

    TreeSet

    Anything in common? Eager, finite, mutable

    They can be

    Eager vs Lazy

    Mutable vs Immutable

    PersistentFinite vs InfiniteComputed

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    GoalsWhat is a data structure?

  • 7/29/2019 Data structures and other

    21/72

    What kinds are out there?

    Some examples (from Java)

    HashMapLinkedList

    Array

    TreeSet

    Anything in common? Eager, finite, mutable

    They can be

    Eager vs Lazy

    Mutable vs Immutable

    PersistentFinite vs InfiniteComputed

    Probabilistic

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    GoalsWhat is a data structure?

  • 7/29/2019 Data structures and other

    22/72

    What kinds are out there?

    Some examples (from Java)

    HashMapLinkedList

    Array

    TreeSet

    Anything in common? Eager, finite, mutable

    They can be

    Eager vs Lazy

    Mutable vs Immutable

    PersistentFinite vs InfiniteComputed

    Probabilistic

    Optimized for specific environmentOn disk

    Igor Petruk Data Structures and beyond

    Data structuresExamples

  • 7/29/2019 Data structures and other

    23/72

    MultiSet

    What is it

    A set

    Each value contains a counter

    Igor Petruk Data Structures and beyond

    Data structuresExamples

  • 7/29/2019 Data structures and other

    24/72

    MultiSet

    What is it

    A set

    Each value contains a counter

    Usage

    StatisticsPut values inGet element occurence count

    Igor Petruk Data Structures and beyond

    Data structuresExamples

  • 7/29/2019 Data structures and other

    25/72

    MultiSet

    What is it

    A set

    Each value contains a counter

    Usage

    Statistics

    Put values inGet element occurence count

    Save memoryStill using a set

    Igor Petruk Data Structures and beyond

    Data structuresExamples

  • 7/29/2019 Data structures and other

    26/72

    MultiSet

    What is it

    A set

    Each value contains a counter

    Usage

    Statistics

    Put values inGet element occurence count

    Save memoryStill using a set

    Uniqueness sanity checkPut values in

    Filter entries with count>1Collect your errors

    Igor Petruk Data Structures and beyond

    Data structuresExamples

  • 7/29/2019 Data structures and other

    27/72

    MultiSet

    What is it

    A set

    Each value contains a counter

    Usage

    Statistics

    Put values inGet element occurence count

    Save memoryStill using a set

    Uniqueness sanity checkPut values in

    Filter entries with count>1Collect your errors

    Maintain leadersPut values in realtimeUse max-first iterator

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    fi

  • 7/29/2019 Data structures and other

    28/72

    Trie - Prefix tree

    What is it

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    T i P fi

  • 7/29/2019 Data structures and other

    29/72

    Trie - Prefix tree

    Branch keys can be anything

    StringChars

    Custom objects

    BitsBit chunks of hash, Hash Array Mapped Trie

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    T i P fi t

  • 7/29/2019 Data structures and other

    30/72

    Trie - Prefix tree

    Branch keys can be anything

    StringChars

    Custom objects

    BitsBit chunks of hash, Hash Array Mapped Trie

    Usage

    As a mapNo collisions with linear search in bucketNo rebalancing

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    T i P fi t

  • 7/29/2019 Data structures and other

    31/72

    Trie - Prefix tree

    Branch keys can be anything

    StringChars

    Custom objects

    BitsBit chunks of hash, Hash Array Mapped Trie

    Usage

    As a mapNo collisions with linear search in bucketNo rebalancing

    Prefix searchBest support for this operation

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    T i P fi t

  • 7/29/2019 Data structures and other

    32/72

    Trie - Prefix tree

    Branch keys can be anything

    StringChars

    Custom objects

    BitsBit chunks of hash, Hash Array Mapped Trie

    Usage

    As a mapNo collisions with linear search in bucketNo rebalancing

    Prefix searchBest support for this operation

    CompressionIf prefixes are shared

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    Trie Prefix tree

  • 7/29/2019 Data structures and other

    33/72

    Trie - Prefix tree

    Branch keys can be anything

    StringChars

    Custom objects

    BitsBit chunks of hash, Hash Array Mapped Trie

    Usage

    As a mapNo collisions with linear search in bucketNo rebalancing

    Prefix searchBest support for this operation

    CompressionIf prefixes are shared

    Indexing, Full Text SearchSuffix Trie

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    Splay tree

  • 7/29/2019 Data structures and other

    34/72

    Splay tree

    Splay treeBalances in order last access first

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    Splay tree

  • 7/29/2019 Data structures and other

    35/72

    Splay tree

    Splay treeBalances in order last access first

    UsageLRU Cache

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    Splay tree

  • 7/29/2019 Data structures and other

    36/72

    Splay tree

    Splay treeBalances in order last access first

    UsageLRU Cache

    Why not to talk about cache in general?

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    Cache

  • 7/29/2019 Data structures and other

    37/72

    Cache

    Cache

    Cache is not a BIG THING

    Caches that are as lightweight and simple as a map are available(Guava)

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    Cache

  • 7/29/2019 Data structures and other

    38/72

    Cache

    Cache

    Cache is not a BIG THING

    Caches that are as lightweight and simple as a map are available(Guava)

    Data

    Simple trees, splay trees

    Databases

    Files

    Off-heap

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    Cache

  • 7/29/2019 Data structures and other

    39/72

    Cache

    Behaviour

    Population strategyManualLoading

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    Cache

  • 7/29/2019 Data structures and other

    40/72

    Behaviour

    Population strategyManualLoading

    Write strategyRead-onlyWrite throughWrite behind

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    Cache

  • 7/29/2019 Data structures and other

    41/72

    BehaviourPopulation strategy

    ManualLoading

    Write strategyRead-onlyWrite throughWrite behind

    Eviction strategyLRUMRULFU

    RandomGarbage collector drivenMulti Queue

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    Cache

  • 7/29/2019 Data structures and other

    42/72

    Complex?

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    Cache

  • 7/29/2019 Data structures and other

    43/72

    Complex?

    LoadingCache graphs = CacheBuilder.newBuilder().maximumSize(1000).expireAfterWrite(10, TimeUnit.MINUTES)

    .build(new CacheLoader() {

    public Graph load(Key key) throws AnyException {return createExpensiveGraph(key);

    }});

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    Cache

  • 7/29/2019 Data structures and other

    44/72

    Usage other then obvious

    Protecting against eventual consistency issues

    Batching

    Memoization

    Dynamic programming

    Concurrent synchronization

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    View

  • 7/29/2019 Data structures and other

    45/72

    View

    Is this a data structure?

    YES/NO

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    View

  • 7/29/2019 Data structures and other

    46/72

    View

    Is this a data structure?

    YES/NO

    We have the data (somewhere)

    We have the behaviour

    Why copy?

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    View

  • 7/29/2019 Data structures and other

    47/72

    Usage

    See a list as a map

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    View

  • 7/29/2019 Data structures and other

    48/72

    Usage

    See a list as a map

    Process only needed part of data

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    View

  • 7/29/2019 Data structures and other

    49/72

    Usage

    See a list as a map

    Process only needed part of data

    Perform multiple operations togetherEach operation produces a wrapper

    Final operation - copy

    Igor Petruk Data Structures and beyond

    Data structuresExamples

    View

  • 7/29/2019 Data structures and other

    50/72

    Usage

    See a list as a map

    Process only needed part of data

    Perform multiple operations togetherEach operation produces a wrapper

    Final operation - copyDoes anyone know how to concat two immutable lists with O(1)complexity?

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    View

  • 7/29/2019 Data structures and other

    51/72

    Usage

    See a list as a map

    Process only needed part of data

    Perform multiple operations togetherEach operation produces a wrapper

    Final operation - copyDoes anyone know how to concat two immutable lists with O(1)complexity?

    Combine data structures

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    View

  • 7/29/2019 Data structures and other

    52/72

    Usage

    See a list as a map

    Process only needed part of data

    Perform multiple operations togetherEach operation produces a wrapper

    Final operation - copyDoes anyone know how to concat two immutable lists with O(1)complexity?

    Combine data structures

    Use listeners to build change aware viewIndexing

    Live displayReactive programming

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Persistent data structures

  • 7/29/2019 Data structures and other

    53/72

    Persistent data structures

    Any mutation produces new version of a structure

    Previous version remains unless deleted (or garbage collected)

    Usually designed to reuse most of the previous version

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Persistent data structures

  • 7/29/2019 Data structures and other

    54/72

    Persistent data structures

    Any mutation produces new version of a structure

    Previous version remains unless deleted (or garbage collected)

    Usually designed to reuse most of the previous version

    Examples

    Immutable linked list (cons list)

    Immutable trees

    Usually immutable functional data structure are persistent

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Persistent data structures

  • 7/29/2019 Data structures and other

    55/72

    Persistent data structures

    Any mutation produces new version of a structure

    Previous version remains unless deleted (or garbage collected)

    Usually designed to reuse most of the previous version

    Examples

    Immutable linked list (cons list)

    Immutable trees

    Usually immutable functional data structure are persistent

    Usage

    When we have a single writer - they are lock free

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Persistent data structures

  • 7/29/2019 Data structures and other

    56/72

    Persistent data structures

    Any mutation produces new version of a structure

    Previous version remains unless deleted (or garbage collected)

    Usually designed to reuse most of the previous version

    Examples

    Immutable linked list (cons list)

    Immutable trees

    Usually immutable functional data structure are persistent

    Usage

    When we have a single writer - they are lock freeAny instance of PDS is a snapshot already

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Persistent data structures

  • 7/29/2019 Data structures and other

    57/72

    Persistent data structures

    Any mutation produces new version of a structure

    Previous version remains unless deleted (or garbage collected)

    Usually designed to reuse most of the previous version

    Examples

    Immutable linked list (cons list)

    Immutable trees

    Usually immutable functional data structure are persistent

    Usage

    When we have a single writer - they are lock freeAny instance of PDS is a snapshot already

    Put a journal in front of writer, store PDS to disk sometimes and hereit goes - durable high performance in-memory database with O(1)snapshot support

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Bloom filter

  • 7/29/2019 Data structures and other

    58/72

    Bloom filter

    A probabilistic set

    Can represent set of elements from infinite universe

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Bloom filter

  • 7/29/2019 Data structures and other

    59/72

    Bloom filter

    A probabilistic set

    Can represent set of elements from infinite universe

    Strict test if the set does NOT contain the element

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Bloom filter

  • 7/29/2019 Data structures and other

    60/72

    Bloom filter

    A probabilistic set

    Can represent set of elements from infinite universe

    Strict test if the set does NOT contain the element

    Probabilistic test if the set contains the element

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Bloom filter

  • 7/29/2019 Data structures and other

    61/72

    Bloom filter

    A probabilistic set

    Can represent set of elements from infinite universe

    Strict test if the set does NOT contain the element

    Probabilistic test if the set contains the element

    Internals

    An array of m bits

    k different hash functions, that produce values from 0 to m-1

    Extendable to counting Bloom filter which supports deletion

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Bloom filter

  • 7/29/2019 Data structures and other

    62/72

    Bloom filter

    A probabilistic set

    Can represent set of elements from infinite universe

    Strict test if the set does NOT contain the element

    Probabilistic test if the set contains the element

    Internals

    An array of m bits

    k different hash functions, that produce values from 0 to m-1

    Extendable to counting Bloom filter which supports deletion

    Notable usagesApache Cassandra and Google BigTable

    Test if the record can be on the diskIf it can - fetch it from the diskIf not, then we dont have to trigger a disk operation

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Stream

  • 7/29/2019 Data structures and other

    63/72

    Stream

    List

    Evaluator in tail

    Can be infinite

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Stream

  • 7/29/2019 Data structures and other

    64/72

    Stream

    List

    Evaluator in tail

    Can be infinite

    Usage

    Good abstraction for infinite computation that can be consumed by listprocessing routines

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Stream

  • 7/29/2019 Data structures and other

    65/72

    Stream

    List

    Evaluator in tail

    Can be infinite

    Usage

    Good abstraction for infinite computation that can be consumed by listprocessing routines

    If internal list is a linked list, then the GC can gather alreadyprocessed data

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Stream

  • 7/29/2019 Data structures and other

    66/72

    Stream

    List

    Evaluator in tail

    Can be infinite

    Usage

    Good abstraction for infinite computation that can be consumed by listprocessing routines

    If internal list is a linked list, then the GC can gather alreadyprocessed data

    Change definition of int fib(int) to Stream fib()

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Stream

  • 7/29/2019 Data structures and other

    67/72

    Stream

    List

    Evaluator in tail

    Can be infinite

    Usage

    Good abstraction for infinite computation that can be consumed by listprocessing routines

    If internal list is a linked list, then the GC can gather alreadyprocessed data

    Change definition of int fib(int) to Stream fib()

    You still can represent input, but via generic Stream

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Stream

  • 7/29/2019 Data structures and other

    68/72

    Stream

    List

    Evaluator in tail

    Can be infinite

    Usage

    Good abstraction for infinite computation that can be consumed by listprocessing routines

    If internal list is a linked list, then the GC can gather alreadyprocessed data

    Change definition of int fib(int) to Stream fib()

    You still can represent input, but via generic Stream

    Convert one stream to another via wrappers

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Option

  • 7/29/2019 Data structures and other

    69/72

    Option

    One or zero element collection

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Option

  • 7/29/2019 Data structures and other

    70/72

    Option

    One or zero element collection

    Null is no more!

    JK, easy conversion from null to Option and back

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Option

  • 7/29/2019 Data structures and other

    71/72

    Option

    One or zero element collection

    Null is no more!

    JK, easy conversion from null to Option and back

    Forces you to unbox the value before usage

    Usage

    Igor Petruk Data Structures and beyond

    Data structures

    Examples

    Option

  • 7/29/2019 Data structures and other

    72/72

    Option

    One or zero element collection

    Null is no more!

    JK, easy conversion from null to Option and back

    Forces you to unbox the value before usage

    Usage

    Do you have and element or not?

    Is the computation sucessful or not?

    Is the value populated or not by user?

    Null safe operations within OptionAvailable in Guava as Optional

    Monad

    Igor Petruk Data Structures and beyond