14 numbers every developer should know

Upload: ebelibala

Post on 04-Jun-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 14 Numbers Every Developer Should Know

    1/3

    28/13 14 numbers every developer should know

    ww.infoarena.ro/blog/numbers-everyone-should-know

    123

    Like

    62

    TweetTweet

    97

    14 numbers every developer should know

    Cosmin Negruseri

    26 martie 2013

    Jeff Dean , a famous Google engineer, popularized a list of latency numbers everyone should

    know . The list is a great resource for designin g large scale infrastructure systems.

    Algorithms and their complexity often occur in critical parts of computer systems, but I find that

    few engineers have a good understanding of how a O(n!) algorithm compares to a O(n5) one.

    In the coding contest world, competitors think about these tradeoffs all the time. No wonder,

    there's a set of numbers every algorithm designer should know.

    The table below shows the limits that can be reached in a few seconds by algorithms of different

    complexities, n being the input size. I've added some algorithms and data structure examples for

    each complexity class.

    maximum n complexity algorithms data structures

    1,000,000,000

    and higherlog n, sqrt n

    binary search, ternary search, fast

    exponentiation, euclid algorithm

    Despre Blog

    Ultimele insemnari 14 numbers every developer

    should know Interactive problems shortlist

    Code Pandas

    Combinatorics shortlist

    Algoritmiada 2013

    Infoarena i-a ales noua

    conducere

    Haideti sa imbunatatim Infoarena

    impreuna!

    Rolling hash, Rabin Karp,

    palindromes, rsync and others

    Count (distinct) problem

    Infoarena lovete din nou!

    Categorii algoritmiada( 7)

    algoritmica (1)

    concursuri(11)

    doi la suta( 1)

    Evenimente( 7)

    facebook (1)

    Features( 2)

    girl camp( 2)

    Girls (2)

    Girls camp( 2)

    girls programming camp( 3)

    google (1)

    infoarena (6)

    internships( 1)

    interviu(12) life the universe and everything

    (7)

    organizare (3)

    potw(31)

    preoni 2008( 3)

    probleme (9)

    Selectie Girls Programming

    Camp( 1)

    stiri(76)

    video( 6)

    Blogroll Vivi's blog

    WebDiarios de Motocicleta Catalin Fr ancu

    Florin Manea

    Stevey's Blog Rants

    Joel on Software

    Developing for developers

  • 8/13/2019 14 Numbers Every Developer Should Know

    2/3

    28/13 14 numbers every developer should know

    ww.infoarena.ro/blog/numbers-everyone-should-know

    10,000,000n, n log log

    n, n log* n

    set intersection, Eratosthenes sieve, radix

    sort, KMP, topological sort, Euler tour,

    strongly connected components, 2sat

    disjoint sets, tries,

    hash_map, rolling hash

    deque

    1,000 ,000 n lo g nsorting, divide and conquer, sweep line,

    Kruskal, Dijkstra

    segment trees, range

    trees, heaps, treaps,

    binary indexed trees,

    suffix arrays

    100,000 n log2n divide and conquer 2d range trees

    50,000n1.585, n

    sqrt nKaratsuba, square root trick two level tree

    1000 - 10,000 n2largest empty rectangle, Dijkstra, Prim (on

    dense graphs)

    300-500 n3

    all pairs shortest paths, largest sum

    submatrix, naive matrix multiplication,

    matrix chain multiplication, gaussian

    elimination, network flow

    30-50 n4, n5, n6

    25 - 40 3n/2, 2n/2 meet in the middlehash tables (for set

    intersection)

    15 - 24 2nsubset enumeration, brute force, dynamic

    programming with exponential states

    15 - 20 n22ndynamic programming with exponential

    statesbitsets, hash_map

    13-17 3ndynamic programming with exponential

    states

    hash_map (to store the

    states)

    11 n!brute force, backtracking,

    next_permutation

    8 nn brute force, cartesian product

    These numbers aren't very precise, they assume in memory operations and some varying constant

    factors, but they do give a good starting point in your search for a solution that fits your problem and

    your data size.

    Let's go through an example.

    Suppose you work for a GPS company and your project is to improve their directions feature. In

    school you've learned about using Dijkstra's algorithm to find the shortest path between two nodes

    in a graph. Knowing these numbers you will understand that it will take seconds to process a graph

    with millio ns of ed ges given that Di jkstra i mplementatio ns h ave m l og n time co mplexity (where m is

    the number of edges and n the number of nodes).

    Now you face a few questions:

    How fast do you want your code to be? seconds? hundreds of milliseconds?

  • 8/13/2019 14 Numbers Every Developer Should Know

    3/3

    28/13 14 numbers every developer should know

    ww.infoarena.ro/blog/numbers-everyone-should-know

    2004-2013 Asociatia infoarena

    A response on the web feels fast if it takes less then 500 milliseconds. So let's pick half a second.

    How big is the graph? Do you want to solve the problem for a city, a coutry or even a continent?

    Each is about a magnitude larger than the other and will be solved by different approaches. Let's

    say we want to solve the problem for the whole of Europe.

    Here are sizes for a few input sets:

    input Europe USA/CAN USA (Tiger)

    #nodes 18 029 721 18 741 705 24 278 285

    #directed edges 42 199 587 47 244 849 58 213 192

    #road categories 13 13 4

    Since we chose half a second to be our execution time and the size of our problem to be about 40

    million edges it's clear from our table that m log n is too slow. So pure Dijkstra won't do. We need to

    look at how other algorithms like A star search or one based on Highway hierarchies behave for

    this problem.

    Categorii:

    Cu exceptia cazurilor in care se specifica altfel, continutul site-ului infoarena

    este publicat sub licenta Creative Commons Attribution-NonCommercial 2.5.