why json in postgresql is awesome

Upload: teo-tokis

Post on 02-Jun-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Why JSON in PostgreSQL is Awesome

    1/7

    21/10/2014 Why JSON in PostgreSQL is Awesome

    https://functionwhatwhat.com/json-in-postgresql/ 1/7

    HOME SUBSCRIBE

    Why JSON inPostgreSQL isAwesome10 NOVEMBER 2013on PostgreSQL, JSON

    Recently a commenter on Hacker News questioned the

    usefulness of the JSON data type in PostgreSQL. At the

    time I posted a brief response, but I have decided to take

    the time to write up a more detailed explanation here as

    I believe the JSON data type provides a significant

    benefit to developers.

    Too long; didnt read:The JSON data type is useful for

    storing multi-level object graphs. It provides better

    performance and the code itself is easier (and therefore

    cheaper) to write and maintain. The developer will be

    happier and the development will be cheaper.

    Multi-Level Object Graph: Names

    To demonstrate the difficulties of handling multi-level

    object graphs in relational databases, consider names as an

    example.

    Handling names in computer systems correctly is a

    https://news.ycombinator.com/item?id=6573119https://news.ycombinator.com/item?id=6573119https://news.ycombinator.com/item?id=6573119https://functionwhatwhat.com/tag/json/https://functionwhatwhat.com/tag/postgresql/https://functionwhatwhat.com/rss/https://functionwhatwhat.com/
  • 8/10/2019 Why JSON in PostgreSQL is Awesome

    2/7

    21/10/2014 Why JSON in PostgreSQL is Awesome

    https://functionwhatwhat.com/json-in-postgresql/ 2/7

    difficult task. Programmers tend to believe falsehoods

    about names. They often assume that everybody has a first

    name and a last name (and in that order), but, realistically,

    one cannot make such brosumptions about peoples

    names. Different cultures have different namingconventions. Over the years I have come to the conclusion

    that the best way to handle names while preserving

    practicality is to:

    1. store the full name,

    2. keep an ordered list of name typename pairs.

    For example:

    {

    "fullName": "Charles John Huffam Dickens",

    "names":

    [

    { "type": "firstName", "value": "Charles" },

    { "type": "middleName", "value": "John" },

    { "type": "middleName", "value": "Huffam" },

    { "type": "lastName", "value": "Dickens" }

    ]

    }

    Book Store

    Imagine an application that manages books, such as that

    used by a library or a book store. Books have authors,authors have names and these names have name parts

    (e.g. first name, last name, etc.). The application must

    http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/
  • 8/10/2019 Why JSON in PostgreSQL is Awesome

    3/7

    21/10/2014 Why JSON in PostgreSQL is Awesome

    https://functionwhatwhat.com/json-in-postgresql/ 3/7

    keep track all of this data. If one follows the requirements

    of the third normal form, this will result in database tables

    such as the ones below.

    Note: for the sake of simplicity, well ignore the fact that theexample below wouldnt meet the requirements of the third

    normal form if one person could author multiple titles.

    Book Table

    id title

    1 A Christmas Carol

    Author Table

    id book_id full_name

    101 1 Charles John Huffam Dickens

    Author Name Part Table

    id author_id type value

    201 101 first_name Charles

    202 101 middle_name John

    203 101 middle_name Huffam

    204 101 last_name Dickens

    To fetch a book from the database there are 3 options. An

    ORM will probably attempt option 2.

    1. Everything is joined in a single query.This will yield a

    row for every name part within the name of each

    author. The columns derived from the book table will

    http://en.wikipedia.org/wiki/Third_normal_form
  • 8/10/2019 Why JSON in PostgreSQL is Awesome

    4/7

    21/10/2014 Why JSON in PostgreSQL is Awesome

    https://functionwhatwhat.com/json-in-postgresql/ 4/7

    have the same data repeated in every row. The book

    data structure must be assembled from multiple rows.

    Things will get very convoluted if one attempts to

    retrieve multiple books at the same time as every row

    will need to be carefully matched to the correct author

    and book.

    2. A separate query is run for each table.First the book is

    returned, then the authors, then the name parts. To

    retrieve multiple books, either

    the author and name part queries must be repeated

    for each book (and the name part query for each

    author in each book) OR

    the author and name part queries must be

    constructed in a manner that retrieves all of the

    authors of all the books (and all the name parts ofall the authors of the books) in a single result at

    which point the matching exercise described in the

    previous section must be executed.

    The first variant results in a separate query for each

    piece of data, which will have a significantly

    detrimental impact on performance. While the secondvariant will be much faster, it is still not a very elegant

    solution as it requires the database to repeat the

    (possibly expensive) filter that was used to select the

    books 3 times: once for the books, once for the authors

    and once for the name parts themselves.

    3. There is no third option!Everybody loses. Always. This

    is why objectrelational mapping is hard.

  • 8/10/2019 Why JSON in PostgreSQL is Awesome

    5/7

    21/10/2014 Why JSON in PostgreSQL is Awesome

    https://functionwhatwhat.com/json-in-postgresql/ 5/7

    Serialisation to JSON

    Fortunately, it is possible to serialise the entirety of this

    madness into a single JSON string (or XML or something

    similar but this post is about JSON). The serialised object

    can then be stored in a text column. When reading from

    the database, the text can be deserialised back to an object

    graph. While this was possible with earlier versions of the

    software, PostgreSQL 9.2 and 9.3 have introduced a

    number of features that make working with JSON data a

    very pleasant experience:

    The json data type is basically the same as text in

    terms of behaviour and storage, but the database

    checks that the value is valid JSON. This was introduced

    in PostgreSQL 9.2.

    WithJSON functions and operatorsintroduced in

    PostgreSQL 9.3 one can select authors->1->'fullName'

    from book to get the full name of the first author of every

    book. It is even possible to create indexes that traverse

    JSON values with indexes on expressions.

    Without JSON columns, functions and operators, such

    queries become complex and possibly inefficient. With

    JSON, only a single table is required:

    Book Table

    id

    title authors

    1 A ChristmasCarol [

    {

    "fullName": "Charles John Huffam

    http://www.postgresql.org/docs/9.3/static/functions-json.htmlhttp://www.postgresql.org/docs/9.3/static/functions-json.html
  • 8/10/2019 Why JSON in PostgreSQL is Awesome

    6/7

    21/10/2014 Why JSON in PostgreSQL is Awesome

    https://functionwhatwhat.com/json-in-postgresql/ 6/7

    Dickens",

    "names":

    [

    { "type": "firstName", "value":

    "Charles" },

    { "type": "middleName", "value":

    "John" },

    { "type": "middleName", "value":"Huffam" },

    { "type": "lastName", "value":

    "Dickens" }

    ]

    }

    ]

    Less code = faster queries. Its hard not to be excited about

    this.

    The Downsides

    No loops in the object graph.Circular references are notpermitted in JSON. An author cannot belong to two books

    and authors cannot reference each other. The first problem

    can be tackled by repeating an authors data for each book

    he/she has authored. This means, however, that the

    database will be larger and updating (renaming) authors a

    very expensive operation. It is up to the programmer todecide if this is compatible with the requirements of the

    application.

    No referential integrity.While it is possible to generate

    pseudo unique keys by creating a unique index on a JSON

    expression, foreign keys cannot be created on (orreference) JSON expressions. For a field to reference

    another column or be referenced by another column, it

  • 8/10/2019 Why JSON in PostgreSQL is Awesome

    7/7

    21/10/2014 Why JSON in PostgreSQL is Awesome

    https://functionwhatwhat.com/json-in-postgresql/ 7/7

    needs to be promoted into an actual column or perhaps

    even a table.

    Not portable.This is a non-standard extension to SQL

    specific to PostgreSQL only. It cannot be used with otherdatabases even if they have some JSON support as the

    actual implementation is probably very different. This

    means that the application will be locked in to PostgreSQL.

    In my view, there are worse things than being locked in to

    an open source technology. However again, it is up to the

    judgement of the stakeholders to determine whether thismeets the needs of the business.

    Follow me on Twitter: @tamasczinege

    Tamas Czinege

    Founder of 44/55 Software Limited, London.

    London, UK

    https://twitter.com/tamasczinege

    Share this post

    Function What What 2014 Proudly published with Ghost

    https://functionwhatwhat.com/author/tamas-czinege/https://ghost.org/https://functionwhatwhat.com/https://plus.google.com/share?url=https://functionwhatwhat.com/json-in-postgresql/https://www.facebook.com/sharer/sharer.php?u=https://functionwhatwhat.com/json-in-postgresql/https://twitter.com/share?text=Why%20JSON%20in%20PostgreSQL%20is%20Awesome&url=https://functionwhatwhat.com/json-in-postgresql/https://functionwhatwhat.com/author/tamas-czinege/https://twitter.com/tamasczinegehttps://twitter.com/tamasczinege