postgresql and json - thebuild.com · greetings! • christophe pettus • consultant with...
TRANSCRIPT
![Page 1: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/1.jpg)
PostgreSQL and JSON: 2015
Christophe PettusPostgreSQL Experts, Inc.
FOSDEM 2015
![Page 2: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/2.jpg)
Greetings!
• Christophe Pettus
• Consultant with PostgreSQL Experts, Inc.
• thebuild.com — personal blog.
• pgexperts.com — company website.
• Twitter @Xof
![Page 3: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/3.jpg)
Well, here we go again.
• My third year in a row talking about JSON at FOSDEM.
• A lot has changed in the last year.
• PostgreSQL 9.4 has a ton of new JSON features.
• Let’s take a look.
![Page 4: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/4.jpg)
JSON, what is?
• JavaScript Object Notation.
• A text format for serializing nested data structures.
• Based on JavaScript’s declaration syntax.
• Intended to be passed directly into JavaScript’s eval() function (don’t do this!)
![Page 5: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/5.jpg)
JSON Primitive Types.
• Strings, always Unicode.
• De facto, always UTF-8 in flight.
• Numbers, integer and float.
• Boolean: true and false.
•null
![Page 6: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/6.jpg)
JSON Structured Types.
• Arrays, using [ ].
• Hash / dictionaries / whatever you want to call them (the JSON spec calls them Objects), using { }
• { ‘string’ : value }
• Keys have to be strings; values can be anything.
![Page 7: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/7.jpg)
More complex types.
• Everything else is built out of those.
• There’s no type declaration mechanism.
• “Object” is unfortunate terminology.
• There’s no “schema” or similar validation method.
• Everything is delegated to the application.
![Page 8: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/8.jpg)
The good…
• It’s super-simple to generate and parse.
• The operational part of the spec is five pages, with diagrams.
• It’s the de facto standard for data interchange in web APIs.
• POST format is still used, but apps that do that are wrong.
![Page 9: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/9.jpg)
The bad…
• No higher-level standards.
• How is a datetime represented? I dunno, you figure it out.
• Remember SQL injection attacks? Now we have JSON injection attacks.
• Don’t use eval(). Just. Don’t.
![Page 10: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/10.jpg)
And PostgreSQL has JSON!
• It’s a core type.
• Not a contrib/ or extension module.
• Introduced in 9.2.
• Enhanced in 9.3.
• And really enhanced in 9.4.
![Page 11: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/11.jpg)
We liked JSON so much…
• … we created two types.
• json
• jsonb
• json is a pure text representation.
• jsonb is a parsed binary representation.
• Each can be casted to the other, of course.
![Page 12: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/12.jpg)
json type.
• Stores the actual json text.
• Whitespace included.
• What you get out is what you put in.
• Checked for correctness, but not otherwise processed.
![Page 13: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/13.jpg)
Why use json?
• You are storing the json and never processing it.
• You need to support two JSON “features”:
• Order-preserved fields in objects.
• Duplicate keys in objects.
• For some reason, you need the exact JSON text back out.
![Page 14: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/14.jpg)
Oh, and…
• jsonb wasn’t introduced until 9.4.
• So, if you are on 9.2-9.3, json is what you’ve got.
• Otherwise, you want to use jsonb.
![Page 15: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/15.jpg)
jsonb
• Parsed and encoded on the way in.
• Stored in a compact, parsed format.
• Considerably more operator and function support.
• Has indexing support.
![Page 16: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/16.jpg)
They’re just types.
• Fully transactional, can have multiple json/jsonb fields in a single table, etc.
• Uses the TOAST mechanism.
• Can be up to 1GB.
• Can be a NULLable field if you like.
![Page 17: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/17.jpg)
Basic Operators(both json and jsonb)• -> gets a JSON array element or object
field, as JSON.
• ->> gets the array element or object field cast to TEXT.
• #> gets the array element or object field at a path.
• #>> … cast to TEXT.
![Page 18: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/18.jpg)
jsonb only!
• @> — Does the left-hand value contain the right-hand value?
• <@ — Does the right-hand value contain the left hand value?
![Page 19: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/19.jpg)
Containment
• Containment work at the top level of the json object only, and on full JSON structures.
• It does not apply to individual keys.
• It does not apply to nested elements.
![Page 20: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/20.jpg)
@>postgres=# select '{"a": 1, "b": 2}'::jsonb @> '{"a": 1}'::jsonb; ?column? ---------- t(1 row)
postgres=# select '[1, 2, 3]'::jsonb @> '[1, 3]'::jsonb; ?column? ---------- t(1 row)
postgres=# select '{"a": {"b": 7, "c": 8}}'::jsonb @> '{"a": {"c": 8}}'::jsonb; ?column? ---------- t(1 row)
![Page 21: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/21.jpg)
but.
postgres=# select '{"a": {"b": 7}}'::jsonb @> '{"b": 7}'::jsonb; ?column? ---------- f(1 row)
postgres=# select '{"a": 1, "b": 2}'::jsonb @> '"a"'::jsonb; ?column? ---------- f(1 row)
![Page 22: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/22.jpg)
?, ?|, ?&
• True if:
• ? — The key on the right-hand side appears in the left-hand side.
• ?| ?& — Any of the array of keys on the right-hand side appear on the left-hand side.
• PostgreSQL array type, not JSON array.
![Page 23: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/23.jpg)
?, ?|, ?&postgres=# select '{"a": 7, "b": 4}'::jsonb ? 'a'; ?column? ---------- t(1 row)
postgres=# select '{"a": 7, "b": 4}'::jsonb ?& ARRAY['a', 'b']; ?column? ---------- t(1 row)
postgres=# select '{"a": 7, "b": 4}'::jsonb ?| ARRAY['a', 'q']; ?column? ---------- t(1 row)
![Page 24: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/24.jpg)
but.postgres=# select '{"a": {"b": 7, "c": 8}}'::jsonb ? 'b'; ?column? ---------- f(1 row)
postgres=# select '[1, 2, 3, 4]'::jsonb ?| ARRAY[1, 100];ERROR: operator does not exist: jsonb ?| integer[]LINE 1: select '[1, 2, 3, 4]'::jsonb ?| ARRAY[1, 100]; ^HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
postgres=# select '[1, 2, 3, 4]'::jsonb ?| '[1, 2]'::jsonb;ERROR: operator does not exist: jsonb ?| jsonbLINE 1: select '[1, 2, 3, 4]'::jsonb ?| '[1, 2]'::jsonb; ^HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
![Page 25: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/25.jpg)
JSON functions
• Lots and lots and lots.
• Create JSON from records, arrays, etc.
• Expand JSON into records, arrays, rowsets, etc.
• Many have both json and jsonb versions.
![Page 26: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/26.jpg)
Example: row_to_json
• Accepts an arbitrary row.
• Returns a json (not jsonb) object.
• For non-string/int/NULL types, uses the output function to create a string.
• Properly handles composite/array types.
![Page 27: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/27.jpg)
Behold!
xof=# select row_to_json(rel.*) from rel where array_length(tags, 1) > 2 order by id limit 3; row_to_json ----------------------------------------------------------------------------------------------------------------------------- {"id":636572,"first_name":"OLENE","last_name":"OGRAM","tags":["female","square","violet"]} {"id":636744,"first_name":"SHAYNE","last_name":"GALPIN","tags":["female","square","silver","aquamarine","green","octogon"]} {"id":636769,"first_name":"YASMIN","last_name":"AKEN","tags":["female","red","green"]}(3 rows)
![Page 28: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/28.jpg)
But seriously…
• … can be used in a trigger to append to an audit table regardless of the schema.
• Extremely useful for shared triggers.
![Page 29: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/29.jpg)
Example: jsonb_each_text
• Takes a jsonb object, and returns a rowset of key/value pairs.
• Returns each as text object.
• Can be used to write the world’s most expensive EAV query!
![Page 30: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/30.jpg)
Behold!
xof=# WITH s AS (xof(# SELECT row_to_json(rel.*)::jsonb AS j FROM rel ORDER BY id LIMIT 3xof(# ) SELECT (s.j->>'id')::bigint AS entity, key as attribute, value FROM s, LATERAL jsonb_each_text(s.j) WHERE key <> 'id'; entity | attribute | value --------+------------+------------ 636526 | tags | ["female"] 636526 | last_name | EILTS 636526 | first_name | REGENA 636527 | tags | ["male"] 636527 | last_name | POTO 636527 | first_name | ANTONIO 636528 | tags | ["female"] 636528 | last_name | LUFSEY 636528 | first_name | ROXY(9 rows)
![Page 31: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/31.jpg)
But that would be wrong.
![Page 32: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/32.jpg)
But seriously…
• … it can be used to expand jsonb into relational data for JOINs and the like.
• Often more efficient than using the extraction operators.
![Page 33: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/33.jpg)
Indexing.
![Page 34: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/34.jpg)
Indexing json
• The textual json type has no inherent indexing (that you’d ever use).
• Can do an expression index on extracted values…
• … but that requires knowing exactly which fields / elements you are going to query on.
![Page 35: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/35.jpg)
jsonb indexing.
• jsonb has GIN indexing.
• Default type supports queries with the @>, ?, ?& and ?| operators.
• The query must be against the top-level object for the index to be useful.
• Can query nested objects, but only in paths rooted at the top level.
![Page 36: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/36.jpg)
jsonb_path_ops
• Optional GIN index type for jsonb.
• Only supports @>.
• Hashes paths for each item, rather than just storing the key itself.
• Faster for @> operations with nesting.
![Page 37: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/37.jpg)
jdoc @> '{"tags": ["qui"]}'
• Both index types support this.
• jsonb_ops (the default) will seach for everything that has “tags”, has “qui”, AND them, and then do a recheck for the path structure.
• jsonb_path_ops will go directly to entries for that path.
![Page 38: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/38.jpg)
Which to use?
• If you just need @>, jsonb_path_ops will probably be faster.
• If you need the other supported operators, you need jsonb_ops.
• But let’s find out!
![Page 39: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/39.jpg)
![Page 40: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/40.jpg)
Test results.
![Page 41: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/41.jpg)
The Usual Caveats
• The universe of possible workloads and schemas is infinite.
• Always build and test using data that simulates your real application.
• Don’t take these results as being applicable to every situation.
• Relative, not absolute results.
![Page 42: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/42.jpg)
That said…
• Four column schema:
• id — Primary key, bigint.
• first_name, last_name — Text.
• tags — Array of short text tags. Two extremely common ones (one per record), a diminishing number of rare ones.
![Page 43: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/43.jpg)
The test setup.
• Amazon i2.2xlarge instance.
• Ubuntu.
• PostgreSQL 9.4.0.
• Basic tuning for instance size.
![Page 44: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/44.jpg)
Test data.
• 10,000,000 records generated at random.
• Schemas:
• Pure relational data.
• hybrid (names in relational, tags jsonb).
• json and jsonb for non-ID.
![Page 45: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/45.jpg)
Methodology
• 100 iterations per test, top and bottom 10 rejected.
• Query execution time only; does not include time to return results.
• Python test harness can distort considerably if objects need to be created.
![Page 46: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/46.jpg)
Test #1: Load
• Load 10,000,000 records using COPY.
• No index rebuilds.
• Relational, “hybrid,” all json, all jsonb.
![Page 47: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/47.jpg)
0
7.5
15
22.5
30
Relational Hybrid json jsonb
Load Time (sec)
![Page 48: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/48.jpg)
Test #1: Results
• Relational beats everything (no surprise).
• jsonb is slower to load than json.
• Parsing and conversion time.
• The same order of magnitude.
![Page 49: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/49.jpg)
Test #2: Sequential scan for a single last name.• Scan table sequentially (no index) for a
single last name.
• Uses a relational field for relational and hybrid.
• Uses ->> operator for json and jsonb.
• Also tried with @> operator for jsonb.
![Page 50: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/50.jpg)
0
2500
5000
7500
10000
Relational Hybrid json jsonb jsonb @>
Query time (ms)
![Page 51: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/51.jpg)
Test #2: Results.
• json dramatically slower than jsonb.
• Relational faster than jsonb by about 2x.
• ->> and @> operators roughly same speed in this application.
![Page 52: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/52.jpg)
Test #3: b-tree index lookup by name.• Create a traditional b-tree index.
• Directly on last_name for relational and hybrid.
• Expression index on (jdoc->>‘last_name’) for json and jsonb.
• Also tried GIN index on jsonb field, using @>.
![Page 53: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/53.jpg)
0
0.15
0.3
0.45
0.6
Relational Hybrid json jsonb jsonb GIN
Query time (ms)
![Page 54: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/54.jpg)
Test #3: Results.
• All of comparable speed.
• jsonb actually faster than anything else!
• json somewhat slower due to extraction overhead.
• Always the fastest way to look up a highly selective field.
![Page 55: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/55.jpg)
Test #3: Results, 2
• jsonb w/GIN very comparable to b-tree index.
• Didn’t have to specify a particular field in advance.
• Huge improvement over 9.3 days.
![Page 56: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/56.jpg)
Test #4: Common tag lookup by seq scan.• Every record has a ‘male’ or ‘female’ tag,
50%/50%.
• Scan looking for all of one.
• Uses @> operator for tag array.
• Uses @> operator for jsonb.
• Also tried with a secondary table of tags to which we join.
![Page 57: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/57.jpg)
0
3750
7500
11250
15000
Relational Relation w/JOIN jsonb
Query time (ms)
![Page 58: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/58.jpg)
Test #4: Results.
• Secondary join table a huge loss in this scenario.
• jsonb slower than relational, but within the same general range.
![Page 59: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/59.jpg)
Test #5: Rare tag lookup by seq scan.• Scan for a rare tag (0.075% of records).
• Uses @> operator for relational.
• Uses @> operator for jsonb.
• Also tried with JOIN table.
• In both cases, JOIN table indexed on tag, but didn’t use in seq scan case.
![Page 60: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/60.jpg)
0
1000
2000
3000
4000
Relational Relation w/JOIN jsonb
Query time (ms)
![Page 61: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/61.jpg)
Test #5: Results.
• Secondary join table a huge win in this scenario.
• Unsurprising, since it can isolate the rare tag faster.
• jsonb remains slower but comparable.
![Page 62: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/62.jpg)
Test #6: Rare tag lookup by index.• Create a GIN index on relational array field
and jsonb document
• Use @> operator for tag array.
• Use @> operator for jsonb.
• Also tried with JOIN table.
![Page 63: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/63.jpg)
0
100
200
300
400
Relational Relation w/JOIN jsonb
Query time (ms)
![Page 64: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/64.jpg)
Test #6: Results.
• Relational fastest in this situation…
• … but jsonb performs comparably.
• If you are storing rare tags and don’t need full JSON, consider an array field.
![Page 65: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/65.jpg)
Note: GIN indexes and selectivity.• GIN indexes on jsonb fields have hard-
wired selectivity calculations (as of 9.4).
• Will almost always use the index even if selectivity is very low.
• This can result in bad performance in cases of low selectivity.
• An area that definitely needs attention.
![Page 66: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/66.jpg)
Test #7: Index Creation.
• Timed index creation for the various index types.
• last_name b-tree on relational.
• GIN on relational array.
• GIN json_ops and json_path_ops on jsonb.
![Page 67: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/67.jpg)
0
25
50
75
100
B-Tree GIN Array GIN json_ops GIN json_path_ops
Build time (sec)
![Page 68: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/68.jpg)
Test #7: Results.
• GIN build time is very fast.
• json_path_ops build time is very fast.
• GIN indexing on arrays, too.
![Page 69: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/69.jpg)
Test #8: Relation size.
• Total size, excluding indexes.
• For relation + JOIN table, includes JOIN table as well.
![Page 70: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/70.jpg)
0
375
750
1125
1500
Relational Relational+JOIN Hybrid JSON JSONB
MB
![Page 71: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/71.jpg)
Test #8: Results.
• Generally comparable size.
• hybrid is the most compact by a significant margin.
• jsonb slightly larger than json due to internal structure overhead.
![Page 72: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/72.jpg)
Test #9: Index size.
• Size of various indexes.
• Primary key index (same for all tables).
• GIN index on relational tags.
• json_ops
• json_path_ops
![Page 73: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/73.jpg)
0
75
150
225
300
Primary Key Relational Tags hybrid GIN json_ops json_path_ops
MB
![Page 74: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/74.jpg)
Test #9: Results.
• Indexes on just the tags are very compact.
• json_path_ops indexes are (as expected) somewhat smaller than json_ops indexes.
![Page 75: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/75.jpg)
Now that we know this, what do we
know?
![Page 76: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/76.jpg)
The One-Slide Oversimplification.• Use relational data for the basic set of
attributes.
• Use either array fields or jsonb for extended attributes.
• Use file-system storage for really big stuff.
• Always use jsonb. No reason to use json.
![Page 77: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/77.jpg)
The Future
![Page 78: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/78.jpg)
PL/V8
• Technically, not the future: Available now!
• Google’s V8 Javascript engine as a PostgreSQL PL.
• Very, very fast.
• A somewhat tedious build and install process.
![Page 79: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/79.jpg)
Jsquery
• A query language for handling nested structures.
• select count(*) from jb where jb @@ '*.term = "NYC"';
• Available now as an extension for 9.4
• https://github.com/akorotkov/jsquery
![Page 80: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/80.jpg)
VODKA
• New indexing architecture to support nested structures.
• An indexing infrastructure, not a set of query operators.
• An improvement on GIN (depending on the brand, I guess).
• Work-in-progress.
![Page 81: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/81.jpg)
ToroDB
• A document-oriented database layered on PostgreSQL.
• Sorts documents out into relations for speed.
• Speaks the MongoDB wire protocol.
• https://github.com/torodb/torodb
![Page 82: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/82.jpg)
And here we are!
• PostgreSQL 9.4 has world-beating JSON support.
• Mix and match! Use JSON for what is good for, relational data for speed.
• We’re much faster than MongoDB, by the way. Just saying.
![Page 83: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/83.jpg)
Thank you!
![Page 84: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/84.jpg)
Questions?
![Page 85: PostgreSQL and JSON - thebuild.com · Greetings! • Christophe Pettus • Consultant with PostgreSQL Experts, Inc. • thebuild.com — personal blog. • pgexperts.com — company](https://reader031.vdocuments.us/reader031/viewer/2022022710/5bf9fecd09d3f2712b8b915d/html5/thumbnails/85.jpg)
• thebuild.com — personal blog.
• pgexperts.com — company website.
• Twitter @Xof