graphconnect europe 2016 - how the icij used neo4j to unravel the panama papers - mar cabra

Post on 08-Jan-2017

1.656 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Leaks, journalism & graphsHow ICIJ Used Neo4j to

Unravel the Panama Papers Mar Cabra

Editor, Data & Research UnitThe International Consortium of Investigative Journalists (ICIJ)

@cabralens | @ICIJorgicij.org

Almost 200 journalistsBased in 65 countries

“Our aim is to bring journalists from different countries together in teams - eliminating rivalry and promoting collaboration.

Together, we aim to be theworld’s best cross-border investigative team.”

icij.org/about

You may remember us from...

+370 journalists+100 media organizations

76 countries

Nearly one in 10 of the 31,000 tax haven companies that own British property are linked to Mossack Fonseca

#panamapapers

INSIDE THE 2.6 TB

Redis queue

35 x g2.xlarge Amazon instances with Ubuntu + Tesseract + Extract

Lucene syntax queries with proximity matching!

400 users

INSIDE THE 2.6 TB

offshoreleaks.icij.org

MAGIC!!

● 950,000 nodes, 1.2 million edges (4GB)

Small, I know!!

● Find shortest path

Wow!

● Fuzzy searching● Public widgets● API

HELP US OUT SOME?

THE END…or is it?

Mar Cabramcabra@icij.org | @cabralens

icij.org/supportbit.ly/icijgraphconnect2016

top related