school of data - mapping opencorporates networks using openrefine and gephi
DESCRIPTION
TRANSCRIPT
Mapping CorporateNetworks - Intro
A two-part recipe for downloading company ownership data from
OpenCorporates using OpenRefine, and then
visualising it with Gephi
http://opencorporates.com/companies/gb/04366849/network.json?depth=2
How to grab the data using
OpenRefineVisit openrefine.org to download the application
Add/network.json?depth=2to the end of the web address
Where’s the data?
URL of the form:
http://opencorporates.com/companies/JURISDICTION/COMPANY_ID/network.json?depth=2
What data block makes a row?
Toggle selection and preview
Create project
Nicely tabulated data
What Gephi Expects…
Child Parent
What Gephi Expects…
Parent -> SourceChild -> Target
(You may find the network analyses work betterif you use the parent as the Target and thechild as the Source…)
How to visualise the data using
GephiVisit gephi.org to download the application
Getting Started with Gephi
Import as Edges table
View
Layout
Colour/Size
Stats/Filters
Label tools
“Spacing”
Turn labels on Label size
Label displayselector
Degree 2In-degree 2Out-degree 0
A matter of degree…
Degree 3In-degree 0Out-degree 3
Degree 3In-degree 1Out-degree 2
Size by degree…Calculate in-degreeand out-degree
Set node size
The color wheel/palette isused to colour the nodes.
Label Sizing
Tweaking the layout
“Expand” the layout(stretch it in twodimensions)
“Adjust” the labelsso that they don’toverlap - maychange relativeposition of nodes
Network Stats
HITS – Authority and Hub values:authoritative nodes are pointed to,hub nodes point to others
Measure the ‘influence’of a node in the network
Note: some of these stats are more meaningful if
we set the parent company as the Target and the child
company as the Source in the original data…
Use the tools in concert…Colour based on Authority (HITS statistic)
Label adjust tweaks the layoutso we can read the labels
Fine tune label sizingusing text-size slider
SchoolOfData.org