seminar feb 2016
TRANSCRIPT
Mathematical Modelling and Analysis ofLegislation Networks
Neda Sakhaee
University of Auckland
February 25, 2016
Neda Sakhaee (UOA) Legislation Network February 25, 2016 1 / 27
Overview
1 Introduction
2 Research ScopeQuestionsObjectives and Methodologies
3 Progress To-Date and Initial ResultsBuilding the NetworkNetwork General MeasuresCentrality, Important NodesInitial Community DetectionModelling ExampleAcademic outcomes
4 Future StudiesOverviewGenerative Models, Pattern PredictionCommunity Detection Method
5 Requirements and LimitationsDataSoftware
6 Time-table for my PhD project
Neda Sakhaee (UOA) Legislation Network February 25, 2016 2 / 27
Introduction
Introduction
This research looks at Legislation Networks within the wider class of citationnetworks.
Main case study is New Zealand Legislation Network, but comparisonstudies with other countries are included.
Legislation Network is a multi-layer graph and has some novel featureswhich make it an excellent test case for new network science tools.
They involve legal documents, but differ substantially from citation networksinvolving case law, Supreme Court opinions, etc.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 3 / 27
Introduction
Legislation Network
The concept of Legislation Network was introduced in 2015 as a novelapproach to show interdependence of the European Union laws.
In Legislation Network:
Nodes:Laws (Acts, Regulations, etc.)
Edges:
Any time one law references another law (definitions, amendments, etc.).We classify these as amendment edges or citation edges.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 4 / 27
Research Scope Questions
Research ScopeQuestions
How to build a Legislation Network? Is the Legislation Network ameaningful network, or just a set of random relationships?
What are the differences and similarities between Legislation Networkand other citation networks?
Can we measure the importance of legal documents using the networkscience tools? How does this relate to human expert opinion?
Is there any meaningful relationship between the Legislation Networkmeasures and the political or social processes?
Neda Sakhaee (UOA) Legislation Network February 25, 2016 5 / 27
Research Scope Questions
Research ScopeQuestions
Do the legal documents tend to cluster?
Can we find a good generative model for Legislation Network?
How can we study time evolution of Legislation Network?
Is this possible to predict the attributes of the legal documents inlegislation network?
Is this possible to predict the historical missing edges in LegislationNetwork?
Neda Sakhaee (UOA) Legislation Network February 25, 2016 6 / 27
Research Scope Objectives and Methodologies
Research ScopeObjectives and Methodologies
Stage IInvestigate how Legislation Network can be built and compared to theother networks.
Stage IIImplement appropriate centrality measures to determine and compare theimportance of legal documents.
Stage IIIDevelop a generative model of Legislation Network.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 7 / 27
Research Scope Objectives and Methodologies
Research ScopeObjectives and Methodologies
Stage IVContribute to community detection algorithms of directed networks.
Stage VInvestigate relationships between the network science properties and thepolitical or social processes.
Stage VIPropose link prediction and attribute prediction models for LegislationNetworks.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 8 / 27
Progress To-Date and Initial Results Building the Network
Progress To-Date and Initial ResultsBuilding the Network
We downloaded 8900 xml files of laws, dated from 1267 to October 2015,from Legislation.govt.nz. Substantial manual cross checking and datacleaning is required.
All types of references are extracted by a C sharp program from the xmlfiles. We have six different Acts networks based on the node and edge type:
Binary Whole Network (BWN)Weighted Whole Network (WWN)Binary Citation Network (BCN)Weighted Citation Network (WCN)Binary Amendment Network (BAN)Weighted Amendment network (WAN)
Neda Sakhaee (UOA) Legislation Network February 25, 2016 9 / 27
Progress To-Date and Initial Results Building the Network
Progress To-Date and Initial Results
Figure: Dataset Building Process
Act Name: Marriage Act 1955 Type: Public Date: 27/11/1955 Terminated: 0 Year: 1955 Reprint: 1 Date reprinted: 19/08/2013 Cites: Child Welfare Amendment Act 1948
Data Extraction Process
a) Citation Link
Data Extraction Process
Act Name: Marriage (Definition of Marriage) Amendment Act 2013 Type: Public Date: 19/04/2013 Terminated: 0 Year: 2013 Reprint: 0 Date reprinted: -‐ Amends: Marriage Act 1955
b) Amendment Link
Neda Sakhaee (UOA) Legislation Network February 25, 2016 10 / 27
Progress To-Date and Initial Results Building the Network
Progress To-Date and Initial ResultsData sets and Visualizations
Data set and visualisation are available at:https://dataverse.harvard.edu/dataverse/LN
Neda Sakhaee (UOA) Legislation Network February 25, 2016 11 / 27
Progress To-Date and Initial Results Network General Measures
Progress To-Date and Initial ResultsNetwork General Measures
RN*** BWN WWN RN*** BCN WCN RN*** BAN WANNodes 3856 3856 3856 2142 2142 2142 3856 3856 3856Edges 33884 33884 33884 20124 20124 20124 9030 9030 9030Average Degree 9.712 8.878 13.233 10.112 9.395 17.257 3.207 2.342 3.648Diameter 15 15 15 15 15 15 15 15 15CCcyc 0.003 0.223 0.223 0.004 0.492 0.492 0.001 0.031 0.031CCmid 0.003 0.305 0.305 0.004 0.655 0.655 0.001 0.066 0.066CCin 0.003 0.528 0.528 0.004 0.414 0.414 0.001 0.03 0.030CCout 0.003 0.506 0.506 0.004 0.374 0.374 0.001 0.033 0.033Average CC* 0.003 0.446 0.446 0.004 0.484 0.484 0.001 0.004 0.004Average Path length** 6.124 3.569 3.569 7.254 3.346 3.346 1.817 4.43 4.43Small world No Yes Yes No Yes Yes No No No
*Based on directed clustering coefficient as proposed by B.M.Tabak in 2014.
**Based on Average Path length proposed by S.H.Strogatz in 1995.
***Random Network is a graph with specific number of vertices n and connection probability of p. Theindexes are calculated based on a sample of 100 graphs.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 12 / 27
Progress To-Date and Initial Results Network General Measures
Progress To-Date and Initial ResultsIn-degree Out-degree correlation
There is not any meaningfulll correlation between out-degree and in-degreeof regular citation networks, but:
R result for Pearson’sproduct-moment correlation
X=In-Degree Y=Out-Degree
t = 57.332 df = 2140 p-value < 2.2e-16
Alternative hypothesis: true
Correlation is not equal to 0
95 percent confidence interval: 0.761 0.794
Sample estimates correlation: 0.778
0 50 100 150 200 250 300
0100
200
300
X
Y
Neda Sakhaee (UOA) Legislation Network February 25, 2016 13 / 27
Progress To-Date and Initial Results Centrality, Important Nodes
Progress To-Date and Initial ResultsCentrality, Important Nodes
Intuitively presented by Borgatti in 2005, centrality measures describethe importance of nodes. Each measure depends on an implicit modelof how traffic flows in the network.
We chose a standard measure (similar to PageRank), EigenvectorCentrality as our key measure which initially proposed by Bonacic in1972.
Later in community detection part, we use Betweenness centrality,introduced by Borgatti in 2005, to label the communities.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 14 / 27
Progress To-Date and Initial Results Centrality, Important Nodes
Progress To-Date and Initial ResultsEigenvector Centrality Result
Table: Top ten Acts
BWN & WWN BCN & WCN BAN & WANCriminal Procedure Act 2011 1 Public Finance Act 1989 1 Public Finance Act 1989 1Public Finance Act 1989 0.86 Criminal Procedure Act 2011 0.94 State Sector Act 1988 0.95Summary Proceedings Act 1957 0.84 Summary Proceedings Act 1957 0.93 Companies Act 1993 0.87State Sector Act 1988 0.77 State Sector Act 1988 0.85 Summary Proceedings Act 1957 0.83Companies Act 1993 0.67 District Courts Act 1947 0.82 Local Government Act 2002 0.83Local Government Act 2002 0.62 Judicature Act 1908 0.74 Criminal Procedure Act 2011 0.79Privacy Act 1993 0.62 Crimes Act 1961 0.72 District Courts Act 1947 0.69Crimes Act 1961 0.61 Privacy Act 1993 0.69 Official Information Act 1982 0.68Regulations (Disallowance) Act 1989 0.55 Companies Act 1993 0.67 Education Act 1989 0.63Official Information Act 1982 0.54 Local Government Act 1974 0.65 Land Transfer Act 1952 0.6Average EC 0.05 Average EC 0.05 Average EC 0.05
Eigenvector centrality values are normalised.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 15 / 27
Progress To-Date and Initial Results Initial Community Detection
Progress To-Date and Initial ResultsInitial Community Detection
Community detection algorithms provide a clearer picture of thenetwork.
Network clustering is a specific type of data clustering problem whichincludes network measures as variables in the objective functions. Themost famous network clustering models are spectral clustering andmodularity clustering.
We will use the Modularity method and Louvain algorithm to solve it.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 16 / 27
Progress To-Date and Initial Results Initial Community Detection
Progress To-Date and Initial ResultsBCN clusters based Modularity method and Louvain algorithm
Criminal Procedure Act 2011
Resource Management Act 1991
Public Finance Act 1989Local Government Act 1974
Companies Act 1993
Judicature Act 1908
Decimal Currency Act 1964
Patents Act 1953
Remainder
Neda Sakhaee (UOA) Legislation Network February 25, 2016 17 / 27
Progress To-Date and Initial Results Initial Community Detection
Progress To-Date and Initial ResultsWAN clusters based Modularity method and Louvain algorithm
Summary Proceedings Act 1957
Decimal Currency Act 1964
Public Finance Act 1989
Reserve Bank of New Zealand Act 1989
Income Tax Act 2004
Official Information Act 1982
Local Government Act 1974Criminal Procedure Act 2011
Employment Relations Act 2000
Public Works Act 1981
Government Superannuation Fund Act 1956
Remainder
Neda Sakhaee (UOA) Legislation Network February 25, 2016 18 / 27
Progress To-Date and Initial Results Modelling Example
Progress To-Date and Initial ResultsModelling Example
Hypothesis: Major Party in the government impacts the creation of
important laws.
{H0 CNational = CLabour
H1 CNational 6= CLabour
t-test result from R:
Parameter Valuet statistics −3.0250df 885P-value 0.0026%95 confidence interval [−0.0308,−0.0007]µCNational
0.0430µCLabour
0.0680
Result: H0 is rejectedOn average Labour governments produce more importantlegislation.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 19 / 27
Progress To-Date and Initial Results Academic outcomes
Progress To-Date and Initial ResultsAcademic outcomes
Presentation in INFORMS 2015, Philadelphia, Modelling of NewZealand Acts Network.Poster presentation in INFORMS 2015, Philadelphia, LegislationNetwork.Presentation in CMSS 2016, Auckland, Mathematical analysis of NewZealand Legislation Network.Presentation in Pitch on the Plains 2015, Christchurch, Law Sense.Journal paper under preparation for Cambridge Network Science, 85percent completed.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 20 / 27
Future Studies Overview
Future StudiesOverview
Generative model, degree distribution, pattern studies.Contribute to community detection.Relation of network properties with political and legal processes.Improving historical data and studying time evolution.Comparative studies (Canada, Australia, e.c.).Studying death of nodes (repeals, replacements).Link and attribute prediction.Legislative drafting tool, identifying dependency between Acts.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 21 / 27
Future Studies Generative Models, Pattern Prediction
Future StudiesGenerative Models, Pattern Prediction
Referring to Clauset in 2014, network generative models:Makes network different from noise and random graphs.Helps to describe the network succinctly and capture most relevantpatterns.Helps us to generalise from one part of the network to another, fromone network to other of same type, from small scale to large scale, orfrom past to future.Consider Graph G, then a generative model can be proposed as aprobability distribution P (G | θ) with parameter θ.The aim of the research is to find this probability distribution forLegislation Network using Bayesian inference.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 22 / 27
Future Studies Community Detection Method
Future StudiesCommunity Detection Method
It is an optimising problem, then:Objective function captures the notion of community structure asgroups of nodes with better internal connectivity than external. AQuality Measure needs to be defined.An algorithmic techniques, assigns the nodes of the network tospecific communities, optimising the objective function.It is a computing difficult problem and needs heuristic methods.It is well studied for undirected graph, but a gap exist in the literaturefor directed networks. This research aims to contribute to the latestmethods and algorithems for directed graph clustering problem. Thenthe results can be examined using the context of the documents.
Neda Sakhaee (UOA) Legislation Network February 25, 2016 23 / 27
Requirements and Limitations
Requirements and Limitations
DataIn the dataset there are some historical documents which are cited by
other documents, but their own document don’t exist in the xml files set.This missing historical data is required to complete to time evolution andgenerative models studies.SoftwareRequired softwares for this research are:
RMATLABGephiLATEX
Neda Sakhaee (UOA) Legislation Network February 25, 2016 24 / 27
Time-table for my PhD project
The time-table of PhD project completion
i Name/Title Start Date End Date Percent Complete1 Mathematical Modelling and Analysis of Legislation Network 01/02/15 31/01/18 34.71.1 Provisional Year 01/02/15 25/02/16 1001.1.1 Data Preparation 01/02/15 14/05/15 1001.1.2 Build the Network 14/05/15 15/06/15 1001.1.3 Network 5/29/15 5/29/15 1001.1.4 Literature of general measures 15/07/15 15/11/15 1001.1.5 Initial results 16/07/15 31/07/15 1001.1.6 First talk (INFORMS 2015) 01/08/15 04/11/15 1001.1.7 INFORMS 04/11/15 04/11/15 1001.1.8 Thesis Proposal 01/09/15 10/02/16 1001.1.9 CMSS Workshop preparation 01/01/16 20/02/16 1001.1.10 Departmental Seminar 01/01/16 25/02/16 1001.2 First Paper: network structure, general measures, initial modelings 01/08/15 15/03/16 851.2.1 Outline 01/09/15 30/09/15 1001.2.2 Network Building 01/08/15 31/08/15 1001.2.3 General Measures 01/08/15 30/09/15 1001.2.4 Centrality 01/08/15 29/02/16 801.2.5 Community Detection 30/9/15 29/02/16 801.2.6 Initial Modelling 15/10/15 29/02/16 1001.2.7 Writing 22/11/15 15/03/16 651.2.8 Submit the paper 15/03/16 15/03/16 01.3 Second Paper: develop community detection method for directed networks 15/03/16 20/10/16 111.3.1 Outline 15/03/16 31/03/16 01.3.2 Analysis and results 15/03/16 15/08/16 251.3.3 Writing 20/04/16 20/10/16 01.3.4 Submit the paper to the Cambridge Network Science Journal 20/10/16 20/10/16 01.4 Third Paper: generative models and prediction studies 01/08/16 01/03/17 51.4.1 Outline 01/08/16 31/08/16 01.4.2 Analysis and Result 01/09/16 28/02/17 101.4.3 Writing 15/09/16 01/03/17 01.4.4 Submit the paper to the Cambridge Network Science Journal 01/03/17 01/03/17 01.5 Fourth Paper: modelling and comparison studies 01/01/17 30/06/17 71.5.1 Outline 01/03/17 31/03/17 01.5.2 Analysis and Results 01/01/17 01/06/17 151.5.3 Writing 01/02/17 30/06/17 01.5.4 Submit the paper to the World Politics Journal 30/06/17 30/06/17 01.6 Theis 01/02/17 31/01/18 51.6.1 Write the thesis 01/02/17 31/01/18 5
Conference talks: SUNBELT, JURIX, e.c.Neda Sakhaee (UOA) Legislation Network February 25, 2016 25 / 27
Time-table for my PhD project
References[1] A. Barabási. Emergence of Scaling in Random Networks. Science 286.5439 (Oct. 15, 1999), 509-512.[2] Michael J. Bommarito, Daniel Katz, and Jon Zelner. Law as a seamless web: compar- ison of various network
representations of the United States Supreme Court corpus (1791-2005). In: ACM Press, 2009, p. 234.[3] Stephen P. Borgatti. Centrality and network flow. Social networks 27.1 (2005), 55-71.[4] U. Brandes, D. Delling, M. Gaertler, R. Gorke, M. Hoefer, Z. Nikoloski, and D. Wagner. On Modularity
Clustering. IEEE Transactions on Knowledge and Data Engineering 20.2 (Feb. 2008), 172-188.[5] Aaron Clauset, Cristopher Moore, and Mark EJ Newman. Hierarchical structure and the prediction of missing
links in networks. Nature 453.7191 (2008), 98-101.[6] Steven H. Strogatz Duncan J. Watts. Collective dynamics of ?small-world? networks (1998).[7] P. Erdo ?s and A. Rényi. On the strength of connectedness of a random graph. Acta Mathematica Academiae
Scientiarum Hungarica 12.1 (Sept. 29, 2013), 261-267.[8] Giorgio Fagiolo. Clustering in complex directed networks. Physical Review E 76.2 (2007), 026107.[9] Timothy R. and Spriggs James F. and Jeon Sangick and Wahlbeck Paul J. Fowler James H. and Johnson.
Network Analysis and the Law: Measuring the Legal Im- portance of Precedents at the U.S. Supreme Court.Political Analysis (2007).[10] Michel Grabisch. Social networks: Prestige, centrality, and influence (Invited paper). 2011.[11] Marios Koniaris, Ioannis Anagnostopoulos, and Yannis Vassiliou. Network Analysis in the Legal Domain: A
complex model for European Union legal sources. (2015).[12] Elizabeth A. Leicht, Gavin Clarkson, Kerby Shedden, and Mark EJ Newman. Large- scale structure of time
evolving citation networks. The European Physical Journal B 59.1 (2007), 75-83.[13] Linyuan Lü and Tao Zhou. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics
and its Applications 390.6 (2011), 1150-1170.[14] Pierre Mazzega, Danièle Bourcier, and Romain Boulet. The network of French legal codes. In: Proceedings of
the 12th international conference on artificial intelligence and law. ACM, 2009, pp. 236-237.[15] Mark EJ Newman. Modularity and community structure in networks. Proceedings of the National Academy of
Sciences 103.23 (2006), 8577-8582.[16] Mark EJ Newman. The structure and function of complex networks. SIAM review 45.2 (2003), 167-256.[17] Benjamin M. Tabak, Marcelo Takami, Jadson M. C. Rocha, Daniel O. Cajueiro, and Sergio R. S. Souza.
Directed clustering coefficient as a measure of systemic risk in complex banking networks. Physica A: StatisticalMechanics and its Applications 394 (Jan. 15, 2014), 211-216.[18] Duncan J. Watts and Steven H. Strogatz. Collective dynamics of ?small-world? net- works. Nature 393.6684
(June 4, 1998), 440.[19] Lavanya Zhang Paul and Koppaka. Semantics-based Legal Citation Network. In: Proceedings of the 11th
International Conference on Artificial Intelligence and Law.Neda Sakhaee (UOA) Legislation Network February 25, 2016 26 / 27