the open government data heart beat of cities

Upload: mahdi-sanaei

Post on 01-Nov-2015

18 views

Category:

Documents


0 download

DESCRIPTION

Abstract: This paper develops a unique index of “Open Government Data Heart Beat of Cities,” which consists of three theoretical archetypes that assess the commitment of cities to open government data (OGD). The three archetypes are: 1. ‘Way of Life,’ which reflects a high level commitment to OGD. 2. ‘On the Fence,’ which represents either a low commitment or erratic commitment to OGD. 3. ‘Lip Service’ which refers to either scarce or no commitment to OGD. These archetypes draw on four main dimensions: 1) Rhythm; 2) Span of Issues; 3) Disclosure; and 4) Feedback. We empirically examine this theoretical framework using longitudinal quantitative analysis on the OGD behavior of 16 different cities in the US, using a large novel corpus of municipal OGD metadata, as well as primary qualitative and secondary quantitative indicators. This methodology allows us to represent, for the first time, the evolving OGD commitment behavior—“heart beat”—of cities. Finally, we examine the impact of some socio-demographic and political factors on the OGD heartbeat of cities in order to provide the basis for further research on the topic.

TRANSCRIPT

The Open Government Data Heart Beat of CitiesKarine Nahon Alon Peled Jennifer ShkabaturUniversity of Washington;Hebrew University Interdisciplinary Center (IDC)

Abstract: This paper develops a unique index of Open Government Data Heart Beat of Cities, which consists of three theoretical archetypes that assess the commitment of cities to open government data (OGD). The three archetypes are: 1. Way of Life, which reflects a high level commitment to OGD. 2. On the Fence, which represents either a low commitment or erratic commitment to OGD. 3. Lip Service which refers to either scarce or no commitment to OGD. These archetypes draw on four main dimensions: 1) Rhythm; 2) Span of Issues; 3) Disclosure; and 4) Feedback. We empirically examine this theoretical framework using longitudinal quantitative analysis on the OGD behavior of 16 different cities in the US, using a large novel corpus of municipal OGD metadata, as well as primary qualitative and secondary quantitative indicators. This methodology allows us to represent, for the first time, the evolving OGD commitment behaviorheart beatof cities. Finally, we examine the impact of some socio-demographic and political factors on the OGD heart beat of cities in order to provide the basis for further research on the topic. Keywords: open government data, cities, municipal, transparency, disclosure, access.Acknowledgement: We acknowledge Alex Troitsky who helped us with the statistical analysis of the data.O1. Introduction: Commitment of Cities to Open Government Datapen government data (OGD) policies are often perceived as a remedy to governance problems such as corruption or poor service delivery, and as a powerful vehicle to spur innovation and economic development (Manyika et al. 2013; Janssen et al. 2012; Jetzek et al. 2013; Mossberger et al. 2013; Martin et al. 2013). Indeed, OGD policies have rapidly diffused across sectors, countries, and political regimes, and become widely recognized as international norms of good governance. Since the introduction of the first national OGD portal in the United States in May 2009, more than 70 countries have launched OGD initiatives (Davies 2013) and G8 countries issued in 2013 an Open Data Charter, committing to release open data as a default in all regulatory activities. Since mid-2013, both the EU and the US have adopted new guidelines that mandate agencies to release on their OGD portals public sector information free of charge and in downloadable format (Nahon and Peled 2015). Hundreds of cities around the world have followed suit, and launched their own OGD portals. While OGD turns into an integral part of activities that are routinely performed by governmental authorities, there is a dearth of measures to assess or compare the OGD behavior or commitment of one agency or city to another. What are the components of the OGD behavior of a governmental authority? How can we assess the commitment of governmental entities toward the publication of OGD? Can we factor in the evolution of OGD behavior and commitment over time? Can we compare the OGD behavior and commitment of different governmental entities? To address these questions while focusing on the municipal level of OGD, we develop the first-ever theoretical, quantitative, and logitudinal index that measures the OGD Heart Beat of cities over time. We developed this index using different metrics that measure the rhythm, span of issues, disclosure and feedback dimensions of the OGD that cities upload to their municipal OGD portals. Based on this index, we propose three archetypes of municipal OGD commitment that we name OGD as a Way of Life, OGD on the Fence, and OGD Lip Service. To test our OGD Heart Beat model, we extracted information about the OGD uploads of sixteen USA cities over a period of four years from a unique corpus of OGD metadata, generated using a software developed by one of us. This information empowered us to measure the daily, evolving OGD heart beat of our sixteen cities and evaluate their actual day-to-day OGD behavior against our theoretical model. We also attempt to strengthen our theoretical work by creating and testing additional measurements, including one that codes the spectrum of municipal issues about which cities publish information and another that codes the legal strength of the municipal OGD policy. We also examine some socio-economic and political indicators and qualitative data in order to deepen our understanding of cities OGD behavior. We believe that our theoretical model, including the three archetypes of municipal OGD behavior (e.g, way-of-life, on-the-fence, and lip-service), holds well against the empirical data and contributes to the theoretical understanding of OGD policies implementation. In future research we plan to include many more cities worldwide and add measurements to strengthen the validity of our findings. Still, the novelty of this paper lays in its pioneering attempt to lay the foundation for a new theory that describes and explains how and why distinct cities adopt different OGD behavior. 2. Literature Review TThe OGD phenomenon has drawn considerable scholarly attention in both developed and developing countries. Scholars have studied barriers and opportunities for the introduction of OGD initiatives (Janssen et al. 2012; Agrawal et al. 2014; Nahon & Peled 2015) and examined triggers that lead to the emergence of OGD policies in specific countries (Davies 2014, Davies 2013, Peled 2013, Davies et al. 2013). Studies have also explored the political consequences of OGD policies and resulting power shifts (Bates 2012), the legal design of national OGD policies and agencies behavior vis--vis OGD mandates (Peled 2013, Shkabatur 2012, Worthy 2013), and the central role of OGD intermediaries (Roberts 2014, Shkabatur 2014). As OGD portals have mushroomed around the world, scholarly attention has turned to identifying the emerging political, social, and economic impacts of OGD, discovering largely mixed results (Davies et al. 2013, Peled 2013; Worthy 2014, Manyika et al. 2013). Despite the abundance of national level studies of OGD, an overarching analytic framework for local-level OGD has not yet been developed. While local e-government has been the subject of a vast body of literature (e.g., Ho 2002; Norris & Moon 2005; Pina, Torres, and Royo 2010; Tolbert et al. 2008; Scott 2006; Mossberger et al. 2012; ), it typically delineates general modalities of online service provision and assesses cities performance, but does not offer a targeted analysis of OGD. The burgeoning literature on smart citiescities that employ information & communication technologies (ICT) to develop a citizen-centered system of service provision and spur local innovation and co-creation (Schaffers et al. 2011; Alawadhi et al. 2012; Townsend 2013; Goldsmith 2014) typically lacks in-depth analyses of city-level OGD policies, strategies, and practices. Specific case studies of OGD in selected cities have recently been developed (e.g., Gurstein 2012; Canares et al. 2014; Fumega 2014), but these typically do not offer a comparative investigation of municipal OGD practices. A distinct and structured local perspective on OGD is, however, preeminent. First, cities are central actors in any OGD endeavor. By virtue of their responsibility for critical government services and unmediated contact with citizens, cities typically possess a wealth of data that is unavailable on national OGD portals (Evans & Campos 2013), but that can be valuable for political, social, and economic development purposes. This puts cities under pressure from both national authorities and residents to enhance transparency and release OGD, at times as part of a larger decentralization reform (Davies & Lithwick 2013; Local Government Association 2012). Accordingly, hundreds of cities around the world and dozens of cities in the United States alone have launched OGD portals in one way or another. However, a theoretical understanding of the current practices and potential of municipal OGD is still to be developed (Davies & Bawa 2012). Second, municipal OGD requires a different and nuanced theoretical treatment compared to national OGD. On the rhetorical level, municipal OGD initiatives are more focused on improved service delivery than government accountability, which is often at the core of national OGD declarations (Yu & Robinson 2012). The political and socio-economic diversity of local governments complicates the comparative task, requiring to carefully weigh in a plethora of factors affecting the OGD capacity and potential of cities. At the same time, the simultaneous emergence of thousands of municipal OGD web portals worldwide represents a concrete and exciting opportunity to collect and analyze data about these portals and explain why some cities do better than others in the OGD domain.3. Theoretical Framework: OGD Heart Beat of Cities We propose to fill the theoretical gap about OGD at the municipal level by providing an innovative model to assess the evolving OGD heart beat of citiestheir evolving day-by-day OGD behaior and commitment. The OGD heart beat model consists of three theoretical archetypes: 1. Way of Life, reflecting a high level commitment to OGD; 2. On the Fence, representing either a low or erratic commitment to OGD; and 3. Lip Service, referring to either scarce or no commitment to OGD. These archetypes are constructed out of four theoretical dimensions, which represent the main components of an information supply behavior of a city: 1) Rhythm the citys rhythm of uploading OGD datasets (measured by the number of assets that the city uploaded since the inception of its OGD initiative and the regularity of these uploads, taking into account the time factor); 2) Span of Issues the extent to which the OGD disclosed by the city encompasses a variety of aspects in the life of its residents; 3) Disclosure the provision of metadata kewords and categories to identify and define each of the disclosed information assets; 4) Feedback the inclusion of contact details of unit or person responsible for the disclosed information asset or queries about it.The OGD heart beat of cities is meant to identify the daily OGD behavior of cities and their evolving commitment to release meaningful OGD, thus revealing the trajectory the city takes regarding its OGD initiative. It also provides the basis for conversation about future interventions need to be taken in order to reach a high level commitment to OGD. Further, while we employ this model to assess the evolving, longitudinal behavior of cities with regard to their OGD initiatives, it can be adopted to analyze any information supply intervention by governments agencies. Table 1 provides an ideal depiction of each one of the theoretical archetypes, along the four suggested dimensions.

Table 1: Ideal Archetypes of OGD CitiesOGD Heart Beat Archetypes

DimensionWay of LifeOn the FenceLip Service

RhythmThe city regularly releases a significant volume of OGD information.The city provides erratic OGD information or consistent and low volume over time. The city provides scarce or no OGD.

Span of IssuesThe city covers most or all spectrum of municipal life aspects in the OGD it discloses.The city covers a partial spectrum of municipal life topics in the OGD it discloses.The city focuses on a small number of topics in the spectrum of municipal life in the OGD it discloses.

FeedbackThe city consistently provides contact details of unit or person responsible for the disclosed information or queries about the information asset.The city provides sporadically contact details of unit or person responsible for the disclosed information or queries about the information asset.The city scarcely provides contact details of unit or person responsible for the disclosed information or queries about the information asset.

DisclosureThe city consistently provides kewords and categories metadata to describe its OGD.The city sporadically provides kewords and categories metadata to describe its OGD.The city scarcely provides kewords and categories metadata to describe its OGD.

The Heart Beat of OGDHigh level commitment to OGD.Low or erratic commitment to OGD.Scarce or no commitment to OGD.

Few remarks are worth making: One, these are ideal types of archetypes. A city can in principle exhibit high commitment to OGD on one dimension (e.g., span of issues), and low commitment on another (e.g., rhythm). Two, the position of a city in the scale in each one of the four dimensions is determined on relative terms, compared to other cities. Therefore, there is no absolute number, which determines the archetype representing the city, but a relative one. Three, cities may migrate over time from one archetype to another, based on their evolving OGD behavior and commitment. The remainder of this paper relies on longitudinal quantitative and qualitative data to assess the OGD heart beat of sixteen US cities. 4. MethodTo the best of our knowledge, our paper is the first-ever theoretical, quantitative, longitudinal and comparative study which examines the OGD commitment behavior of 16 cities in the US over time. We use three types of data: 1) Large corpus of metadata about OGD uploaded by the cities. 2) Primary qualitative data, which we coded to represent the legal mechanism of OGD in each of the cities (e.g., policy announcement, executive ordinance or resolution, state law, etc.) and the spectrum of topics that covers various aspects of life in a city. 3) Secondary data, which is used to measure the impact of socio-demographic and political indicators on the OGD heart beat of cities. The metadata corpus is built using the Public Sector Information Exchange (PSIE) software that one of us developed. The software crawls into an OGD portal and performs an initial indexing of all the information assets published by the city. Thereafter, the software returns to the portal once a week, to check whether it can find new information assets or glean new metadata information about information assets that were previously indexed. The most important and lowest-granular information in our corpus is the rich metadata descriptions that cities publish along with the data on their OGD portals. To the best of our knowledge, not a single other central repository exists today for scholars studying governmental release of datasets on individual OGD portals. Hence, this research technique can be applied also to other levels of governments (e.g., state, federal, international).We relied on this software to extract the metadata given by cities in our sample to each of the OGD information assets that they released. The corpus contains 5006 OGD information assets uploaded by each one of the 16 cities during the last four years (see table 2). The cities that were chosen in the sample are those that were featured on the federal US OGD portal (www.data.gov) and that have OGD portals in JSON, CKAN or Socrata standards.

Table 2: Sixteen USA Cities and Their OGD Uploads (2011-2014)CityStateOGD BeginningLast OGD UploadTotal OGD Assets

AustinTexasOct 2012Sep 2014289

BaltimoreMarylandNov 2013Sep 2014316

BostonMassachusettsOct 2012Sep 2014317

BurlingtonVermontDec 2013Sep 201433

ChicagoIllinoisOct 2011Sep 2014523

HonoluluHawaiiNov 2012Sep 201468

Kansas CityMissouriOct 2012Sep 20142792

Las VegasNevadaNov 2013Jul 201426

Los AngelesCaliforniaOct 2013Sep 201455

MadisonWisconsinJan 2013Sep 201448

New OrleansLouisianaOct 2011Sep 201488

Santa CruzCaliforniaNov 2012May 201452

SeattleWashingtonJan 2011Sep 2014313

SomervilleMassachusettsJun 2012Sep 201416

South BendIndiana2013_10Sep 201450

WellingtonFloridaOct 2012Oct 201420

Total5006

Figure 1 represents the geographic dispersion of the 16 cities:

Figure 1: The Location of Sixteen OGD Cities

There are two types of analyses done in this paper: 1. Measuring and testing the theoretical framework of the OGD heart beat of cities, as proposed in section 3; 2. Analyzing through regression socio-demographic and political independent indicators that may impact the OGD heart beat.

4.1 A Compound Index Measuring the Heart Beat of OGDThe analysis of the OGD Heart Beat theoretical framework consists of operationalizing the four theoretical dimensions (rhythm, span, disclosure, and feedback), which we discussed in section 3. One variable, distinct metacategories, which represent the spectrum of municipal issues that a city can address, was created using content analysis of the metadata of the released information assets (see below for additional analysis of metacategories). Table 3 presents the full list of variables, measurement items, and weights that compose the OGD Heart Beat index. The variables were accumulated on a daily basis and represent a longitudinal analysis. We created a composite index variable to measure the daily OGD heart beat of a city based on our four analytical categories (rhythm, span of issues, feedback, and disclosure).[footnoteRef:1] There was no collinearity in our sample. However, we believe that our analysis below must be tested against a larger sample (currently N=895 which refers to the cumulative number of days on which one of our sixteen cities released at least one OGD asset). [1: The Cronbachs Alpha for our compound heartbeat index is C=0.8270, which is considered as a good value. ]

Table 3: Operationalization of the OGD Heart BeatAnalytical CategoryOperationalizationDescription

Disclosure (10%)1. Assets without category and without keyword 2. Assets with category and without keywords 3. Assets with keywords and without category 4. Assets with category and with keywordsThe provision of metadata keywords and categories to identify and define each of the disclosed information assets. We assume that a commited city to OGD will consistently assign to its information assets descriptive keywords and categories.

Span of Issues(40%)The extent to which the OGD disclosed by the city encompasses a variety of aspects in the life of its residents

1. Distinct Metacategories (70%)[footnoteRef:2] [2: These fifteen metacategories are: Animals, Community and leisure, Demographics, Education, Environment, Financial regulation, Health, Land regulation, Legal and political system, Municipal services general, Open government, Private sector regulation, Public safety, Traffic, and Transportation.]

A measurement created by content analysis and measures the varieties of areas covered by the OGD.

2. Distinct Categories (20%)An automatic metadata, which examines the variety of categories used by the city to describe its OGD assets.

3. Distinct Keywords (10%)An automatic metadata, which examines the variety of keywords used by the city to describe its OGD assets.

Feedback(10%)The inclusion of contact details of unit or person responsible for the disclosed information asset or queries about it. We assume that a commited city to OGD will consistently provide this information.

1. Assets with Feedback (50%)Did the OGD asset provide contact details?

2. Assets with distinct feedback (50%)The distinct number of contact people or units.

Rhythm(40%)The citys rhythm of uploading OGD datasets. We assume that a city commited to OGD regularly releases a significant volume of OGD.

1. Normalized daily upload periods ratio (30%)The normalized number of periods where, in each day, at least one OGD asset was uploaded.

2. Normalized monthly periods ratio (30%)

The normalized number of monthly periods where, in each month, at least one OGD asset was uploaded.

3. Current uploaded assets (20%)The total number of asset uploaded in a current day.

4. Accumulated uploaded assets (10%)

The total number of assets uploaded including all assets uploaded in previous days.

5. Daily upload periods (5%)

The number of daily periods where, in each day, at least one OGD asset was uploaded.

6. Monthly upload periods (5%)

The number of monthly periods in each month, where at least one OGD asset was uploaded.

4.2 A Regression to Measure what Impacts the level of Heart Beat of OGDAfter we tested the theoretical framework, we exercised a regression examining how different socio-demographic and political indicators impact the Heart Beat of OGD (acting now as the dependent variable). The independent variables were 1. The average income per household in the city; 2. Poverty rate in the city; 3. The percentage of political support of the Democrat or Republican parties; 4. The legal mechanisms of OGD in the city (this was a qualitative historical coding of five categories, ranging from no designated open data policy to concrete mandate to municipal departments to release data and monitoring open data plans in the city); 5. Percentage of elderly population (above 65); and 6. Percentage of white and non-white population in the city. The goal of this analysis is to develop an initial understanding of what might explain the OGD behavior of cities. 5. ResultsFigure 2 presents the empirical results of the four dimensions and the archetype in which the cities fall into. The results show a clear distinction between the three archetypes for each one of the dimensions (Rhythm, Span of Issues, Feedback and Disclosure), and strengthen the theoretical argument for the existence of these archetypes. As we explained in Section 3, the Span of Issues dimension is based mainly on a manual qualitative coding of the OGD assets. This variable represents the topics covered by each city as part of its OGD. Figure 3 displays the distribution of 5,006 OGD assets into fifteen types of distinct metacategories, which were assigned to each of the assets through content analysis.Rhythm Span of Issues

Feedback Disclosure

Figure 2: Testing the Four Dimensions which Constitute the Heart Bit of OGD

Figure 3: Distinct Metacategory Distribution

The OGD Heart Beat of cities is a compound index. It reflects both the current behavior and the commitment of cities regarding OGD. The first one (the OGD Heart Beat behavior) is a static picture, which relies on the last data entry point of a city and is presented in Figure 4. But more important is to understand the commitment of a city to OGD, the Heart Beat of OGD itself, as it represents the trajectory which the city takes in regard to OGD (see Table 4).

Figure 4: OGD Heart Beat StanceThe coefficients in Table 4 represent the best-fitting slope of the regression for each citys improvement per day.[footnoteRef:3] The Slope by date shows the actual trajectory of every city (Note that Santa Cruz and Wellington are not significant since they behave in an erratic way). The Interval 95% represents the margin errors of the slope and the scope of trajectory the city can take regarding OGD. [3: The maximum OGD heart beat in our sample is 0.945 with a mean of 0.351 and standard deviation of 0.130.]

Table 4: The OGD Heart Beat Trajectory[footnoteRef:4] [4: To support this table, we created a statistical model that contains 16 variables for our 16 cities. Each of the 895 per city, per day data-rows received a dummy value of 1 for the city that uploaded it and 0 for the other cities. Next, we computed the OGD heart beat of the city by multiplying the dummy values by the overall OGD behavior of all sixteen cities (see Table 3). We repeated this process sixteen times to register the OGD Heart Beat of our sixteen cities. For two cities with low numbers of daily uploads.]

City Slope by datePInterval 95%

Austin.00020110.000.000153 .0002492

Boston.00030970.000.0002345 .0003849

Santa Cruz.00011010.116 (not significant)-.0000273 .0002476

Somerville.00015150.000.0000933 .0002097

Baltimore.00096220.000.0007721 .0011524

Burlington .00069270.000.0005201 .0008653

Chicago.00013330.000.0001274 .0001392

Honolulu.0000880.000.0000534 .0001226

Kansas City .00083850.000.0007922 .0008849

Las Vegas .00075730.000.0005597 .0009549

La .00054170.000.0004427 .0006407

Madison.00029780.000.0002713 .0003242

New Orleans.00010810.000.0000883 .0001279

Seattle .0001070.000.0000751 .000139

South Bend.00045870.000.000391 .0005265

Wellington .00007630.066(not significant)-4.91e-06 .0001576

N895--

R-square 0.9371--

Finally, Table 5 below summarizes the main findings for the regression phase, which explores some common socio-demographic and political independent variables against our compound OGD heart beat index.

Table 5: Explaining the OGD Behavior of Sixteen USA CitiesVariable\predictionCoefficients P

Date (in days).0002323P