lecy ∙ data-driven dm lecture 09 working with maps files and census data
TRANSCRIPT
![Page 1: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/1.jpg)
Lecy ∙ Data-Driven DM
Lecture 09Working with Maps Files
and Census Data
![Page 2: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/2.jpg)
census data
![Page 3: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/3.jpg)
Census Topics by Table
From: http://censusreporter.org/topics/
Each link below brings you to a page describing the concepts and tables covered by the Census and AmericanCommunity Survey on that particular topic.
Getting StartedThe Census is a big subject and there’s a lot to learn, but you don’t have to learn it all at once. Here’s somehelp knowing the lay of the land.
Table CodesWhile Census Reporter hopes to save you from the details, you may be interested to understand some of therationale behind American Community Survey table identifiers.
Age and SexHow the Census approaches the topics of age and sex.
ChildrenTables concerning Children. Helpful to consider in relation to Families.
CommuteCommute data from the American Community Survey.
EmploymentWhile the ACS is not always the best source for employment data, it provides interesting information forsmall geographies that other sources don’t cover.
FamiliesFamilies are an important topic in the ACS and a key framework for considering many kinds of data.
![Page 4: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/4.jpg)
GeographyGeography is fundamental to the Census Bureau’s process of tabulating data. Here are the key concepts youneed to understand.
Health InsuranceThe ACS has a number of questions that deal with health insurance and many corresponding tables.
IncomeHow the Census approaches the topic of income.
MigrationHow the Census deals with migration data.
PovertyPoverty data and how it is used within the ACS.
Public AssistancePublic assistance data from the ACS.
Race and Hispanic OriginRace is a complex issue, and no less so with Census data. A large proportion of Census tables are brokendown by race.
Same-Sex CouplesAlthough Census does not ask about them directly, there are a number of ways to get at data about same-sexcouples using ACS data.
SeniorsIn addition to basic Census data about age, there are a small number of Census tables which focus directlyon data about older Americans, and on grandparents as caregivers.
Veterans and MilitaryData collected about past and present members of the U.S. Armed Forces.
![Page 5: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/5.jpg)
American Community Survey
The American Community Survey (ACS) is an ongoing statistical survey by the U.S. Census Bureau, sent to approximately 250,000 addresses monthly (or 3 million per year). It regularly gathers information previously contained only in the long form of the decennial census. It is the largest survey other than the decennial census that the Census Bureau administers.
Note: What is the difference between 1 year, 3 year, 5 year?
![Page 6: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/6.jpg)
http://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml
![Page 7: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/7.jpg)
http://factfinder.census.gov/faces/nav/jsf/pages/index.xhtml
STEP 01: GO TO THE “AMERICAN FACTFINDER” HOMEPAGE
Click on “Advanced Search”
![Page 8: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/8.jpg)
STEP 02: SEARCH BY TABLE NAME
Enter your topic of table name.
For this lab we will use poverty measures contained in the table: S1701
![Page 9: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/9.jpg)
STEP 03: SELECT YOUR UNIT OF ANALYSIS (GEOGRAPHY)
Click on “Geographies” and select “Census Tract”
![Page 10: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/10.jpg)
STEP 03: SELECT YOUR UNIT OF ANALYSIS (GEOGRAPHY)
Follow the prompts to select your state, then select “all counties”
![Page 11: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/11.jpg)
STEP 04: MATCH DATA YEAR TO SHAPEFILES YEAR
Select the version of the data that matches your shapefiles.
Recall that census tract boundaries may change over time, so matching data to shapefiles ensures accuracy.
![Page 12: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/12.jpg)
STEP 05: DOWNLOAD IN CSV FORMAT
![Page 13: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/13.jpg)
STEP 05: DOWNLOAD IN CSV FORMAT
CONTENTS
Text File: contains information on the table that you downloaded (ignore for now).
Metadata File: Contains variable names and definitions.
With Ann File: Your actual data, first line contains variable names (annotated) – is the largest file.
Readme File: disclaimers (ignore for now).
Read this file into R
![Page 14: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/14.jpg)
geographic units
![Page 15: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/15.jpg)
In the U.S., Census tracts are "Designed to be relatively homogeneous units with respect to population characteristics, economic status, and living conditions, census tracts average about 4,000 inhabitants."
Why are the tracts different sizes?
![Page 16: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/16.jpg)
Nested Administrative Units
• States• Counties
• Census Tracts• Census Blocks
Census Tract is usually the most granular levelof data provided by census
County
Census Tract
Block Group
Block
![Page 17: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/17.jpg)
Example: Household Income by Census Tract
![Page 18: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/18.jpg)
FIPS Code Structure
County Census TractState42003000102 42-003-0001.02
11-digit Geo ID Code
![Page 19: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/19.jpg)
Which counties are in my MSA?
http://www.bea.gov/faq/index.cfm?faq_id=470
Pittsburgh, PA (Metropolitan Statistical Area)• Allegheny, PA [42003]• Beaver, PA [42007]• Butler, PA [42019]• Fayette, PA [42051]• Washington, PA [42125]• Westmoreland, PA [42129]
FIPS Code for Allegheny County (42003) 42 = Pennsylvania003 = Allegheny County
![Page 20: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/20.jpg)
visual narrative
![Page 21: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/21.jpg)
Your data tells a story
![Page 22: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/22.jpg)
WHAT PATTERN DO YOU SEE?
![Page 23: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/23.jpg)
WHAT PATTERN DO YOU SEE NOW?
Equal Intervals Quantiles
Geometric
These maps are all made with the same data using different intervals for the break points.
![Page 24: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/24.jpg)
BREAK POINTS FOR NORMAL DISTRIBUTIONS
http://uxblog.idvsolutions.com/2010/03/crazy-world-of-range-breaks.html
![Page 25: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/25.jpg)
http://uxblog.idvsolutions.com/2010/03/crazy-world-of-range-breaks.html
![Page 26: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/26.jpg)
BREAK POINTS FOR SKEWED DISTRIBUTIONS
![Page 27: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/27.jpg)
![Page 28: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/28.jpg)
HIGHLY-SKEWED DISTRIBUTIONS:
Think about skewed distributions like you think about classes of wine.
The first level of quality is a bottle for $3, the next level is a bottle at $6, the next level is at $10-12, the next level at $25, the next level at $50, then $100.
To move up one class, you basically double the price.
If you want a scale that translates wine prices into class, it would look something like:
$0 - $4 Cheap wine$4 - $8 Low quality$8 - $20 Medium quality$20 - $80 High quality$80 + Excellent quality
![Page 29: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/29.jpg)
http://www.medpagetoday.com/PublicHealthPolicy/GeneralProfessionalIssues/50497
![Page 30: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/30.jpg)
Interval Size Increases
![Page 31: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/31.jpg)
CLASSIFYING DATA
Process of placing data into groups (classes or bins) that have a similar characteristic or value
Break points• Breaks the total attribute range up into these intervals • Keep the number of intervals as small as possible (5-7)• Use a mathematical progression or formula instead of picking
arbitrary values
Break points
![Page 32: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/32.jpg)
CLASSIFICATIONS
Quantiles
• Places the same number of data values in each class
• Will never have empty classes or classes with too few or too many values
• Attractive in that this method produces distinct map patterns
• Analysts use because they provide information about the shape of the distribution.
• Example: 0–25%, 25%–50%, 50%–75%,75%–100%
![Page 33: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/33.jpg)
Equal intervals
• Divides a set of attribute values into groups that contain an equal range of values
• Best communicates with continuous set of data
• Easy to accomplish and read
• Not good for clustered data
• Produces map with many features in one or two classes and some classes with no features
CLASSIFICATIONS
![Page 34: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/34.jpg)
Increasing interval widths
• Long-tailed distributions
• Data distributions deviate from a bell-shaped curve and most often are skewed to the right with the right tail elongated
• Example: Keep doubling the interval of each category, 0–5, 5–15, 15–35, 35–75 have interval widths of 5, 10, 20, and 40.
CLASSIFICATIONS
![Page 35: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/35.jpg)
Exponential scales
• Popular method of increasing intervals
• Use break values that are powers such as 2n or 3n
• Generally start out with zero as an additional class if that value appears in your data
• Example: 0, 1–2, 3–4, 5–8, 9–16, and so forth
CLASSIFICATIONS
![Page 36: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/36.jpg)
CHOROPLETH MAP EXAMPLE
Percentage of vacant housing units by county
![Page 37: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/37.jpg)
RULES OF THUMB:
1. First Rule: use common sense! What do your groups represent, and are they meaningful? Are you misleading your audience with unreasonable breaks?
2. Binning by quantiles is typically a safe way to create breaks to show low, medium, and high values.
3. If a lot of your data is bunched together (for example, half of your values are close to zero), quantiles will not be meaningful because it will imply differences that do not exist.
4. If your distribution is skewed, consider increasing-interval or exponential scales.
For example, define the first group as 0-2, second as 2-6, third as 6-14, next as 14-30 (your interval size doubles each break).
![Page 38: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/38.jpg)
U.S. population by state, 2000
Original map (natural breaks)
![Page 39: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/39.jpg)
Not good because too many values fall into low classesEqual interval scale
![Page 40: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/40.jpg)
Shows that an increasing width (geometric) scale is neededQuantile scale
![Page 41: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/41.jpg)
Custom geometric scale• Experiment with exponential scales with powers of 2
or 3.
![Page 42: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/42.jpg)
Normalizing data
![Page 43: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/43.jpg)
Divides one numeric attribute by another in order to
minimize differences in values based on the size of
areas or number of features in each area Examples:
• Dividing the number of vacant housing units by the total number of housing units yields the percentage of vacant units
• Dividing the population by area of the feature yields a population density
43
NORMALIZING DATA
![Page 44: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/44.jpg)
Number of vacant housing units by state, 2000
NON-NORMALIZED DATA
![Page 45: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/45.jpg)
Percentage vacant housing units by state, 2000
NORMALIZED DATA
![Page 46: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/46.jpg)
California population by county, 2007
NON-NORMALIZED DATA
![Page 47: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/47.jpg)
California population density, 2007
NORMALIZED DATA
![Page 48: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/48.jpg)
Selecting colors
![Page 49: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/49.jpg)
SEQUENTIAL SCALE
http://colorbrewer2.org/
![Page 50: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/50.jpg)
DIVERGENT SCALE
![Page 51: Lecy ∙ Data-Driven DM Lecture 09 Working with Maps Files and Census Data](https://reader035.vdocuments.us/reader035/viewer/2022062315/5697bfa31a28abf838c96c0a/html5/thumbnails/51.jpg)
RULES OF THUMB:
1. If you are highlighting a class of individuals, like those in poverty, a single color might be sufficient (FE red for in poverty, gray for not).
2. When you are highlighting performance along a single dimension, use a sequential scale with white or gray at the bottom of your color scale and a dark color at the top.
3. If your comparison is relative to the average put white or gray in the middle to represent “average.” Consider red for low, blue or green for high (depending on the context if high is good, low is bad).
4. Five to seven categories is typical.