GIS for SS Librarians

Friday, June 1, 2012

Tufts University

GIS for Social Sciences Librarians

Software

ArcGIS version 10 ESRI (

Campus-wide License $25,000 annually

Tech Support limited to one contact person on campus

GIS (Geographic Information Systems)
In this lesson, we will learn how to use a GIS (Geographic Information System). This is a software package that allows you store, manipulate, analyze and display geographic data. We will be using ArcGIS 10, which is produced by ESRI ( This software is available in many labs throughout Tufts in addition to the GIS Lab.
Layers / Tables

GIS in a nutshell: the most common data format for GIS data are shapefiles, which is composed oflayers and tables. Layers and tables are the building blocks in your use of a GIS, and everything you do will extend from these two things.

  • A layer is an overlay that represents a feature. A feature is a group of similar objects. For example, towns can be one feature and roads can be another feature. So, one layer contains towns. Another contains roads. Yet another represents vegetation. Each one is laid one atop another to make the final map.
  • A table is the information that is attached to each layer. The information is data in the form of a spreadsheet, where rows represent the cases and the columns represent the variables. So, for the towns layer, we have a table and each row is a town. The columns are the variables for the towns. They could be size, location, population, etc. These data about the feature are called an attribute.

With these two elements we can make maps and do calculations. For example, we can select towns that have over 200,000 people and make a new layer. Or, we can find best suited locations for an upscale store by locating areas with certain demographics. Or, we can map out poor areas of urban areas in attempt to allocate services. All this is done through layers and tables.

Interesting Examples

1. Geocoding Substance Abuse Centers

Student had addresses of substance abuse centers in Rhode Island and wanted to map them out for a presentation and pamphlet. She geocoded the addresses whereby she used a GIS function to find the addresses in a Tiger street shapefile and made point data from them.

Data: Tiger file, excel spreadsheet with addresses

Difficulties: geocoding is sloppy. Need to do a lot of cleaning

  1. Mapping out historical ethnic data

Student wanted to see if Japanese-Americans had relocated from their original homes after being interred during WWII. She mapped out the census data from 1940 and 1950 on the county level to see differences.

Data: Countyshapefile from census.gov, census data 1940 & 1950

Difficulties: Data from 1940 only in paper format, data from 1950 from ICPSR, clean up data (e.g., all capitals).

  1. Mapping political wards and selecting census data that fall within the wards.

Student wanted to see if demographic variables could predict election results in Dayton and Toledo, Ohio. He had maps of elections wards in the cities. He needed to georeference the maps, whereby he gives them corresponding coordinates with the census data, and create a new layer for the wards. Then he could select what census block groups intersected with the wards. With the census data and the election results, he performed regression.

Data: .pdfs of wards, Block group shapefiles, tiger shapefiles, S4 census data (Brown spatial initiative

Difficulties: Very time consuming to georeference and create new layer. Calculations had to use estimates since block groups and wards often did not correspond exclusively, i.e., the same block group could fall into different wards because the boundaries of each layer didn’t correspond.

Consulting

Conducting reference or consulting depends on your skills and comfort level with spatial data and GIS. The most essential questions that a librarian gets is about obtaining shapefiles and variables. If you feel more comfortable with GIS, you can answer some more of the GIS process questions.

Definitely make a data subject guide pointing users to all the data that are available to them as a member of your institution.

Example of data subject page:

Online GIS Data Sources

Geodata@Tufts

Some good data sources are:

American Factfinder( has a wealth of data from the 2000 census and sample data from 1990. This site also provides online mapping abilities of census data.

Spatial Data on the Web by State

Geolytics Census CD ( has census CD back to 1970. Shapefiles are included with the data.

National Historic GIS Project( is a project to digitize and distribute aggregated census data for US Censuses 1790-2000. Also included are boundary GIS shape files for previous censuses. An invaluable site for historic research in US. Easy to use interface.

IPUMS USA( Integrated Public Use Microdata (Minnesota Population Center, University of Minnesota) is very good site with census samples from different years, information about previous censuses, including enumeration forms ( the questions asked) and lists of variables for all US censuses. Also has an international section.

Good to get local sources, too. For example,

Cambridge is a good example.

Numeric data can be joined to boundary files

ICPSR( a lot of census data online. One study, Number 2896, has county and state data from 1790 onward.

Also good to create references to GIS and Google Earth tutorials:

ArcGIS Tips and Tutorials

Google Earth

Learn GIS if you’re up to it. Often students need ask the data librarians for help:

Common Functions (zoom, pan, select, find) change symbology, label, metadata, map variables.

GIS Functions/Metadata

Good metadata or codebook is critical for understanding your data. GIS data mostly in FGDC (Federal Geographic Data Committee) format Shapefile metadata comes in XML format and can be read with any appropriate reader.

Practice question: I need to map out the Uygur population of China.

  1. Open up ArcMap
  2. Add Data through the Catalog
  3. Navigate to S:\classes\Bootcamp\2012\Examples\ChinaCensusCty00.shp
  4. Drag and drop that shapefile into ArcMap.
  5. Right-click on layer name and click on Open Attribute Table. These are all the attributes (variables) for this layer.
  6. Use metadata to find out what the attributes are. Right-click on layer name and click on Data > View Item Description.
  7. In Data Source Item Description, go to FGDC > Entities and Attributes > A106016
  8. Map out the variable. Right-click on layer name and go to Properties.
  9. In Layer Properties go to the Symbologytab.
  10. Under “Show” select Quantities.
  11. Under “Fields” (variables) select A106016. This gives you 5 categories for you data, with the darker color meaning more. Click ok.

Where are the Uygur people located in China?

If you have time, where are people with university education (P University) located and why?

If metadata doesn’t explain it, then you need to contact the author of the data.

Thomas Stieve

617.627.6075

1