Skip to Main Content Skip to footer content

IPUMS (census and survey data)

IPUMS Tips

  • Start early and plan time for your research.
  • Things which may take longer than you expect:
    • Account approval (my IPUMS International account approval took 8 days - and this was after I already was approved for and had been using data from other IPUMS projects)
    • Extract creation - it can take minutes to hours to create an extract. (a data extract I created of 32 variables from 20 samples took ~15 minutes to create. mileage may vary). 
    • Reading data documentation: their data documentation is extensive.
  • Download your data immediately once you're notified it's available. Data files older than 72 hours are subject to deletion.
  • Download the additional files (stata, r, data codebook, etc).
  • Read all of the documentation, not just the codebook. Each IPUMS project will have its own FAQ and user guide, but also read the documentation for each sample you're using.
  • Some samples are weighted.  Did I mention you should read the documentation? 

Building Your Extract

screenshot of IPUMS data extract selector

Using the data sample extractor

Samples

Samples are different datasets, such as census or survey microdata. Some of the samples have existed for years, while others were created specifically for IPUMS projects.

  • Select Samples: limits the display of variable information to selected samples

Variables

Use the "Variables" menu to browse or search variables:

  • Household: household variables by group
  • Person: person variables by group
  • A-Z: integrated variables by letter, allows browsing alphabetically
  • Search: displays only variables that contain specified text in particular fields. This is especially helpful if you know the type of variables you want to use.

Data Cart

The data cart will give you an overview of the contents of your data extract.

screenshot of IPUMS data extract selector with search results and variable selected.

Once variables are displayed, the extract selector will indicate which dataset they are in.

Click the plus symbol to add the variable to the data cart. 

Downloading Your Data Extract

screenshot of my data, which allows download or revision of extracts

  • Create a meaningful description of your dataset when sending the extract to be processed.
  • Put all of your project files in a folder with a meaningful name. 
    • Don't clutter this folder with items not related to your project. 
  • Right-click and "save link as..." to save command files and codebooks.
    • Against most best practices, don't rename the files. The code in the command files and codebooks already refers to the names the system gives the files.
  • Resubmit or revise if the data has been deleted from your account.
  • Click through the assistance they provide. 

Opening Files

The data produced by the extract system are gzipped (the file has a .gz extension). You must use a data compression utility to uncompress the file before you can analyze it.  

R/RStudio allows loading data without uncompressing the .dat.gz file.