Skip to Main Content

Statistics and Data Science Subject Guide

Finding Datasets: Questions to Consider

When looking for data, begin by considering some key questions. This will help you strategize regarding where to search and make your research process more efficient.

  1. What is your research topic or question?
  2. What are the characteristics of the data you need?
    • Unit of analysis: individuals, households, companies, players, teams, countries, states, nations, etc.
    • Geography: parcels in a city, countries in a region, birds in a forest. etc.
    • Time period: e.g. 1980-2006
    • Frequency: annual, quarterly, etc.
  3. Who is likely to collect data on this topic? Consider:
    • Specific researchers
    • Government agencies
    • NGOs/IGOs
    • Think tanks and research organizations
  4. Where are these data likely to be indexed or published?
    • Compendia, portals, and indexes: When data are likely to be compiled or reported, these tools allow you to search by topic and discover data and data producers. Examples include: Data Planet, Social Explorer, Data.gov
    • When data are likely to be shared by the researchers who produced them, they are likely to deposit the data in repositories. Examples include: ICPSR, Dryad, Figshare
  5. What existing publications might use the data you need? Finding books, articles, or other research publications addressing your topic of interest can help you look "backwards" to find data. After all, researchers need to cite the datasets they used... and you might be able to use those datasets, too!

"Finding Datasets: Questions to Consider" is adapted from the "Data Reference Worksheet" created by Kristin Partlo and Danya Leebaw, used under CC-BY-SA 4.0. "Finding Datasets: Questions to Consider" is licensed under CC-BY-SA 4.0 by Audrey Gunn.

Places to Find Data and Datasets

This is just a small sample of the many places you can search for data and datasets. If you're not finding what you need, consider exploring our complete list of St. Olaf's paid databases for finding data/statistics. Another option is to search for relevant articles using one of our research databases, then look at the datasets cited in those articles.

Finding Government Data

One of the best tricks for finding government data is searching Google by domain suffix. Here's how it works:

Let's say you want to find data from the Minnesota state government about wolves. You know that websites run by the state of Minnesota typically end in the domain suffix .mn.us. Therefore, your Google search will look like this:

site:.mn.us wolf data

By typing "site:" and then the name of the domain (or domain suffix) you want to search, you will only find results from the site(s) you specified. This works for all kinds of sites: site:.gov will give you results from websites ending in .gov (often U.S. federal government websites, but includes other levels of U.S. government, too), while site:northfieldmn.gov will give you results from the City of Northfield's website.

This trick is especially useful when finding government data for two reasons. First, many different government agencies publish data, and it can be messy to try and figure out which agency will have the data you need. This strategy lets you search across many different agencies within the same government, all at once. Second, government domain suffixes are generally restricted, meaning not just anybody can get a website with that suffix. You can be fairly confident that the results you find will be from the government in question, though it's always a good idea to double check.