Skip to main content
Universiteitsbibliotheek – LibGuides

Searching quantitative data for historians: Methods

Step-by-step

Searching for historical data can be done in multiple ways, depending on your research question, knowledge of the topic and the data available. In this LibGuide we provide several starting points to search for data. As publishing and archiving historical data online is still a quite recent development, an infrastructure to provide for this is still developing. Therefore, one of the most successful strategies to start a data search is to begin with secondary sources, although other starting points are provided in this LibGuide.

It is important to realize that searching for data can be done with a direct or an indirect search (see the table below). A direct search will in many cases mean fewer steps to find relevant data, but can also have the effect that relevant data is not found or important data archives are not scrutinized. Therefore it is important to use both methods, and to know on which level (data archive or data sets) the search engine you are using is searching on.

Concepts

Data

Data may be thought of as unprocessed atomic statements of fact. It very often refers to systematic collections of numerical information in tables of numbers such as spreadsheets or databases. When data is structured and presented so as to be useful and relevant for a particular purpose, it becomes information available for human apprehension. 

Dataset (also: data set)
Any organised collection of data. ‘Dataset’ is a flexible term and may refer to an entire database, a spreadsheet or other data file, or a related collection of data resources. 

Data collection
Datasets are created by collecting data in different ways: from manual or automatic measurements (e.g. weather data), surveys (census data), records of decisions (budget data) or ongoing transactions (spending data), aggregation of many records (crime data), mathematical modelling (population projections), etc. 

Database (also: data base; synonym: databank, also data bank)
1. Any organised collection of data may be considered a database. In this sense the word is synonymous with dataset.
2. A software system for processing and managing data, including features to extend or update, transform and query the data.
Note: In the context of this LibGuide, the word database is exclusively used in the second definition. 

Metadata

Information about a dataset such as its title and description, method of collection, author or publisher, area and time period covered, licence, date and frequency of release, etc. It is essential to publish data with adequate metadata to aid both discoverability and usability of the data. 

Visualization
A visual representation of data is often the most compelling way of communicating the data, bringing out its key features, correlations and outliers.Though many tools exist, creating a visualisation for a dataset is not an automatic process, but requires careful attention to the meaning of the variables, the relations between them and the stories inherent in the data, to design a visual representation that lets the message of the data shine through. 

Credits and more concepts:  Open Data Handbook.
Also see the glossary in BlackBoard for this course.

Methods

Source

Direct search:
find datasets in just one step

Indirect search:
find datasets through metadata in databases 
1. Via secondary sources
  • Link to the dataset
  • Google "dataset" OR "data set"
  • Citation of the (published) dataset
  • Data journal
  • Data supplement to a scholarly journal
  • Multidisciplinary databases 
  • Databases by discipline

2. Via data archives

  • National data research repositories 
  • Data repositories by discipline 
     
  • Dataset Search (Google)
  • DataSearch (Elsevier)
  • DataCite 
  • Registry of data repositories (Re3Data) 
  • Google "data archive" OR "data repository"

3. Via statistical databases

  • Statistical databases
  • Google "statistics"
4. Via internet
  • Google "dataset" OR "data set"
  • Google 
  • Zanran