This website contains a collection of datasets for visualization construction.
Most datasets are collected from their original sources and processed. Unless otherwise stated, all derived work is shared under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license. Please attribute the original sources when using these datasets.
The primary purpose of this collection is to demonstrate and evaluate visualization construction tools. Although there was a considerable amount of effort to make these datasets correct and accurate, please be cautious when using them for serious analysis.
For questions or complaints, please email “visualization.datasets [at] gmail.com”.
Character Co-occurrence in Les Misérables
Character co-occurrence graph in Victor Hugo’s novel Les Misérables.
Original dataset compiled by Donald Knuth for the Stanford GraphBase; retrieved from Mike Bostock’s D3 example Force-Directed Graph.
Gapminder Dataset
Statistical data of countries from Gapminder World.
Retrieved from Gapminder.
Caltrain Schedule
Caltrain’s schedule.
Timetable data from Caltrain’s Website; distance information from Wikipedia - List of Caltrain stations; parsed and processed by the authors.
Weather Data for Seattle and Boston
Boston daily weather data in 2015 including temperature and precipitation.
Data collected from the National Centers for Environmental Information; accessed May. 3th, 2018; aggregated by the authors.
UCI Mushroom Dataset
The “Mushroom” dataset from the UCI Machine Learning Repository. We took a sample of 200 mushrooms from the original dataset.
UCI Machine Learning Repository: Mushroom Dataset. Mushroom records drawn from The Audubon Society Field Guide to North American Mushrooms (1981). G. H. Lincoff (Pres.), New York: Alfred A. Knopf.
Nightingale
Florence Nightingale’s data on the Diagram of the Causes of Mortality in the Army of the East.
Nightingale, F., Farr, W., & Smith, A. (1859). A contribution to the sanitary history of the British army during the late war with Russia. John W. Parker and Son.
Polio Cases in the United States
The number of Polio cases in the United States by Year and State.
Retrieved from Project Tycho; aggregated into yearly values.
Auto MPG Dataset
The “Auto MPG Dataset” from the UCI Machine Learning Repository.
UCI Machine Learning Repository: Auto MPG Dataset. This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University. The dataset was used in the 1983 American Statistical Association Exposition.
Best books selected by the New York Times from 2013 to 2017
Best books selected by the New York Times.
Retrieved from the source code of Tanyoung Kim’s Best Book Shelf.
Global Trade of Natural Resources in 2016
Global trade of natural resources in 2016; processed to contain only trades of more than $1,000,000,000 in value.
Chatham House (2018), ‘resourcetrade.earth’, https://resourcetrade.earth/.
World Greenhouse Gas Emissions in 2005
Greenhouse gas emissions by industry sectors and end user area.
Data extracted from the original chart from the World Resources Institute. See the working paper for more information.
Higher Education v.s. Obesity
Obesity and higher education rates (BA degree) in the United States in 2016.
Obesity data is from “Prevalence of Self-Reported Obesity Among U.S. Adults by State and Territory, BRFSS, 2016”; Education data is from the U.S. Census Bureau, 2012-2016 American Community Survey 5-Year Estimates.
Microsoft Stock Price
Stock price of Microsoft from 1987 to 2018.
Retrieved from MarcoTrends.
World Population 2017
World population in 2017, grouped by age and gender.
United Nations, Department of Economic and Social Affairs, Population Division (2017). World Population Prospects: The 2017 Revision, custom data acquired via website.
China PM2.5 Air Quality Index
China PM2.5 Air Quality Index. We averaged and then converted all values to Air Quality Index (AQI) using the Chinese scale (HJ 633-2012).
Berman, Lex, 2017, “China AQI Archive (Feb 2014 - Feb 2016)”, doi:10.7910/DVN/GHOXXO, Harvard Dataverse.
Per Capita Food Supply in 2013
Food supply in kcal/capita/day. With grand total and percentages.
Data is collected from FAOSTAT’s “Food Balance Sheets”.
Ranking of Carbon Dioxide Emissions
Carbon dioxide emission values of selected countries and the ranking among these countries.
Millennium Development Goals Indicators from the United Nations Statistics Division. Accessed May. 4th, 2018.