This website contains a collection of datasets for visualization construction.

Most datasets are collected from their original sources and processed. Unless otherwise stated, all derived work is shared under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license. Please attribute the original sources when using these datasets.

The primary purpose of this collection is to demonstrate and evaluate visualization construction tools. Although there was a considerable amount of effort to make these datasets correct and accurate, please be cautious when using them for serious analysis.

For questions or complaints, please email “visualization.datasets [at] gmail.com”.

Mobile OS Market Share

Description

Market share of well-known mobile operating systems from 2009 to 2016.

Source

Data retrieved from StatCounter Global Stats; aggregated by year by averaging the monthly percentiles.

Character Co-occurrence in Les Misérables

Description

Character co-occurrence graph in Victor Hugo’s novel Les Misérables.

Source

Original dataset compiled by Donald Knuth for the Stanford GraphBase; retrieved from Mike Bostock’s D3 example Force-Directed Graph.

Gapminder Dataset

Description

Statistical data of countries from Gapminder World.

Source

Retrieved from Gapminder.

Caltrain Schedule

Description

Caltrain’s schedule.

Source

Timetable data from Caltrain’s Website; distance information from Wikipedia - List of Caltrain stations; parsed and processed by the authors.

Weather Data for Seattle and Boston

Description

Boston daily weather data in 2015 including temperature and precipitation.

Source

Data collected from the National Centers for Environmental Information; accessed May. 3th, 2018; aggregated by the authors.

UCI Mushroom Dataset

Description

The “Mushroom” dataset from the UCI Machine Learning Repository. We took a sample of 200 mushrooms from the original dataset.

Source

UCI Machine Learning Repository: Mushroom Dataset. Mushroom records drawn from The Audubon Society Field Guide to North American Mushrooms (1981). G. H. Lincoff (Pres.), New York: Alfred A. Knopf.

Download

Nightingale

Description

Florence Nightingale’s data on the Diagram of the Causes of Mortality in the Army of the East.

Source

Nightingale, F., Farr, W., & Smith, A. (1859). A contribution to the sanitary history of the British army during the late war with Russia. John W. Parker and Son.

Polio Cases in the United States

Description

The number of Polio cases in the United States by Year and State.

Source

Retrieved from Project Tycho; aggregated into yearly values.

Auto MPG Dataset

Description

The “Auto MPG Dataset” from the UCI Machine Learning Repository.

Source

UCI Machine Learning Repository: Auto MPG Dataset. This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University. The dataset was used in the 1983 American Statistical Association Exposition.

Best books selected by the New York Times from 2013 to 2017

Description

Best books selected by the New York Times.

Source

Retrieved from the source code of Tanyoung Kim’s Best Book Shelf.

Global Trade of Natural Resources in 2016

Description

Global trade of natural resources in 2016; processed to contain only trades of more than $1,000,000,000 in value.

Source

Chatham House (2018), ‘resourcetrade.earth’, https://resourcetrade.earth/.

World Greenhouse Gas Emissions in 2005

Description

Greenhouse gas emissions by industry sectors and end user area.

Source

Data extracted from the original chart from the World Resources Institute. See the working paper for more information.

Higher Education v.s. Obesity

Description

Obesity and higher education rates (BA degree) in the United States in 2016.

Microsoft Stock Price

Description

Stock price of Microsoft from 1987 to 2018.

Source

Retrieved from MarcoTrends.

Download

World Population 2017

Description

World population in 2017, grouped by age and gender.

Source

United Nations, Department of Economic and Social Affairs, Population Division (2017). World Population Prospects: The 2017 Revision, custom data acquired via website.

China PM2.5 Air Quality Index

Description

China PM2.5 Air Quality Index. We averaged and then converted all values to Air Quality Index (AQI) using the Chinese scale (HJ 633-2012).

Source

Berman, Lex, 2017, “China AQI Archive (Feb 2014 - Feb 2016)”, doi:10.7910/DVN/GHOXXO, Harvard Dataverse.

Per Capita Food Supply in 2013

Description

Food supply in kcal/capita/day. With grand total and percentages.

Source

Data is collected from FAOSTAT’s “Food Balance Sheets”.

Ranking of Carbon Dioxide Emissions

Description

Carbon dioxide emission values of selected countries and the ranking among these countries.

Source

Millennium Development Goals Indicators from the United Nations Statistics Division. Accessed May. 4th, 2018.