Pantheon 1.0, A Manually Verified Dataset of Globally Famous Biographies (doi:10.7910/DVN/28201)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Pantheon 1.0, A Manually Verified Dataset of Globally Famous Biographies

Identification Number:

doi:10.7910/DVN/28201

Distributor:

Harvard Dataverse

Date of Distribution:

2016-01-04

Version:

1

Bibliographic Citation:

Yu, Amy Zhao; Ronen, Shahar; Hu, Kevin; Lu, Tiffany; Hidalgo, Cesar, 2016, "Pantheon 1.0, A Manually Verified Dataset of Globally Famous Biographies", https://doi.org/10.7910/DVN/28201, Harvard Dataverse, V1

Study Description

Citation

Title:

Pantheon 1.0, A Manually Verified Dataset of Globally Famous Biographies

Identification Number:

doi:10.7910/DVN/28201

Authoring Entity:

Yu, Amy Zhao (Macro Connections, MIT Media Lab)

Ronen, Shahar (Macro Connections, MIT Media Lab)

Hu, Kevin (Macro Connections, MIT Media Lab)

Lu, Tiffany (Macro Connections, MIT Media Lab)

Hidalgo, Cesar (Macro Connections, MIT Media Lab)

Producer:

Yu, Amy Zhao

Date of Production:

2013

Distributor:

Harvard Dataverse

Distributor:

Macro Connections

Access Authority:

Amy Yu

Date of Deposit:

2014-12-14

Date of Distribution:

2014-12

Holdings Information:

https://doi.org/10.7910/DVN/28201

Study Scope

Keywords:

Social Sciences

Abstract:

We present the Pantheon 1.0 dataset: a manually verified dataset of individuals that have transcended linguistic, temporal, and geographic boundaries. The Pantheon 1.0 dataset includes the 11,341 biographies present in more than 25 languages in Wikipedia and is enriched with: (i) manually verified demographic information (place and date of birth, gender) (ii) a taxonomy of occupations classifying each biography at three levels of aggregation and (iii) two measures of global popularity including the number of languages in which a biography is present in Wikipedia (L), and the Historical Popularity Index (HPI) a metric that combines information on L, time since birth, and page-views (2008-2013). We compare the Pantheon 1.0 dataset to data from the 2003 book, Human Accomplishments, and also to external measures of accomplishment in individual games and sports: Tennis, Swimming, Car Racing, and Chess. In all of these cases we find that measures of popularity (L and HPI) correlate highly with individual accomplishment, suggesting that measures of global popularity proxy the historical impact of individuals.

Time Period:

6000 BC-2013

Unit of Analysis:

Individuals

Kind of Data:

TSV

Methodology and Processing

Sources Statement

Data Access

Notes:

<a href="http://creativecommons.org/publicdomain/zero/1.0">CC0 1.0</a>

Other Study Description Materials

Related Publications

Citation

Title:

Yu AZ, Ronen S, Hu K, Lu T, Hidalgo CA (2016). Pantheon 1.0, a manually verified dataset of globally famous biographies. Scientific Data 2:150075.

Identification Number:

10.1038/sdata.2015.75

Bibliographic Citation:

Yu AZ, Ronen S, Hu K, Lu T, Hidalgo CA (2016). Pantheon 1.0, a manually verified dataset of globally famous biographies. Scientific Data 2:150075.

Other Study-Related Materials

Label:

pageviews_2008-2013.tsv

Text:

Monthly pageviews for all individuals, across all languages, 2008-2013

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

pantheon.tsv

Text:

flattened data file with all individuals in Pantheon.

Notes:

text/tab-separated-values

Other Study-Related Materials

Label:

wikilangs.tsv

Text:

Language table linking Pantheon biographies to language editions of Wikipedia

Notes:

text/plain; charset=UTF-8