Replication Data for: Not Just Conspiracy Theories: Vaccine Opponents and Proponents add to the COVID-19 ‘Infodemic’ on Twitter (doi:10.7910/DVN/9ICICY)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

Replication Data for: Not Just Conspiracy Theories: Vaccine Opponents and Proponents add to the COVID-19 ‘Infodemic’ on Twitter

Identification Number:

doi:10.7910/DVN/9ICICY

Distributor:

Harvard Dataverse

Date of Distribution:

2020-08-26

Version:

1

Bibliographic Citation:

Broniatowski, David, 2020, "Replication Data for: Not Just Conspiracy Theories: Vaccine Opponents and Proponents add to the COVID-19 ‘Infodemic’ on Twitter", https://doi.org/10.7910/DVN/9ICICY, Harvard Dataverse, V1

Study Description

Citation

Title:

Replication Data for: Not Just Conspiracy Theories: Vaccine Opponents and Proponents add to the COVID-19 ‘Infodemic’ on Twitter

Identification Number:

doi:10.7910/DVN/9ICICY

Authoring Entity:

Broniatowski, David (The George Washington University)

Distributor:

Harvard Dataverse

Access Authority:

Broniatowski, David

Depositor:

Broniatowski, David

Date of Deposit:

2020-08-25

Holdings Information:

https://doi.org/10.7910/DVN/9ICICY

Study Scope

Keywords:

Computer and Information Science, Medicine, Health and Life Sciences, Social Sciences

Abstract:

These files contain the data required to replicate all findings in the referenced paper. Files include: 1) 2000_Account_IDs.txt -- a tab-separated text file listing the top 2000 accounts mentioning vaccine-related keywords in CY 2019. 2) users_ids.csv -- a comma-separated file listing all tweet IDs containing coronavirus-related keywords generated by each of the 2000 accounts. The first entry on each line is a username, followed by a list of tweet IDs. 3) users_botscores.txt -- a tab-separated text file listing the bot scores generated from querying Botometer on March 2, 2020. The first entry is the raw (English) bot score and the second entry is the CAP score. 4) corona_topic_keys.txt -- the top 20 words for each of 35 topics generated using the LDA algorithm fit to all tweets listed in users_ids.csv 5) corona_doc_topics.txt -- LDA model topic results fit to each tweet in users_ids.csv. The second column corresponds to the tweet ID, and the following 35 columns are topic proportions for topics 0-34, respectively.

Methodology and Processing

Sources Statement

Data Access

Notes:

<a href="http://creativecommons.org/publicdomain/zero/1.0">CC0 1.0</a>

Other Study Description Materials

Other Study-Related Materials

Label:

2000_Account_IDs.txt

Text:

User ID, retweet count, and annotations for the 2000 most prolific accounts in the vaccine stream Twitter archive for calendar year 2019.

Notes:

text/plain

Other Study-Related Materials

Label:

corona_doc_topics.txt

Text:

LDA model topic results fit to each tweet in users_ids.csv. The second column corresponds to the tweet ID, and the following 35 columns are topic proportions for topics 0-34, respectively.

Notes:

text/plain

Other Study-Related Materials

Label:

corona_topic_keys.txt

Text:

The top 20 words for each of 35 topics generated using the LDA algorithm fit to all tweets listed in users_ids.csv

Notes:

text/plain

Other Study-Related Materials

Label:

users_botscores.txt

Text:

A tab-separated text file listing the bot scores generated from querying Botometer on March 2, 2020. The first entry is the raw (English) bot score and the second entry is the CAP score.

Notes:

text/plain

Other Study-Related Materials

Label:

users_ids.csv

Text:

A comma-separated file listing all tweet IDs containing coronavirus-related keywords generated by each of the 2000 accounts. The first entry on each line is a username, followed by a list of tweet IDs.

Notes:

text/csv