|
View: |
Part 1: Document Description
|
|
Citation |
|
|---|---|
|
Title: |
Replication Data for: Not Just Conspiracy Theories: Vaccine Opponents and Proponents add to the COVID-19 ‘Infodemic’ on Twitter |
|
Identification Number: |
doi:10.7910/DVN/9ICICY |
|
Distributor: |
Harvard Dataverse |
|
Date of Distribution: |
2020-08-26 |
|
Version: |
1 |
|
Bibliographic Citation: |
Broniatowski, David, 2020, "Replication Data for: Not Just Conspiracy Theories: Vaccine Opponents and Proponents add to the COVID-19 ‘Infodemic’ on Twitter", https://doi.org/10.7910/DVN/9ICICY, Harvard Dataverse, V1 |
|
Citation |
|
|
Title: |
Replication Data for: Not Just Conspiracy Theories: Vaccine Opponents and Proponents add to the COVID-19 ‘Infodemic’ on Twitter |
|
Identification Number: |
doi:10.7910/DVN/9ICICY |
|
Authoring Entity: |
Broniatowski, David (The George Washington University) |
|
Distributor: |
Harvard Dataverse |
|
Access Authority: |
Broniatowski, David |
|
Depositor: |
Broniatowski, David |
|
Date of Deposit: |
2020-08-25 |
|
Holdings Information: |
https://doi.org/10.7910/DVN/9ICICY |
|
Study Scope |
|
|
Keywords: |
Computer and Information Science, Medicine, Health and Life Sciences, Social Sciences |
|
Abstract: |
These files contain the data required to replicate all findings in the referenced paper. Files include: 1) 2000_Account_IDs.txt -- a tab-separated text file listing the top 2000 accounts mentioning vaccine-related keywords in CY 2019. 2) users_ids.csv -- a comma-separated file listing all tweet IDs containing coronavirus-related keywords generated by each of the 2000 accounts. The first entry on each line is a username, followed by a list of tweet IDs. 3) users_botscores.txt -- a tab-separated text file listing the bot scores generated from querying Botometer on March 2, 2020. The first entry is the raw (English) bot score and the second entry is the CAP score. 4) corona_topic_keys.txt -- the top 20 words for each of 35 topics generated using the LDA algorithm fit to all tweets listed in users_ids.csv 5) corona_doc_topics.txt -- LDA model topic results fit to each tweet in users_ids.csv. The second column corresponds to the tweet ID, and the following 35 columns are topic proportions for topics 0-34, respectively. |
|
Methodology and Processing |
|
|
Sources Statement |
|
|
Data Access |
|
|
Notes: |
<a href="http://creativecommons.org/publicdomain/zero/1.0">CC0 1.0</a> |
|
Other Study Description Materials |
|
|
Label: |
2000_Account_IDs.txt |
|
Text: |
User ID, retweet count, and annotations for the 2000 most prolific accounts in the vaccine stream Twitter archive for calendar year 2019. |
|
Notes: |
text/plain |
|
Label: |
corona_doc_topics.txt |
|
Text: |
LDA model topic results fit to each tweet in users_ids.csv. The second column corresponds to the tweet ID, and the following 35 columns are topic proportions for topics 0-34, respectively. |
|
Notes: |
text/plain |
|
Label: |
corona_topic_keys.txt |
|
Text: |
The top 20 words for each of 35 topics generated using the LDA algorithm fit to all tweets listed in users_ids.csv |
|
Notes: |
text/plain |
|
Label: |
users_botscores.txt |
|
Text: |
A tab-separated text file listing the bot scores generated from querying Botometer on March 2, 2020. The first entry is the raw (English) bot score and the second entry is the CAP score. |
|
Notes: |
text/plain |
|
Label: |
users_ids.csv |
|
Text: |
A comma-separated file listing all tweet IDs containing coronavirus-related keywords generated by each of the 2000 accounts. The first entry on each line is a username, followed by a list of tweet IDs. |
|
Notes: |
text/csv |