Open e-commerce 1.0: Five years of crowdsourced U.S. Amazon purchase histories with user demographics (doi:10.7910/DVN/YGLYDY)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link)

Document Description

Citation

Title:

Open e-commerce 1.0: Five years of crowdsourced U.S. Amazon purchase histories with user demographics

Identification Number:

doi:10.7910/DVN/YGLYDY

Distributor:

Harvard Dataverse

Date of Distribution:

2023-12-02

Version:

1

Bibliographic Citation:

Alex Berke; Dan Calacci; Robert Mahari; Takahiro Yabe; Kent Larson; Sandy Pentland, 2023, "Open e-commerce 1.0: Five years of crowdsourced U.S. Amazon purchase histories with user demographics", https://doi.org/10.7910/DVN/YGLYDY, Harvard Dataverse, V1, UNF:6:mV4isMgPXhqWeiQ3gZmmNQ== [fileUNF]

Study Description

Citation

Title:

Open e-commerce 1.0: Five years of crowdsourced U.S. Amazon purchase histories with user demographics

Identification Number:

doi:10.7910/DVN/YGLYDY

Authoring Entity:

Alex Berke (MIT Media Lab)

Dan Calacci (Princeton University & MIT Media Lab)

Robert Mahari (MIT Media Lab & Harvard Law School)

Takahiro Yabe (MIT Institute of Data, Systems, and Society (IDSS) & New York University Center for Urban Science and Progress)

Kent Larson (MIT Media Lab)

Sandy Pentland (MIT Media Lab)

Distributor:

Harvard Dataverse

Access Authority:

Alex Berke

Depositor:

Alex Berke

Date of Deposit:

2023-12-02

Holdings Information:

https://doi.org/10.7910/DVN/YGLYDY

Study Scope

Keywords:

Social Sciences, Other, e-commerce, purchase histories, crowdsourced

Abstract:

This dataset contains longitudinal purchases data from 5027 Amazon.com users in the US, spanning 2018 through 2022: amazon-purchases.csv<br> It also includes demographic data and other consumer level variables for each user with data in the dataset. These consumer level variables were collected through an online survey and are included in survey.csv <br> fields.csv describes the columns in the survey.csv file, where fields/survey columns correspond to survey questions. <br> <br> The dataset also contains the survey instrument used to collect the data. More details about the survey questions and possible responses, and the format in which they were presented can be found by viewing the survey instrument. <br> <br> A 'Survey ResponseID' column is present in both the amazon-purchases.csv and survey.csv files. It links a user's survey responses to their Amazon.com purchases. The 'Survey ResponseID' was randomly generated at the time of data collection. <br><br> <b>amazon-purchases.csv</b> <br> Each row in this file corresponds to an Amazon order. Each such row has the following columns: <ul> <li>Survey ResponseID</li> <li>Order date</li> <li>Shipping address state</li> <li>Purchase price per unit</li> <li>Quantity</li> <li>ASIN/ISBN (Product Code)</li> <li>Title </li> <li>Category</li> </ul> <br> The data were exported by the Amazon users from Amazon.com and shared by users with their informed consent. PII and other information not listed above were stripped from the data. This processing occurred on users' machines before sharing with researchers.

Notes:

The dataset is provided for research purposes and should not be used to re-identify study participants. <br><br> The Amazon.com purchases data were crowdsourced and shared through an online survey. Surrey participants were recruited via online research platforms Prolific and CloudResearch. They were offered $0.35 for an estimated 1 minute prescreen and $1.50 for the main survey, with an estimated 4-7 minute completion time. In order to be eligible for the survey, participants had to meet the following requirements: 18 years or older, U.S. resident and English speaker, have an active Amazon account that they could sign into during the survey and which they had been using since 2018. <br> The survey prompted participants to share their Amazon data with informed consent, with the option to consent or decline to share. Participants were paid for completing the survey whether or not they chose to share their data. <br><br> The survey tool also embedded an experiment designed to test the impact of various data transparency levels and incentives on participants' likelihood to share their Amazon data. In addition, the survey tool enabled an empirical study of the privacy paradox. More information about the survey tool, data collection process, experiment design, and experiment results can be found in our related publication. <br><br> All software used in the data collection process is available via an open source repository: https://github.com/aberke/amazon-study <br><br> This data collection and publication was approved by the MIT Institutional Review Board (protocol #2205000649).

Methodology and Processing

Sources Statement

Data Access

Other Study Description Materials

Related Publications

Citation

Title:

Berke, A., Mahari, R., Pentland, S., Larson, K., Calacci, D. Insights from an experiment crowdsourcing data from thousands of US Amazon users: The importance of transparency, money, and data use. (In review).

Bibliographic Citation:

Berke, A., Mahari, R., Pentland, S., Larson, K., Calacci, D. Insights from an experiment crowdsourcing data from thousands of US Amazon users: The importance of transparency, money, and data use. (In review).

File Description--f7616231

File: survey.csv

  • Number of cases: 5027

  • No. of variables per record: 23

  • Type of File: text/tab-separated-values

Notes:

UNF:6:mV4isMgPXhqWeiQ3gZmmNQ==

Variable Description

List of Variables:

Variables

Survey ResponseID

f7616231 Location:

Variable Format: character

Notes: UNF:6:iz2A5v9x7adZNdEPWLCm2Q==

Q-demos-age

f7616231 Location:

Variable Format: character

Notes: UNF:6:dLUBf2tOG8/JsQRLSkMKZQ==

Q-demos-hispanic

f7616231 Location:

Variable Format: character

Notes: UNF:6:BLxin5ByfSPzCrpTY2BalQ==

Q-demos-race

f7616231 Location:

Variable Format: character

Notes: UNF:6:sRU0N9Bmj7ODj0GAIv8YrA==

Q-demos-education

f7616231 Location:

Variable Format: character

Notes: UNF:6:Dy85SedHQmVhbJ802JE1/A==

Q-demos-income

f7616231 Location:

Variable Format: character

Notes: UNF:6:4fDoYjBlQeNq22Qvd7KG5Q==

Q-demos-gender

f7616231 Location:

Variable Format: character

Notes: UNF:6:vnpt1jm20rWh3wrzffkGpg==

Q-sexual-orientation

f7616231 Location:

Variable Format: character

Notes: UNF:6:fkX0UbYjqdeFWX2UxelKdw==

Q-demos-state

f7616231 Location:

Variable Format: character

Notes: UNF:6:S3CNaQVpJhjFtvKEBkP85Q==

Q-amazon-use-howmany

f7616231 Location:

Variable Format: character

Notes: UNF:6:aNqb7RcK8awCBKZBfwsp7A==

Q-amazon-use-hh-size

f7616231 Location:

Variable Format: character

Notes: UNF:6:UqCXolcN5Wi82ozhI8u3lg==

Q-amazon-use-how-oft

f7616231 Location:

Variable Format: character

Notes: UNF:6:hrnSS1ocB/NDqHGL+ibJ1g==

Q-substance-use-cigarettes

f7616231 Location:

Variable Format: character

Notes: UNF:6:g96p55a8iPlmB0Fx4q+N2w==

Q-substance-use-marijuana

f7616231 Location:

Variable Format: character

Notes: UNF:6:1Mj8LvdkinBqhiavZAtSEw==

Q-substance-use-alcohol

f7616231 Location:

Variable Format: character

Notes: UNF:6:X5i/cwJ5iFdwr5IBf+n5Mw==

Q-personal-diabetes

f7616231 Location:

Variable Format: character

Notes: UNF:6:368f0rkAv8tNlzayKVrwbg==

Q-personal-wheelchair

f7616231 Location:

Variable Format: character

Notes: UNF:6:QImnKtVfuomhxAwRuRjCkw==

Q-life-changes

f7616231 Location:

Variable Format: character

Notes: UNF:6:zpnoYT+sa6I173rcsBipBQ==

Q-sell-YOUR-data

f7616231 Location:

Variable Format: character

Notes: UNF:6:mzyywvGaJDSTQZBPBco2+w==

Q-sell-consumer-data

f7616231 Location:

Variable Format: character

Notes: UNF:6:yDbCN1ySU/lNJTKNEPcUrw==

Q-small-biz-use

f7616231 Location:

Variable Format: character

Notes: UNF:6:lI2+2goCVkQ2IT0U3F1oYQ==

Q-census-use

f7616231 Location:

Variable Format: character

Notes: UNF:6:SVUC8pNogySI/We8UABzqA==

Q-research-society

f7616231 Location:

Variable Format: character

Notes: UNF:6:ATmCZMPJdmbiz/6LgdBlEA==

Other Study-Related Materials

Label:

amazon-purchases.csv

Text:

Amazon purchases data

Notes:

text/csv

Other Study-Related Materials

Label:

fields.csv

Text:

Names and descriptions of columns in survey.csv

Notes:

text/csv

Other Study-Related Materials

Label:

prescreen-survey-instrument.pdf

Notes:

application/pdf

Other Study-Related Materials

Label:

survey-instrument.pdf

Notes:

application/pdf