|
View: |
Part 1: Document Description
|
|
Citation |
|
|---|---|
|
Title: |
Open e-commerce 1.0: Five years of crowdsourced U.S. Amazon purchase histories with user demographics |
|
Identification Number: |
doi:10.7910/DVN/YGLYDY |
|
Distributor: |
Harvard Dataverse |
|
Date of Distribution: |
2023-12-02 |
|
Version: |
1 |
|
Bibliographic Citation: |
Alex Berke; Dan Calacci; Robert Mahari; Takahiro Yabe; Kent Larson; Sandy Pentland, 2023, "Open e-commerce 1.0: Five years of crowdsourced U.S. Amazon purchase histories with user demographics", https://doi.org/10.7910/DVN/YGLYDY, Harvard Dataverse, V1, UNF:6:mV4isMgPXhqWeiQ3gZmmNQ== [fileUNF] |
|
Citation |
|
|
Title: |
Open e-commerce 1.0: Five years of crowdsourced U.S. Amazon purchase histories with user demographics |
|
Identification Number: |
doi:10.7910/DVN/YGLYDY |
|
Authoring Entity: |
Alex Berke (MIT Media Lab) |
|
Dan Calacci (Princeton University & MIT Media Lab) |
|
|
Robert Mahari (MIT Media Lab & Harvard Law School) |
|
|
Takahiro Yabe (MIT Institute of Data, Systems, and Society (IDSS) & New York University Center for Urban Science and Progress) |
|
|
Kent Larson (MIT Media Lab) |
|
|
Sandy Pentland (MIT Media Lab) |
|
|
Distributor: |
Harvard Dataverse |
|
Access Authority: |
Alex Berke |
|
Depositor: |
Alex Berke |
|
Date of Deposit: |
2023-12-02 |
|
Holdings Information: |
https://doi.org/10.7910/DVN/YGLYDY |
|
Study Scope |
|
|
Keywords: |
Social Sciences, Other, e-commerce, purchase histories, crowdsourced |
|
Abstract: |
This dataset contains longitudinal purchases data from 5027 Amazon.com users in the US, spanning 2018 through 2022: amazon-purchases.csv<br> It also includes demographic data and other consumer level variables for each user with data in the dataset. These consumer level variables were collected through an online survey and are included in survey.csv <br> fields.csv describes the columns in the survey.csv file, where fields/survey columns correspond to survey questions. <br> <br> The dataset also contains the survey instrument used to collect the data. More details about the survey questions and possible responses, and the format in which they were presented can be found by viewing the survey instrument. <br> <br> A 'Survey ResponseID' column is present in both the amazon-purchases.csv and survey.csv files. It links a user's survey responses to their Amazon.com purchases. The 'Survey ResponseID' was randomly generated at the time of data collection. <br><br> <b>amazon-purchases.csv</b> <br> Each row in this file corresponds to an Amazon order. Each such row has the following columns: <ul> <li>Survey ResponseID</li> <li>Order date</li> <li>Shipping address state</li> <li>Purchase price per unit</li> <li>Quantity</li> <li>ASIN/ISBN (Product Code)</li> <li>Title </li> <li>Category</li> </ul> <br> The data were exported by the Amazon users from Amazon.com and shared by users with their informed consent. PII and other information not listed above were stripped from the data. This processing occurred on users' machines before sharing with researchers. |
|
Notes: |
The dataset is provided for research purposes and should not be used to re-identify study participants. <br><br> The Amazon.com purchases data were crowdsourced and shared through an online survey. Surrey participants were recruited via online research platforms Prolific and CloudResearch. They were offered $0.35 for an estimated 1 minute prescreen and $1.50 for the main survey, with an estimated 4-7 minute completion time. In order to be eligible for the survey, participants had to meet the following requirements: 18 years or older, U.S. resident and English speaker, have an active Amazon account that they could sign into during the survey and which they had been using since 2018. <br> The survey prompted participants to share their Amazon data with informed consent, with the option to consent or decline to share. Participants were paid for completing the survey whether or not they chose to share their data. <br><br> The survey tool also embedded an experiment designed to test the impact of various data transparency levels and incentives on participants' likelihood to share their Amazon data. In addition, the survey tool enabled an empirical study of the privacy paradox. More information about the survey tool, data collection process, experiment design, and experiment results can be found in our related publication. <br><br> All software used in the data collection process is available via an open source repository: https://github.com/aberke/amazon-study <br><br> This data collection and publication was approved by the MIT Institutional Review Board (protocol #2205000649). |
|
Methodology and Processing |
|
|
Sources Statement |
|
|
Data Access |
|
|
Other Study Description Materials |
|
|
Related Publications |
|
|
Citation |
|
|
Title: |
Berke, A., Mahari, R., Pentland, S., Larson, K., Calacci, D. Insights from an experiment crowdsourcing data from thousands of US Amazon users: The importance of transparency, money, and data use. (In review). |
|
Bibliographic Citation: |
Berke, A., Mahari, R., Pentland, S., Larson, K., Calacci, D. Insights from an experiment crowdsourcing data from thousands of US Amazon users: The importance of transparency, money, and data use. (In review). |
|
File Description--f7616231 |
|
|
File: survey.csv |
|
|
|
|
Notes: |
UNF:6:mV4isMgPXhqWeiQ3gZmmNQ== |
|
List of Variables: |
|
|
Variables |
|
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:iz2A5v9x7adZNdEPWLCm2Q== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:dLUBf2tOG8/JsQRLSkMKZQ== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:BLxin5ByfSPzCrpTY2BalQ== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:sRU0N9Bmj7ODj0GAIv8YrA== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:Dy85SedHQmVhbJ802JE1/A== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:4fDoYjBlQeNq22Qvd7KG5Q== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:vnpt1jm20rWh3wrzffkGpg== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:fkX0UbYjqdeFWX2UxelKdw== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:S3CNaQVpJhjFtvKEBkP85Q== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:aNqb7RcK8awCBKZBfwsp7A== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:UqCXolcN5Wi82ozhI8u3lg== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:hrnSS1ocB/NDqHGL+ibJ1g== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:g96p55a8iPlmB0Fx4q+N2w== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:1Mj8LvdkinBqhiavZAtSEw== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:X5i/cwJ5iFdwr5IBf+n5Mw== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:368f0rkAv8tNlzayKVrwbg== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:QImnKtVfuomhxAwRuRjCkw== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:zpnoYT+sa6I173rcsBipBQ== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:mzyywvGaJDSTQZBPBco2+w== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:yDbCN1ySU/lNJTKNEPcUrw== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:lI2+2goCVkQ2IT0U3F1oYQ== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:SVUC8pNogySI/We8UABzqA== |
|
f7616231 Location: |
Variable Format: character Notes: UNF:6:ATmCZMPJdmbiz/6LgdBlEA== |
|
Label: |
amazon-purchases.csv |
|
Text: |
Amazon purchases data |
|
Notes: |
text/csv |
|
Label: |
fields.csv |
|
Text: |
Names and descriptions of columns in survey.csv |
|
Notes: |
text/csv |
|
Label: |
prescreen-survey-instrument.pdf |
|
Notes: |
application/pdf |
|
Label: |
survey-instrument.pdf |
|
Notes: |
application/pdf |