<?xml version="1.0" encoding="UTF-8"?>
<resource xmlns="http://datacite.org/schema/kernel-4" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.5/metadata.xsd">
  <identifier identifierType="DOI">10.7910/DVN/YGLYDY</identifier>
  <creators>
    <creator>
      <creatorName nameType="Personal">Alex Berke</creatorName>
      <givenName>Alex</givenName>
      <familyName>Berke</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org">https://orcid.org/0000-0001-5996-0557</nameIdentifier>
      <affiliation>MIT Media Lab</affiliation>
    </creator>
    <creator>
      <creatorName nameType="Personal">Dan Calacci</creatorName>
      <givenName>Dan</givenName>
      <familyName>Calacci</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org">https://orcid.org/0000-0002-9552-1137</nameIdentifier>
      <affiliation>Princeton University &amp;amp; MIT Media Lab</affiliation>
    </creator>
    <creator>
      <creatorName nameType="Personal">Robert Mahari</creatorName>
      <givenName>Robert</givenName>
      <familyName>Mahari</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org">https://orcid.org/0000-0003-2372-2746</nameIdentifier>
      <affiliation>MIT Media Lab &amp;amp; Harvard Law School</affiliation>
    </creator>
    <creator>
      <creatorName nameType="Personal">Takahiro Yabe</creatorName>
      <givenName>Takahiro</givenName>
      <familyName>Yabe</familyName>
      <nameIdentifier nameIdentifierScheme="ORCID" schemeURI="https://orcid.org">https://orcid.org/0000-0001-8967-1967</nameIdentifier>
      <affiliation>MIT Institute of Data, Systems, and Society (IDSS) &amp;amp; New York University Center for Urban Science and Progress</affiliation>
    </creator>
    <creator>
      <creatorName nameType="Personal">Kent Larson</creatorName>
      <givenName>Kent</givenName>
      <familyName>Larson</familyName>
      <affiliation>MIT Media Lab</affiliation>
    </creator>
    <creator>
      <creatorName nameType="Personal">Sandy Pentland</creatorName>
      <givenName>Sandy</givenName>
      <familyName>Pentland</familyName>
      <affiliation>MIT Media Lab</affiliation>
    </creator>
  </creators>
  <titles>
    <title>Open e-commerce 1.0: Five years of crowdsourced U.S. Amazon purchase histories with user demographics</title>
  </titles>
  <publisher>Harvard Dataverse</publisher>
  <publicationYear>2023</publicationYear>
  <subjects>
    <subject>Social Sciences</subject>
    <subject>Other</subject>
    <subject>e-commerce</subject>
    <subject>purchase histories</subject>
    <subject>crowdsourced</subject>
  </subjects>
  <contributors>
    <contributor contributorType="ContactPerson">
      <contributorName nameType="Personal">Alex Berke</contributorName>
      <givenName>Alex</givenName>
      <familyName>Berke</familyName>
      <affiliation>Massachusetts Institute of Technology</affiliation>
    </contributor>
  </contributors>
  <dates>
    <date dateType="Submitted">2023-12-02</date>
    <date dateType="Available">2023-12-02</date>
  </dates>
  <resourceType resourceTypeGeneral="Dataset"/>
  <relatedIdentifiers>
    <relatedIdentifier relationType="IsSupplementTo" schemeURI="https://github.com" relatedIdentifierType="URL">/aberke/amazon-study/blob/master/data-collection-survey-experiment.pdf</relatedIdentifier>
  </relatedIdentifiers>
  <sizes>
    <size>1560714</size>
    <size>393144</size>
    <size>2532</size>
    <size>1333077</size>
    <size>313070173</size>
  </sizes>
  <formats>
    <format>text/tab-separated-values</format>
    <format>application/pdf</format>
    <format>text/csv</format>
    <format>application/pdf</format>
    <format>text/csv</format>
  </formats>
  <version>1.0</version>
  <rightsList>
    <rights rightsURI="info:eu-repo/semantics/openAccess"/>
    <rights rightsURI="http://creativecommons.org/publicdomain/zero/1.0" rightsIdentifier="CC0-1.0" rightsIdentifierScheme="SPDX" schemeURI="https://spdx.org/licenses/" xml:lang="en">Creative Commons CC0 1.0 Universal Public Domain Dedication.</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">This dataset contains longitudinal purchases data from 5027 Amazon.com users in the US, spanning 2018 through 2022: amazon-purchases.csv&amp;lt;br&amp;gt;
It also includes demographic data and other consumer level variables for each user with data in the dataset. These consumer level variables were collected through an online survey and are included in survey.csv
&amp;lt;br&amp;gt;
fields.csv describes the columns in the survey.csv file, where fields/survey columns correspond to survey questions. 
&amp;lt;br&amp;gt;
&amp;lt;br&amp;gt;
The dataset also contains the survey instrument used to collect the data.
More details about the survey questions and possible responses, and the format in which they were presented can be found by viewing the survey instrument.
&amp;lt;br&amp;gt;
&amp;lt;br&amp;gt;
A &amp;apos;Survey ResponseID&amp;apos; column is present in both the amazon-purchases.csv and survey.csv files. It links a user&amp;apos;s survey responses to their Amazon.com purchases. The &amp;apos;Survey ResponseID&amp;apos; was randomly generated at the time of data collection. 
&amp;lt;br&amp;gt;&amp;lt;br&amp;gt;
&amp;lt;b&amp;gt;amazon-purchases.csv&amp;lt;/b&amp;gt;
&amp;lt;br&amp;gt;
Each row in this file corresponds to an Amazon order. Each such row has the following columns: 
&amp;lt;ul&amp;gt;
&amp;lt;li&amp;gt;Survey ResponseID&amp;lt;/li&amp;gt;
&amp;lt;li&amp;gt;Order date&amp;lt;/li&amp;gt;
&amp;lt;li&amp;gt;Shipping address state&amp;lt;/li&amp;gt;
&amp;lt;li&amp;gt;Purchase price per unit&amp;lt;/li&amp;gt;
&amp;lt;li&amp;gt;Quantity&amp;lt;/li&amp;gt;
&amp;lt;li&amp;gt;ASIN/ISBN (Product Code)&amp;lt;/li&amp;gt;
&amp;lt;li&amp;gt;Title &amp;lt;/li&amp;gt;
&amp;lt;li&amp;gt;Category&amp;lt;/li&amp;gt; 
&amp;lt;/ul&amp;gt;

&amp;lt;br&amp;gt;
The data were exported by the Amazon users from Amazon.com and shared by users with their informed consent. PII and other information not listed above were stripped from the data. This processing occurred on users&amp;apos; machines before sharing with researchers.</description>
    <description descriptionType="Other">The dataset is provided for research purposes and should not be used to re-identify study participants.
&lt;br&gt;&lt;br&gt;
The Amazon.com purchases data were crowdsourced and shared through an online survey. Surrey participants were recruited via online research platforms Prolific and CloudResearch.
They were offered $0.35 for an estimated 1 minute prescreen and $1.50 for the main survey, with an estimated 4-7 minute completion time.
In order to be eligible for the survey, participants had to meet the following requirements: 18 years or older, U.S. resident and English speaker, have an active Amazon account that they could sign into during the survey and which they had been using since 2018.
&lt;br&gt;
The survey prompted participants to share their Amazon data with informed consent, with the option to consent or decline to share. Participants were paid for completing the survey whether or not they chose to share their data.
&lt;br&gt;&lt;br&gt;
The survey tool also embedded an experiment designed to test the impact of various data transparency levels and incentives on participants' likelihood to share their Amazon data. In addition, the survey tool enabled an empirical study of the privacy paradox. More information about the survey tool, data collection process, experiment design, and experiment results can be found in our related publication.
&lt;br&gt;&lt;br&gt;
All software used in the data collection process is available via an open source repository: https://github.com/aberke/amazon-study 
&lt;br&gt;&lt;br&gt;
This data collection and publication was approved by the MIT Institutional Review Board (protocol #2205000649).</description>
  </descriptions>
</resource>
