Large Dataset of Generalization Patterns in the Number Game (doi:10.7910/DVN/A8ZWLF)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

(external link) (external link) (external link)

Document Description

Citation

Title:

Large Dataset of Generalization Patterns in the Number Game

Identification Number:

doi:10.7910/DVN/A8ZWLF

Distributor:

Harvard Dataverse

Date of Distribution:

2018-08-10

Version:

1

Bibliographic Citation:

Bigelow, Eric J.; Piantadosi, Steven T., 2018, "Large Dataset of Generalization Patterns in the Number Game", https://doi.org/10.7910/DVN/A8ZWLF, Harvard Dataverse, V1, UNF:6:zUgVtjc9CKvWc4pB//Qp6A== [fileUNF]

Study Description

Citation

Title:

Large Dataset of Generalization Patterns in the Number Game

Identification Number:

doi:10.7910/DVN/A8ZWLF

Authoring Entity:

Bigelow, Eric J. (University of Rochester)

Piantadosi, Steven T. (University of Rochester)

Distributor:

Harvard Dataverse

Access Authority:

Bigelow, Eric J.

Access Authority:

Piantadosi, Steven T.

Depositor:

Bigelow, Eric

Date of Deposit:

2015-05-19

Holdings Information:

https://doi.org/10.7910/DVN/A8ZWLF

Study Scope

Keywords:

Social Sciences, Generalization; Bayesian inference; Structured cognitive model; Numerical cognition; Concept learning

Abstract:

272,700 two-alternative forced choice responses in a simple numerical task modeled after Tenenbaum (1999, 2000), collected from 606 Amazon Mechanical Turk workers. Subjects were shown sets of numbers length 1 to 4 from the range 1 to 100 (e.g. {12, 16}), and asked what other numbers were likely to belong to that set (e.g. 1, 5, 2, 98). Their generalization patterns reflect both rule-like (e.g. “even numbers,” “powers of two”) and distance-based (e.g. numbers near 50) generalization. This data set is available for further analysis of these simple and intuitive inferences, developing of hands-on modeling instruction, and attempts to understand how probability and rules interact in human cognition.

Date of Collection:

2015-03-27-2015-04-14

Notes:

Technical report describing this dataset to be reviewed by Journal of Open Psychology Data (JOPD).

Methodology and Processing

Sources Statement

Data Access

Notes:

<a href="http://creativecommons.org/publicdomain/zero/1.0">CC0 1.0</a>

Other Study Description Materials

Related Publications

Citation

Title:

Tenenbaum, J. B. (2000). Rules and similarity in concept learning. Advances in neural information processing systems, 12, 59-65.

Bibliographic Citation:

Tenenbaum, J. B. (2000). Rules and similarity in concept learning. Advances in neural information processing systems, 12, 59-65.

Citation

Title:

Tenenbaum, J. B. (1999). A Bayesian framework for concept learning (Doctoral dissertation, Massachusetts Institute of Technology).

Bibliographic Citation:

Tenenbaum, J. B. (1999). A Bayesian framework for concept learning (Doctoral dissertation, Massachusetts Institute of Technology).

Citation

Title:

Tenenbaum, J. B. & Griffiths, T. L. (2001). Generalization, similarity, and Bayesian inference. Behavioral and Brain Sciences, 24(4), 629-640.

Bibliographic Citation:

Tenenbaum, J. B. & Griffiths, T. L. (2001). Generalization, similarity, and Bayesian inference. Behavioral and Brain Sciences, 24(4), 629-640.

File Description--f2677703

File: instructions_rt.tab

  • Number of cases: 1848

  • No. of variables per record: 3

  • Type of File: text/tab-separated-values

Notes:

UNF:6:iESvM+aEgU47IjH6BHkjKQ==

Time spent looking at each instruction page

File Description--f2696204

File: numbergame_data.tab

  • Number of cases: 272700

  • No. of variables per record: 14

  • Type of File: text/tab-separated-values

Notes:

UNF:6:WXjlwG01u+JA91yxbdD1lQ==

Primary dataset, with rows for each response. Includes rating, reaction time, demographics information, & values calculated based on rating. See README.txt for more information.

File Description--f2696205

File: set_descriptions.tab

  • Number of cases: 3030

  • No. of variables per record: 3

  • Type of File: text/tab-separated-values

Notes:

UNF:6:V6ifJ0pY1z91leZAIj9tpw==

Survey

Subjective concept descriptions collected during post-experiment questionnaire

File Description--f2696206

File: show_set_rt.tab

  • Number of cases: 9090

  • No. of variables per record: 3

  • Type of File: text/tab-separated-values

Notes:

UNF:6:sQfwTes+du4DHwtVfMxRyw==

Amount of time spent on each page initially displaying the set to the subject, before they rate corresponding targets

Variable Description

List of Variables:

Variables

id

f2677703 Location:

Summary Statistics: Min. 0.0; Valid 1848.0; Max. 605.0; StDev 175.50800268732442; Mean 304.21699134199145

Variable Format: numeric

Notes: UNF:6:563FkXPiuPpJ16iXMvNZ/g==

instruct

f2677703 Location:

Summary Statistics: Valid 1848.0; Mean 0.9962121212121213; Max. 2.0; Min. 0.0; StDev 0.8150497923242312

Variable Format: numeric

Notes: UNF:6:wmb5Z7lhwTku1Vyk+RbWdA==

rt

f2677703 Location:

Summary Statistics: Mean 20164.270021644974; Max. 1203980.0; StDev 56602.40861969358; Valid 1848.0; Min. 765.0

Variable Format: numeric

Notes: UNF:6:bdHoOszOBY3JYFyjshlEyw==

set

f2696204 Location:

Variable Format: character

Notes: UNF:6:USC536CdDHy2F7oU1YPDSQ==

id

f2696204 Location:

Summary Statistics: StDev 174.93721413408534; Mean 302.5; Max. 605.0; Min. 0.0; Valid 272700.0

Variable Format: numeric

Notes: UNF:6:/sxlEdvcJbaG1ejjKgSUxA==

rating

f2696204 Location:

Summary Statistics: Max. 1.0; Valid 272700.0; Mean 0.31780344701137037; StDev 0.4656234649488847; Min. 0.0;

Variable Format: numeric

Notes: UNF:6:Wkpy7yRk+7ox2U8ql37JQg==

rt

f2696204 Location:

Summary Statistics: Mean 1402.7500018362568; Min. 0.0; Max. 29871.0; StDev 1782.78759187141; Valid 272294.0;

Variable Format: numeric

Notes: UNF:6:+nKd8thj4gecSinq+LIgoQ==

target

f2696204 Location:

Summary Statistics: Valid 272700.0; Max. 100.0; Min. 1.0; StDev 28.856082102559842; Mean 50.4583425009167;

Variable Format: numeric

Notes: UNF:6:6eVstPE/hILQMmEU336Hcw==

trial

f2696204 Location:

Summary Statistics: StDev 129.90372799799133; Valid 272700.0; Max. 449.0; Mean 224.5; Min. 0.0

Variable Format: numeric

Notes: UNF:6:CBf4il/XNiwkXf51ESDJkQ==

hits

f2696204 Location:

Summary Statistics: Max. 22.0; Mean 10.993465346534757; Valid 272700.0; Min. 9.0; StDev 1.9703060035263393;

Variable Format: numeric

Notes: UNF:6:+mJgExQooUKyVxmhqXlxVw==

p

f2696204 Location:

Summary Statistics: Valid 272700.0; StDev 0.2563681780648187; Min. 0.0; Max. 1.0; Mean 0.3178034028236157;

Variable Format: numeric

Notes: UNF:6:e4D8iciWm0Onh9LkfOuT4g==

H

f2696204 Location:

Summary Statistics: Min. 0.0; StDev 0.23534174308806563; Mean 0.4513916883755037; Valid 272700.0; Max. 0.69315;

Variable Format: numeric

Notes: UNF:6:H0Qx+3EZ857j06enIf+5xQ==

age

f2696204 Location:

Summary Statistics: Valid 271800.0; Mean 34.177152317882026; Min. 18.0; StDev 10.941166469455093; Max. 68.0;

Variable Format: numeric

Notes: UNF:6:8pqsbuLvaI9FiPIawy2FoA==

firstlang

f2696204 Location:

Variable Format: character

Notes: UNF:6:HwHuFcvw6voC79Xf+PXWGw==

zipcode

f2696204 Location:

Summary Statistics: Valid 266400.0; StDev 29748.983836944077; Max. 99508.0; Min. 1035.0; Mean 50552.793918917916

Variable Format: numeric

Notes: UNF:6:C1SwBmTKAc0HBUrk3Pc3Sg==

gender

f2696204 Location:

Variable Format: character

Notes: UNF:6:8hSsSl740PyYelv/cfwBxw==

education

f2696204 Location:

Variable Format: character

Notes: UNF:6:g7BMn0+wAcBtOxyAQcH/8Q==

id

f2696205 Location:

Summary Statistics: Max. 605.0; Mean 302.5; Valid 3030.0; Min. 0.0; StDev 174.96576800502618;

Variable Format: numeric

Notes: UNF:6:cSbo6FYgYg/fVGtYtX0Wxw==

set

f2696205 Location:

Variable Format: character

Notes: UNF:6:Hkg1VFApr6VcCR2Fy5kKVw==

descr

f2696205 Location:

Variable Format: character

Notes: UNF:6:M1m5GdaCew2PYjyDr4ed2g==

set

f2696206 Location:

Variable Format: character

Notes: UNF:6:wD9LySYhDKWvxye1rgc1ew==

id

f2696206 Location:

Summary Statistics: StDev 174.9465166688833; Min. 0.0; Max. 605.0; Mean 302.5; Valid 9090.0

Variable Format: numeric

Notes: UNF:6:xtdR3yB9gQdMXZ69sPidQg==

rt

f2696206 Location:

Summary Statistics: Min. 678.0; Max. 1510936.0; StDev 27034.445131966153; Mean 9310.921782178262; Valid 9090.0;

Variable Format: numeric

Notes: UNF:6:GULEgoD7u4CuQEEJnV3OqQ==

Other Study-Related Materials

Label:

plot_all.R

Text:

Generate large plot of predictive distributions across all subjects for every set in the dataset (see predictive_all.pdf)

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

plot_compare.R

Text:

Plot predictive distribution for multiple specified sets to compare. This file also includes a short list of set lists, with interesting patterns to compare.

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

plot_focus.R

Text:

Same as `plot_compare.R`, but highlighting certain targets to compare how multiple concepts' predictive distributions may reflect common patterns.

Notes:

text/plain; charset=US-ASCII

Other Study-Related Materials

Label:

predictive_all.pdf

Text:

Large plot of predictive distributions across all subjects for every concept in the dataset

Notes:

application/pdf

Other Study-Related Materials

Label:

README.txt

Text:

Describes each file in detail, describes each column for .csv files

Notes:

text/plain

Other Study-Related Materials

Label:

recompute_columns.py

Text:

Load dataset into pandas and recompute 'p', 'H', 'hits', & 'typicality' columns

Notes:

text/x-python-script