Skip to main content
Duolingo Dataverse (Duolingo)
Share Dataverse

Share this dataverse on your favorite social media networks.

Public research data sets from Duolingo
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Find Advanced Search

1 to 2 of 2 Results
Apr 24, 2018
Settles, Burr, 2018, "Data for the 2018 Duolingo Shared Task on Second Language Acquisition Modeling (SLAM)",, Harvard Dataverse, V3
This repository contains gzipped files containing more than 2 million tokens (words) from answers submitted by more than 6,000 students over the course of their first 30 days of using Duolingo. It also contains baseline starter code written in Python. There are three data sets, c...
Dec 14, 2017
Settles, Burr, 2017, "Replication Data for: A Trainable Spaced Repetition Model for Language Learning",, Harvard Dataverse, V1
This is a gzipped CSV file containing the 13 million Duolingo student learning traces used in experiments by Settles & Meeder (2016). For more details and replication source code, visit:
Add Data

You need to Sign Up or Log In to create a dataverse or add a dataset.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Help improve Dataverse: Volunteer for online usability research!

Learn More
Contact Harvard Dataverse Support

Harvard Dataverse Support

Please fill this out to prove you are not a robot.

+ =
Send Message