<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd"><identifier identifierType="DOI">10.7910/DVN/YDEPUT</identifier><creators><creator><creatorName nameType="Personal">Karol J. Piczak</creatorName><givenName>Karol</givenName><familyName>J. Piczak</familyName><nameIdentifier nameIdentifierScheme="ORCID">0000-0002-6115-0833</nameIdentifier><affiliation>Warsaw University of Technology</affiliation></creator></creators><titles><title>ESC: Dataset for Environmental Sound Classification</title></titles><publisher>Harvard Dataverse</publisher><publicationYear>2015</publicationYear><subjects><subject>Computer and Information Science</subject><subject>environmental sound</subject><subject>classification</subject><subject>dataset</subject></subjects><contributors><contributor contributorType="ContactPerson"><contributorName nameType="Personal">Karol J. Piczak</contributorName><givenName>Karol</givenName><familyName>J. Piczak</familyName><affiliation>Warsaw University of Technology</affiliation></contributor></contributors><dates><date dateType="Submitted">2015-04-16</date><date dateType="Updated">2015-10-18</date></dates><resourceType resourceTypeGeneral="Dataset"/><sizes><size>47241189</size><size>245122607</size><size>2219994</size><size>1125882634</size><size>1142718053</size><size>1136834867</size><size>1147758000</size><size>1143071997</size><size>1113169852</size><size>1110860210</size><size>1127506884</size><size>1140682675</size><size>1152263374</size><size>1125104912</size><size>1152944407</size><size>1141572080</size><size>1141156970</size><size>1141841010</size><size>1157872865</size><size>1145451715</size><size>1165537521</size><size>1149383565</size><size>1137901493</size><size>1148515733</size><size>1146274692</size><size>1153732692</size><size>1165665432</size><size>1153893991</size></sizes><formats><format>application/zip</format><format>application/zip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format><format>application/x-gzip</format></formats><version>2.0</version><rightsList><rights rightsURI="info:eu-repo/semantics/openAccess"/><rights/></rightsList><descriptions><description descriptionType="Abstract">&lt;p>The &lt;strong>ESC dataset&lt;/strong> is a collection of short environmental recordings available in a unified format (5-second-long clips, 44.1 kHz, single channel, Ogg Vorbis compressed @ 192 kbit/s). All clips have been extracted from public field recordings available through the &lt;a href="http://freesound.org">Freesound.org project&lt;/a>. Please see the README files for a detailed attribution list. The dataset is available under the terms of the &lt;a href="http://creativecommons.org/licenses/by-nc/3.0/">Creative Commons license - Attribution-NonCommercial&lt;/a>.&lt;/p>&#xd;
&#xd;
&lt;p>The dataset consists of three parts:&lt;/p>&#xd;
&lt;ul>&#xd;
&lt;li>&lt;strong>&lt;a href="https://github.com/karoldvl/ESC-50">ESC-50&lt;/a>&lt;/strong>: a labeled set of 2 000 environmental recordings (50 classes, 40 clips per class),&lt;/li>&#xd;
&lt;li>&lt;strong>&lt;a href="https://github.com/karoldvl/ESC-10">ESC-10&lt;/a>&lt;/strong>: a labeled set of 400 environmental recordings (10 classes, 40 clips per class) (this is a subset of ESC-50 - created initialy as a proof-of-concept/standardized selection of easy recordings),&lt;/li>&#xd;
&lt;li>&lt;strong>ESC-US&lt;/strong>: an unlabeled dataset of 250 000 environmental recordings (5-second-long clips), suitable for unsupervised pre-training.&lt;/li>&#xd;
&lt;/ul>&#xd;
&#xd;
&lt;p>The ESC-US dataset, although not hand-annotated, includes the labels (tags) submitted by the original uploading users, which could be potentially used for weakly-supervised learning (noisy and/or missing labels). The ESC-10 and ESC-50 datasets have been prearranged into 5 uniformly sized folds so that clips extracted from the same original source recording are always contained in a single fold.&lt;/p>&#xd;
&#xd;
&lt;p>The labeled datasets are also available as GitHub projects: &lt;a href="https://github.com/karoldvl/ESC-50">ESC-50&lt;/a> | &lt;a href="https://github.com/karoldvl/ESC-10">ESC-10&lt;/a>.&lt;/p>&#xd;
&#xd;
&lt;p>For a more thorough description and analysis, please see the &lt;a href="http://karol.piczak.com/papers/Piczak2015-ESC-Dataset.pdf">original paper&lt;/a> and the &lt;a href="https://github.com/karoldvl/paper-2015-esc">supplementary IPython notebook&lt;/a>.&lt;/p>&#xd;
&#xd;
&lt;hr>&#xd;
&#xd;
&lt;p>The goal of this project is to facilitate open research initiatives in the field of environmental sound classification as publicly available datasets in this domain are still quite scarce.&lt;/p>&#xd;
&#xd;
&lt;p>&lt;strong>Acknowledgments&lt;/strong>&lt;br/>&#xd;
I would like to thank &lt;a href="http://www.dtic.upf.edu/~ffont/">Frederic Font Corbera&lt;/a> for his help in using the Freesound API.&lt;/p></description><description descriptionType="Other">Notes</description></descriptions><geoLocations/></resource>