Stanford NLP Model Output for Biofuel Patent Classification (doi:10.7910/DVN/29374)

View:

Part 1: Document Description
Part 2: Study Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description
Citation
Title:	Stanford NLP Model Output for Biofuel Patent Classification
Identification Number:	doi:10.7910/DVN/29374
Distributor:	Harvard Dataverse
Date of Distribution:	2015-03-06
Version:	1
Bibliographic Citation:	Kessler, Jeff, 2015, "Stanford NLP Model Output for Biofuel Patent Classification", https://doi.org/10.7910/DVN/29374, Harvard Dataverse, V1
Study Description
Citation
Title:	Stanford NLP Model Output for Biofuel Patent Classification
Identification Number:	doi:10.7910/DVN/29374
Authoring Entity:	Kessler, Jeff (University of California, Davis)
Distributor:	Harvard Dataverse
Distributor:	Harvard Dataverse Network
Access Authority:	Jeff Kessler
Date of Deposit:	2015-03-06
Date of Distribution:	2015
Holdings Information:	https://doi.org/10.7910/DVN/29374
Study Scope
Keywords:	Biofuel Classifier
Topic Classification:	Natural Language Processing
Abstract:	This NLP model was generated using the Stanford NLP Classifier (available from: http://nlp.stanford.edu/software/classifier.shtml). The model was trained using a random selection of 700 manually classified biofuel patents from 1976 through 2013, and validated against 300 manually classified biofuel patents on January 03, 2014. Included are the classification results and associated patent numbers for both the manually trained patents, and for the automatically categorized patents.
Time Period:	1976-2013
Geographic Coverage:	United States
Methodology and Processing
Sources Statement
Data Access
Notes:	<a href="http://creativecommons.org/publicdomain/zero/1.0">CC0 1.0</a>
Other Study Description Materials
Other Study-Related Materials
Label:	Manual Classification.csv
Text:	This is the initial list of 1000 patents manually classified for use with training and validating the NLP model
Notes:	text/plain; charset=US-ASCII
Other Study-Related Materials
Label:	ner-model.ser.gz
Text:	This is the model generated by the Stanford NLP Classifier
Notes:	application/x-gzip
Other Study-Related Materials
Label:	NLP Classification.csv
Text:	This is the list of patents and associated classifications based on the NLP model that was trained using the manually classified patents
Notes:	text/plain; charset=US-ASCII
Other Study-Related Materials
Label:	patents_test.prop
Text:	This is the property file used for parameterizing the model
Notes:	text/plain; charset=US-ASCII