{"dcterms:modified":"2026-05-10","dcterms:creator":"Harvard Dataverse","@type":"ore:ResourceMap","schema:additionalType":"Dataverse OREMap Format v1.0.2","dvcore:generatedBy":{"@type":"schema:SoftwareApplication","schema:name":"Dataverse","schema:version":"6.10.1 build iqss-3","schema:url":"https://github.com/iqss/dataverse"},"@id":"https://dataverse.harvard.edu/api/datasets/export?exporter=OAI_ORE&persistentId=https://doi.org/10.7910/DVN/GM8T8Q","ore:describes":{"author":{"citation:authorName":"zhao, zhilong","citation:authorAffiliation":{"scheme":"http://www.grid.ac/ontology/","termName":"South China University of Technology","@type":"https://schema.org/Organization","@id":"https://ror.org/0530pts50"}},"citation:dsDescription":{"citation:dsDescriptionValue":"This replication package contains all code and data necessary to reproduce the results presented in \"Cross-Domain Quality Assessment for Complex Qualitative Analysis: Validating Confidence-Entropy Signals Across Legal, Political, and Medical Tasks\".\n\nResearch Context: This study extends beyond accessible coding tasks to validate automated quality assessment for complex qualitative analysis requiring domain expertise and interpretive judgment across legal, political, and medical domains.\n\nPackage Contents:\n- Core Scripts: reproduce_all_results.py (main reproduction script), generate_synthetic_data.py (data generator), validate_reproduction.py (result validation)\n- Data Files: Synthetic datasets matching paper statistics for SCOTUS legal reasoning (390 cases), Hyperpartisan political analysis (644 cases), and MTSamples medical classification (1,000 cases)\n- Expected Outputs: All LaTeX tables (Table 1-5), validation reports, and cross-domain statistical analyses\n\nKey Findings Reproduced:\n- Cross-domain signal effectiveness (Table 1): Perfect correlation reproduction across all domains (±0.005 accuracy)\n- Dual-signal weight optimization (Table 2): 6.6-113.7% improvements over single-signal baselines\n- Cross-domain transferability (Table 3): 88.9% success rate for weight transfer across domains\n- Intelligent triage efficiency (Table 5): 45.4% vs 44.6% effort reduction (0.8% difference)\n- Domain-specific patterns: Confidence signals are stronger in legal contexts, and entropy signals are more reliable in political/medical domains\n\nValidation Status: Successfully reproduces all core findings with statistical significance maintained across complex analytical tasks. Demonstrates automated quality assessment viability for scaling complex qualitative research beyond accessible coding tasks.\n\nUsage: Run ./run_complete_reproduction.sh for complete reproduction, or python3 reproduce_all_results.py for individual table generation. All dependencies included."},"citation:datasetContact":{"citation:datasetContactName":"zhao, zhilong","citation:datasetContactEmail":"yb87315@umac.mo"},"citation:depositor":"zhao, zhilong","dateOfDeposit":"2025-08-26","subject":["Computer and Information Science","Social Sciences"],"title":"Replication Data for: Automated Quality Assessment for LLM-Based Complex Qualitative Coding: A Confidence-Diversity Framework","@id":"https://doi.org/10.7910/DVN/GM8T8Q","@type":["ore:Aggregation","schema:Dataset"],"schema:version":"1.1","schema:name":"Replication Data for: Automated Quality Assessment for LLM-Based Complex Qualitative Coding: A Confidence-Diversity Framework","schema:dateModified":"Wed Aug 27 21:38:22 EDT 2025","schema:datePublished":"2025-08-26","schema:creativeWorkStatus":"RELEASED","schema:license":"http://creativecommons.org/publicdomain/zero/1.0","dvcore:fileTermsOfAccess":{"dvcore:fileRequestAccess":true},"schema:includedInDataCatalog":"Harvard Dataverse","schema:isPartOf":{"schema:name":"Harvard Dataverse","@id":"https://dataverse.harvard.edu/dataverse/harvard","schema:description":"<span><span><span><h3>Share, archive, and get credit for your data. Find and cite data across all research fields.</h3></span></span></span>"},"ore:aggregates":[{"schema:name":"reproduction_package.zip","dvcore:restricted":false,"schema:version":1,"dvcore:datasetVersionId":501743,"@id":"https://dataverse.harvard.edu/file.xhtml?fileId=12013838","schema:sameAs":"https://dataverse.harvard.edu/api/access/datafile/12013838","@type":"ore:AggregatedResource","schema:fileFormat":"application/zip","dvcore:filesize":124689,"dvcore:storageIdentifier":"s3://dvn-cloud:198e5504712-ccab87a56914","dvcore:rootDataFileId":-1,"dvcore:checksum":{"@type":"http://www.w3.org/2001/04/xmldsig-more#md5","@value":"cf2c64c662be72d7ea11152c7c4845ad"}}],"schema:hasPart":["https://dataverse.harvard.edu/file.xhtml?fileId=12013838"]},"@context":{"author":"http://purl.org/dc/terms/creator","citation":"https://dataverse.org/schema/citation/","content":"@value","dateOfDeposit":"http://purl.org/dc/terms/dateSubmitted","dcterms":"http://purl.org/dc/terms/","dvcore":"https://dataverse.org/schema/core#","lang":"@language","ore":"http://www.openarchives.org/ore/terms/","schema":"http://schema.org/","scheme":"http://www.w3.org/2004/02/skos/core#inScheme","subject":"http://purl.org/dc/terms/subject","termName":"https://schema.org/name","title":"http://purl.org/dc/terms/title"}}