{"id":12068842,"identifier":"DVN/OV2WAM","persistentUrl":"https://doi.org/10.7910/DVN/OV2WAM","protocol":"doi","authority":"10.7910","separator":"/","publisher":"Harvard Dataverse","publicationDate":"2025-09-12","storageIdentifier":"s3://10.7910/DVN/OV2WAM","datasetType":"dataset","datasetVersion":{"id":503127,"datasetId":12068842,"datasetPersistentId":"doi:10.7910/DVN/OV2WAM","datasetType":"dataset","storageIdentifier":"s3://10.7910/DVN/OV2WAM","versionNumber":1,"internalVersionNumber":5,"versionMinorNumber":0,"versionState":"RELEASED","latestVersionPublishingState":"RELEASED","lastUpdateTime":"2025-09-12T10:47:47Z","releaseTime":"2025-09-12T10:47:47Z","createTime":"2025-09-12T10:46:52Z","publicationDate":"2025-09-12","citationDate":"2025-09-12","license":{"name":"CC0 1.0","uri":"http://creativecommons.org/publicdomain/zero/1.0","iconUri":"https://licensebuttons.net/p/zero/1.0/88x31.png","rightsIdentifier":"CC0-1.0","rightsIdentifierScheme":"SPDX","schemeUri":"https://spdx.org/licenses/","languageCode":"en"},"fileAccessRequest":true,"metadataBlocks":{"citation":{"displayName":"Citation Metadata","name":"citation","fields":[{"typeName":"title","multiple":false,"typeClass":"primitive","value":"Vulnerability of LLMs in Educational Assessment"},{"typeName":"author","multiple":true,"typeClass":"compound","value":[{"authorName":{"typeName":"authorName","multiple":false,"typeClass":"primitive","value":"Milani, Alfredo"},"authorAffiliation":{"typeName":"authorAffiliation","multiple":false,"typeClass":"primitive","value":"https://ror.org/035mh1293","expandedvalue":{"scheme":"http://www.grid.ac/ontology/","termName":"Link Campus University","@type":"https://schema.org/Organization"}},"authorIdentifierScheme":{"typeName":"authorIdentifierScheme","multiple":false,"typeClass":"controlledVocabulary","value":"ORCID"},"authorIdentifier":{"typeName":"authorIdentifier","multiple":false,"typeClass":"primitive","value":"https://orcid.org/0000-0003-4534-1805","expandedvalue":{"personName":"MILANI, Alfredo","@id":"https://orcid.org/0000-0003-4534-1805","scheme":"ORCID","@type":"https://schema.org/Person"}}}]},{"typeName":"datasetContact","multiple":true,"typeClass":"compound","value":[{"datasetContactName":{"typeName":"datasetContactName","multiple":false,"typeClass":"primitive","value":"Milani, Alfredo"},"datasetContactAffiliation":{"typeName":"datasetContactAffiliation","multiple":false,"typeClass":"primitive","value":"Link Campus University, Rome, Italy"},"datasetContactEmail":{"typeName":"datasetContactEmail","multiple":false,"typeClass":"primitive","value":"a.milani@unilink.it"}},{"datasetContactName":{"typeName":"datasetContactName","multiple":false,"typeClass":"primitive","value":"Valentina Franzoni"},"datasetContactAffiliation":{"typeName":"datasetContactAffiliation","multiple":false,"typeClass":"primitive","value":"University of Perugia, Italy"},"datasetContactEmail":{"typeName":"datasetContactEmail","multiple":false,"typeClass":"primitive","value":"valentina.franzoni@unipg.it"}},{"datasetContactName":{"typeName":"datasetContactName","multiple":false,"typeClass":"primitive","value":"Florindi Emanuele"},"datasetContactAffiliation":{"typeName":"datasetContactAffiliation","multiple":false,"typeClass":"primitive","value":"University of Modena-Reggio Emilia"},"datasetContactEmail":{"typeName":"datasetContactEmail","multiple":false,"typeClass":"primitive","value":"emanuele.florindi@unimore.it"}}]},{"typeName":"dsDescription","multiple":true,"typeClass":"compound","value":[{"dsDescriptionValue":{"typeName":"dsDescriptionValue","multiple":false,"typeClass":"primitive","value":"The dataset contains the output of experiments on a research project on \nVulnerability of LLMs in Educational Assessment.\n\nThe Dataset contains:\n-the students assignments data in normal form and the injected form\n-the output produced by the experimented LLMs: ChatGPT, Gemini, DeepSeek, Grok, Perplexity and Copilot for the experiments evaluation the assignments, as a single document and collectively as a group of documents, denominated:\n \n-User Legitimate LLMs Prompts\n-Normal (no injection) providing the reference base evaluation\n -Prompt Injection Pass, one  type of injection experiments, called Fail-To-Top,  to move an assignment evailuated FAIL by reference base evaluation to PASS, i.e. above 35% of total points.\n -Prompt Injection to Top25 , a type of injection experiments  to move to top 25% an assignment with lowe reference base evaluation . This latter type of experiment come in 3 versions, Fail-To-Top, Sat-To-Top, Good-To-Top where assignment with reference base evaluation respectively: Fail (below 35%), Satisfactory (greater than 25% and belo 50%) and Good (above 50% and below 75%) are considered for injection.\n\nThe name of the folders and output results files are accordingly self-explanatory ."},"dsDescriptionDate":{"typeName":"dsDescriptionDate","multiple":false,"typeClass":"primitive","value":"2025-08-31"}}]},{"typeName":"subject","multiple":true,"typeClass":"controlledVocabulary","value":["Computer and Information Science","Social Sciences"]},{"typeName":"keyword","multiple":true,"typeClass":"compound","value":[{"keywordValue":{"typeName":"keywordValue","multiple":false,"typeClass":"primitive","value":"Large Language Models"}},{"keywordValue":{"typeName":"keywordValue","multiple":false,"typeClass":"primitive","value":"Prompt Injection"},"keywordVocabulary":{"typeName":"keywordVocabulary","multiple":false,"typeClass":"primitive","value":"Generative AI"}},{"keywordValue":{"typeName":"keywordValue","multiple":false,"typeClass":"primitive","value":"Education Sciences"}},{"keywordValue":{"typeName":"keywordValue","multiple":false,"typeClass":"primitive","value":"Education Evaluation"}},{"keywordValue":{"typeName":"keywordValue","multiple":false,"typeClass":"primitive","value":"Trustworthy AI"}},{"keywordValue":{"typeName":"keywordValue","multiple":false,"typeClass":"primitive","value":"Human-in-the-Loop AI"}}]},{"typeName":"publication","multiple":true,"typeClass":"compound","value":[{"publicationRelationType":{"typeName":"publicationRelationType","multiple":false,"typeClass":"controlledVocabulary","value":"IsSupplementTo"},"publicationCitation":{"typeName":"publicationCitation","multiple":false,"typeClass":"primitive","value":"\"When AI is Fooled: Hidden Risks in LLM-assisted Grading\"\nAuthors:\nAlfredo Milani, Valentina Franzoni, Emanuele Florindi, Assel Omarbekova, Gulmira\nBekmanova, Banu Yergesh\nin\nEducation Sciences, ISSN 2227-7102"},"publicationIDType":{"typeName":"publicationIDType","multiple":false,"typeClass":"controlledVocabulary","value":"issn"},"publicationIDNumber":{"typeName":"publicationIDNumber","multiple":false,"typeClass":"primitive","value":"2227-7102"}}]},{"typeName":"depositor","multiple":false,"typeClass":"primitive","value":"Milani, Alfredo"},{"typeName":"dateOfDeposit","multiple":false,"typeClass":"primitive","value":"2025-09-12"}]}},"files":[{"description":"The dataset contains the output of experiments on a research project on \nVulnerability of LLMs in Educational Assessment.\n\nThe Dataset contains:\n-the students assignments data in normal form and the injected form\n-the output produced by the experimented LLMs: ChatGPT, Gemini, DeepSeek, Grok, Perplexity and Copilot for the experiments evaluation the assignments, as a single document and collectively as a group of documents, denominated:\n  \n-Normal (no injection) providing the reference base evaluation\n -Prompt Injection Pass, one  type of injection experiments, called Fail-To-Top,  to move an assignment evailuated FAIL by reference base evaluation to PASS, i.e. above 35% of total points.\n -Prompt Injection to Top25 , a type of injection experiments  to move to top 25% an assignment with lowe reference base evaluation . This latter type of experiment come in 3 versions, Fail-To-Top, Sat-To-Top, Good-To-Top where assignment with reference base evaluation respectively: Fail (below 35%), Satisfactory (greater than 25% and belo 50%) and Good (above 50% and below 75%) are considered for injection.\n\nThe name of the folders and output results files are accordingly self-explanatory .","label":"Normal_and_Injected_Assignment_Experiments.zip","restricted":false,"version":1,"datasetVersionId":503127,"dataFile":{"id":12068843,"persistentId":"","filename":"Normal_and_Injected_Assignment_Experiments.zip","contentType":"application/zip","friendlyType":"ZIP Archive","filesize":4804924,"description":"The dataset contains the output of experiments on a research project on \nVulnerability of LLMs in Educational Assessment.\n\nThe Dataset contains:\n-the students assignments data in normal form and the injected form\n-the output produced by the experimented LLMs: ChatGPT, Gemini, DeepSeek, Grok, Perplexity and Copilot for the experiments evaluation the assignments, as a single document and collectively as a group of documents, denominated:\n  \n-Normal (no injection) providing the reference base evaluation\n -Prompt Injection Pass, one  type of injection experiments, called Fail-To-Top,  to move an assignment evailuated FAIL by reference base evaluation to PASS, i.e. above 35% of total points.\n -Prompt Injection to Top25 , a type of injection experiments  to move to top 25% an assignment with lowe reference base evaluation . This latter type of experiment come in 3 versions, Fail-To-Top, Sat-To-Top, Good-To-Top where assignment with reference base evaluation respectively: Fail (below 35%), Satisfactory (greater than 25% and belo 50%) and Good (above 50% and below 75%) are considered for injection.\n\nThe name of the folders and output results files are accordingly self-explanatory .","storageIdentifier":"s3://dvn-cloud:1993d867500-4f795b148d7e","rootDataFileId":-1,"md5":"d6580deb1f5fd647a0b3f3ccbb31fbda","checksum":{"type":"MD5","value":"d6580deb1f5fd647a0b3f3ccbb31fbda"},"tabularData":false,"creationDate":"2025-09-12","publicationDate":"2025-09-12","fileAccessRequest":true}}],"citation":"Milani, Alfredo, 2025, \"Vulnerability of LLMs in Educational Assessment\", https://doi.org/10.7910/DVN/OV2WAM, Harvard Dataverse, V1"}}