<codeBook xmlns="ddi:codebook:2_5" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="ddi:codebook:2_5 https://ddialliance.org/Specification/DDI-Codebook/2.5/XMLSchema/codebook.xsd" version="2.5"><docDscr><citation><titlStmt><titl>Vulnerability of LLMs in Educational Assessment</titl><IDNo agency="DOI">doi:10.7910/DVN/OV2WAM</IDNo></titlStmt><distStmt><distrbtr source="archive">Harvard Dataverse</distrbtr><distDate>2025-09-12</distDate></distStmt><verStmt source="archive"><version date="2025-09-12" type="RELEASED">1</version></verStmt><biblCit>Milani, Alfredo, 2025, "Vulnerability of LLMs in Educational Assessment", https://doi.org/10.7910/DVN/OV2WAM, Harvard Dataverse, V1</biblCit></citation></docDscr><stdyDscr><citation><titlStmt><titl>Vulnerability of LLMs in Educational Assessment</titl><IDNo agency="DOI">doi:10.7910/DVN/OV2WAM</IDNo></titlStmt><rspStmt><AuthEnty affiliation="https://ror.org/035mh1293">Milani, Alfredo</AuthEnty></rspStmt><prodStmt/><distStmt><distrbtr source="archive">Harvard Dataverse</distrbtr><contact affiliation="Link Campus University, Rome, Italy" email="a.milani@unilink.it">Milani, Alfredo</contact><contact affiliation="University of Perugia, Italy" email="valentina.franzoni@unipg.it">Valentina Franzoni</contact><contact affiliation="University of Modena-Reggio Emilia" email="emanuele.florindi@unimore.it">Florindi Emanuele</contact><depositr>Milani, Alfredo</depositr><depDate>2025-09-12</depDate></distStmt><holdings URI="https://doi.org/10.7910/DVN/OV2WAM"/></citation><stdyInfo><subject><keyword xml:lang="en">Computer and Information Science</keyword><keyword xml:lang="en">Social Sciences</keyword><keyword>Large Language Models</keyword><keyword vocab="Generative AI">Prompt Injection</keyword><keyword>Education Sciences</keyword><keyword>Education Evaluation</keyword><keyword>Trustworthy AI</keyword><keyword>Human-in-the-Loop AI</keyword></subject><abstract date="2025-08-31">The dataset contains the output of experiments on a research project on 
Vulnerability of LLMs in Educational Assessment.

The Dataset contains:
-the students assignments data in normal form and the injected form
-the output produced by the experimented LLMs: ChatGPT, Gemini, DeepSeek, Grok, Perplexity and Copilot for the experiments evaluation the assignments, as a single document and collectively as a group of documents, denominated:
 
-User Legitimate LLMs Prompts
-Normal (no injection) providing the reference base evaluation
 -Prompt Injection Pass, one  type of injection experiments, called Fail-To-Top,  to move an assignment evailuated FAIL by reference base evaluation to PASS, i.e. above 35% of total points.
 -Prompt Injection to Top25 , a type of injection experiments  to move to top 25% an assignment with lowe reference base evaluation . This latter type of experiment come in 3 versions, Fail-To-Top, Sat-To-Top, Good-To-Top where assignment with reference base evaluation respectively: Fail (below 35%), Satisfactory (greater than 25% and belo 50%) and Good (above 50% and below 75%) are considered for injection.

The name of the folders and output results files are accordingly self-explanatory .</abstract><sumDscr/></stdyInfo><method><dataColl><sources/></dataColl><anlyInfo/></method><dataAccs><setAvail/><useStmt/><notes type="DVN:TOU" level="dv">&lt;a href="http://creativecommons.org/publicdomain/zero/1.0">CC0 1.0&lt;/a></notes></dataAccs><othrStdyMat><relPubl><citation><titlStmt><titl>"When AI is Fooled: Hidden Risks in LLM-assisted Grading"
Authors:
Alfredo Milani, Valentina Franzoni, Emanuele Florindi, Assel Omarbekova, Gulmira
Bekmanova, Banu Yergesh
in
Education Sciences, ISSN 2227-7102</titl><IDNo agency="issn">2227-7102</IDNo></titlStmt><biblCit>"When AI is Fooled: Hidden Risks in LLM-assisted Grading"
Authors:
Alfredo Milani, Valentina Franzoni, Emanuele Florindi, Assel Omarbekova, Gulmira
Bekmanova, Banu Yergesh
in
Education Sciences, ISSN 2227-7102</biblCit></citation></relPubl></othrStdyMat></stdyDscr><otherMat ID="f12068843" URI="https://dataverse.harvard.edu/api/access/datafile/12068843" level="datafile"><labl>Normal_and_Injected_Assignment_Experiments.zip</labl><txt>The dataset contains the output of experiments on a research project on 
Vulnerability of LLMs in Educational Assessment.

The Dataset contains:
-the students assignments data in normal form and the injected form
-the output produced by the experimented LLMs: ChatGPT, Gemini, DeepSeek, Grok, Perplexity and Copilot for the experiments evaluation the assignments, as a single document and collectively as a group of documents, denominated:
  
-Normal (no injection) providing the reference base evaluation
 -Prompt Injection Pass, one  type of injection experiments, called Fail-To-Top,  to move an assignment evailuated FAIL by reference base evaluation to PASS, i.e. above 35% of total points.
 -Prompt Injection to Top25 , a type of injection experiments  to move to top 25% an assignment with lowe reference base evaluation . This latter type of experiment come in 3 versions, Fail-To-Top, Sat-To-Top, Good-To-Top where assignment with reference base evaluation respectively: Fail (below 35%), Satisfactory (greater than 25% and belo 50%) and Good (above 50% and below 75%) are considered for injection.

The name of the folders and output results files are accordingly self-explanatory .</txt><notes level="file" type="DATAVERSE:CONTENTTYPE" subject="Content/MIME Type">application/zip</notes></otherMat></codeBook>