<?xml version='1.0' encoding='UTF-8'?><metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns="http://dublincore.org/documents/dcmi-terms/"><dcterms:title>Vulnerability of LLMs in Educational Assessment</dcterms:title><dcterms:identifier>https://doi.org/10.7910/DVN/OV2WAM</dcterms:identifier><dcterms:creator>Milani, Alfredo</dcterms:creator><dcterms:publisher>Harvard Dataverse</dcterms:publisher><dcterms:issued>2025-09-12</dcterms:issued><dcterms:modified>2025-09-12T10:47:47Z</dcterms:modified><dcterms:description>The dataset contains the output of experiments on a research project on 
Vulnerability of LLMs in Educational Assessment.

The Dataset contains:
-the students assignments data in normal form and the injected form
-the output produced by the experimented LLMs: ChatGPT, Gemini, DeepSeek, Grok, Perplexity and Copilot for the experiments evaluation the assignments, as a single document and collectively as a group of documents, denominated:
 
-User Legitimate LLMs Prompts
-Normal (no injection) providing the reference base evaluation
 -Prompt Injection Pass, one  type of injection experiments, called Fail-To-Top,  to move an assignment evailuated FAIL by reference base evaluation to PASS, i.e. above 35% of total points.
 -Prompt Injection to Top25 , a type of injection experiments  to move to top 25% an assignment with lowe reference base evaluation . This latter type of experiment come in 3 versions, Fail-To-Top, Sat-To-Top, Good-To-Top where assignment with reference base evaluation respectively: Fail (below 35%), Satisfactory (greater than 25% and belo 50%) and Good (above 50% and below 75%) are considered for injection.

The name of the folders and output results files are accordingly self-explanatory .</dcterms:description><dcterms:subject>Computer and Information Science</dcterms:subject><dcterms:subject>Social Sciences</dcterms:subject><dcterms:subject>Large Language Models</dcterms:subject><dcterms:subject>Prompt Injection</dcterms:subject><dcterms:subject>Education Sciences</dcterms:subject><dcterms:subject>Education Evaluation</dcterms:subject><dcterms:subject>Trustworthy AI</dcterms:subject><dcterms:subject>Human-in-the-Loop AI</dcterms:subject><dcterms:IsSupplementTo>"When AI is Fooled: Hidden Risks in LLM-assisted Grading"
Authors:
Alfredo Milani, Valentina Franzoni, Emanuele Florindi, Assel Omarbekova, Gulmira
Bekmanova, Banu Yergesh
in
Education Sciences, ISSN 2227-7102, issn, 2227-7102</dcterms:IsSupplementTo><dcterms:date>2025-09-12</dcterms:date><dcterms:contributor>Milani, Alfredo</dcterms:contributor><dcterms:dateSubmitted>2025-09-12</dcterms:dateSubmitted><dcterms:license>CC0 1.0</dcterms:license></metadata>