<?xml version='1.0' encoding='UTF-8'?><metadata xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns="http://dublincore.org/documents/dcmi-terms/"><dcterms:title>FitzDerm-CF: A Controlled Multimodal Dermatology Resource for Skin-Tone Counterfactual and Leakage Evaluation</dcterms:title><dcterms:identifier>https://doi.org/10.7910/DVN/OZKS1L</dcterms:identifier><dcterms:creator>Jangid, Shivam</dcterms:creator><dcterms:publisher>Harvard Dataverse</dcterms:publisher><dcterms:issued>2026-06-05</dcterms:issued><dcterms:modified>2026-06-05T11:07:10Z</dcterms:modified><dcterms:description>Multimodal dermatology models increasingly rely on paired visual and textual information, yet existing resources provide limited control over how clinical text encodes diagnostic evidence, protected attributes, and spurious correlations. We introduce FitzDerm-CF, a controlled multimodal dermatology resource that augments images from Fitzpatrick17k and DDI with synthetic clinical notes designed to support systematic evaluation of semantic capacity, diagnostic leakage, and skin-tone counterfactual robustness. For each image, we generate dermatology-style textual companions conditioned on structured metadata, including Fitzpatrick skin type, fine-grained disease label, and malignancy status, while enforcing strict constraints that prevent explicit disease naming, patient-specific demographic attributes, management recommendations, and diagnostic certainty language. To enable controlled studies of text informativeness, notes are generated across multiple temperature settings, producing variants with different levels of morphological specificity. We further construct a counterfactual split by perturbing only the Fitzpatrick skin-type attribute and rewriting skin-tone-dependent appearance descriptors while preserving all other clinical content. Candidate counterfactuals are filtered through a four-stage validation pipeline that rejects unchanged rewrites, enforces bounded semantic similarity, checks causal and epidemiological invariance, and verifies skin-tone-specific lexical intervention. The resulting resource supports benchmark tasks for cross-modal retrieval, diagnostic leakage probing, multimodal classification, and counterfactual consistency evaluation. By separating disease-relevant morphology from label leakage and protected-attribute confounding, FitzDerm-CF provides a reusable testbed for developing and auditing multimodal dermatology systems under controlled distributional and counterfactual conditions.</dcterms:description><dcterms:subject>Computer and Information Science</dcterms:subject><dcterms:subject>Medicine, Health and Life Sciences</dcterms:subject><dcterms:date>2026-06-05</dcterms:date><dcterms:contributor>Jangid, Shivam</dcterms:contributor><dcterms:dateSubmitted>2026-06-04</dcterms:dateSubmitted><dcterms:license>CC0 1.0</dcterms:license></metadata>