BD-KDD: A Clinical Dataset of Kidney Disease Diagnosis (doi:10.7910/DVN/MB1LES)

View:

Part 1: Document Description
Part 2: Study Description
Part 3: Data Files Description
Part 4: Variable Description
Part 5: Other Study-Related Materials
Entire Codebook

Document Description

Citation

Title:

BD-KDD: A Clinical Dataset of Kidney Disease Diagnosis

Identification Number:

doi:10.7910/DVN/MB1LES

Distributor:

Harvard Dataverse

Date of Distribution:

2026-03-13

Version:

1

Bibliographic Citation:

Islam, Md. Masudul, 2026, "BD-KDD: A Clinical Dataset of Kidney Disease Diagnosis", https://doi.org/10.7910/DVN/MB1LES, Harvard Dataverse, V1, UNF:6:oNpkl6IW60/B7imqNEI2eQ== [fileUNF]

Study Description

Citation

Title:

BD-KDD: A Clinical Dataset of Kidney Disease Diagnosis

Identification Number:

doi:10.7910/DVN/MB1LES

Authoring Entity:

Islam, Md. Masudul (Bangladesh University of Business and Technology)

Producer:

Dr. Muhammad Towfiqur Rahman

Distributor:

Harvard Dataverse

Access Authority:

Islam, Md. Masudul

Depositor:

Islam, Md. Masudul

Date of Deposit:

2026-03-10

Holdings Information:

https://doi.org/10.7910/DVN/MB1LES

Study Scope

Keywords:

Computer and Information Science, Medicine, Health and Life Sciences

Abstract:

This data article introduces BD-KDD (Clinical Dataset for Kidney Disease Diagnosis and Healthy Classification), a comprehensive clinical and laboratory dataset containing 988 patient records. It was developed to address the limited sample size of commonly used renal research benchmarks and provides more than twice the number of instances compared to the standard UCI Chronic Kidney Disease dataset. The dataset includes 26 variables, combining demographic information (Age), physical examination data (Blood Pressure), and 24 laboratory biomarkers, such as Serum Creatinine (Sc), Blood Urea (Bu), Hemoglobin (Hemo), and Specific Gravity (Sg). Each record is annotated with a binary target variable (Class) that categorizes patients as either Healthy (n = 481) or Kidney Disease (n = 507), enabling reliable development and evaluation of diagnostic machine learning models. ----------------------------------------------------------------------------------------------------------------- Note: This study involves retrospective clinical data obtained from diagnostic laboratory records. The dataset was collected with institutional authorization for academic research purposes. All patient records were fully anonymized prior to dataset preparation, and no personally identifiable information was included in the dataset. The research procedures followed the ethical principles outlined in the Declaration of Helsinki for research involving human subjects. Informed consent for clinical testing was obtained from patients by the diagnostic center as part of routine medical procedures. The dataset used in this study contains only anonymized laboratory and clinical measurements and therefore does not allow identification of individual participants.

Notes:

This study involves retrospective clinical data obtained from diagnostic laboratory records. The dataset was collected with institutional authorization for academic research purposes. All patient records were fully anonymized prior to dataset preparation, and no personally identifiable information was included in the dataset. The research procedures followed the ethical principles outlined in the Declaration of Helsinki for research involving human subjects. Informed consent for clinical testing was obtained from patients by the diagnostic center as part of routine medical procedures. The dataset used in this study contains only anonymized laboratory and clinical measurements and therefore does not allow identification of individual participants.

Methodology and Processing

Sources Statement

Data Access

Notes:

<a href="http://creativecommons.org/publicdomain/zero/1.0">CC0 1.0</a>

Other Study Description Materials

File Description--f13595783

File: BD-KDD Dataset.tab

  • Number of cases: 988

  • No. of variables per record: 26

  • Type of File: text/tab-separated-values

Notes:

UNF:6:oNpkl6IW60/B7imqNEI2eQ==

Variable Description

List of Variables:

Variables

Sl. No.

f13595783 Location:

Summary Statistics: StDev 285.3553340427802; Min. 1.0; Mean 494.5; Valid 988.0; Max. 988.0;

Variable Format: numeric

Notes: UNF:6:f6WRZqxHetOvua75CN4Dkw==

Age

f13595783 Location:

Summary Statistics: Mean 50.27024291497976; Max. 79.0; Min. 20.0; StDev 17.336027140697464; Valid 988.0

Variable Format: numeric

Notes: UNF:6:MoQNiTtjMVO7Zd/z9ui/aw==

Bp

f13595783 Location:

Summary Statistics: StDev 32.192748614972; Max. 179.0; Valid 988.0; Mean 123.75607287449391; Min. 70.0

Variable Format: numeric

Notes: UNF:6:QWOngz92giG5kX3lqA5Sww==

Sg

f13595783 Location:

Summary Statistics: Valid 988.0; StDev 0.007141043869453485; Mean 1.0152631578947369; Max. 1.025; Min. 1.005

Variable Format: numeric

Notes: UNF:6:yLyFTpVMfRgoamHzjRUHPg==

Al

f13595783 Location:

Summary Statistics: Max. 4.0; Mean 2.0303643724696356; Valid 988.0; Min. 0.0; StDev 1.4188944527020813

Variable Format: numeric

Notes: UNF:6:/X02X+NS7ffxTgukRSV5fg==

Su

f13595783 Location:

Summary Statistics: Mean 2.0; Valid 988.0; Max. 4.0; Min. 0.0; StDev 1.4199333574555515

Variable Format: numeric

Notes: UNF:6:huTGwEu3HjWrCqqFRsBQsw==

Rbc

f13595783 Location:

Summary Statistics: Min. 0.0; StDev 0.5000225599839361; Mean 0.5151821862348179; Max. 1.0; Valid 988.0

Variable Format: numeric

Notes: UNF:6:Dk95OE9GEwRaj1DtinzIEQ==

Pc

f13595783 Location:

Summary Statistics: Valid 988.0; Mean 0.5111336032388665; Min. 0.0; Max. 1.0; StDev 0.5001291934046781

Variable Format: numeric

Notes: UNF:6:icp3FNZuZdhxv8IRF6/I2g==

Pcc

f13595783 Location:

Summary Statistics: Max. 1.0; StDev 0.49975690286679014; Min. 0.0; Mean 0.4777327935222671; Valid 988.0

Variable Format: numeric

Notes: UNF:6:nSC5pyC9w/2c6WjbwnArJw==

Ba

f13595783 Location:

Summary Statistics: Mean 0.5242914979757078; StDev 0.4996625041802432; Valid 988.0; Max. 1.0; Min. 0.0

Variable Format: numeric

Notes: UNF:6:0A9Mhej/tgcQg3Hd+I7/Bg==

Bgr

f13595783 Location:

Summary Statistics: Min. 70.0; Mean 232.09412955465586; Max. 399.0; StDev 97.01377212392937; Valid 988.0;

Variable Format: numeric

Notes: UNF:6:pkbYWuBqJHNqTj9FC/PpAA==

Bu

f13595783 Location:

Summary Statistics: Mean 104.24291497975706; Max. 199.0; StDev 52.95705377457415; Min. 10.0; Valid 988.0

Variable Format: numeric

Notes: UNF:6:xlI67HlBdwm7OC6AX0JQDA==

Sc

f13595783 Location:

Summary Statistics: Min. 0.5; Max. 14.98; Mean 7.535506072874494; Valid 988.0; StDev 4.1420512916547345

Variable Format: numeric

Notes: UNF:6:q6Mr/U+LuQYt6y7AaAbMRQ==

Sod

f13595783 Location:

Summary Statistics: Min. 130.0; Max. 149.0; Valid 988.0; Mean 139.72064777327935; StDev 5.717328109765646;

Variable Format: numeric

Notes: UNF:6:eZagqkUk2dWreMynQbJRaQ==

Pot

f13595783 Location:

Summary Statistics: Max. 6.5; Min. 3.51; Mean 4.969838056680162; Valid 988.0; StDev 0.8530505329949845

Variable Format: numeric

Notes: UNF:6:h5tGfUbSpO8wcy6vP22bXA==

Hemo

f13595783 Location:

Summary Statistics: Max. 17.0; StDev 2.8576249580444304; Valid 988.0; Mean 12.078238866396761; Min. 7.0

Variable Format: numeric

Notes: UNF:6:Nvg9ywVboJJsbfAu2LFlbQ==

Pcv

f13595783 Location:

Summary Statistics: Max. 54.0; Valid 988.0; Min. 20.0; Mean 36.72165991902834; StDev 9.878720310463985

Variable Format: numeric

Notes: UNF:6:qvIibO65+V43GXfDlDj79Q==

Wbcc

f13595783 Location:

Summary Statistics: Max. 14987.0; StDev 3156.5699941743806; Valid 988.0; Mean 9579.932186234819; Min. 4012.0

Variable Format: numeric

Notes: UNF:6:NEB1rIrLoXf6eWQklt7M5g==

Rbcc

f13595783 Location:

Summary Statistics: StDev 0.7343805391637633; Min. 3.5; Max. 6.0; Mean 4.734210526315789; Valid 988.0

Variable Format: numeric

Notes: UNF:6:A8BFqrn4kvtaNdLJ2sMcAw==

Htn

f13595783 Location:

Summary Statistics: StDev 0.4993904951959949; Min. 0.0; Valid 988.0; Mean 0.5293522267206477; Max. 1.0

Variable Format: numeric

Notes: UNF:6:fq3ELreWsbU2VBcgpN3KQA==

Dm

f13595783 Location:

Summary Statistics: Min. 0.0; Mean 0.49089068825910925; Valid 988.0; Max. 1.0; StDev 0.5001702002054779;

Variable Format: numeric

Notes: UNF:6:2NNYCjMOgtEioO8gpbPcgA==

Cad

f13595783 Location:

Summary Statistics: Valid 988.0; StDev 0.5002522037228156; Min. 0.0; Max. 1.0; Mean 0.49898785425101183;

Variable Format: numeric

Notes: UNF:6:XGi33W+gmZnEpWQUzErrtA==

Appet

f13595783 Location:

Summary Statistics: Mean 0.5040485829959515; StDev 0.5002368290872782; Max. 1.0; Min. 0.0; Valid 988.0

Variable Format: numeric

Notes: UNF:6:E1qU7h9atjtHiYYiCbgR4w==

Pe

f13595783 Location:

Summary Statistics: Valid 988.0; StDev 0.500052296589483; Mean 0.48582995951416996; Min. 0.0; Max. 1.0

Variable Format: numeric

Notes: UNF:6:+6GSrb31a87eam5zSZPJ6w==

Ane

f13595783 Location:

Summary Statistics: Max. 1.0; StDev 0.4999210320389626; Min. 0.0; Mean 0.5182186234817814; Valid 988.0;

Variable Format: numeric

Notes: UNF:6:83G8q1i5AOXoSNAUQggHDg==

Class

f13595783 Location:

Summary Statistics: Min. 0.0; Max. 1.0; StDev 0.5000799808051165; Mean 0.5131578947368419; Valid 988.0

Variable Format: numeric

Notes: UNF:6:pa+Xy7Uw0K2AwUWhXiX2EQ==

Other Study-Related Materials

Label:

Administrative Permission Letter.pdf

Text:

This document is a formal Administrative Approval and Data Collection Permission Letter issued by Popular Diagnostic Centre Ltd.. It provides the necessary ethical and institutional authorization to utilize 988 clinical patient records for the development and publication of the BD-KDD dataset.

Notes:

application/pdf

Other Study-Related Materials

Label:

BD-KDD Dictionary.md

Notes:

text/markdown