Published: 14 March 2026. The English Chronicle Desk. The English Chronicle Online
Confidential health data from the UK Biobank, one of the country’s largest medical research projects, has been exposed online on multiple occasions, according to a Guardian investigation. The Biobank, which holds medical records of 500,000 volunteers, has been instrumental in research on cancer, dementia, and diabetes.
Researchers with access to the data sometimes inadvertently posted it online, including on GitHub, while sharing analysis code for academic publications. Although the datasets do not contain names or addresses, they include sensitive information such as hospital diagnoses, dates of medical procedures, and sex, which experts warn could enable re-identification of participants.
One dataset discovered by the Guardian contained hospital diagnoses for more than 400,000 participants. Using details from a volunteer, including month and year of birth and a major surgery, the Guardian was able to locate that individual’s records. A data expert described the exposure as “a gross invasion of privacy.”
UK Biobank maintains that no participant has been re-identified and that identifying details such as names and addresses were never shared. Prof Sir Rory Collins, chief executive of UK Biobank, said: “We have never seen any evidence of any UK Biobank participant being re-identified by others.”
Founded in 2003, UK Biobank collects genomic data, scans, blood samples, and lifestyle information from volunteers. Until late 2024, researchers could download data directly to their own systems, which contributed to accidental leaks. Biobank now restricts access more strictly and provides training to approved users.
Between July and December 2025, UK Biobank issued 80 legal notices to GitHub to remove inadvertently posted datasets, resulting in approximately 500 repositories being taken down. Nevertheless, some data remains accessible on archive websites.
Experts warn that anonymisation is not foolproof. Dr Luc Rocher of the Oxford Internet Institute said that knowing details like a birthday and surgery date can be sufficient to identify participants and reveal sensitive information such as psychiatric diagnoses or HIV status. Prof Niels Peek of the University of Cambridge called the scale of the leaks “shocking” and highlighted the tension between enabling large-scale health research and protecting individual privacy.
While some volunteers expressed concern about the privacy implications, they also emphasised the importance of UK Biobank’s research. The organisation continues to take measures to safeguard data and ensure researchers comply with strict privacy standards, but questions remain about whether full control over inadvertently released datasets can be regained.
The incident underscores the challenges of balancing the ambitions of medical research with the legal and ethical imperative to protect participant privacy in an age of AI, social media, and data sharing.

























































































