UK Biobank, which holds medical records, genome sequences, scans, blood samples, and lifestyle information of 500,000 British volunteers, has suffered two data exposure incidents, according to multiple reports. The first involved researchers accidentally publishing partial or entire Biobank datasets on GitHub when intending to upload code. The Guardian investigation found that one dataset contained millions of hospital diagnoses and dates for over 400,000 participants. Between July and December 2025, UK Biobank issued 80 legal notices to GitHub to remove data, but much of the leaked data still remains available online. Until late 2024, researchers were free to download data directly onto their own computer systems, and data had been inadvertently published online before that time, with Biobank still grappling with the problem.
The exposed files do not include names or addresses but may still pose privacy concerns. The data involved could include gender, age, month and year of birth, socioeconomic status, lifestyle habits, and measures from biological samples. With a volunteer's consent, the Guardian was able to identify extensive hospital diagnosis records for that volunteer using only month/year of birth and details of a major surgery. Technology minister Ian Murray said he could not give a complete guarantee that nobody could be identified, but re-identification would likely require a 'very advanced way'. A data expert described the scale and persistence of the problem as 'shocking'.
UK Biobank rejected concerns, stating no identifying data such as names and addresses were provided to researchers. Chief Executive Sir Rory Collins said they have never seen evidence of any participant being re-identified. UK Biobank prohibits researchers from sharing data outside their systems and has introduced further training. The organization temporarily closed access to the research platform. Sir Rory Collins apologized to participants and said additional security measures will be put in place.
The UK Biobank charity informed the Government that it had identified their data had been advertised for sale by several sellers on Alibaba e-commerce platforms in China. Biobank told us that in three listings that appeared to sell... Biobank participation data had been identified. At least one of these three data sets appear to contain data from all 500,000 UK Biobank volunteers.
In a separate incident, details of 500,000 UK Biobank members were offered for sale online in China on Alibaba. Technology minister Ian Murray confirmed the data was listed for sale on Alibaba and called it an 'unacceptable abuse' of data. The Biobank charity informed the government about the data breach on Monday. The information did not include names, addresses, contact details, or telephone numbers. The data had been legitimately downloaded by three research institutions in China, which have since had their access revoked. No purchases were made from the three listings on Alibaba. The listings have been taken down, and the Chinese government cooperated.
UK Biobank was founded in 2003 by the Department of Health and medical research charities. It is one of the world's most comprehensive health information stores and has driven breakthroughs in cancer, dementia, and diabetes research. UK Biobank data has been cited in more than 18,000 peer-reviewed scientific papers. In late 2024, the government extended Biobank's access to volunteers' GP records. Until late 2024, researchers were free to download data directly onto their own computer systems, a policy that may have contributed to the exposures.
Scientists approved to access Biobank data have sometimes been careless about security, according to The Guardian investigation. A data expert described the scale and persistence of the problem as 'shocking'. The exact number of distinct datasets exposed on GitHub and how many participants' data is still available online remains unclear. It is also unknown how the data files on Alibaba were obtained — whether they were downloaded by the Chinese research institutions and then sold, or if there was another breach. The specific additional security measures UK Biobank is implementing beyond training and legal notices have not been detailed. The exact timeline of the Alibaba listings and when they were taken down is also unclear. It remains to be seen whether UK Biobank will change its data access policies to prevent future leaks, such as requiring researchers to use a secure analysis platform instead of downloading data.
