Machine learning helps construct diagnostic models for IPF
New study reveals lipid-related gene activity changes in patient lungs

Researchers in China used machine learning — a type of artificial intelligence in which computers detect patterns in datasets and then make predictions — to create diagnostic and prognostic models for use in idiopathic pulmonary fibrosis (IPF).
Using the new models, the team observed gene activity changes suggestive of altered lipid (fat) metabolism in the lungs of people with IPF. According to the scientists, these gene activity data could be used to develop algorithms for identifying IPF and predicting its prognosis.
In lab studies, the KLF4 gene showed a role in promoting processes related to scarring, medically known as fibrosis.
“This study successfully constructed lipid-related diagnostic and prognostic models for IPF and identified KLF4 as a potential causative gene,” the researchers wrote. “These findings provide a foundation for further exploration of lipid metabolism in IPF pathogenesis [disease development] and potential therapeutic strategies targeting KLF4.”
Titled “Machine learning identifies lipid-associated genes and constructs diagnostic and prognostic models for idiopathic pulmonary fibrosis,” the study was published in the Orphanet Journal of Rare Diseases.
IPF, a lung disease without a known cause, is the most common type of pulmonary fibrosis (PF), in which inflammation and fibrosis in the lungs make it harder to breathe.
Fatty compounds, or lipids, are abundant in the lungs, where processes related to lipid metabolism — the chemical reactions through which lipids are made, stored, and broken — are highly active.
Research suggests that dysregulation of lipid metabolism may be involved in the development of IPF, potentially leading to increased cell death and production of fibrosis-promoting molecules.
Researchers sought to ID lipid-related genes implicated in IPF
In this study, scientists from Sichuan Provincial People’s Hospital, University of Electronic Science and Technology of China, aimed to identify lipid-related genes that might be implicated in IPF. To that end, the team examined large gene activity datasets obtained from lung tissue, lung fluid, and blood samples of people with and without the lung disease.
Clusters of lipid-related genes had different activity levels in IPF versus healthy lungs, the data showed.
Through a series of analyses, the researchers found that one group of genes in particular had the strongest correlation with measures of lung function and other IPF-related genes. These genes also were linked to other heart and lung diseases, including congestive heart failure, pulmonary hypertension, chronic obstructive pulmonary disease, and asthma.
Genes in the module were commonly involved in biological processes related to immune regulation, metabolism, and signaling of the extracellular matrix — the network of proteins and other components that gives cells structural support.
In both the training and three validation sets, [the model was found to have] excellent predictive performance, [with accuracy of about 95%.]
The scientists then used artificial intelligence to create a diagnostic algorithm based on the activity of 15 different hub genes in lung tissue. According to the team, “in both the training and three validation sets,” the model was found to have “excellent predictive performance,” with accuracy of about 95% in distinguishing people with IPF from people without it.
A prognostic model was also built, which used the activity of 10 genes to calculate a lipid-related risk score. The data showed that people who had a high risk score exhibited a significantly worse prognosis than those with a low risk score.
Additional analyses further showed a pattern of alterations in lipid metabolism within the IPF lungs.
KLF4 gene may be potential therapeutic target in IPF
Looking specifically at gene activity in individual cell types, the scientists noticed a pattern where people with IPF had more alveolar type 2 epithelial cells (AT2) with low activity of the KLF4 gene than people without it.
AT2 cells line alveoli — the tiny air sacs where the lungs and blood exchange oxygen and carbon dioxide — and secrete surfactant, a substance composed of about 90% lipids that protects and supports alveoli function.
In lab-grown cells, experimentally reducing KLF4 activity promoted fibrosis, suggesting that the reduced activity of this gene in IPF may contribute to lung scarring.
Previous studies have identified that levels of the KLF4 protein are reduced in the lungs of people with IPF and in animal models of the disease. This reduction correlates with worse fibrosis and impaired function of cells lining the alveoli.
The researchers indicated that while the data highlight KLF4’s potential as a therapeutic target, additional mechanistic experiments “are needed to clarify how KLF4 regulates fibrosis-related signaling pathways at cellular and molecular levels.”
Moreover, while the diagnostic and predictive models that were created “provide a valuable framework for future studies and clinical applications,” more work will be needed to further validate and improve them, the team concluded.