RSS-Feed abonnieren

DOI: 10.1055/s-0044-1800725
Best Paper Selection
Appendix: Content Summaries of Selected Best Papers for the 2024 IMIA Yearbook, Section Bioinformatics and Translational Informatics
Bhattacharya A, Vo DD, Jops C, Kim M, Wen C, Hervoso JL, Pasaniuc B, Gandal MJ.
Isoform-level transcriptome-wide association uncovers genetic risk mechanisms for neuropsychiatric disorders in the human brain.
Nat Genet. 2023 Dec;55(12):2117-2128.
doi: 10.1038/s41588-023-01560-2
This study introduces the isoform-level Transcriptome-Wide Association Study (isoTWAS), a multivariate, stepwise approach that uses genetic data to impute isoform-level expression and associate this with phenotypes. Unlike traditional gene-level methods, isoTWAS leverages the unique transcriptional profiles of distinct transcript isoforms produced by splicing in brain tissue, improving the prediction accuracy and power for discovering trait associations within genetic loci identified by genome-wide association studies (GWAS). The researchers demonstrated the efficacy of isoTWAS across 15 neuropsychiatric traits. They showed that isoTWAS significantly outperforms gene-level models by providing a more detailed understanding of transcriptomic mechanisms underlying genetic associations. Notably, isoTWAS identified multiple associations undetectable at the gene-level, such as isoforms of the genes AKT3, CUL3, HSPD1, and PCLO, emphasizing the importance of considering isoform-level variations in complex trait mapping. Key findings include the improved prediction of isoform and gene expression, which directly correlates with increased power for identifying trait associations. The isoTWAS framework adjusts for multiple testing and local linkage disequilibrium structures, enhancing the robustness of its genetic associations. This work underscores the value of incorporating isoform-level resolution in genetic studies of brain-related traits, offering new avenues for understanding the molecular basis of neuropsychiatric disorders and potentially guiding more targeted therapeutic strategies.
Li Y, Guo Z, Gao X, Wang G.
MMCL-CDR: enhancing cancer drug response prediction with multi-omics and morphology images contrastive representation learning.
Bioinformatics. 2023 Dec 1;39(12):btad734.
doi: 10.1093/bioinformatics/btad734.
Cancer is a highly heterogeneous and complex disease, which prevents a one-size-fits-all approach for effective treatment. In this study, Li et al. developed Multimodal Contrastive Learning for Cancer Drug Responses (MMCL-CDR), a machine learning model for predicting drug resistance and sensitivity in cancer cell lines. MMCL-CDR leverages two state-of-the-art approaches to learn a representation of the cancer cell (trained on multi-modal data, including gene expression levels, copy-number variation, and cell morphology) and the cancer drug (derived from a graph convolutional network trained on chemical structures). These two representations are then used as input into a final multilayer perceptron that predicts resistance or sensitivity for that cell-drug combination. MMCL-CDR outperforms its competitors in the area under the receiver operating characteristic curve (AUC = 0.89) and the precision-recall curve (AUC = 0.90), suggesting that the model is better able to classify cell-drug pairs. Through a series of ablation studies, they find that the multi-modal features significantly improve model performance, highlighting the importance of integrating diverse features into multi-omic prediction models. This work has the potential to improve our understanding of the features that lead to cancer drug resistance and sensitivity, as well as identify novel anticancer drugs.
Theodoris CV, Xiao L, Chopra A, Chaffin MD, Al Sayed ZR, Hill MC, Mantineo H, Brydon EM, Zeng Z, Liu XS, Ellinor PT.
Transfer learning enables predictions in network biology.
Nature. 2023 Jun;618(7965):616-624.
doi: 10.1038/s41586-023-06139-9
This study introduces “Geneformer”, a context-aware, attention-based deep learning model, designed to enhance predictive accuracy in network biology, especially under conditions of limited data availability. Built on a foundation of transfer learning, Geneformer was pretrained on a substantial corpus of approximately 30 million human single-cell transcriptomes, which represent a broad spectrum of human tissues. This pretraining allowed the model to internalize a deep understanding of network dynamics, which it could then apply to various downstream tasks. Key insights were demonstrated in the model's application to disease modeling, particularly cardiomyopathy, where Geneformer identified novel candidate therapeutic targets. By fine-tuning the model with limited disease-specific data, it predicted genes with potential therapeutic implications, which were subsequently validated experimentally. For instance, inhibition of certain predicted targets in induced pluripotent stem cell-derived cardiomyocytes showed marked improvement in cellular function, underscoring the model's practical utility in identifying actionable biological targets. The paper emphasizes the transformative potential of transfer learning in computational biology, showing how models trained on extensive datasets can transcend their initial conditions to provide significant insights in specialized applications with scarce data.
Liao W-W, Asri M, Ebler J, Doerr D, Haukness M, Hickey G, et al.
A draft human pangenome reference.
Nature. 2023 May;617(7960):312-324.
doi: 10.1038/s41586-023-05896-x.
In this study, the Human Pangenome Reference Consortium presents a human pangenome draft assembled from 47 diverse individuals. Utilizing advanced long-read sequencing from Pacific Biosciences and Oxford Nanopore, the team generated phased, diploid assemblies that more accurately reflect human genetic diversity than the traditional single-reference genomes. The pangenome encompasses 47 individual genomes, each phased and assembled to high standards of accuracy, covering over 99% of the expected genetic sequences and displaying high fidelity at both structural and base pair levels. Importantly, this new reference includes 119 million base pairs of euchromatic polymorphic sequences that are not present in the current GRCh38 reference genome and identifies 1,115 gene duplications, significantly enriching our genomic reference materials. The application of this pangenome in genetic analysis offers substantial improvements over the GRCh38 reference, with a 34% reduction in small variant discovery errors and a doubling in the detection of structural variants per haplotype. These enhancements are pivotal for advancing our understanding of genetic variations and their implications across different human populations, laying a stronger foundation for future genomic research and medical applications.
#
#
Die Autoren geben an, dass kein Interessenkonflikt besteht.
Publikationsverlauf
Artikel online veröffentlicht:
08. April 2025
© 2024. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution License, permitting unrestricted use, distribution, and reproduction so long as the original work is properly cited. (https://creativecommons.org/licenses/by/4.0/)
Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany