JoLab.AI :: Publications

Uncertainty-aware genomic classification of Alzheimer's disease: a transformer-based ensemble approach with Monte Carlo dropout

Taeho Jo, Eun Hye Lee, for the Alzheimer's Disease Neuroimaging Initiative (ADNI) and the Alzheimer's Disease Sequencing Project (ADSP), Briefings in Bioinformatics (2025) TrUE-Net combines transformer and random forest models with Monte Carlo Dropout to provide uncertainty-aware AD classification from WGS data. Analyzing 1,050 individuals (607 AD, 443 controls) from ADNI, the framework achieved overall accuracy of 65.1% with AUC 0.664. By stratifying predictions based on uncertainty, the model identified high-confidence predictions (24.6% of samples) with 72.9% accuracy and F1 0.821. This approach enables identification of reliable predictions while acknowledging inherent uncertainty in genomic classification, providing a framework for more informed clinical decision-making.

Genomics & AI

Longitudinal plasma proteomics: relation to incident Alzheimer's disease dementia and biomarkers

Eun Hye Lee, Yen-Ning Huang, Tamina Park, Taeho Jo, Kwangsik Nho, Andrew J. Saykin, for the Indiana Alzheimer Disease Research Center, Alzheimer's & Dementia(2025) Longitudinal proteomics analysis identified dynamic changes in plasma proteins associated with AD progression. Seven proteins (ACES, C7, ZCD1, IL-17C, CC055, SO5A1, IGFALS) showed significant associations with baseline cognitive stage and incident ADD. Machine learning models incorporating these protein trajectories achieved AUC 0.848 for predicting ADD conversion. The study of 346 participants with repeated measurements demonstrated that protein changes were correlated with AD imaging biomarkers and provided superior predictive performance compared to static biomarkers alone.

Proteomics & AI

LD‐informed deep learning for Alzheimer's gene loci detection using WGS data

Taeho Jo, Paula Bice, Kwangsik Nho, Andrew J. Saykin, the Alzheimer's Disease Sequencing Project, Alzheimer & Dementia TRCI (2025) Deep‐Block is a multi‐stage deep learning framework designed to detect AD associated genetic loci in large‐scale WGS data. It segments the genome based on linkage disequilibrium, applies sparse attention to select key blocks, and evaluates SNP feature importance with TabNet/RF. In a study of 7416 participants, 30,218 LD blocks were identified, including novel variants and established APOE loci. The results were supported by eQTL analysis across 13 brain regions and comparisons to existing GWAS data.

Genomics & AI

Circular-SWAT for deep learning based diagnostic classification of Alzheimer’s disease: Application to metabolome data

Taeho Jo, Junpyo Kima, Paula Bice, Kevin Huynh, Tingting Wang, Matthias Arnold, Peter J. Meikle, Corey Giles, Rima Kaddurah-Daoukf, Andrew J. Saykina, Kwangsik Nho, eBioMedicine (2023) This study introduces the Circular-Sliding Window Association Test (c-SWAT), a methodology designed to enhance the diagnostic classification of AD using serum-based metabolomics data, with a focus on lipidomics. Leveraging data from 997 participants, c-SWAT integrates feature correlation analysis, feature selection via CNN, and final classification through Random Forest, achieving an accuracy of up to 80.8% and an AUC of 0.808 in distinguishing AD from cognitively normal older adults.

Metabolomics & AI

Deep Learning-based Integration of Neuroimaging and Genetic Data for Classification of Alzheimer's Disease

Taeho Jo, Kwangsik Nho, Shannon L. Risacher, Andrew J. Saykin, AAIC (2023) This study introduces a new deep learning method using CNNs to analyze tau PET images and identify Alzheimer's Disease (AD) related patterns. The method achieved a 90.8% accuracy in classifying AD and highlighted significant tau deposition regions associated with AD. Additionally, we used the SWAT method to find AD-related SNPs, uncovering key genetic loci, including the known APOE regions, and achieved an AUC of 0.82.

Precision Medicine

Deep Learning-based SWAT-Tab Approach for Identifying Genetic Variants using Whole Genome Sequencing

Taeho Jo, Kwangsik Nho, Andrew J. Saykin, AAIC (2023) The study introduces SWAT-TAB, an evolved form of SWAT-CNN, optimized for identifying genetic variants in Alzheimer's disease (AD). It utilizes the Tabnet algorithm to meticulously select relevant features using a concept called sequential attention and was applied to ADSP WGS data, revealing pivotal genetic features. SWAT-TAB demonstrated enhanced efficiency, offering reduced processing time and improved ease of implementation compared to its predecessor.

Genomics & AI

Novel circling SWAT for deep learning based diagnostic classification of Alzheimer’s disease: Application to metabolome data

Taeho Jo, Junpyo Kim, Paula Bice, Kevin Huynh, Tingting Wang, Peter J Meikle, Rima Kaddurah-Daouk, Kwangsik Nho, Andrew J. Saykin, AAIC (2022) We used serum-based cross-sectional lipidome data with 781 lipids from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) including 216 cognitively normal (CN), 635 MCI, and 382 dementia (AD). Phenotype influence scores (PIS) was derived by deep learning-based circling Sliding Window Association Test approach (Circling SWAT), an extension of SWAT (Jo et al., 2022) with correlation heatmap and dendrogram analysis for omics data with minimal features.

Metabolomics & AI

Deep learning-based identification of genetic variants: application to Alzheimer’s disease classification

Taeho Jo, Kwangsik Nho, Paula Bice, Andrew J Saykin, For The Alzheimer’s Disease Neuroimaging Initiative, Briefings in Bioinformatics (2022) We propose a novel three-step approach (SWAT-CNN) for identification of genetic variants using deep learning to identify phenotype-related single nucleotide polymorphisms (SNPs) that can be applied to develop accurate disease classification models. We tested our approach using GWAS data from the ADNI including (N = 981; CN = 650, AD = 331). Our approach identified the well-known APOE region as the most significant genetic locus for AD. Our classification model achieved an AUC of 0.82.

Genomics & AI

Deep learning–based genome-wide association analysis in Alzheimer’s disease

Taeho Jo, Kwangsik Nho, Andrew J. Saykin, AAIC (2021) We used genome-wide genotyping data (12,448,786 SNPs following imputation) from 916 participants in the Alzheimer’s Disease Neuroimaging Initiative (458 cognitively normal controls and 458 AD patients). A convolutional neural network (CNN) consisting of convolutional, pooling and fully connected Softmax layers was used in a two-stage approach.

Genomics & AI

Deep learning detection of informative features in tau PET for Alzheimer’s disease classification

Taeho Jo, Kwangsik Nho, Shannon L. Risacher & Andrew J. Saykin for the Alzheimer’s Neuroimaging Initiative, BMC Bioinformatics (2020) We developed a deep learning-based framework to identify informative features for AD classification using tau positron emission tomography (PET) scans. The 3D convolutional neural network (CNN)-based classification model of AD from cognitively normal (CN) yielded an average accuracy of 90.8% based on five-fold cross-validation. The LRP model identified the brain regions in tau PET images that contributed most to the AD classification from CN.

Neuroimaging & AI

Deep learning detection of informative features in [18F] flortaucipir PET for Alzheimer’s disease classification

Taeho Jo, Kwangsik Nho, Shannon L. Risacher, Andrew J. Saykin, AAIC (2020) We downloaded 458 tau PET images (196 CN, 196 MCI, and 66 AD) from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and included only one scan per individual. SPM12 was used to process the tau PET data using standard techniques. We used a 3D convolution neural network (CNN) method for the classification, and applied a layer-wise relevance propagation (LRP) algorithm to identify informative features and to visualize the classification results. Five-fold cross validation was applied, where 70% of the entire data set was used for model training, 20% for model testing, and 10% for independent validation.

Neuroimaging & AI

Deep Learning in Alzheimer's Disease: Diagnostic Classification and Prognostic Prediction Using Neuroimaging Data

Taeho Jo, Kwangsik Nho, Andrew J. Saykin, Frontiers in Aging Neuroscience (2019) The application of deep learning to early detection and automated classification of AD has recently gained considerable attention, as rapid progress in neuroimaging techniques has generated large-scale multimodal neuroimaging data. A systematic review of publications using deep learning and neuroimaging data for diagnostic classification of AD was performed. A PubMed and Google Scholar search was used to identify deep learning papers on AD published between Jan 2013 and July 2018. These papers were reviewed, evaluated, and classified by algorithm and neuroimaging type, and the findings were summarized.

Neuroimaging & AI

Multimodal-3DCNN: Diagnostic Classification of Alzheimer's Disease Using Deep Learning on Neuroimaging, Genetic, and Demographic Data

Taeho Jo, Kwangsik Nho, Shannon L. Risacher, Andrew J. Saykin, AAIC (2019) Demographic information, 3D MRI and PET image data, and APOE data were downloaded from the ADNI data repository (N=329; 185 CN and 144 AD). In our novel Multimodal-3DCNN approach, we first applied 3D Convolutional Neural Network (3D-CNN) to multimodal neuroimaging (MRI and PET) and then combined the output of 3D-CNN with APOE ε4 genotype and demographic information (age, sex, education, handedness etc.) using a gram matrix method (mCNN; Jo et al. AAIC2018). Finally, Deep Neural Network (DNN) was used to distinguish individuals with AD from CN. A 5-fold cross validation approach was employed to evaluate performance.

Neuroimaging & AI

Multimodal-CNN: Improved Accuracy of MRI-based Classification of Alzheimer’s Disease by Incorporating Clinical Data in Deep Learning

Taeho Jo, Kwangsik Nho, Shannon L. Risacher, Jingwen Yan, Andrew J. Saykin, AAIC (2018) Intermediate layers of the CNN were extracted, and the patient's clinical information was added by the gram matrix method. The clinical information was encoded as 2D matrices in this method, and the 2D images were extracted for train set by using the hippocampal segmentations, downloaded from the LONI ADNI site, carried out using Surgical Navigation Technologies (SNT). CNN with augmentation was performed on baseline scans from 103 participants with AD, 144 cognitively normal (CN) controls. Global CDR scores and the number of APOE ε4 alleles were included as clinical and genetic data.

Neuroimaging & AI

Evaluation of Protein Structural Models Using Random Forests

Renzhi Cao, Taeho Jo, Jianlin Cheng, arXiv (2016) We propose a new protein quality assessment method which can predict both local and global quality of the protein 3D structural models. Our method uses both multi and single model quality assessment method for global quality assessment, and uses chemical, physical, geo-metrical features, and global quality score for local quality assessment. CASP9 targets are used to generate the features for local quality assessment. We evaluate the performance of our local quality assessment method on CASP10, which is comparable with two stage-of-art QA methods based on the average absolute distance between the real and predicted distance. We blindly tested our method on CASP11, and the good performance shows that combining single and multiple model quality assessment method could be a good way to improve the accuracy of model quality assessment.

Proteomics & AI

Improving Protein Fold Recognition by Deep Learning Networks

Taeho Jo, Jie Hou, Jesse Eickholt & Jianlin Cheng, Scientific Reports (2015) The three–dimensional structure of Heterosigma akashiwo Na+–ATPase (HANA) was predicted by means of homology modeling based on the crystal structure of the K+–bound form of shark Na+/K+–ATPase (PDB ID: 2ZXE). The overall structure of HANA appears to be similar to that of shark Na+/K+–ATPase. Both contain three characteristic cytoplasmic domains, A, N and P, which are unique to P–type ATPases. HANA has a long TM7–8 junction as a large extracellular domain, in place of the β–subunit of shark Na+/K+–ATPase. Two putative K+–binding sites in the transmembrane domain of HANA were identified by means of valence mapping based on the constructed structure. The presence of K+–binding sites and the reported ion requirements for ATPase activity and EP formation indicate that HANA may transport K+ ions in the same manner as animal Na+/K+–ATPas...

Proteomics & AI

Improving protein fold recognition by random forest

Taeho Jo & Jianlin Cheng, BMC Bioinformatics (2014) RF-Fold consists of hundreds of decision trees that can be trained efficiently on very large datasets to make accurate predictions on a highly imbalanced dataset. We evaluated RF-Fold on the standard Lindahl's benchmark dataset comprised of 976 × 975 target-template protein pairs through cross-validation. Compared with 17 different fold recognition methods, the performance of RF-Fold is generally comparable to the best performance in fold recognition of different difficulty ranging from the easiest family level, the medium-hard superfamily level, and to the hardest fold level. Based on the top-one template protein ranked by RF-Fold, the correct recognition rate is 84.5%, 63.4%, and 40.8% at family, superfamily, and fold levels, respectively. Based on the top-five template protein folds ranked by RF-Fold, the correct recognition rate increases to 91.5%, 79.3% and 58.3% at family, superfamily, and fold levels.

Proteomics & AI

Homology Modeling of an Algal Membrane Protein, Heterosigma Akashiwo Na^+-ATPase

Taeho Jo, Mariko Shono, Masato Wada, Sayaka Ito, Junko Nomoto, Yukichi Hara, Membrane (2010) The three–dimensional structure of Heterosigma akashiwo Na+–ATPase (HANA) was predicted by means of homology modeling based on the crystal structure of the K+–bound form of shark Na+/K+–ATPase (PDB ID: 2ZXE). The overall structure of HANA appears to be similar to that of shark Na+/K+–ATPase. Both contain three characteristic cytoplasmic domains, A, N and P, which are unique to P–type ATPases. HANA has a long TM7–8 junction as a large extracellular domain, in place of the β–subunit of shark Na+/K+–ATPase. Two putative K+–binding sites in the transmembrane domain of HANA were identified by means of valence mapping based on the constructed structure. The presence of K+–binding sites and the reported ion requirements for ATPase activity and EP formation indicate that HANA may transport K+ ions in the same manner as animal Na+/K+...

Proteomics & AI

JoLab.AI :: Publications

LOGIN