Identification of Spatial Domains, Spatially Variable Genes, and Genetic Association Studies of Alzheimer Disease with an Autoencoder-based Fuzzy Clustering Algorithm
- Authors: Cui Y.1, Wei L.2, Wang R.3, Ye X.1, Sakurai T.1
-
Affiliations:
- Department of Computer Science, University of Tsukuba
- Center of Artificial Intelligence driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University
- School of Software, Shandong University
- Issue: Vol 19, No 8 (2024)
- Pages: 765-776
- Section: Life Sciences
- URL: https://jdigitaldiagnostics.com/1574-8936/article/view/644040
- DOI: https://doi.org/10.2174/0115748936278884240102094058
- ID: 644040
Cite item
Full Text
Abstract
Introduction:Transcriptional gene expressions and their corresponding spatial information are critical for understanding the biological function, mutual regulation, and identification of various cell types.
Materials and Methods:Recently, several computational methods have been proposed for clustering using spatial transcriptional expression. Although these algorithms have certain practicability, they cannot utilize spatial information effectively and are highly sensitive to noise and outliers. In this study, we propose ACSpot, an autoencoder-based fuzzy clustering algorithm, as a solution to tackle these problems. Specifically, we employed a self-supervised autoencoder to reduce feature dimensionality, mitigate nonlinear noise, and learn high-quality representations. Additionally, a commonly used clustering method, Fuzzy c-means, is used to achieve improved clustering results. In particular, we utilize spatial neighbor information to optimize the clustering process and to fine-tune each spot to its associated cluster category using probabilistic and statistical methods.
Result and Discussion:The comparative analysis on the 10x Visium human dorsolateral prefrontal cortex (DLPFC) dataset demonstrates that ACSpot outperforms other clustering algorithms. Subsequently, spatially variable genes were identified based on the clustering outcomes, revealing a striking similarity between their spatial distribution and the subcluster spatial distribution from the clustering results. Notably, these spatially variable genes include APP, PSEN1, APOE, SORL1, BIN1, and PICALM, all of which are well-known Alzheimer's disease-associated genes.
Conclusion:In addition, we applied our model to explore some potential Alzheimer's disease correlated genes within the dataset and performed Gene Ontology (GO) enrichment and gene-pathway analyses for validation, illustrating the capability of our model to pinpoint genes linked to Alzheimers disease.
About the authors
Yaxuan Cui
Department of Computer Science, University of Tsukuba
Email: info@benthamscience.net
Leyi Wei
Center of Artificial Intelligence driven Drug Discovery, Faculty of Applied Science, Macao Polytechnic University
Author for correspondence.
Email: info@benthamscience.net
Ruheng Wang
School of Software, Shandong University
Email: info@benthamscience.net
Xiucai Ye
Department of Computer Science, University of Tsukuba
Author for correspondence.
Email: info@benthamscience.net
Tetsuya Sakurai
Department of Computer Science, University of Tsukuba
Email: info@benthamscience.net
References
- Chen WT, Lu A, Craessaerts K, et al. Spatial transcriptomics and in situ sequencing to study alzheimers disease. Cell 2020; 182(4): 976-991.e19. doi: 10.1016/j.cell.2020.06.038 PMID: 32702314
- Rivera J, Sharma B, Torres MM, Kumar S. Factors affecting the GABAergic synapse function in Alzheimers disease: Focus on microRNAs. Ageing Res Rev 2023; 92: 102123. doi: 10.1016/j.arr.2023.102123 PMID: 37967653
- Murdock MH, Tsai LH. Insights into Alzheimers disease from single-cell genomic approaches. Nat Neurosci 2023; 26(2): 181-95. doi: 10.1038/s41593-022-01222-2 PMID: 36593328
- Navarro JF, Croteau DL, Jurek A, et al. Spatial transcriptomics reveals genes associated with dysregulated mitochondrial functions and stress signaling in alzheimer disease. iScience 2020; 23(10): 101556. doi: 10.1016/j.isci.2020.101556 PMID: 33083725
- Cui Y, Zhang S, Liang Y, Wang X, Ferraro TN, Chen Y. Consensus clustering of single-cell RNA-seq data by enhancing network affinity. Brief Bioinform 2021; 22(6): bbab236. doi: 10.1093/bib/bbab236 PMID: 34160582
- Paik DT, Cho S, Tian L, Chang HY, Wu JC. Single-cell RNA sequencing in cardiovascular development, disease and medicine. Nat Rev Cardiol 2020; 17(8): 457-73. doi: 10.1038/s41569-020-0359-y PMID: 32231331
- Zhang X, Wang X, Shivashankar GV, Uhler C. Graph-based autoencoder integrates spatial transcriptomics with chromatin images and identifies joint biomarkers for Alzheimers disease. Nat Commun 2022; 13(1): 7480. doi: 10.1038/s41467-022-35233-1 PMID: 36463283
- Potter SS. Single-cell RNA sequencing for the study of development, physiology and disease. Nat Rev Nephrol 2018; 14(8): 479-92. doi: 10.1038/s41581-018-0021-7 PMID: 29789704
- Butcher DT, Alliston T, Weaver VM. A tense situation: Forcing tumour progression. Nat Rev Cancer 2009; 9(2): 108-22. doi: 10.1038/nrc2544 PMID: 19165226
- Trapnell C. Defining cell types and states with single-cell genomics. Genome Res 2015; 25(10): 1491-8. doi: 10.1101/gr.190595.115 PMID: 26430159
- Satija R, Farrell JA, Gennert D, Schier AF, Regev A. Spatial reconstruction of single-cell gene expression data. Nat Biotechnol 2015; 33(5): 495-502. doi: 10.1038/nbt.3192 PMID: 25867923
- Maynard KR, Collado-Torres L, Weber LM, et al. Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex. Nat Neurosci 2021; 24(3): 425-36. doi: 10.1038/s41593-020-00787-0 PMID: 33558695
- Stegle O, Parts L, Piipari M, Winn J, Durbin R. Using probabilistic estimation of expression residuals (PEER) to obtain increased power and interpretability of gene expression analyses. Nat Protoc 2012; 7(3): 500-7. doi: 10.1038/nprot.2011.457 PMID: 22343431
- LanctAt C, Cheutin T, Cremer M, Cavalli G, Cremer T. Dynamic genome architecture in the nuclear space: Regulation of gene expression in three dimensions. Nat Rev Genet 2007; 8(2): 104-15. doi: 10.1038/nrg2041 PMID: 17230197
- Engreitz JM, Ollikainen N, Guttman M. Long non-coding RNAs: Spatial amplifiers that control nuclear structure and gene expression. Nat Rev Mol Cell Biol 2016; 17(12): 756-70. doi: 10.1038/nrm.2016.126 PMID: 27780979
- Liao J, Lu X, Shao X, Zhu L, Fan X. Uncovering an Organs molecular architecture at single-cell resolution by spatially resolved transcriptomics. Trends Biotechnol 2021; 39(1): 43-58. doi: 10.1016/j.tibtech.2020.05.006 PMID: 32505359
- Cardona-Alberich A, Tourbez M, Pearce SF, Sibley CR. Elucidating the cellular dynamics of the brain with single-cell RNA sequencing. RNA Biol 2021; 18(7): 1063-84. doi: 10.1080/15476286.2020.1870362 PMID: 33499699
- Shan X, Chen J, Dong K, Zhou W, Zhang S. Deciphering the spatial modular patterns of tissues by integrating spatial and single-cell transcriptomic data. J Comput Biol 2022; 29(7): 650-63. doi: 10.1089/cmb.2021.0617 PMID: 35727094
- Medaglia C, Giladi A, Stoler-Barak L, et al. Spatial reconstruction of immune niches by combining photoactivatable reporters and scRNA-seq. Science 2017; 358(6370): 1622-6. doi: 10.1126/science.aao4277 PMID: 29217582
- Lein E, Borm LE, Linnarsson S. The promise of spatial transcriptomics for neuroscience in the era of molecular cell typing. Science 2017; 358(6359): 64-9. doi: 10.1126/science.aan6827 PMID: 28983044
- Ortiz C. CarlA(c)n M, Meletis K. Spatial transcriptomics: Molecular maps of the mammalian brain. Annu Rev Neurosci 2021; 44(1): 547-62. doi: 10.1146/annurev-neuro-100520-082639 PMID: 33914592
- Lee J, Yoo M, Choi J. Recent advances in spatially resolved transcriptomics: Challenges and opportunities. BMB Rep 2022; 55(3): 113-24. doi: 10.5483/BMBRep.2022.55.3.014 PMID: 35168703
- Shapiro E, Biezuner T, Linnarsson S. Single-cell sequencing-based technologies will revolutionize whole-organism science. Nat Rev Genet 2013; 14(9): 618-30. doi: 10.1038/nrg3542 PMID: 23897237
- Wen T. Recent advances in single-cell sequencing technologies. Precis Clin Med 2022; 5(1)
- Moffitt JR, Hao J, Wang G, Chen KH, Babcock HP, Zhuang X. High-throughput single-cell gene-expression profiling with multiplexed error-robust fluorescence in situ hybridization. Proc Natl Acad Sci 2016; 113(39): 11046-51. doi: 10.1073/pnas.1612826113 PMID: 27625426
- Lee JH, Daugharthy ER, Scheiman J, et al. Highly multiplexed subcellular RNA sequencing in situ. Science 2014; 343(6177): 1360-3. doi: 10.1126/science.1250212 PMID: 24578530
- Wang X, Allen WE, Wright MA, et al. Three-dimensional intact-tissue sequencing of single-cell transcriptional states. Science 2018; 361(6400): eaat5691. doi: 10.1126/science.aat5691 PMID: 29930089
- Moffitt JR, Bambah-Mukku D, Eichhorn SW, et al. Molecular, spatial, and functional single-cell profiling of the hypothalamic preoptic region. Science 2018; 362(6416): eaau5324. doi: 10.1126/science.aau5324 PMID: 30385464
- Chen KH, Boettiger AN, Moffitt JR, Wang S, Zhuang X. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 2015; 348(6233): aaa6090. doi: 10.1126/science.aaa6090 PMID: 25858977
- Lubeck E, Coskun AF, Zhiyentayev T, Ahmad M, Cai L. Single-cell in situ RNA profiling by sequential hybridization. Nat Methods 2014; 11(4): 360-1. doi: 10.1038/nmeth.2892 PMID: 24681720
- Shah S, Lubeck E, Zhou W, Cai L. In Situ Transcription profiling of single cells reveals spatial organization of cells in the mouse hippocampus. Neuron 2016; 92(2): 342-57. doi: 10.1016/j.neuron.2016.10.001 PMID: 27764670
- Eng CHL, Lawson M, Zhu Q, et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH+. Nature 2019; 568(7751): 235-9. doi: 10.1038/s41586-019-1049-y PMID: 30911168
- Vickovic S, Eraslan G, SalmA(c)n F, et al. High-definition spatial transcriptomics for in situ tissue profiling. Nat Methods 2019; 16(10): 987-90. doi: 10.1038/s41592-019-0548-y PMID: 31501547
- Rodriques SG, Stickels RR, Goeva A, et al. Slide-seq: A scalable technology for measuring genome-wide expression at high spatial resolution. Science 2019; 363(6434): 1463-7. doi: 10.1126/science.aaw1219 PMID: 30923225
- Stickels RR, Murray E, Kumar P, et al. Highly sensitive spatial transcriptomics at near-cellular resolution with Slide-seqV2. Nat Biotechnol 2021; 39(3): 313-9. doi: 10.1038/s41587-020-0739-1 PMID: 33288904
- Ji AL, Rubin AJ, Thrane K, et al. Multimodal analysis of composition and spatial architecture in human squamous cell carcinoma. Cell 2020; 182(2): 497-514.e22. doi: 10.1016/j.cell.2020.05.039 PMID: 32579974
- Hodges E, Smith AD, Kendall J, et al. High definition profiling of mammalian DNA methylation by array capture and single molecule bisulfite sequencing. Genome Res 2009; 19(9): 1593-605. doi: 10.1101/gr.095190.109 PMID: 19581485
- Daxin Jiang, Chun Tang, Aidong Zhang. Cluster analysis for gene expression data: A survey. IEEE Trans Knowl Data Eng 2004; 16(11): 1370-86. doi: 10.1109/TKDE.2004.68
- Heumos L, Schaar AC, Lance C, et al. Best practices for single-cell analysis across modalities. Nat Rev Genet 2023; 24(8): 550-72. doi: 10.1038/s41576-023-00586-w PMID: 37002403
- Yip KY, Cheng C, Bhardwaj N, et al. Classification of human genomic regions based on experimentally determined binding sites of more than 100 transcription-related factors. Genome Biol 2012; 13(9): R48. doi: 10.1186/gb-2012-13-9-r48 PMID: 22950945
- Chang Y, He F, Wang J, et al. Define and visualize pathological architectures of human tissues from spatially resolved transcriptomics using deep learning. Comput Struct Biotechnol J 2022; 20: 4600-17. doi: 10.1016/j.csbj.2022.08.029 PMID: 36090815
- Monjo T, Koido M, Nagasawa S, Suzuki Y, Kamatani Y. Efficient prediction of a spatial transcriptomics profile better characterizes breast cancer tissue sections without costly experimentation. Sci Rep 2022; 12(1): 4133. doi: 10.1038/s41598-022-07685-4 PMID: 35260632
- Heydari AA, Sindi SS. Deep learning in spatial transcriptomics: Learning from the next next-generation sequencing. 2022. doi: 10.1101/2022.02.28.482392
- Luo W, Lin GN, Song W, et al. Single-cell spatial transcriptomic analysis reveals common and divergent features of developing postnatal granule cerebellar cells and medulloblastoma. BMC Biol 2021; 19(1): 135. doi: 10.1186/s12915-021-01071-8 PMID: 34210306
- Na S, Xumin L, Yong G. Research on k-means Clustering Algorithm: An improved k-means clustering algorithm. 2010 Third International Symposium on Intelligent Information Technology and Security Informatics 63-7. doi: 10.1109/IITSI.2010.74
- Ng A, Jordan M, Weiss Y. On Spectral Clustering: Analysis and an algorithm. in Advances in Neural Information Processing Systems 2001; 14.
- Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008; 10008.
- Hu J, Schroeder A, Coleman K, Chen C, Auerbach BJ, Li M. Statistical and machine learning methods for spatially resolved transcriptomics with histology. Comput Struct Biotechnol J 2021; 19: 3829-41. doi: 10.1016/j.csbj.2021.06.052 PMID: 34285782
- Rudolph M, Wandt B, Rosenhahn B. Structuring Autoencoders. 2019. doi: 10.1109/ICCVW.2019.00075
- Lan K, Wang D, Fong S, Liu L, Wong KKL, Dey N. A survey of data mining and deep learning in bioinformatics. J Med Syst 2018; 42(8): 139. doi: 10.1007/s10916-018-1003-9 PMID: 29956014
- Suo Y, Liu T, Jia X, Yu F. Application of clustering analysis in brain gene data based on deep learning. IEEE Access 2019; 7: 2947-56. doi: 10.1109/ACCESS.2018.2886425
- Karlik B. Soft computing methods in bioinformatics: A comprehensive review. Mathematical and Computational Applications 2013; 18(3): 176-97. doi: 10.3390/mca18030176
- Hassanien AE, Al-Shammari ET, Ghali NI. Computational intelligence techniques in bioinformatics. Comput Biol Chem 2013; 47: 37-47. doi: 10.1016/j.compbiolchem.2013.04.007 PMID: 23891719
- Ezugwu AE, Ikotun AM, Oyelade OO, et al. A comprehensive survey of clustering algorithms: State-of-the-art machine learning applications, taxonomy, challenges, and future research prospects. Eng Appl Artif Intell 2022; 110: 104743. doi: 10.1016/j.engappai.2022.104743
- Masulli F, Mitra S. Natural computing methods in bioinformatics: A survey. Inf Fusion 2009; 10(3): 211-6. doi: 10.1016/j.inffus.2008.12.002
- Bezdek JC, Ehrlich R, Full W. FCM: The fuzzy c-means clustering algorithm. Comput Geosci 1984; 10(2-3): 191-203. doi: 10.1016/0098-3004(84)90020-7
- Wolf FA, Angerer P, Theis FJ. SCANPY: Large-scale single-cell gene expression data analysis. Genome Biol 2018; 19(1): 15. doi: 10.1186/s13059-017-1382-0 PMID: 29409532
- Hubert L, Arabie P. Comparing partitions. J Classif 1985; 2(1): 193-218. doi: 10.1007/BF01908075
- Pigino G, Morfini G, Pelsman A, Mattson MP, Brady ST, Busciglio J. Alzheimers presenilin 1 mutations impair kinesin-based axonal transport. J Neurosci 2003; 23(11): 4499-508. doi: 10.1523/JNEUROSCI.23-11-04499.2003 PMID: 12805290
- Jonsson T, Atwal JK, Steinberg S, et al. A mutation in APP protects against Alzheimers disease and age-related cognitive decline. Nature 2012; 488(7409): 96-9. doi: 10.1038/nature11283 PMID: 22801501
- Mullan M, Crawford F, Axelman K, et al. A pathogenic mutation for probable Alzheimers disease in the APP gene at the N-terminus of beta-amyloid. Nat Genet 1992; 1(5): 345-7. doi: 10.1038/ng0892-345 PMID: 1302033
- Reitz C, Cheng R, Rogaeva E, et al. Meta-analysis of the association between variants in SORL1 and Alzheimer disease. Arch Neurol 2011; 68(1): 99-106. doi: 10.1001/archneurol.2010.346 PMID: 21220680
- Sadick JS, ODea MR, Hasel P, Dykstra T, Faustin A, Liddelow SA. Astrocytes and oligodendrocytes undergo subtype-specific transcriptional changes in Alzheimers disease. Neuron 2022; 110(11): 1788-1805.e10. doi: 10.1016/j.neuron.2022.03.008 PMID: 35381189
- Chapuis J, Hansmannel F, Gistelinck M, et al. Increased expression of BIN1 mediates Alzheimer genetic risk by modulating tau pathology. Mol Psychiatry 2013; 18(11): 1225-34. doi: 10.1038/mp.2013.1 PMID: 23399914
- Goate A. Segregation of a missense mutation in the amyloid β-protein precursor gene with familial Alzheimers disease. J Alzheimers Dis 2006; 9(s3) (Suppl.): 341-7. doi: 10.3233/JAD-2006-9S338 PMID: 16914872
- Wang R, et al. DeepBIO: an automated and interpretable deep-learning platform for high-throughput biological sequence prediction, functional annotation and visualization analysis. Nucleic Acids Res 2023; 51(7): 3017-29.
- Wang R, Jin J, Zou Q, Nakai K, Wei L. Predicting protein-peptide binding residues via interpretable deep learning. Bioinformatics 2022; 38(13): 3351-60. doi: 10.1093/bioinformatics/btac352 PMID: 35604077
- Jiang Y, Wang R, Feng J, et al. Explainable deep hypergraph learning modeling the peptide secondary structure prediction. Adv Sci 2023; 10(11): 2206151. doi: 10.1002/advs.202206151 PMID: 36794291
- Wang T, Wang R, Wei L. AttenSyn: An attention-based deep graph neural network for anticancer synergistic drug combination prediction. J Chem Inf Model 2023; acs.jcim.3c00709. doi: 10.1021/acs.jcim.3c00709 PMID: 37565997
- Wang R, Feng Y, Sun M, et al. MVIL6: Accurate identification of IL-6-induced peptides using multi-view feature learning. Int J Biol Macromol 2023; 246: 125412. doi: 10.1016/j.ijbiomac.2023.125412 PMID: 37327922
- Wei Q, Wang R, Jiang Y, et al. ConPep: Prediction of peptide contact maps with pre-trained biological language model and multi-view feature extracting strategy. Comput Biol Med 2023; 167: 107631. doi: 10.1016/j.compbiomed.2023.107631 PMID: 37948966
- MartA-nez-Serra R. Alonso-Nanclares L, Cho K, Giese KP. Emerging insights into synapse dysregulation in Alzheimers disease. Brain Commun 2022; 4(2): fcac083. doi: 10.1093/braincomms/fcac083 PMID: 35652120
- de la Torre-Ubieta L, Bonni A. Transcriptional regulation of neuronal polarity and morphogenesis in the mammalian brain. Neuron 2011; 72(1): 22-40. doi: 10.1016/j.neuron.2011.09.018 PMID: 21982366
- Zhang S, Xie L, Cui Y, Carone BR, Chen Y. Detecting fear-memory-related genes from neuronal scrna-seq data by diverse distributions and bhattacharyya distance. Biomolecules 2022; 12(8): 1130. doi: 10.3390/biom12081130 PMID: 36009024
- Zhu LQ, Zheng HY, Peng CX, et al. Protein phosphatase 2A facilitates axonogenesis by dephosphorylating CRMP2. J Neurosci 2010; 30(10): 3839-48. doi: 10.1523/JNEUROSCI.5174-09.2010 PMID: 20220019
Supplementary files
