서브메뉴
검색
Computational Approaches to Understand Mechanisms of Human Genetic Disorders.
Computational Approaches to Understand Mechanisms of Human Genetic Disorders.
상세정보
- 자료유형
- 학위논문
- Control Number
- 0017164663
- International Standard Book Number
- 9798342761796
- Dewey Decimal Classification Number
- 590
- Main Entry-Personal Name
- Zhong, Guojie.
- Publication, Distribution, etc. (Imprint
- [S.l.] : Columbia University., 2024
- Publication, Distribution, etc. (Imprint
- Ann Arbor : ProQuest Dissertations & Theses, 2024
- Physical Description
- 151 p.
- General Note
- Source: Dissertations Abstracts International, Volume: 86-05, Section: B.
- General Note
- Advisor: Shen, Yufeng.
- Dissertation Note
- Thesis (Ph.D.)--Columbia University, 2024.
- Summary, Etc.
- 요약Human genetics is one of the strongest risk factors for complex diseases. Understanding the effects of genetic variations not only serves as a fundamental approach to studying disease mechanisms but also offers unprecedented opportunities for improved clinical screening, disease diagnosis and therapeutic discoveries. Despite decades of extensive DNA sequencing and genetic research involving large cohorts, two major challenges remain. First, the majority of disease risk genes remain unidentified due to limited statistical power. Second, the functional effects of rare variants, especially missense variants, in disease risk genes are understudied. In this thesis, I describe new computational approaches to address those challenges using statistical genetics and machine learning methods implementing intuition of biological mechanisms.First, I worked on a statistical framework that can identify disease related pathways from de novo coding variants data. I applied this framework to study the genetics of esophageal atresia / tracheoesophageal fistula (EA/TEF) and identified several potential disease causal pathways that involved in endosome trafficking. Next, I developed a new method to identifying disease risk genes by integrating genetic (rare de novo variants) and functional genomics data. Identifying risk genes using rare variants typically has low statistical power due to the rarity of genotype data. Using functional genomics data has the potential to address this challenge as it serves as informative priors of disease risk. Therefore, I developed a statistical method called VBASS. VBASS is a semi-supervised algorithm that uses a neural network to encode biological priors, such as cell type-specific expression values, into a rigorous Bayesian statistical model to increase statistical power. On simulated data, VBASS demonstrated proper error rate control and better power than current state-of-the-art methods. We applied VBASS to congenital heart disease (CHD) and autism spectrum disorder (ASD), identifying several novel disease risk genes along with their associated cell types.Finally, I focused on predicting the functional mechanisms of missense variants that cause diseases. Pathogenic missense variants may act through different modes of action (e.g., gain-of-function or loss-of-function) by affecting various aspects of protein function. These variants may result in distinct clinical conditions requiring different treatments, yet current computational tools cannot distinguish between them because their predictions heavily relied on evolutional conservation data. The recent breakthrough of AI-powered protein structure prediction tools provides an opportunity to address this challenge because the functional mechanisms of variants is intrinsically embedded in its structural properties. Therefore, I developed a deep learning method called PreMode. PreMode is a pretrained SE(3)-equivariant graph neural network model designed to capture the effects of missense variants from their structural contexts and evolutionary information. I pretrained PreMode using labeled pathogenicity data to enable the model to learn a general representation of variant effects, followed by protein-specific transfer learning to predict mode-of-action effects. I applied PreMode to the mode-of-action predictions of 17 genes and demonstrated that PreMode achieved state-of-the-art performance compared to existing models. PreMode has various applications, including identifying novel gain/loss-of-function variants, improving the study design of deep mutational scans and optimization in protein engineering.
- Subject Added Entry-Topical Term
- Systematic biology.
- Subject Added Entry-Topical Term
- Genetics.
- Subject Added Entry-Topical Term
- Biostatistics.
- Subject Added Entry-Topical Term
- Bioinformatics.
- Index Term-Uncontrolled
- Birth defects
- Index Term-Uncontrolled
- Computational biology
- Index Term-Uncontrolled
- Developmental disorders
- Index Term-Uncontrolled
- Human genetics
- Index Term-Uncontrolled
- Machine learning
- Added Entry-Corporate Name
- Columbia University Cellular Molecular and Biomedical Studies
- Host Item Entry
- Dissertations Abstracts International. 86-05B.
- Electronic Location and Access
- 로그인을 한후 보실 수 있는 자료입니다.
- Control Number
- joongbu:654746
MARC
008250224s2024 us ||||||||||||||c||eng d■001000017164663
■00520250211153029
■006m o d
■007cr#unu||||||||
■020 ▼a9798342761796
■035 ▼a(MiAaPQ)AAI31634638
■040 ▼aMiAaPQ▼cMiAaPQ
■0820 ▼a590
■1001 ▼aZhong, Guojie.
■24510▼aComputational Approaches to Understand Mechanisms of Human Genetic Disorders.
■260 ▼a[S.l.]▼bColumbia University. ▼c2024
■260 1▼aAnn Arbor▼bProQuest Dissertations & Theses▼c2024
■300 ▼a151 p.
■500 ▼aSource: Dissertations Abstracts International, Volume: 86-05, Section: B.
■500 ▼aAdvisor: Shen, Yufeng.
■5021 ▼aThesis (Ph.D.)--Columbia University, 2024.
■520 ▼aHuman genetics is one of the strongest risk factors for complex diseases. Understanding the effects of genetic variations not only serves as a fundamental approach to studying disease mechanisms but also offers unprecedented opportunities for improved clinical screening, disease diagnosis and therapeutic discoveries. Despite decades of extensive DNA sequencing and genetic research involving large cohorts, two major challenges remain. First, the majority of disease risk genes remain unidentified due to limited statistical power. Second, the functional effects of rare variants, especially missense variants, in disease risk genes are understudied. In this thesis, I describe new computational approaches to address those challenges using statistical genetics and machine learning methods implementing intuition of biological mechanisms.First, I worked on a statistical framework that can identify disease related pathways from de novo coding variants data. I applied this framework to study the genetics of esophageal atresia / tracheoesophageal fistula (EA/TEF) and identified several potential disease causal pathways that involved in endosome trafficking. Next, I developed a new method to identifying disease risk genes by integrating genetic (rare de novo variants) and functional genomics data. Identifying risk genes using rare variants typically has low statistical power due to the rarity of genotype data. Using functional genomics data has the potential to address this challenge as it serves as informative priors of disease risk. Therefore, I developed a statistical method called VBASS. VBASS is a semi-supervised algorithm that uses a neural network to encode biological priors, such as cell type-specific expression values, into a rigorous Bayesian statistical model to increase statistical power. On simulated data, VBASS demonstrated proper error rate control and better power than current state-of-the-art methods. We applied VBASS to congenital heart disease (CHD) and autism spectrum disorder (ASD), identifying several novel disease risk genes along with their associated cell types.Finally, I focused on predicting the functional mechanisms of missense variants that cause diseases. Pathogenic missense variants may act through different modes of action (e.g., gain-of-function or loss-of-function) by affecting various aspects of protein function. These variants may result in distinct clinical conditions requiring different treatments, yet current computational tools cannot distinguish between them because their predictions heavily relied on evolutional conservation data. The recent breakthrough of AI-powered protein structure prediction tools provides an opportunity to address this challenge because the functional mechanisms of variants is intrinsically embedded in its structural properties. Therefore, I developed a deep learning method called PreMode. PreMode is a pretrained SE(3)-equivariant graph neural network model designed to capture the effects of missense variants from their structural contexts and evolutionary information. I pretrained PreMode using labeled pathogenicity data to enable the model to learn a general representation of variant effects, followed by protein-specific transfer learning to predict mode-of-action effects. I applied PreMode to the mode-of-action predictions of 17 genes and demonstrated that PreMode achieved state-of-the-art performance compared to existing models. PreMode has various applications, including identifying novel gain/loss-of-function variants, improving the study design of deep mutational scans and optimization in protein engineering.
■590 ▼aSchool code: 0054.
■650 4▼aSystematic biology.
■650 4▼aGenetics.
■650 4▼aBiostatistics.
■650 4▼aBioinformatics.
■653 ▼aBirth defects
■653 ▼aComputational biology
■653 ▼aDevelopmental disorders
■653 ▼aHuman genetics
■653 ▼aMachine learning
■690 ▼a0423
■690 ▼a0369
■690 ▼a0715
■690 ▼a0308
■71020▼aColumbia University▼bCellular, Molecular and Biomedical Studies.
■7730 ▼tDissertations Abstracts International▼g86-05B.
■790 ▼a0054
■791 ▼aPh.D.
■792 ▼a2024
■793 ▼aEnglish
■85640▼uhttp://www.riss.kr/pdu/ddodLink.do?id=T17164663▼nKERIS▼z이 자료의 원문은 한국교육학술정보원에서 제공합니다.
미리보기
내보내기
chatGPT토론
Ai 추천 관련 도서
Подробнее информация.
- Бронирование
- 캠퍼스간 도서대출
- 서가에 없는 책 신고
- моя папка