서브메뉴
검색
Using Family Sequencing Data to Understand Sequencing Errors, Meiotic Crossovers, and Disease Risk- [electronic resource]
Using Family Sequencing Data to Understand Sequencing Errors, Meiotic Crossovers, and Disease Risk- [electronic resource]
상세정보
- 자료유형
- 학위논문
- Control Number
- 0016931973
- International Standard Book Number
- 9798379658496
- Dewey Decimal Classification Number
- 300
- Main Entry-Personal Name
- Paskov, Kelley Marie.
- Publication, Distribution, etc. (Imprint
- [S.l.] : Stanford University., 2022
- Publication, Distribution, etc. (Imprint
- Ann Arbor : ProQuest Dissertations & Theses, 2022
- Physical Description
- 1 online resource(87 p.)
- General Note
- Source: Dissertations Abstracts International, Volume: 84-12, Section: A.
- General Note
- Advisor: Hastie, Trevor;Sabatti, Chiara;Wall, Dennis.
- Dissertation Note
- Thesis (Ph.D.)--Stanford University, 2022.
- Restrictions on Access Note
- This item must not be sold to any third party vendors.
- Summary, Etc.
- 요약Despite widespread sequencing efforts, the genetic etiologies of many complex diseases remain poorly understood. One explanation is that the recent explosive population growth of humans has dramatically increased the impact of rare variation on complex traits. These variants are unlikely to be in strong linkage disequilibrium with their neighbors, and thus are invisible to association-based approaches. In this work, we show that family-based linkage methods, when adapted to handle the large sample sizes and dense marker sets that are now available, provide an opportunity to find and understand these otherwise hidden variants. Linkage methods do not rely on linkage disequilibrium in a population, and instead exploit genetic inheritance in families to identify risk regions, even when causal variants are unobserved and are not in linkage disequilibrium with nearby markers. We have developed a series of methods that use large cohorts of family-based sequencing datasets to better understand sequencing error rates, meiotic crossovers, and disease risk. First, we show that familial relationships can be leveraged to estimate sample-level estimates of sequencing error rates. These error rates can be used to evaluate variant calling pipelines and to compare their error rates in different genomic contexts. Next, we develop a hidden Markov model that identifies meiotic crossovers, shared genetic material between siblings, and inherited deletions in families. Our algorithm is specifically designed to handle the complexity of whole-genome sequencing data and is able to uncover meiotic crossovers with 10x better resolution than existing microarray-based methods. Finally, because our algorithm produces nearly complete ( 99%) genome-wide identity-by-descent (IBD) status between siblings, we develop a genome-wide sibling-pair linkage test which leverages sibling IBD to identify genomic regions harboring risk variants. This method not only increases detection power for rare risk variants, but also enables the use of microarrays which are widely and affordably available in the consumer market. Applying our method to crowdsourced autism families who have taken Ancestry.com DNA tests, we identify two significant autism risk regions which we validate with a separate and independent microarray dataset. While family-based approaches to marker detection have taken a back seat to case-control cohort-based approaches in the last 15 years of human genetics, here we show how returning our attention to families provides the power to uncover key events in the genome that cannot be detected otherwise. This thesis provides a framework for extending family-based linkage analysis into the era of next-generation sequencing in order to increase our understanding of genetic risk factors for complex diseases.
- Subject Added Entry-Topical Term
- Parents & parenting.
- Subject Added Entry-Topical Term
- Maps.
- Subject Added Entry-Topical Term
- Chromosomes.
- Subject Added Entry-Topical Term
- Autistic children.
- Subject Added Entry-Topical Term
- Families & family life.
- Subject Added Entry-Topical Term
- Software.
- Subject Added Entry-Topical Term
- Genomes.
- Subject Added Entry-Topical Term
- Algorithms.
- Subject Added Entry-Topical Term
- Siblings.
- Subject Added Entry-Topical Term
- Computer science.
- Subject Added Entry-Topical Term
- Genetics.
- Subject Added Entry-Topical Term
- Individual & family studies.
- Added Entry-Corporate Name
- Stanford University.
- Host Item Entry
- Dissertations Abstracts International. 84-12A.
- Host Item Entry
- Dissertation Abstract International
- Electronic Location and Access
- 로그인을 한후 보실 수 있는 자료입니다.
- Control Number
- joongbu:640669
MARC
008240220s2022 ulk 00 kor■001000016931973
■00520240214100355
■006m o d
■007cr#unu||||||||
■020 ▼a9798379658496
■035 ▼a(MiAaPQ)AAI30462692
■035 ▼a(MiAaPQ)STANFORDzd828ty8201
■040 ▼aMiAaPQ▼cMiAaPQ
■0820 ▼a300
■1001 ▼aPaskov, Kelley Marie.
■24510▼aUsing Family Sequencing Data to Understand Sequencing Errors, Meiotic Crossovers, and Disease Risk▼h[electronic resource]
■260 ▼a[S.l.]▼bStanford University. ▼c2022
■260 1▼aAnn Arbor▼bProQuest Dissertations & Theses▼c2022
■300 ▼a1 online resource(87 p.)
■500 ▼aSource: Dissertations Abstracts International, Volume: 84-12, Section: A.
■500 ▼aAdvisor: Hastie, Trevor;Sabatti, Chiara;Wall, Dennis.
■5021 ▼aThesis (Ph.D.)--Stanford University, 2022.
■506 ▼aThis item must not be sold to any third party vendors.
■520 ▼aDespite widespread sequencing efforts, the genetic etiologies of many complex diseases remain poorly understood. One explanation is that the recent explosive population growth of humans has dramatically increased the impact of rare variation on complex traits. These variants are unlikely to be in strong linkage disequilibrium with their neighbors, and thus are invisible to association-based approaches. In this work, we show that family-based linkage methods, when adapted to handle the large sample sizes and dense marker sets that are now available, provide an opportunity to find and understand these otherwise hidden variants. Linkage methods do not rely on linkage disequilibrium in a population, and instead exploit genetic inheritance in families to identify risk regions, even when causal variants are unobserved and are not in linkage disequilibrium with nearby markers. We have developed a series of methods that use large cohorts of family-based sequencing datasets to better understand sequencing error rates, meiotic crossovers, and disease risk. First, we show that familial relationships can be leveraged to estimate sample-level estimates of sequencing error rates. These error rates can be used to evaluate variant calling pipelines and to compare their error rates in different genomic contexts. Next, we develop a hidden Markov model that identifies meiotic crossovers, shared genetic material between siblings, and inherited deletions in families. Our algorithm is specifically designed to handle the complexity of whole-genome sequencing data and is able to uncover meiotic crossovers with 10x better resolution than existing microarray-based methods. Finally, because our algorithm produces nearly complete ( 99%) genome-wide identity-by-descent (IBD) status between siblings, we develop a genome-wide sibling-pair linkage test which leverages sibling IBD to identify genomic regions harboring risk variants. This method not only increases detection power for rare risk variants, but also enables the use of microarrays which are widely and affordably available in the consumer market. Applying our method to crowdsourced autism families who have taken Ancestry.com DNA tests, we identify two significant autism risk regions which we validate with a separate and independent microarray dataset. While family-based approaches to marker detection have taken a back seat to case-control cohort-based approaches in the last 15 years of human genetics, here we show how returning our attention to families provides the power to uncover key events in the genome that cannot be detected otherwise. This thesis provides a framework for extending family-based linkage analysis into the era of next-generation sequencing in order to increase our understanding of genetic risk factors for complex diseases.
■590 ▼aSchool code: 0212.
■650 4▼aParents & parenting.
■650 4▼aMaps.
■650 4▼aChromosomes.
■650 4▼aAutistic children.
■650 4▼aFamilies & family life.
■650 4▼aSoftware.
■650 4▼aGenomes.
■650 4▼aAlgorithms.
■650 4▼aSiblings.
■650 4▼aComputer science.
■650 4▼aGenetics.
■650 4▼aIndividual & family studies.
■690 ▼a0984
■690 ▼a0369
■690 ▼a0628
■71020▼aStanford University.
■7730 ▼tDissertations Abstracts International▼g84-12A.
■773 ▼tDissertation Abstract International
■790 ▼a0212
■791 ▼aPh.D.
■792 ▼a2022
■793 ▼aEnglish
■85640▼uhttp://www.riss.kr/pdu/ddodLink.do?id=T16931973▼nKERIS▼z이 자료의 원문은 한국교육학술정보원에서 제공합니다.
■980 ▼a202402▼f2024