중부대학교 도서관

본문 바로가기
탑 메뉴 바로가기
주 메뉴 바로가기
하단 바로가기

상세정보

Reconsider Machine Learning Method for Variable Selection and Validation With High Dimensional Data.

자료유형: 학위논문

Control Number: 0017162676

International Standard Book Number: 9798384093374

Dewey Decimal Classification Number: 574

Main Entry-Personal Name: Liu, Lu.

Publication, Distribution, etc. (Imprint: [S.l.] : Duke University., 2024

Publication, Distribution, etc. (Imprint: Ann Arbor : ProQuest Dissertations & Theses, 2024

Physical Description: 89 p.

General Note: Source: Dissertations Abstracts International, Volume: 86-03, Section: A.

General Note: Advisor: Jung, Sin-Ho.

Dissertation Note: Thesis (Ph.D.)--Duke University, 2024.

Summary, Etc.: 요약The big data tendency influences how people think and inspires potential research directions. Recent feats of machine learning have seized collective attention because of its profound performance in conducting big data analysis including text analysis and image processing. Machine learning is also a popular topic in clinical medicine to implement analysis on electronic health records and medical image data, which traditional statistics model is not adequate for. However, we realize that machine learning is not panacea and its defects such as loss of interpretability and excess selection may restrict its application. And we must also recognize that for many clinical prediction analyses, the simpler approach-generalized linear model is enough for what we need. In this dissertation, we propose to use standard regression methods, without any penalizing approach, combined with a stepwise variable selection procedure to overcome the over-selection issue of popular machine learning methods. For model validation, we propose a permutation approach to estimate the performance of various validation methods. Finally, we propose a repeated sieving approach, extending the standard regression methods with stepwise variable selection, to handle high dimensional modeling.

Subject Added Entry-Topical Term: Biostatistics.

Subject Added Entry-Topical Term: Statistics.

Subject Added Entry-Topical Term: Bioinformatics.

Subject Added Entry-Topical Term: Information science.

Index Term-Uncontrolled: Logistic regression

Index Term-Uncontrolled: Machine learning

Index Term-Uncontrolled: Permutation approach

Index Term-Uncontrolled: Variable selection

Index Term-Uncontrolled: Validation methods

Added Entry-Corporate Name: Duke University Biostatistics and Bioinformatics Doctor of Philosophy

Host Item Entry: Dissertations Abstracts International. 86-03A.

Electronic Location and Access: 로그인을 한후 보실 수 있는 자료입니다.

Control Number: joongbu:658171

008250224s2024        us  ||||||||||||||c||eng  d
■001000017162676
■00520250211152040
■006m          o    d
■007cr#unu||||||||
■020    ▼a9798384093374
■035    ▼a(MiAaPQ)AAI31336592
■040    ▼aMiAaPQ▼cMiAaPQ
■0820  ▼a574
■1001  ▼aLiu,  Lu.
■24510▼aReconsider  Machine  Learning  Method  for  Variable  Selection  and  Validation  With  High  Dimensional  Data.
■260    ▼a[S.l.]▼bDuke  University.  ▼c2024
■260  1▼aAnn  Arbor▼bProQuest  Dissertations  &  Theses▼c2024
■300    ▼a89  p.
■500    ▼aSource:  Dissertations  Abstracts  International,  Volume:  86-03,  Section:  A.
■500    ▼aAdvisor:  Jung,  Sin-Ho.
■5021  ▼aThesis  (Ph.D.)--Duke  University,  2024.
■520    ▼aThe  big  data  tendency  influences  how  people  think  and  inspires  potential  research  directions.  Recent  feats  of  machine  learning  have  seized  collective  attention  because  of  its  profound  performance  in  conducting  big  data  analysis  including  text  analysis  and  image  processing.  Machine  learning  is  also  a  popular  topic  in  clinical  medicine  to  implement  analysis  on  electronic  health  records  and  medical  image  data,  which  traditional  statistics  model  is  not  adequate  for.  However,  we  realize  that  machine  learning  is  not  panacea  and  its  defects  such  as  loss  of  interpretability  and  excess  selection  may  restrict  its  application.  And  we  must  also  recognize  that  for  many  clinical  prediction  analyses,  the  simpler  approach-generalized  linear  model  is  enough  for  what  we  need.  In  this  dissertation,  we  propose  to  use  standard  regression  methods,  without  any  penalizing  approach,  combined  with  a  stepwise  variable  selection  procedure  to  overcome  the  over-selection  issue  of  popular  machine  learning  methods.  For  model  validation,  we  propose  a  permutation  approach  to  estimate  the  performance  of  various  validation  methods.  Finally,  we  propose  a  repeated  sieving  approach,  extending  the  standard  regression  methods  with  stepwise  variable  selection,  to  handle  high  dimensional  modeling.
■590    ▼aSchool  code:  0066.
■650  4▼aBiostatistics.
■650  4▼aStatistics.
■650  4▼aBioinformatics.
■650  4▼aInformation  science.
■653    ▼aLogistic  regression
■653    ▼aMachine  learning
■653    ▼aPermutation  approach
■653    ▼aVariable  selection
■653    ▼aValidation  methods
■690    ▼a0308
■690    ▼a0723
■690    ▼a0715
■690    ▼a0463
■71020▼aDuke  University▼bBiostatistics  and  Bioinformatics  Doctor  of  Philosophy.
■7730  ▼tDissertations  Abstracts  International▼g86-03A.
■790    ▼a0066
■791    ▼aPh.D.
■792    ▼a2024
■793    ▼aEnglish
■85640▼uhttp://www.riss.kr/pdu/ddodLink.do?id=T17162676▼nKERIS▼z이  자료의  원문은  한국교육학술정보원에서  제공합니다.

New Books MORE

Related books MORE

최근 3년간 통계입니다.

预订
캠퍼스간 도서대출
서가에 없는 책 신고
보존서고대출신청
我的文件夹

材料
注册编号	呼叫号码.	收藏	状态	借信息.
TQ0034489	T	원문자료	열람가능/출력가능	열람가능/출력가능 마이폴더 부재도서신고

*保留在借用的书可用。预订，请点击预订按钮

본문

서브메뉴

검색

상세정보

MARC

미리보기

내보내기

chatGPT토론

Ai 추천 관련 도서

New Books MORE

Related books MORE

최근 3년간 통계입니다.

高级搜索信息

해당 도서를 다른 이용자가 함께 대출한 도서

Related books

Related Popular Books

도서위치

QUICK LINK