서브메뉴
검색
Privacy-Enhanced Learning and Inference With Distributed Clinical Datasets.
Privacy-Enhanced Learning and Inference With Distributed Clinical Datasets.
상세정보
- 자료유형
- 학위논문
- Control Number
- 0017164346
- International Standard Book Number
- 9798384041795
- Dewey Decimal Classification Number
- 614
- Main Entry-Personal Name
- Hu, Mengtong.
- Publication, Distribution, etc. (Imprint
- [S.l.] : University of Michigan., 2024
- Publication, Distribution, etc. (Imprint
- Ann Arbor : ProQuest Dissertations & Theses, 2024
- Physical Description
- 124 p.
- General Note
- Source: Dissertations Abstracts International, Volume: 86-03, Section: B.
- General Note
- Advisor: Song, Peter X. K.;Shi, Xu.
- Dissertation Note
- Thesis (Ph.D.)--University of Michigan, 2024.
- Summary, Etc.
- 요약The integration of data collected from multiple clinical centers can enhance the statistical power of analysis and the generalizability of findings. It is known that merging subject-level data from individual centers for centralized analyses is often logistically non-trivial and may be restricted by data privacy concerns and lawful protection. In practice, this data management task can be rather time-consuming and thus possibly delays scientific discovery. Such a challenge is amplified when data at some centers are of low quantity, leading to unreliable meta-analyses, because associated local estimates may not be properly generated by such data sets. To overcome this issue, we propose several new solutions in that we can perform efficient statistical analyses of multi-center data while protecting patient-level information privacy. Chapter II develops a collaborative average treatment effect inference framework for a multicenter clinical trial to study basal insulin's effect on reducing post-transplantation diabetes mellitus. Our proposed method relies on sequential processing of summary data rather than merging patient-level data. The proposed sequential analytic method delivers an efficient inverse propensity weighting (IPW) estimation of the marginal differential treatment effects between two treatment arms. The statistical efficiency is achieved as the proposed estimation enjoys the convergence rate at the order of the cumulative sample size of all centers involved in the trial. We show theoretically and numerically that this new distributed inference approach has little loss of statistical power compared to the centralized method based on the entire data. Chapter III extends the distributed inference framework to estimate hazard ratios in the Cox proportional hazards model with no need for centralized data access and risk-set construction through maximum likelihood estimation, instead of partial likelihood estimation. The proposed method nonparametrically estimates the baseline hazard function and avoids aggregating individual-level data on the formation of risk sets. Of note, risk-set construction has an ample risk of leaking individual patient information which is unacceptable. The proposed approach of distributed likelihood estimation only shares summary statistics with no reliance on risk sets. We establish large-sample properties of the proposed method and illustrate its performance through simulation experiments and a real-world data example of kidney transplantation in the Organ Procurement and Transplantation Network to understand risk factors associated with 5-year death-censored graft failure for patients who underwent kidney transplants in the USA. Chapter IV concerns a collaborative framework for the Accelerated Failure Time (AFT) model, a popular alternative to the Cox model for the analysis of time-to-failure data. The AFT model directly accounts for the effects of the covariates on times to failure, rather than on hazard functions, thus the assumption of proportional hazards is not required. Consequently, it provides more flexibility in data aggregation than the Cox model. Our proposed distributed inference method focuses on a class of parametric AFT models with Weibull, log-normal, and log-logistic distributions for time-to-event outcomes, in which a distributed likelihood ratio test is established under the generalized gamma distribution to assess the goodness-of-fit across different candidate parametric models. We present large-sample properties for the proposed method and illustrate their performance through simulation experiments and a real-world data example on kidney transplantation.
- Subject Added Entry-Topical Term
- Public health.
- Subject Added Entry-Topical Term
- Statistics.
- Subject Added Entry-Topical Term
- Biostatistics.
- Subject Added Entry-Topical Term
- Bioinformatics.
- Index Term-Uncontrolled
- Distributed inference
- Index Term-Uncontrolled
- Federated learning
- Index Term-Uncontrolled
- Data privacy
- Index Term-Uncontrolled
- Collaborative inference
- Index Term-Uncontrolled
- Survival analysis
- Index Term-Uncontrolled
- Causal inference
- Added Entry-Corporate Name
- University of Michigan Biostatistics
- Host Item Entry
- Dissertations Abstracts International. 86-03B.
- Electronic Location and Access
- 로그인을 한후 보실 수 있는 자료입니다.
- Control Number
- joongbu:657248
MARC
008250224s2024 us ||||||||||||||c||eng d■001000017164346
■00520250211152951
■006m o d
■007cr#unu||||||||
■020 ▼a9798384041795
■035 ▼a(MiAaPQ)AAI31631038
■035 ▼a(MiAaPQ)umichrackham005755
■040 ▼aMiAaPQ▼cMiAaPQ
■0820 ▼a614
■1001 ▼aHu, Mengtong.
■24510▼aPrivacy-Enhanced Learning and Inference With Distributed Clinical Datasets.
■260 ▼a[S.l.]▼bUniversity of Michigan. ▼c2024
■260 1▼aAnn Arbor▼bProQuest Dissertations & Theses▼c2024
■300 ▼a124 p.
■500 ▼aSource: Dissertations Abstracts International, Volume: 86-03, Section: B.
■500 ▼aAdvisor: Song, Peter X. K.;Shi, Xu.
■5021 ▼aThesis (Ph.D.)--University of Michigan, 2024.
■520 ▼aThe integration of data collected from multiple clinical centers can enhance the statistical power of analysis and the generalizability of findings. It is known that merging subject-level data from individual centers for centralized analyses is often logistically non-trivial and may be restricted by data privacy concerns and lawful protection. In practice, this data management task can be rather time-consuming and thus possibly delays scientific discovery. Such a challenge is amplified when data at some centers are of low quantity, leading to unreliable meta-analyses, because associated local estimates may not be properly generated by such data sets. To overcome this issue, we propose several new solutions in that we can perform efficient statistical analyses of multi-center data while protecting patient-level information privacy. Chapter II develops a collaborative average treatment effect inference framework for a multicenter clinical trial to study basal insulin's effect on reducing post-transplantation diabetes mellitus. Our proposed method relies on sequential processing of summary data rather than merging patient-level data. The proposed sequential analytic method delivers an efficient inverse propensity weighting (IPW) estimation of the marginal differential treatment effects between two treatment arms. The statistical efficiency is achieved as the proposed estimation enjoys the convergence rate at the order of the cumulative sample size of all centers involved in the trial. We show theoretically and numerically that this new distributed inference approach has little loss of statistical power compared to the centralized method based on the entire data. Chapter III extends the distributed inference framework to estimate hazard ratios in the Cox proportional hazards model with no need for centralized data access and risk-set construction through maximum likelihood estimation, instead of partial likelihood estimation. The proposed method nonparametrically estimates the baseline hazard function and avoids aggregating individual-level data on the formation of risk sets. Of note, risk-set construction has an ample risk of leaking individual patient information which is unacceptable. The proposed approach of distributed likelihood estimation only shares summary statistics with no reliance on risk sets. We establish large-sample properties of the proposed method and illustrate its performance through simulation experiments and a real-world data example of kidney transplantation in the Organ Procurement and Transplantation Network to understand risk factors associated with 5-year death-censored graft failure for patients who underwent kidney transplants in the USA. Chapter IV concerns a collaborative framework for the Accelerated Failure Time (AFT) model, a popular alternative to the Cox model for the analysis of time-to-failure data. The AFT model directly accounts for the effects of the covariates on times to failure, rather than on hazard functions, thus the assumption of proportional hazards is not required. Consequently, it provides more flexibility in data aggregation than the Cox model. Our proposed distributed inference method focuses on a class of parametric AFT models with Weibull, log-normal, and log-logistic distributions for time-to-event outcomes, in which a distributed likelihood ratio test is established under the generalized gamma distribution to assess the goodness-of-fit across different candidate parametric models. We present large-sample properties for the proposed method and illustrate their performance through simulation experiments and a real-world data example on kidney transplantation.
■590 ▼aSchool code: 0127.
■650 4▼aPublic health.
■650 4▼aStatistics.
■650 4▼aBiostatistics.
■650 4▼aBioinformatics.
■653 ▼aDistributed inference
■653 ▼aFederated learning
■653 ▼aData privacy
■653 ▼aCollaborative inference
■653 ▼aSurvival analysis
■653 ▼aCausal inference
■690 ▼a0308
■690 ▼a0463
■690 ▼a0573
■690 ▼a0769
■690 ▼a0715
■71020▼aUniversity of Michigan▼bBiostatistics.
■7730 ▼tDissertations Abstracts International▼g86-03B.
■790 ▼a0127
■791 ▼aPh.D.
■792 ▼a2024
■793 ▼aEnglish
■85640▼uhttp://www.riss.kr/pdu/ddodLink.do?id=T17164346▼nKERIS▼z이 자료의 원문은 한국교육학술정보원에서 제공합니다.