Investigating synthetic medical time-series resemblance

Karan Bhanot; Joseph Pedersen; Isabelle Guyon; Kristin P Bennett

doi:10.1016/j.neucom.2022.04.097

Article Dans Une Revue Neurocomputing Année : 2022

Investigating synthetic medical time-series resemblance

(1) , (1) , (2, 3) , (1)

1
2
3

Karan Bhanot

Fonction : Auteur
PersonId : 1354183

Rensselaer Polytechnic Institute

Joseph Pedersen

Fonction : Auteur

Rensselaer Polytechnic Institute

Isabelle Guyon

Fonction : Auteur
PersonId : 963159

TAckling the Underspecified

Chalearn

Kristin P Bennett

Fonction : Auteur
PersonId : 924690

Rensselaer Polytechnic Institute

Résumé

Access to private medical data is restricted due to privacy laws, hindering research and real-world use. Synthetic data generation provides a viable solution by generating data with high utility and privacy protection without releasing the real data. Healthcare data records are often longitudinal in nature, being affected by covariates like age, gender, ethnicity, etc. As a result, synthetic healthcare data generation falls in the domain of time-series modeling and requires time-series based measures to investigate real and synthetic data resemblance. Covariate plots can be used for qualitative time-series resemblance but lack an empirical quantitative measure, thus, resulting in interpretations biased towards viewer's perspective. In this paper, we describe four time-series metrics to quantitatively evaluate the real and synthetic time-series resemblance on datasets from previously published healthcare research studies, both public and private. We apply the metrics on covariate plots for synthetic datasets to investigate the resemblance and compare the results with baseline synthetic datasets. We infer that the metrics effectively capture the time-series resemblance between real and synthetic datasets. The results highlight varying degrees of resemblance across subgroups of covariates and multivariate time-series.

Mots clés

Synthetic Time-series Covariate Resemblance Healthcare Medical

Domaines

Informatique [cs]

Bénédicte Daly : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04465658

Soumis le : lundi 19 février 2024-14:10:44

Dernière modification le : jeudi 22 février 2024-03:21:11

Dates et versions

hal-04465658 , version 1 (19-02-2024)

Identifiants

HAL Id : hal-04465658 , version 1
DOI : 10.1016/j.neucom.2022.04.097

Citer

Karan Bhanot, Joseph Pedersen, Isabelle Guyon, Kristin P Bennett. Investigating synthetic medical time-series resemblance. Neurocomputing, 2022, 494, pp.368-378. ⟨10.1016/j.neucom.2022.04.097⟩. ⟨hal-04465658⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS INRIA CENTRALESUPELEC INRIA2 UNIV-PARIS-SACLAY LISN GS-COMPUTER-SCIENCE LISN-AO

3 Consultations

3 Téléchargements

Investigating synthetic medical time-series resemblance

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager