Explore ways to support patient-level longitudinal multivariate analysis #290

karafecho · 2023-10-13T22:36:00Z

Having implemented an approach to support cohort- and study period-level longitudinal multivariate analysis, i.e., by allowing users to select year (i.e., study period) as an input feature in a multivariate request, this issue is to suggest that we explore ways to support patient-level longitudinal multivariate analysis. The approach that I had originally conceived was to allow users to select PatientID (i.e., the dummy variable that links patients across years / study periods). This would allow users to retrieve a subset of the underlying deidentified integrated feature table. However, the approach is not computationally feasible, given the large patient sample sizes (e.g., roughly 160,000 total patients in asthma cohort).

One approach might be to put a cap on the cohort size for which users are allowed to include PatientID as an input feature in a multivariate request. To implement this, we could (1) return an error when users attempt to include PatientID as an input feature in a multivariate request AND request to do so for a cohort of size TBD and (2) update the documentation to reflect the limitation. While this approach seems relatively straightforward, it also seems rather arbitrary and statistically unsound.

Another approach might be to create a new multivariate endpoint, one that accepts the following user input: (1) a primary outcome / dependent variable, (2) a set of predictors / independent variables, (3) an optional factor(s) to control for repeated observations (e.g., PatientID, year), and (3) a desired multivariate model (e.g., GLM, conditional random forest). The model would then be applied to the data on the backend, and the endpoint would return model output. This approach may work, although (1) we would have to develop general-purpose models and (2) the run time may be slow, but that's a lesser concern, IMO.

The text was updated successfully, but these errors were encountered:

karafecho · 2023-10-18T16:31:52Z

Per discussion with Hong, 10.18.2023: Maybe include PatientID as input parameter, similar to year? PatientID=1 or PatientID=1-10.

karafecho · 2023-10-19T15:26:13Z

Related to #286

karafecho assigned hyi Oct 13, 2023

karafecho mentioned this issue Oct 19, 2023

Modify multivariate endpoint to recognize PatientID #286

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Explore ways to support patient-level longitudinal multivariate analysis #290

Explore ways to support patient-level longitudinal multivariate analysis #290

karafecho commented Oct 13, 2023 •

edited

Loading

karafecho commented Oct 18, 2023

karafecho commented Oct 19, 2023

Explore ways to support patient-level longitudinal multivariate analysis #290

Explore ways to support patient-level longitudinal multivariate analysis #290

Comments

karafecho commented Oct 13, 2023 • edited Loading

karafecho commented Oct 18, 2023

karafecho commented Oct 19, 2023

karafecho commented Oct 13, 2023 •

edited

Loading