Skip to content

Commit

Permalink
Merge pull request #266 from rcpch/eatyourpeas/issue265
Browse files Browse the repository at this point in the history
fake-patients
  • Loading branch information
anchit-chandran authored Oct 28, 2024
2 parents 6d5cb72 + b2bf546 commit 65fd924
Show file tree
Hide file tree
Showing 14 changed files with 1,838 additions and 38 deletions.
238 changes: 238 additions & 0 deletions documentation/docs/developer/fake_patient_generator.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@
# `FakePatientCreator`

The `FakePatientCreator` is a utility class designed to generate realistic test data for patient objects and their corresponding visits. This guide will demonstrate how to use it within Django tests to create patients and associated visit data while ensuring specific attributes are valid, such as the date ranges and age categories.

## Overview

`FakePatientCreator` uses:

- A defined **audit period** for setting important date fields such as patient **date of birth**, **diagnosis date**, and **visit dates**.
- Random generation of visit types and allocation of patients to ensure realistic test scenarios.

The following steps provide an example test that ensures patients' ages fall within the expected range and visits are generated accordingly.

## Usage

### Initialise with Audit Period

We initialise the FakePatientCreator with a specific audit period as
this is used throughout the creation of fake patients and visits
including:

- setting random date of birth so max age by audit_start_date is valid
for the given AgeRange. Also, diagnosis_date is between
date_of_birth and audit_start_date.

- setting random diagnosis date so it is before audit_start_date

- Visit dates are spread evenly throughout each quarter of the audit
period. For each Visit, the date is randomly set within its
quarter's date range.

```python
from project.npda.general_functions.data_generator_extended import (
FakePatientCreator,
HbA1cTargetRange,
VisitType,
)

# Set necessary attributes to calibrate all dates
DATE_IN_AUDIT = date(2024, 4, 1)
audit_start_date, audit_end_date = get_audit_period_for_date(DATE_IN_AUDIT)

fake_patient_creator = FakePatientCreator(
audit_start_date=audit_start_date,
audit_end_date=audit_end_date,
)
```

The `FakePatientCreator` provides utility methods to generate and save test data for patients and their corresponding visits. This guide describes the usage of the following three key methods:

1. `build_fake_patients`: Builds fake patient objects.
2. `build_fake_visits`: Builds fake visit objects for patients.
3. `create_and_save_fake_patients`: Combines patient and visit creation and saves them to the database.

### `build_fake_patients`

The `build_fake_patients` method generates a list of `n` fake patient objects but does **not** save them to the database. This is useful when you want to create patients but manipulate them further before committing them to the database.

#### Method Signature

```python
def build_fake_patients(
self,
n: int,
age_range: AgeRange,
**patient_kwargs,
) -> list[Patient]:
```

**Parameters**:

- n (int): The number of patients to create.
- age_range (AgeRange): The age range to assign to the patients (e.g., AgeRange.AGE_0_4).
- \*\*patient_kwargs (optional): Additional keyword arguments to pass to the PatientFactory, which allows customising fields like postcode.

**Example Usage:**

```python
# Create 10 fake patients within the age range 0-4
patients = fake_patient_creator.build_fake_patients(
n=10,
age_range=AgeRange.AGE_0_4,
postcode="fake_postcode", # Customise as needed
)
```

**Returns:**

- A list of Patient objects that are not yet saved to the database.

### `build_fake_visits`

The `build_fake_visits` method generates fake Visit objects for each patient and distributes these visits across different quarters of the audit period.

#### Method Signature

```python
def build_fake_visits(
self,
patients: list[Patient],
age_range: AgeRange,
hb1ac_target_range: HbA1cTargetRange = HbA1cTargetRange.TARGET,
visit_types: list[VisitType] = DEFAULT_VISIT_TYPE,
**visit_kwargs,
) -> list[Visit]:
```

**Parameters:**

- patients (list[Patient]): The list of patients for whom the visits are being created.
- age_range (AgeRange): The age range of the patients to guide the visit characteristics.
- hb1ac_target_range (HbA1cTargetRange, optional): The HbA1c target range for the visits, defaults to TARGET.
- visit_types (list[VisitType], optional): A list of visit types to be assigned to each patient, e.g., VisitType.CLINIC, VisitType.ANNUAL_REVIEW.
- \*\*visit_kwargs (optional): Additional keyword arguments for customising the visit creation, such as is_valid=True.

**Method Behavior:**

- Visits are distributed evenly across the quarters of the audit period. For example, if 12 visits are assigned, 3 will occur in each quarter.
- For each quarter, visit dates are randomly assigned within the quarter's date range.

#### Example Usage:

```python
# Generate 12 random visit types
VISIT_TYPES = generate_random_visit_types(n=12)

# Build visits for patients
visits = fake_patient_creator.build_fake_visits(
patients=patients,
age_range=AgeRange.AGE_0_4,
hb1ac_target_range=HbA1cTargetRange.WELL_ABOVE,
visit_types=VISIT_TYPES,
is_valid=True # Customise additional fields if necessary
)
```

Returns a list of Visit objects corresponding to the patients.

### `create_and_save_fake_patients`

The `create_and_save_fake_patients` method handles both patient and visit creation in a single process and saves them to the database. It bulk creates the patients and their associated visits to improve performance.

#### Method Signature

```python
def create_and_save_fake_patients(
self,
n: int,
age_range: AgeRange,
hb1ac_target_range: HbA1cTargetRange = HbA1cTargetRange.TARGET,
visit_types: list[VisitType] = DEFAULT_VISIT_TYPE,
**patient_kwargs,
) -> list[Patient]:
```

**Parameters:**

- n (int): The number of patients to create and save.
- age_range (AgeRange): The age range of the patients.
- hb1ac_target_range (HbA1cTargetRange, optional): The HbA1c target range, defaulting to TARGET.
- visit_types (list[VisitType], optional): A list of visit types to be created for each patient.
- \*\*patient_kwargs (optional): Additional keyword arguments for patient creation, such as postcode="123".

#### Example Usage

```python
# Create and save 100 patients with associated visits
saved_patients = fake_patient_creator.create_and_save_fake_patients(
n=100,
age_range=AgeRange.AGE_25_34,
visit_types=[VisitType.CLINIC, VisitType.ANNUAL_REVIEW],
postcode="fake_postcode" # Customise as needed
)
```

Returns a list of Patient objects that have been saved to the database, each with their associated visits.

### Full Example

Below is an example test that demonstrates usage.

```python
import pytest
from datetime import date
from app.models import AgeRange, VisitType, HbA1cTargetRange
from app.utils import FakePatientCreator, get_audit_period_for_date

@pytest.mark.django_db
def test_example_use_fake_patient_creator():
"""Tests that the ages of all fake patients fall into the appropriate
age range.
NOTE:
We initialise the FakePatientCreator with a specific audit period as
this is used throughout the creation of fake patients and visits
including:
- setting random date of birth so max age by audit_start_date is valid
for the given AgeRange. Also, diagnosis_date is between
date_of_birth and audit_start_date.
- setting random diagnosis date so it is before audit_start_date
- Visit dates are spread evenly throughout each quarter of the audit
period. For each Visit, the date is randomly set within its
quarter's date range.
"""

# Set necessary attributes to calibrate all dates
DATE_IN_AUDIT = date(2024, 4, 1)
audit_start_date, audit_end_date = get_audit_period_for_date(DATE_IN_AUDIT)
fake_patient_creator = FakePatientCreator(
audit_start_date=audit_start_date,
audit_end_date=audit_end_date,
)
age_range = AgeRange.AGE_0_4

# Build fake patient instances
pts = fake_patient_creator.build_fake_patients(
n=10,
age_range=age_range,
# Can additionally pass in extra PatientFactory kwargs here
postcode="fake_postcode",
)

# Build fake Visit instances for each patient
VISIT_TYPES = generate_random_visit_types(n=12)
visits = fake_patient_creator.build_fake_visits(
patients=pts,
visit_types=VISIT_TYPES,
hb1ac_target_range=HbA1cTargetRange.WELL_ABOVE,
age_range=age_range,
# Can additionally pass in extra VisitFactory kwargs here
is_valid=True,
)

assert len(pts) == 10
assert len(visits) == 120 # 10 patients * 12 visits
```
1 change: 1 addition & 0 deletions documentation/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,7 @@ nav:
- 'developer/organisations.md'
- 'developer/users.md'
- 'developer/testing.md'
- 'developer/fake_patient_generator.md'
- 'developer/submission.md'
- KPI Definitions:
- 'developer/kpis/kpi_definitions.md'
Expand Down
28 changes: 28 additions & 0 deletions project/constants/visit_categories.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,34 @@ class VisitCategories(Enum):
(VisitCategories.HOSPITAL_ADMISSION, HOSPITAL_ADMISSION_FIELDS),
)

CLINIC_VISIT_FIELDS = ( # These tend to be addressed at all clinic visits
(VisitCategories.MEASUREMENT, MEASUREMENT_FIELDS),
(VisitCategories.HBA1, HBA1_FIELDS),
(VisitCategories.TREATMENT, TREATMENT_FIELDS),
(VisitCategories.CGM, CGM_FIELDS),
(VisitCategories.BP, BP_FIELDS),
)

ANNUAL_REVIEW_FIELDS = ( # These fields are only required once a year and tend to be done at the same time
(VisitCategories.FOOT, FOOT_FIELDS),
(VisitCategories.DECS, DECS_FIELDS),
(VisitCategories.ACR, ACR_FIELDS),
(VisitCategories.CHOLESTEROL, CHOLESTEROL_FIELDS),
(VisitCategories.THYROID, THYROID_FIELDS),
(VisitCategories.COELIAC, COELIAC_FIELDS),
(VisitCategories.PSYCHOLOGY, PSYCHOLOGY_FIELDS),
(VisitCategories.SMOKING, SMOKING_FIELDS),
(VisitCategories.SICK_DAY, SICK_DAY_FIELDS),
(VisitCategories.FLU, FLU_FIELDS),
)

EXTRA_VISIT_FIELDS = ( # These fields are not always part of annual review and are not always addressed in clinic visits
(VisitCategories.DIETETIAN, DIETETIAN_FIELDS),
(VisitCategories.PSYCHOLOGY, PSYCHOLOGY_FIELDS),
(VisitCategories.HOSPITAL_ADMISSION, HOSPITAL_ADMISSION_FIELDS),
)


VISIT_CATEGORY_COLOURS = (
(VisitCategories.HBA1, "rcpch_dark_grey"),
(VisitCategories.MEASUREMENT, "rcpch_yellow"),
Expand Down
1 change: 1 addition & 0 deletions project/npda/general_functions/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,5 @@
from .model_utils import *
from .audit_period import *
from .session import *
from .utils import *
from .view_preference import *
56 changes: 55 additions & 1 deletion project/npda/general_functions/audit_period.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from datetime import date
from dateutil.relativedelta import relativedelta


def get_audit_period_for_date(input_date: date) -> tuple[date, date]:
Expand All @@ -23,7 +24,9 @@ def get_audit_period_for_date(input_date: date) -> tuple[date, date]:
)

# Audit year is the year of the input date if the month is April or later, otherwise it is the previous year
audit_year = input_date.year if input_date.month >= 4 else input_date.year - 1
audit_year = (
input_date.year if input_date.month >= 4 else input_date.year - 1
)

# Start date is always 1st April
audit_start_date = date(audit_year, 4, 1)
Expand All @@ -32,3 +35,54 @@ def get_audit_period_for_date(input_date: date) -> tuple[date, date]:
audit_end_date = date(audit_year + 1, 3, 31)

return audit_start_date, audit_end_date


def get_quarters_for_audit_period(
audit_start_date: date, audit_end_date: date
) -> list[tuple[date, date]]:
"""Get the quarters for the audit period.
:param audit_start_date: The start date of the audit period
:param audit_end_date: The end date of the audit period
:return: A list of tuples, each containing the start and end date of a quarter
"""

# Ensure audit_start_date is earlier than audit_end_date
if audit_start_date >= audit_end_date:
raise ValueError("Audit start date must be before the audit end date.")

# Initialize the list of quarters
quarters = []

# Calculate the start and end date of each quarter
current_start = audit_start_date
while current_start < audit_end_date:
# Calculate the quarter end date by adding 3 months
current_end = (
current_start + relativedelta(months=3) - relativedelta(days=1)
)

# If the quarter end date exceeds the audit end date, use the audit end date
if current_end > audit_end_date:
current_end = audit_end_date

quarters.append((current_start, current_end))

# Move to the next quarter
current_start = current_end + relativedelta(days=1)

return quarters


def get_quarter_for_visit(
visit_date: date,
) -> int:
"""Returns quarter for the visit date"""
audit_start_date, audit_end_date = get_audit_period_for_date(visit_date)
quarters = get_quarters_for_audit_period(audit_start_date, audit_end_date)

for i, (quarter_start, quarter_end) in enumerate(quarters, start=1):
if quarter_start <= visit_date <= quarter_end:
return i

raise ValueError("Visit date is not within the audit period.")
Loading

0 comments on commit 65fd924

Please sign in to comment.