Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

entrypoint and readme update to main #3

Merged
merged 7 commits into from
Apr 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ FROM rocker/r-ver:4.3.0

# DeGAUSS container metadata
ENV degauss_name="daymet"
ENV degauss_version="0.1.1"
ENV degauss_version="0.1.2"
ENV degauss_description="daymet climate variables"
ENV degauss_argument="short description of optional argument [default: 'insert_default_value_here']"

Expand Down
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@ Note: The Daymet calendar is based on a standard calendar year. All Daymet years
If `my_addresses.csv` is a file in the current working directory with ID column `id`, start and end date columns `start_date` and `end_date`, and coordinate columns named `lat` and `lon`, then the [DeGAUSS command](https://degauss.org/using_degauss.html#DeGAUSS_Commands):

```sh
docker run --rm -v $PWD:/tmp ghcr.io/degauss-org/daymet:0.1.1 my_addresses.csv
docker run --rm -v $PWD:/tmp ghcr.io/degauss-org/daymet:0.1.2 my_addresses.csv
```

will produce `my_addresses_daymet_0.1.1.csv` with added columns:
will produce `my_addresses_daymet_0.1.2.csv` with added columns:

- **`tmax`**: maximum temperature
- **`tmin`**: minimum temperature
Expand All @@ -29,7 +29,7 @@ will produce `my_addresses_daymet_0.1.1.csv` with added columns:
- **`prcp`**: precipitation
- **`dayl`**: day length

Other columns may be present in the input `my_addresses.csv` file, and these other columns will be linked in and included in the output `my_addresses_daymet_0.1.1.csv` file.
Other columns may be present in the input `my_addresses.csv` file, and these other columns will be linked in and included in the output `my_addresses_daymet_0.1.2.csv` file.

### Optional Arguments

Expand All @@ -43,14 +43,15 @@ Other columns may be present in the input `my_addresses.csv` file, and these oth
An example DeGAUSS command with all optional arguments used would be:

```sh
docker run --rm -v $PWD:/tmp ghcr.io/degauss-org/daymet:0.1.1 my_addresses.csv tmax,vp,prcp -88.263390 -87.525706 41.470117 42.154247 na
docker run --rm -v $PWD:/tmp ghcr.io/degauss-org/daymet:0.1.2 my_addresses.csv tmax,vp,prcp -88.263390 -87.525706 41.470117 42.154247 na
```

which will return maximum temperature, vapor pressure, and precipitation for observations within a bounding box of Cook County, IL. It is important to specify bounding box coordinates in the order of: `min_lon`, `max_lon`, `min_lat`, `max_lat`.
which will return maximum temperature, vapor pressure, and precipitation for observations within a boundary box of Cook County, IL. It is important to specify bounding box coordinates in the order of: `min_lon`, `max_lon`, `min_lat`, `max_lat`.

## Geomarker Methods

Daymet data on a specified date is linked to coordinate data within the `my_addresses.csv` file by matching on the Daymet 1 km x 1 km raster cell number.
If the boundary box coordinate data is not supplied in the optional arguments, they will be inferred from the .csv file with an added 0.1 degree latitude and longitude buffer to the outermost points to enhance data privacy.

## Geomarker Data

Expand All @@ -59,8 +60,9 @@ Daymet data on a specified date is linked to coordinate data within the `my_addr

## Warning

If the bounding box for Daymet data download is inferred from address coordinates, then the size of the Daymet data download may be quite large if the address coordinates are very spread out. If a wide spread of coordinates is desired, then it may be best to stratify your input dataset to coordinates within separate geographic regions.
If the boundary box for Daymet data download is inferred from address coordinates, then the size of the Daymet data download may be quite large if the address coordinates are very spread out. If a wide spread of coordinates is desired, then it may be best to stratify your input dataset to coordinates within separate geographic regions.

## DeGAUSS Details

The Daymet DeGAUSS package was created by Ben Barrett and Peter Graffy, with contributions from Erika Rasnick and Luke Rasmussen.
For detailed documentation on DeGAUSS, including general usage and installation, please see the [DeGAUSS homepage](https://degauss.org).
23 changes: 11 additions & 12 deletions entrypoint.R
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@ if (is.null(opt$vars)) {

day_var <- str_remove_all(opt$vars, " ")
day_var <- str_split(day_var, ",", simplify = TRUE)

if (! all(day_var %in% c("tmax", "tmin", "srad", "vp", "swe", "prcp", "dayl", "capricorn"))) {
opt$vars <- "tmax, tmin, srad, vp, swe, prcp, dayl"
cli::cli_alert_warning("Invalid argument for Daymet variable selection. Will return all Daymet variables. Please see {.url https://degauss.org/daymet/} for more information.")
Expand Down Expand Up @@ -74,13 +73,13 @@ if (! opt$region %in% c("na", "hi", "pr")) {
}

if (opt$vars %in% c("capricorn")) {
opt$vars <- "tmax, tmin"
opt$min_lon <- -88.263390
opt$max_lon <- -87.525706
opt$min_lat <- 41.470117
opt$max_lat <- 42.154247
opt$region <- "na"
cli::cli_alert_warning("Returning tmax and tmin for lat/lon coordinates of Cook County. Please see {.url https://degauss.org/daymet/} for more information.")
opt$vars <- "tmax, tmin"
opt$min_lon <- -88.263390
opt$max_lon <- -87.525706
opt$min_lat <- 41.470117
opt$max_lat <- 42.154247
opt$region <- "na"
cli::cli_alert_warning("Returning tmax and tmin for lat/lon coordinates of Cook County. Please see {.url https://degauss.org/daymet/} for more information.")
}

# Writing functions
Expand Down Expand Up @@ -183,9 +182,6 @@ import_data <- function(.csv_filename = opt$filename, .min_lon = opt$min_lon, .m
print(w)
stop(call. = FALSE)
})
# Inferring the start and end year of Daymet data to download from start_date and end_date
year_start <- year(min(input_data$start_date))
year_end <- year(max(input_data$end_date))
# Expanding the dates between start_date and end_date into a daily series
input_data <- expand_dates(input_data, by = "day") %>%
select(-start_date, -end_date)
Expand All @@ -200,14 +196,17 @@ import_data <- function(.csv_filename = opt$filename, .min_lon = opt$min_lon, .m
# Throwing an error if no observations are remaining
if (nrow(input_data) == 0) {
stop(call. = FALSE, 'Zero observations where the start_date is within or after the first year of available Daymet data.')
}
}
# Filtering out any rows in the input data where the date year is equal to the current date year
input_data <- input_data %>%
filter(!(year(date) == year(Sys.Date())))
# Throwing an error if no observations are remaining
if (nrow(input_data) == 0) {
stop(call. = FALSE, 'Zero observations where the end_date is within or before the last year of available Daymet data.')
}
# Inferring the start and end year of Daymet data to download from date
year_start <- year(min(input_data$date))
year_end <- year(max(input_data$date))
# Removing any columns in the input data where everything is NA
input_data <- input_data %>%
select_if(~ !all(is.na(.)))
Expand Down
Loading