Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend get_vpts() to provide access to the RMI CROW dataset #13

Open
27 tasks
peterdesmet opened this issue Nov 20, 2024 · 4 comments
Open
27 tasks

Extend get_vpts() to provide access to the RMI CROW dataset #13

peterdesmet opened this issue Nov 20, 2024 · 4 comments
Assignees

Comments

@peterdesmet
Copy link
Member

Source

I suggest the value rmi for the parameter source: rmi. It is the only VPTS dataset by RMI. I think the alternative value crow would be confusing as a name, since that is also used for the visualization.

Scope

Metadata and context can be found here. The dataset covers 10 radars and has data since 2019. More data are added daily.

  • Do we have a mechanism for users to discover the scope (temporal + radars) of dataset?

Data files

Data files are deposited at https://opendata.meteo.be/ftp/observations/radar/vbird/ and organized in radar and year directories. The file names are of the format <radar>_vpts_<yyyymmdd>.txt (e.g. behel_vpts_20191015.txt)

Data format

The data format is the default stdout of vol2bird, which is fixed width (example). If you write a parser for that format, I would call it vol2bird_vpts, not rmi_vpts. The CROW visualization has a minimal parser

The format unfortunately does not contain all columns of VPTS CSV. Below is a suggestion how it could be completed.

  • radar: value provided by user
  • datetime: columns <date>T<time>00Z
  • height: < HGHT>
  • u: <u>
  • v: <v>
  • w: <w>
  • ff: <ff>
  • dd: <dd>
  • sd_vvp: <sd_vvp>
  • gap: <gap>
  • eta: <eta>
  • dens: <dens>
  • dbz: <dbz>
  • dbz_all: <DBZH>
  • n: <n>
  • n_dbz: <dbz>
  • n_all: <n_all>
  • n_dbz_all: <n_dbz_all>
  • rcs: very likely 11, but it's not recorded and not a required term
  • sd_vvp_threshold: very likely 2, but it's not recorded and not a required term
  • vcp: leave empty
  • radar_latitude: required term, not recorded. I suggest to retrieve that from the radar overview, see suggested function at Possibly integrate radar overview #6
  • radar_longitude: same issue radar_latitude
  • radar_height: same issue radar_latitude
  • radar_wavelength: same issue radar_latitude
  • source_file: not a required term. Recorded in the csv comments, as eg. # polar volume input: /tmp/20191015143000.rad.behel.pvol.dbzh.scanz.pvol.h5
@PietrH
Copy link
Contributor

PietrH commented Nov 21, 2024

Looks fun to work on, I'm looking forward to it. I already explored data.table for reading fixed with files (fwf) earlier this week: https://gist.github.com/PietrH/f13fb98f95b37242e59c92407fde1917

There is also readr::read_fwf(), it offers more control but requires more setup.

I'm slightly tempted to explore arrow::open_dataset() as the radar/year partitioning might actually come in handy.

There is both an FTP as well as a HTTP endpoint, for now I'll probably prefer using the HTTP endpoint.

Questions

  1. Is the metadata header always the same length?
  2. Are the columns always the same width and order?

@bart1
Copy link
Collaborator

bart1 commented Nov 21, 2024

arrow actually looks quite cool I need to look into that!

@peterdesmet
Copy link
Member Author

  1. Is the metadata header always the same length?

Do you mean: the same amount of rows? Not sure, but they should always start with #, which can be ignored with readr:: read_fwf(comment = "#")

  1. Are the columns always the same width and order?

Yes

@PietrH
Copy link
Contributor

PietrH commented Nov 21, 2024

  1. Is the metadata header always the same length?

Do you mean: the same amount of rows? Not sure, but they should always start with #, which can be ignored with readr:: read_fwf(comment = "#")

Sadly the header is also commented out, but if we are very certain the columns never change, this shouldn't be an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants