-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Script to update e-picsa from R-Instat #9289
Comments
I support. This is a sensible way of making it easy for ZMD. @lilyclements let me know when you have done the "tweak" and I will do the testing |
@rdstern @jkmusyoka happy to assist. However, I am unsure where you'd like me to make this tweak. You refer to a "current script", what or where is this current script? You say |
@lilyclements I thought you had an R script from before, to update the data in google buckets if another year has passed, and so you just wanted the new year included, and the stations and events were all the same as last year? That's the one - if I'm not imagining things, that we suggested could be done from within R-Instat in stead? |
@rdstern @jkmusyoka I have written a new function which currently updates only for annual rainfall summaries and monthly/annual temperature summaries. I can get to the other bits now, but, thought I should share this for now. There are three parts:
Only step 2. involves new code. 1. Importing from Climsoft # Get Climsoft Data using Climatic > Import from Climsoft
# I tested with Lundazi data, but, you can test with any. I can share the script with that privately because it contains information on importing from Climsfot.
data_book$database_connect(dbname="...", host="...", port=..., user="...")
# Dialog: Import From Climsoft
data_book$import_climsoft_data(table="observationfinal", station_filter_column="stationId", stations="LUNDAZ01", element_filter_column="elementName", elements=c("Precip daily","Temp daily min","Temp daily max"))
# You then need to rearrange the data - pivot_wider and create relevant columns, like DOY and Year.
# Is this something we want to be automated?
# Dialog: Unstack (Pivot Wider)
observations_data <- data_book$get_data_frame(data_name="observations_data")
observations_data_unstacked <- tidyr::pivot_wider(data=observations_data, names_from=element_abbrv, values_from=value)
data_book$import_data(data_tables=list(observations_data_unstacked=observations_data_unstacked))
rm(list=c("observations_data_unstacked", "observations_data"))
# Dialog: Use Date
data_book$split_date(data_name="observations_data_unstacked", col_name="date", year_val=TRUE, month_val=TRUE, day_in_year_366 =TRUE, s_start_month=1) 2. Updating the Summaries from Definitions This is the new bit! You need to update your # read in token to access bucket
gcs_auth_file(file = "tests/testthat/testdata/epicsa_token.json")
# update our functions
annual_summaries_data <- data_book$get_data_frame("observations_data_unstacked")
annual_summaries_data <- update_rainfall_summaries_from_definition(country = "zm_workshops", station_id = "Lundazi Met", daily_data = annual_summaries_data)
data_book$import_data(data_tables=list(annual_summaries_data = annual_summaries_data))
# and for our temperature
monthly_temperature_summaries <- update_monthly_temperature_summaries_from_definition(country = "zm_workshops", station_id = "Lundazi Met", daily_data = observations_data_unstacked)
annual_temperature_summaries <- update_annual_temperature_summaries_from_definition(country = "zm_workshops", station_id = "Lundazi Met", daily_data = observations_data_unstacked)
data_book$import_data(data_tables=list(monthly_temperature_summaries = monthly_temperature_summaries))
data_book$import_data(data_tables=list(annual_temperature_summaries = annual_temperature_summaries))
3. Exporting to Google Buckets annual_rain <- epicsawrap::reformat_annual_summaries(data=annual_summaries_data,
station_col="station_id"
year_col="year",
start_rains_doy_col="start_rains_doy",
start_rains_date_col="start_rains_date",
end_season_doy_col = "end_season_doy",
end_season_date_col = "end_season_date",
season_length_col = "season_length",
n_rain_col = "n_rain")
# similarly make changes for reformat_temperature_summaries
epicsawrap::export_r_instat_to_bucket(data_by_year = "annual_summaries_data",
rain = "PRECIP",
station = "station_id",
year="year",
month="month_val",
summaries=c("annual_rainfall"),
station_id = "station_id",
definitions_id="999",
country="zm_workshop",
include_summary_data=TRUE,
annual_rainfall_data = annual_rain,
start_rains_column = "start_rains_doy",
end_season_column = "end_season_doy",
seasonal_length_column = "seasonal_length")
# amend epicsawrap::export_r_instat_to_bucket to include our changes to the temperature summaries too TODO
|
@lilyclements this seems great. For now I am also ok that it works just one station at a time. If the updates are to be from the individual provinces, then there are relatively few stations. |
Great! Good suggestion! I've implemented that now (see point 2.)
E.g.,
|
@lilyclements I think using the calculation system may be what we are also suggesting? Currently when we use the start of the rains - which uses the calculation system it also generates a (sort of mysterious) definition, and this is exported to google buckets. Could the definition be less mysterious and simply become a "definition object" or "e-picsa object", which is added to the existing objects attached to the data sheet. So, like a graph object, which can be all sorts of graphs, or a filter object, etc. Then we have the Prepare > R-objects menu to View, Rename, Reorder and Delete them. Now I assume this would make the updating much more flexible and simpler to follow. The updating procedure simply (maybe) has a dialog to get (import?) the definition objects from google buckets, rather than from the dialog. So it could be in the file menu to Now the If importing definitions, together with summary data, from google buckets, this is designed so you can prepare e-picsa type graphs in R-Instat to confirm that they are sensible and you can support them. @jkmusyoka may wish to add? @lilyclements I'm quite liking this scheme. We are teaching the export to google buckets on Wednesday. I would like to progress on our thinking and discussion by then, but would not expect any coding. And there isn't any rush after then. I'm hoping you agree that adding an Import from Google Buckets is a sensible addition. Then we could explain that this is coming. I also like the definition Objects, but that is too detailed to be discussed in the workshop. |
@lilyclements looking again at your message above - your point 3, I think my request to save a definition is maybe the same then as saving a calculation! We could check with @volloholic or @dannyparsons because we don't yet save calculations, but I think they always had in mind that we should. That would then be great, because your definitions would then not be a special climatic e-picsa feature, but simply a special case of saving a calculation. And that fits perfectly with the whole idea of the data sheets and data book. |
@rdstern I think I'm a few steps behind you. I'd like to catch up as what you're saying is very exciting! I think I'm on the right page after writing and rewriting and rerewriting this message. But, my summary and some questions: The "Import from Google Buckets" dialog:We can have two buttons at the top, something like: "Import/Update Definitions" and "Import Summaries" Import/Update Definitions Import Summaries (Out of interest, if we had a function which used the calculation system, is this something we would want to replace our current code in SoR/EoR dialogs?) Definition Object bits:
|
@lilyclements I think you are only saying you are behind - you have moved ahead - we have continued discussions here and @jkmusyoka should also reply to your message above. Also there is no rush, so we have time to reflect. On a time scale I'm hoping we might have a workshop for the staff from the provinces in perhaps April - not before. If we do, then this would be an excellent workshop for you to be part of. If these become calculation (or definition) objects, then I was assuming there would be multiple objects, with each definition, e.g. start, with no dry-spell, being an object. That's like each filter, or each graph. I'm assuming we shouldn't do much more before involving at least David? With the calculation system being such a selling point within the R-Instat databook etc, how come he and Danny didn't include calculation objects from the start. We are nearly 10 years in, and we have lots of other objects in the data book, but not yet calculations? And should we link this to the need for the General Summaries dialog to be "completed", i.e. to be able to edit the parts of a calculation? And should we call that General Calculations, as it is more general than a summary? |
@lilyclements I think this could be a very simple update, to what I understand is your current R script to update the google bucket, from Climsoft.
The reason I hope it is a simple "tweak" to your current script is that I suggest it should be a 2-step process as follows:
a) Step 1 is the import all the data for the update, into R-Instat. This could be for Zambia, or it could be province by Province, if there are so many stations that it is a bit scary for the ZMD staff to do it all in 1 step. Or it could even be for a few (new) stations.
b) Then they run your updating script.
I assume it is working already - from Climsoft - so all you need to do, is adapt it to run from data in R-Instat?
Neat, eh? If so, then this builds on all the changes you have been making in R-Instat recently. So that justifies that they were made, at least partly for e-picsa. That is to be able to import packages from github easily, and also the specific r-package for e-picsa. Could you then add the new code into that package?
This would then all fit very sweetly into our advanced workshop, which starts 9 December. We are teaching about using scripts also for the out-filling process. So perfect that they need to be able to run a script for e-picsa updates!
And it fits perfectly into my general plan for R-Instat, namely that we have demolished the steep learning curve, for a large group of potential users, who find the usual "learn R first" approach daunting. And it shows that (for them), using scripts is pretty easy, and much simpler than starting R by writing them. This facility would be an excellent example!
And we can test it all next week! It is one important step into making e-picsa a smooth process by the end of this contract.
The text was updated successfully, but these errors were encountered: