This repository has been archived by the owner on Dec 11, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #36 from EcoJulia/improvement-query-speed
Improve query speed
- Loading branch information
Showing
22 changed files
with
271 additions
and
303 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,11 @@ | ||
[deps] | ||
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4" | ||
DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0" | ||
Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4" | ||
Query = "1a8c2f83-1ff3-5112-b086-8aa67b057ba1" | ||
StatsPlots = "f3b207a7-027a-5e70-b257-86293d7955fd" | ||
|
||
[compat] | ||
Documenter = "0.24" | ||
DataFrames = "0.21" | ||
Documenter = "0.25" | ||
Query = "0.12" | ||
StatsPlots = "0.14" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# Rank-abundance curves of bats in Europe | ||
|
||
In this example, we will use the `GBIF` package to produce a rank-abundance | ||
curve of Chiroptera species in Europe, based on data from 2000 to 2005. | ||
|
||
```@example bt | ||
using GBIF | ||
using DataFrames | ||
using Query | ||
using StatsPlots | ||
bats = GBIF.taxon("Chiroptera"; strict=false) | ||
occ = occurrences(bats, "continent" => "EUROPE", "year" => (2000, 2005), "limit" => 300) | ||
while length(occ) < size(occ) | ||
occurrences!(occ) | ||
end | ||
``` | ||
|
||
```@example bt | ||
by_country = occ |> | ||
@filter(_.rank == "SPECIES") |> | ||
@map({_.key, _.country, species=_.taxon.name}) |> | ||
@filter(!ismissing(_.species)) |> | ||
@filter(!ismissing(_.country)) |> | ||
@groupby((_.country, _.species)) |> | ||
@map({country = first(unique(_.country)), species = first(unique(_.species)), count = length(_)}) |> | ||
@groupby(_.country) |> | ||
@map({country = key(_), abundance = sort(_.count, rev=true), rank = 1:length(_)}) |> | ||
@filter(length(_.abundance) > 5) |> | ||
DataFrame |> | ||
(d) -> flatten(d, [:abundance, :rank]) |> | ||
(d) -> sort(d, :rank) | ||
theme(:wong) | ||
@df by_country plot(:rank, :abundance, group = :country, m=:circle, legend=:outertopright) | ||
xaxis!("Rank", :log) | ||
yaxis!("Observations", :log) | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,85 @@ | ||
# Observations of Northern Cardinal over time | ||
|
||
In this example, we will use the `GBIF` package to compare the number of | ||
observations of a species over two years. Specifically, we will look at records of the Northern Cardinal (*Cardinalis cardinalis*) in Québec, from 2011 to 2013. This example will allow us to highlight how `GBIFRecords` can be used with `Query`, to select records and transform them. | ||
|
||
```@example qc | ||
using GBIF | ||
using DataFrames | ||
using Query | ||
using StatsPlots | ||
using Dates | ||
``` | ||
|
||
We can get the taxonomic object for *Cardinalis cardinalis*: | ||
|
||
```@example qc | ||
sp_code = taxon("Cardinalis cardinalis", rank = :SPECIES) | ||
``` | ||
|
||
The `rank = :SPECIES` argument is not required, as it is the default behaviour | ||
of the API. Yet, it helps the readability of the code to specify what we should | ||
be expecting. With this object created, we can define a rough bounding box for | ||
Québec: | ||
|
||
```@example qc | ||
lat, lon = (44.0, 62.0), (-80.0, -56.0) | ||
``` | ||
|
||
This bounding box will also include a few parts of the continental USA, but this | ||
is not an issue as we will filter them out when we have done the occurrences | ||
retrieval. It would also be possible to add a `"country" => "CA"` parameter to | ||
the query. | ||
|
||
```@example qc | ||
obs_qc = occurrences( | ||
sp_code, | ||
"limit" => 300, | ||
"hasCoordinate" => "true", | ||
"decimalLatitude" => lat, | ||
"decimalLongitude" => lon, | ||
"year" => (2011, 2013) | ||
) | ||
``` | ||
|
||
The `length` method for this object will tell us how many records we currently | ||
have, and the `size` method will tell us how many we can retrieve in total. | ||
Because the query parameters are going to remain within the `obs_qc` variable | ||
(in the `query` field, specifically), all we need to do is call `occurrences!` | ||
on this variable until all occurrences (of which there are `size(obs_qc)`) are | ||
retrieved. | ||
|
||
```@example qc | ||
while length(obs_qc) < size(obs_qc) | ||
occurrences!(obs_qc) | ||
end | ||
``` | ||
|
||
At the end of this loop, the `obs_qc` object will have all of the occurrences. Running this loop may take some time, as there are limitations on speed due to interacting with a remote server. | ||
|
||
The result is directly iterable, so we do not need to do anything specific to | ||
use it in a `for` loop - but if we want to get an array of `GBIFRecord`, we can | ||
use `collect(view(obs_qc))`. Why `view`? The `GBIFRecords` type always starts | ||
with enough "room" to put all the `GBIFRecord`, but any record that was not | ||
retrieved yet is `#undef`. Calling `view` will give us the records that are | ||
initialized (in versions of Julia starting from 1.5, this has no performance | ||
cost); `collect`ing the view generates a `Vector{GBIFRecord}`. Internally, | ||
iteration methods act on the view, so the unassigned records are invisible to | ||
the user. | ||
|
||
The next step is to actually convert the data into a form where we can plot | ||
them, and this showcases how the package can be used with `Query`: | ||
|
||
```@example qc | ||
d = obs_qc |> | ||
@filter(_.rank == "SPECIES") |> | ||
@filter(_.country == "Canada") |> | ||
@map({_.key, year=year(_.date), month=month(_.date)}) |> | ||
@groupby((_.year, _.month)) |> | ||
@map({year = first(unique(_.year)), month = first(unique(_.month)), obs = length(_)}) |> | ||
@orderby(_.month) |> | ||
@thenby(_.year) |> | ||
DataFrame | ||
@df d plot(:month, :obs, group=:year) | ||
``` |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,25 +1,33 @@ | ||
# Access GBIF data with Julia | ||
|
||
This package offers access to biodiversity data through the Global Biodiversity | ||
Information Facility ([GBIF](https://www.gbif.org/)) API. The package currently | ||
supports access to occurrence information, and limited support for taxonomic | ||
information. There are a limited number of cleaning routines built-in, but more | ||
can easily be added. | ||
This package offers access to biodiversity data stored by the Global | ||
Biodiversity Information Facility ([GBIF](https://www.gbif.org/)). The package | ||
currently offers a wrapper around the search API (to retrieve information on | ||
occurrences), and a limited wrapper around the species API (to retrieve the | ||
identifier of taxa). | ||
|
||
## How to install | ||
The focus on the package is on retrieving data; filtering and data analysis | ||
should be done using other packages from the Julia ecosystem. In particular, we | ||
provide support for `DataFrames` and `Query` (and therefore the rest of the | ||
"queryverse"). | ||
|
||
The package can be installed from the Julia console: | ||
## Getting started | ||
|
||
The latest release of the package can be installed from the Julia console: | ||
|
||
~~~ julia | ||
Pkg.add("GBIF") | ||
~~~ | ||
|
||
## How to use | ||
|
||
After installing it, load the package as usual: | ||
|
||
~~~ julia | ||
using GBIF | ||
~~~ | ||
|
||
This documentation will walk you through the various features. | ||
## Core features | ||
|
||
- get taxonomic information using the `taxon` function | ||
- retrieve a single occurrence as a `GBIFRecord` using `occurrence` | ||
- search for multiple occurrences as a `GBIFRecords` according to a query using the `occurrences` function, and page through the results with `occurrences!` | ||
- `GBIFRecords` are fully iterable |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
973a724
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JuliaRegistrator register
973a724
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Registration pull request created: JuliaRegistries/General/18600
After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.
This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via: