-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use c++ code for testing coordinate equality of nodes #130
Comments
Yes, I would even put it high on the priority list ;) |
Hi everyone. If you are willing to introduce a Load packages library(sf)
#> Linking to GEOS 3.8.0, GDAL 3.0.4, PROJ 6.3.1
library(data.table) Create a simple example my_sfc <- st_sfc(
st_linestring(rbind(c(-1, 0), c(0, 0))),
st_linestring(rbind(c(1, 0), c(0, 0))),
st_linestring(rbind(c(0, 0), c(0, 1))),
st_linestring(rbind(c(0, 0), c(0, -1))),
st_linestring(rbind(c(0, 1), c(1, 1))),
st_linestring(rbind(c(1, 1), c(1, 0))),
st_linestring(rbind(c(1, 0), c(1, -1))),
st_linestring(rbind(c(0, -1), c(1, -1)))
) and plot it par(mar = rep(0, 4))
plot(my_sfc, reset = FALSE)
plot(st_cast(my_sfc, "POINT"), pch = 16, add = TRUE) Extract the boundary points nodes <- sfnetworks:::linestring_boundary_points(my_sfc) Extract their coordinates nodes_coordinates <- sfheaders::sfc_to_df(nodes) Convert to setDT(nodes_coordinates) Extract unique coordinates cols_union <- intersect(colnames(nodes_coordinates), c("x", "y", "z", "m"))
unique_coordinates <- unique(nodes_coordinates, by = cols_union) Extract the ID of the row numbers unique_coordinates[nodes_coordinates, which = TRUE, on = cols_union]
#> [1] 1 2 3 2 2 4 2 5 4 6 6 3 3 7 5 7 The output is identical to the current procedure: match(nodes, unique(nodes))
#> [1] 1 2 3 2 2 4 2 5 4 6 6 3 3 7 5 7 Summarise the previous steps into a function my_match_with_unique <- function(nodes) {
nodes_coordinates <- sfheaders::sfc_to_df(nodes)
setDT(nodes_coordinates)
cols_union <- intersect(colnames(nodes_coordinates), c("x", "y", "z", "m"))
unique_coordinates <- unique(nodes_coordinates, by = cols_union)
unique_coordinates[nodes_coordinates, which = TRUE, on = cols_union]
} and test it again my_match_with_unique(nodes)
#> [1] 1 2 3 2 2 4 2 5 4 6 6 3 3 7 5 7 Define the degenerate data used in the previous example degenerate_sfc <- st_sfc(
st_linestring(rbind(
c(
-59.81698520799989893248493899591267108917236328125,
-13.6634482960000003259892764617688953876495361328125
),
c(
-59.81698520799988472163022379390895366668701171875,
-13.6634482960000003259892764617688953876495361328125
)
)),
st_linestring(rbind(c(1, 2), c(3, 4))),
crs = 4326
) Extract the nodes nodes <- sfnetworks:::linestring_boundary_points(degenerate_sfc) Compare new code my_match_with_unique(nodes)
#> [1] 1 2 3 4 Current (wrong) code match(nodes, unique(nodes))
#> [1] 1 1 3 4 Test it again with nodes <- sfnetworks:::linestring_boundary_points(sfnetworks::roxel)
all(my_match_with_unique(nodes) == match(nodes, unique(nodes)))
#> [1] TRUE A slightly more difficult example: iow <- osmextract::oe_get("Isle of Wight")
#> The input place was matched with: Isle of Wight
#> The chosen file was already detected in the download directory. Skip downloading.
#> The corresponding gpkg file was already detected. Skip vectortranslate operations.
#> Reading layer `lines' from data source `C:\Users\Utente\Documents\osm-data\geofabrik_isle-of-wight-latest.gpkg' using driver `GPKG'
#> Simple feature collection with 45621 features and 9 fields
#> geometry type: LINESTRING
#> dimension: XY
#> bbox: xmin: -5.401978 ymin: 43.35489 xmax: -0.175775 ymax: 50.89601
#> geographic CRS: WGS 84
nodes <- sfnetworks:::linestring_boundary_points(iow)
bench::mark(
new = my_match_with_unique(nodes),
current = match(nodes, unique(nodes)),
check = TRUE
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 2 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 new 68.7ms 76.37ms 12.5 8.69MB 1.79
#> 2 current 1.08s 1.08s 0.923 6.74MB 0.923 Even more difficult greater_london <- osmextract::oe_get("Greater London")
#> The input place was matched with: Greater London
#> The chosen file was already detected in the download directory. Skip downloading.
#> The corresponding gpkg file was already detected. Skip vectortranslate operations.
#> Reading layer `lines' from data source `C:\Users\Utente\Documents\osm-data\geofabrik_greater-london-latest.gpkg' using driver `GPKG'
#> Simple feature collection with 423239 features and 9 fields
#> geometry type: LINESTRING
#> dimension: XY
#> bbox: xmin: -0.6125036 ymin: 51.22684 xmax: 0.399669 ymax: 51.78736
#> geographic CRS: WGS 84
nodes <- sfnetworks:::linestring_boundary_points(greater_london)
bench::mark(
new = my_match_with_unique(nodes),
current = match(nodes, unique(nodes)),
check = TRUE
)
#> # A tibble: 2 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 new 729.49ms 729.49ms 1.37 76.9MB 0
#> 2 current 8.59s 8.59s 0.116 57.1MB 0 Created on 2021-03-13 by the reprex package (v1.0.0) To summarise:
I know that it may create more problems than benefits, but I don't know how to replicate the following code using regular
EDIT: Add more details I also tested a combination of A more general question: I think that the problem of matching and extracting unique |
Thanks @agila5! Interesting approach, did not think before of doing it like that. Personally I am a bit reluctant to include the I have been thinking about this and came up with an idea of using st_match = function(x) {
idxs = do.call("c", lapply(st_equals(x), `[`, 1))
match(idxs, unique(idxs))
} Some checks and benchmarks. Note that the runtime of nodes <- sfnetworks:::linestring_boundary_points(my_sfc)
my_match_with_unique(nodes)
# [1] 1 2 3 2 2 4 2 5 4 6 6 3 3 7 5 7
st_match(nodes)
# [1] 1 2 3 2 2 4 2 5 4 6 6 3 3 7 5 7
nodes <- sfnetworks:::linestring_boundary_points(degenerate_sfc)
my_match_with_unique(nodes)
# [1] 1 2 3 4
st_match(nodes)
# [1] 1 2 3 4
nodes <- sfnetworks:::linestring_boundary_points(sfnetworks::roxel)
all(my_match_with_unique(nodes) == st_match(nodes))
# [1] TRUE
iow <- osmextract::oe_get("Isle of Wight")
#> The input place was matched with: Isle of Wight
#> The chosen file was already detected in the download directory. Skip downloading.
#> The corresponding gpkg file was already detected. Skip vectortranslate operations.
#> Reading layer `lines' from data source `C:\Users\Utente\Documents\osm-data\geofabrik_isle-of-wight-latest.gpkg' using driver `GPKG'
#> Simple feature collection with 45621 features and 9 fields
#> geometry type: LINESTRING
#> dimension: XY
#> bbox: xmin: -5.401978 ymin: 43.35489 xmax: -0.175775 ymax: 50.89601
#> geographic CRS: WGS 84
nodes <- sfnetworks:::linestring_boundary_points(iow)
bench::mark(
dt = my_match_with_unique(nodes),
st = st_match(nodes),
base = match(nodes, unique(nodes)),
check = TRUE
)
# A tibble: 3 x 13
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr
# <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int>
# 1 dt 46.3ms 51.4ms 19.7 8.75MB 9.84 10
# 2 st 685.4ms 685.4ms 1.46 8.23MB 4.38 1
# 3 base 847.6ms 847.6ms 1.18 5.76MB 2.36 1
# … with 6 more variables: n_gc <dbl>, total_time <bch:tm>,
# result <list>, memory <list>, time <list>, gc <list>
greater_london <- osmextract::oe_get("Greater London")
#> The input place was matched with: Greater London
#> The chosen file was already detected in the download directory. Skip downloading.
#> The corresponding gpkg file was already detected. Skip vectortranslate operations.
#> Reading layer `lines' from data source `C:\Users\Utente\Documents\osm-data\geofabrik_greater-london-latest.gpkg' using driver `GPKG'
#> Simple feature collection with 423239 features and 9 fields
#> geometry type: LINESTRING
#> dimension: XY
#> bbox: xmin: -0.6125036 ymin: 51.22684 xmax: 0.399669 ymax: 51.78736
#> geographic CRS: WGS 84
nodes <- sfnetworks:::linestring_boundary_points(greater_london)
bench::mark(
dt = my_match_with_unique(nodes),
st = st_match(nodes),
base = match(nodes, unique(nodes)),
check = TRUE
)
# A tibble: 3 x 13
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr
# <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl> <int>
# 1 dt 955.89ms 955.89ms 1.05 81.3MB 1.05 1
# 2 st 7.6s 7.6s 0.132 75.4MB 0.527 1
# 3 base 7.97s 7.97s 0.126 51MB 0.251 1
# … with 6 more variables: n_gc <dbl>, total_time <bch:tm>,
# result <list>, memory <list>, time <list>, gc <list> What I like about the st_equals solution:
What I like about the data.table solution:
I think we need to balance the pros and cons. Any ideas on that? |
Hi @luukvdmeer. Great ideas and great solution (as always). I would adopt your approach since it adds no extra dependency and (I'm pretty sure) nobody cares about 6 seconds of extra computing time when working with such a large spatial network. |
Here's a C++ version that has 3 important advantages:
The source code is this, presumed to be in a local file #include <utility> // for std::pair
#include <Rcpp.h>
typedef std::pair <long int, long int> nodePair_t;
struct pair_hash {
inline std::size_t operator() (const nodePair_t & n) const {
return n.first * 31. + n.second;
}
};
// [[Rcpp::export]]
Rcpp::IntegerVector cpp_match (Rcpp::List g, double precision = 1e-10) {
const long int prec_int = round (1 / precision);
Rcpp::IntegerVector index (g.size ());
std::unordered_map <nodePair_t, int, pair_hash> nodeMap;
int count = 0;
int n = 0;
int index_i = 0;
for (auto i: g) {
const std::vector <double> xy = Rcpp::as <std::vector <double>> (i);
nodePair_t nodes = std::make_pair (
round (xy [0] * prec_int),
round (xy [1] * prec_int));
if (nodeMap.find (nodes) == nodeMap.end ()) {
n = count;
nodeMap.emplace (nodes, count++);
} else {
n = nodeMap.at (nodes);
}
// store n+1 for direct 1-based R indexing:
index (index_i++) = n + 1;
}
return index;
} And here is the benchmarking against the current solutions: library (sfnetworks)
library(sf)
#> Linking to GEOS 3.8.1, GDAL 3.0.4, PROJ 6.3.2
library (data.table)
library(Rcpp)
library (osmextract)
sourceCpp ("src.cpp")
iow <- osmextract::oe_get("Isle of Wight")
nodes <- sfnetworks:::linestring_boundary_points(iow)
my_match_with_unique <- function(nodes) {
nodes_coordinates <- sfheaders::sfc_to_df(nodes)
setDT(nodes_coordinates)
cols_union <- intersect(colnames(nodes_coordinates), c("x", "y", "z", "m"))
unique_coordinates <- unique(nodes_coordinates, by = cols_union)
unique_coordinates[nodes_coordinates, which = TRUE, on = cols_union]
}
st_match = function(x) {
idxs = do.call("c", lapply(st_equals(x), `[`, 1))
match(idxs, unique(idxs))
}
bench::mark(
dt = my_match_with_unique(nodes),
st = st_match (nodes),
base = match(nodes, unique(nodes)),
cpp = cpp_match (nodes),
check = TRUE
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 4 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:tm> <dbl> <bch:byt> <dbl>
#> 1 dt 45.7ms 49ms 19.6 NA 7.83
#> 2 st 750.3ms 750ms 1.33 NA 4.00
#> 3 base 706.5ms 707ms 1.42 NA 0
#> 4 cpp 46.1ms 47ms 21.2 NA 0
# and most importrantly:
identical (st_match (nodes), cpp_match (nodes))
#> TRUE Created on 2021-03-15 by the reprex package (v1.0.0) Also a clear demonstration of how amazing @luukvdmeer You currently have no |
Hi @mpadge. I don't understand the C++ code, but thank you very much for your assistance! |
Sorry @agila5, to clarify: save the C++ chunk above as a local file called library (Rcpp)
sourceCpp ("src.cpp") You can then call it just like the other functions, and use it in benchmarking. Don't worry about the details of how it works, but please do mess around with the library (sfnetworks)
library(sf)
#> Linking to GEOS 3.8.1, GDAL 3.0.4, PROJ 6.3.2
library(Rcpp)
library (osmextract)
sourceCpp ("src.cpp")
iow <- osmextract::oe_get("Isle of Wight")
nodes <- sfnetworks:::linestring_boundary_points(iow)
precision <- 10 ^ -(1:10)
nunique <- vapply (precision, function (i)
length (unique (cpp_match (nodes, precision = i))),
integer (1))
x <- data.frame (precision = precision,
n = nunique)
library (ggplot2)
ggplot (x, aes (x = precision, y = n)) +
geom_line () +
scale_x_log10 () +
scale_y_log10 () Created on 2021-03-15 by the reprex package (v1.0.0) |
Thanks @mpadge that looks great! One question I have: the default precision you use (i.e. |
There is no specific reason, just a generic value; lots of routines use around which happens to perfectly sensibly be 0.0. And that means extracting These considerations nevertheless suggest that it would be good to first check |
Good morning @mpadge, and (again) thank you very much for your time, help, and explanations. I have just one more consideration regarding the # packages
library(ggplot2)
# load cpp function
Rcpp::sourceCpp("C:/Users/Utente/Desktop/src.cpp")
# OSM data
iow <- sf::st_transform(osmextract::oe_get("Isle of Wight"), 27700)
nodes <- sfnetworks:::linestring_boundary_points(iow)
precision <- 10 ^ -(1:12)
nunique <- vapply (precision, function (i) {
length(unique (cpp_match (nodes, precision = i)))
},
integer (1)
)
x <- data.frame (
precision = precision,
n = nunique
)
ggplot (x, aes (x = precision, y = n)) +
geom_line () +
scale_x_log10 () +
scale_y_log10 () Created on 2021-03-16 by the reprex package (v1.0.0) Maybe it's possible to modify the C++ code and merge the two scenarios (I don't know), but I just wanted to highlight that (at the moment) the default value of |
That can be handled straight in R by something like the following lines, which estimate how many digits are used to represent the whole-number part of coordinates, as well as the maximal possible number that can be used to store integer values. n <- vapply (nodes [[1]], function (i)
nchar (as.character (round (i))),
integer (1))
nmax <- nchar (as.character (.Machine$integer.max)) For 4326, |
I think it must be related to the fact that I use Windows on my laptop because when I run the same code on a virtual machine (running ubuntu 16.04) I observe the same behaviour that you describe (i.e. a flat line going down for smaller values of precision). |
Just for "fun", I tried to test those problems a little bit more (since I really don't want to update On Ubuntu 16.04 I observe the same behaviour that you describe, i.e. C++ code uses a long int which does not exist in R and is much bigger than R's # load cpp function
Rcpp::sourceCpp(
code = '
#include <Rcpp.h>
// [[Rcpp::export]]
int cpp_match (double precision = 1e-10) {
const long int prec_int = round (1 / precision);
Rcpp::Rcout << "prec_int is equal to: " << prec_int << std::endl;
return prec_int;
}',
verbose = TRUE
)
#>
#> Generated extern "C" functions
#> --------------------------------------------------------
#>
#>
#> #include <Rcpp.h>
#> // cpp_match
#> int cpp_match(double precision);
#> RcppExport SEXP sourceCpp_1_cpp_match(SEXP precisionSEXP) {
#> BEGIN_RCPP
#> Rcpp::RObject rcpp_result_gen;
#> Rcpp::RNGScope rcpp_rngScope_gen;
#> Rcpp::traits::input_parameter< double >::type precision(precisionSEXP);
#> rcpp_result_gen = Rcpp::wrap(cpp_match(precision));
#> return rcpp_result_gen;
#> END_RCPP
#> }
#>
#> Generated R functions
#> -------------------------------------------------------
#>
#> `.sourceCpp_1_DLLInfo` <- dyn.load('/tmp/RtmpmMDLp0/sourceCpp-x86_64-pc-linux-gnu-1.0.6/sourcecpp_83071fec14b1/sourceCpp_2.so')
#>
#> cpp_match <- Rcpp:::sourceCppFunction(function(precision = 1e-10) {}, FALSE, `.sourceCpp_1_DLLInfo`, 'sourceCpp_1_cpp_match')
#>
#> rm(`.sourceCpp_1_DLLInfo`)
#>
#> Building shared library
#> --------------------------------------------------------
#>
#> DIR: /tmp/RtmpmMDLp0/sourceCpp-x86_64-pc-linux-gnu-1.0.6/sourcecpp_83071fec14b1
#>
#> /usr/lib/R/bin/R CMD SHLIB -o 'sourceCpp_2.so' 'file830749bf89dd.cpp'
res <- vapply(
X = 10 ^ (-(0:12)),
FUN = cpp_match,
FUN.VALUE = integer(1)
)
#> prec_int is equal to: 1
#> prec_int is equal to: 10
#> prec_int is equal to: 100
#> prec_int is equal to: 1000
#> prec_int is equal to: 10000
#> prec_int is equal to: 100000
#> prec_int is equal to: 1000000
#> prec_int is equal to: 10000000
#> prec_int is equal to: 100000000
#> prec_int is equal to: 1000000000
#> prec_int is equal to: 10000000000
#> prec_int is equal to: 100000000000
#> prec_int is equal to: 1000000000000
res
#> [1] 1 10 100 1000 10000 100000
#> [7] 1000000 10000000 100000000 1000000000 1410065408 1215752192
#> [13] -727379968 Created on 2021-03-17 by the reprex package (v0.3.0) Session infodevtools::session_info()
#> � Session info ���������������������������������������������������������������
#> setting value
#> version R version 3.6.3 (2020-02-29)
#> os Ubuntu 16.04.6 LTS
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_US.UTF-8
#> ctype en_US.UTF-8
#> tz Europe/Rome
#> date 2021-03-17
#>
#> � Packages �������������������������������������������������������������������
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [3] CRAN (R 3.5.3)
#> backports 1.1.10 2020-09-15 [1] CRAN (R 3.6.3)
#> callr 3.4.4 2020-09-07 [1] CRAN (R 3.6.3)
#> cli 2.2.0 2020-11-20 [1] CRAN (R 3.6.3)
#> crayon 1.3.4 2017-09-16 [1] CRAN (R 3.6.3)
#> desc 1.2.0 2018-05-01 [3] CRAN (R 3.5.0)
#> devtools 2.2.2 2020-02-17 [3] CRAN (R 3.6.2)
#> digest 0.6.27 2020-10-24 [1] CRAN (R 3.6.3)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 3.6.3)
#> evaluate 0.14 2019-05-28 [3] CRAN (R 3.6.0)
#> fansi 0.4.1 2020-01-08 [3] CRAN (R 3.6.2)
#> fs 1.3.2 2020-03-05 [3] CRAN (R 3.6.3)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 3.6.3)
#> highr 0.8 2019-03-20 [3] CRAN (R 3.5.3)
#> htmltools 0.5.0 2020-06-16 [1] CRAN (R 3.6.3)
#> knitr 1.30 2020-09-22 [1] CRAN (R 3.6.3)
#> magrittr 2.0.1 2020-11-17 [1] CRAN (R 3.6.3)
#> memoise 1.1.0 2017-04-21 [3] CRAN (R 3.5.0)
#> pkgbuild 1.1.0 2020-07-13 [1] CRAN (R 3.6.3)
#> pkgload 1.1.0 2020-05-29 [1] CRAN (R 3.6.3)
#> prettyunits 1.1.1 2020-01-24 [3] CRAN (R 3.6.2)
#> processx 3.4.4 2020-09-03 [1] CRAN (R 3.6.3)
#> ps 1.3.4 2020-08-11 [1] CRAN (R 3.6.3)
#> R6 2.5.0 2020-10-28 [1] CRAN (R 3.6.3)
#> Rcpp 1.0.6 2021-01-15 [1] CRAN (R 3.6.3)
#> remotes 2.1.1 2020-02-15 [3] CRAN (R 3.6.2)
#> rlang 0.4.9 2020-11-26 [1] CRAN (R 3.6.3)
#> rmarkdown 2.5 2020-10-21 [1] CRAN (R 3.6.3)
#> rprojroot 1.3-2 2018-01-03 [3] CRAN (R 3.5.0)
#> sessioninfo 1.1.1 2018-11-05 [3] CRAN (R 3.5.1)
#> stringi 1.5.3 2020-09-09 [1] CRAN (R 3.6.3)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.3)
#> testthat 2.3.2 2020-03-02 [3] CRAN (R 3.6.3)
#> usethis 1.6.1 2020-04-29 [1] CRAN (R 3.6.3)
#> withr 2.3.0 2020-09-22 [1] CRAN (R 3.6.3)
#> xfun 0.21 2021-02-10 [1] CRAN (R 3.6.3)
#> yaml 2.2.1 2020-02-01 [3] CRAN (R 3.6.2)
#>
#> [1] /home/dsuser/R/x86_64-pc-linux-gnu-library/3.6
#> [2] /usr/local/lib/R/site-library
#> [3] /usr/lib/R/site-library
#> [4] /usr/lib/R/library On the other hand, on Windows: # load cpp function
Rcpp::sourceCpp(
code = '
#include <Rcpp.h>
// [[Rcpp::export]]
int cpp_match (double precision = 1e-10) {
const long int prec_int = round (1 / precision);
Rcpp::Rcout << "prec_int is equal to: " << prec_int << std::endl;
return prec_int;
}',
verbose = TRUE
)
#>
#> Generated extern "C" functions
#> --------------------------------------------------------
#>
#>
#> #include <Rcpp.h>
#> // cpp_match
#> int cpp_match(double precision);
#> RcppExport SEXP sourceCpp_1_cpp_match(SEXP precisionSEXP) {
#> BEGIN_RCPP
#> Rcpp::RObject rcpp_result_gen;
#> Rcpp::RNGScope rcpp_rngScope_gen;
#> Rcpp::traits::input_parameter< double >::type precision(precisionSEXP);
#> rcpp_result_gen = Rcpp::wrap(cpp_match(precision));
#> return rcpp_result_gen;
#> END_RCPP
#> }
#>
#> Generated R functions
#> -------------------------------------------------------
#>
#> `.sourceCpp_1_DLLInfo` <- dyn.load('C:/Users/Utente/AppData/Local/Temp/RtmpSUEJy1/sourceCpp-x86_64-w64-mingw32-1.0.6/sourcecpp_2bd017893460/sourceCpp_2.dll')
#>
#> cpp_match <- Rcpp:::sourceCppFunction(function(precision = 1e-10) {}, FALSE, `.sourceCpp_1_DLLInfo`, 'sourceCpp_1_cpp_match')
#>
#> rm(`.sourceCpp_1_DLLInfo`)
#>
#> Building shared library
#> --------------------------------------------------------
#>
#> DIR: C:/Users/Utente/AppData/Local/Temp/RtmpSUEJy1/sourceCpp-x86_64-w64-mingw32-1.0.6/sourcecpp_2bd017893460
#>
#> C:/PROGRA~1/R/R-40~1.4/bin/x64/R CMD SHLIB -o "sourceCpp_2.dll" "file2bd053da4388.cpp"
res <- vapply(
X = 10 ^ (-(0:12)),
FUN = cpp_match,
FUN.VALUE = integer(1)
)
#> prec_int is equal to: 1
#> prec_int is equal to: 10
#> prec_int is equal to: 100
#> prec_int is equal to: 1000
#> prec_int is equal to: 10000
#> prec_int is equal to: 100000
#> prec_int is equal to: 1000000
#> prec_int is equal to: 10000000
#> prec_int is equal to: 100000000
#> prec_int is equal to: 1000000000
#> prec_int is equal to: -2147483648
#> prec_int is equal to: -2147483648
#> prec_int is equal to: -2147483648
res
#> [1] 1 10 100 1000 10000 100000
#> [7] 1000000 10000000 100000000 1000000000 NA NA
#> [13] NA Created on 2021-03-17 by the reprex package (v1.0.0) Session infosessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#> setting value
#> version R version 4.0.4 (2021-02-15)
#> os Windows 10 x64
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate Italian_Italy.1252
#> ctype Italian_Italy.1252
#> tz Europe/Berlin
#> date 2021-03-17
#>
#> - Packages -------------------------------------------------------------------
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.4)
#> backports 1.2.1 2020-12-09 [1] CRAN (R 4.0.3)
#> cli 2.3.1 2021-02-23 [1] CRAN (R 4.0.4)
#> crayon 1.4.1 2021-02-08 [1] CRAN (R 4.0.4)
#> digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.4)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.4)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.4)
#> fansi 0.4.2 2021-01-15 [1] CRAN (R 4.0.4)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.4)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.4)
#> highr 0.8 2019-03-20 [1] CRAN (R 4.0.4)
#> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.0.4)
#> knitr 1.31 2021-01-27 [1] CRAN (R 4.0.4)
#> lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.0.4)
#> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.4)
#> pillar 1.5.1 2021-03-05 [1] CRAN (R 4.0.4)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.4)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.4)
#> Rcpp 1.0.6 2021-01-15 [1] CRAN (R 4.0.4)
#> reprex 1.0.0 2021-01-27 [1] CRAN (R 4.0.4)
#> rlang 0.4.10 2020-12-30 [1] CRAN (R 4.0.4)
#> rmarkdown 2.7 2021-02-19 [1] CRAN (R 4.0.4)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.4)
#> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.3)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.4)
#> styler 1.3.2 2020-02-23 [1] CRAN (R 4.0.4)
#> tibble 3.1.0 2021-02-25 [1] CRAN (R 4.0.4)
#> utf8 1.2.1 2021-03-12 [1] CRAN (R 4.0.4)
#> vctrs 0.3.6 2020-12-17 [1] CRAN (R 4.0.4)
#> withr 2.4.1 2021-01-26 [1] CRAN (R 4.0.4)
#> xfun 0.22 2021-03-11 [1] CRAN (R 4.0.4)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.3)
#>
#> [1] C:/Users/Utente/Documents/R/win-library/4.0
#> [2] C:/Program Files/R/R-4.0.4/library Unfortunately, that also implies that I cannot use # load cpp function
Rcpp::sourceCpp("src.cpp") # the same code that you shared
# load data
roads <- sfnetworks::roxel
# extract nodes
nodes <- sfnetworks:::linestring_boundary_points(roads)
table(cpp_match(nodes), useNA = "ifany")
#>
#> 1
#> 1702 Created on 2021-03-17 by the reprex package (v1.0.0) Does it make sense to use # load cpp function
Rcpp::sourceCpp(
code = '
#include <Rcpp.h>
// [[Rcpp::export]]
int cpp_match (double precision = 1e-10) {
const unsigned long long int prec_int = round (1 / precision);
Rcpp::Rcout << "prec_int is equal to: " << prec_int << std::endl;
return prec_int;
}',
verbose = TRUE
)
#>
#> Generated extern "C" functions
#> --------------------------------------------------------
#>
#>
#> #include <Rcpp.h>
#> // cpp_match
#> int cpp_match(double precision);
#> RcppExport SEXP sourceCpp_1_cpp_match(SEXP precisionSEXP) {
#> BEGIN_RCPP
#> Rcpp::RObject rcpp_result_gen;
#> Rcpp::RNGScope rcpp_rngScope_gen;
#> Rcpp::traits::input_parameter< double >::type precision(precisionSEXP);
#> rcpp_result_gen = Rcpp::wrap(cpp_match(precision));
#> return rcpp_result_gen;
#> END_RCPP
#> }
#>
#> Generated R functions
#> -------------------------------------------------------
#>
#> `.sourceCpp_1_DLLInfo` <- dyn.load('C:/Users/Utente/AppData/Local/Temp/RtmpE7A6zT/sourceCpp-x86_64-w64-mingw32-1.0.6/sourcecpp_39b04576172f/sourceCpp_2.dll')
#>
#> cpp_match <- Rcpp:::sourceCppFunction(function(precision = 1e-10) {}, FALSE, `.sourceCpp_1_DLLInfo`, 'sourceCpp_1_cpp_match')
#>
#> rm(`.sourceCpp_1_DLLInfo`)
#>
#> Building shared library
#> --------------------------------------------------------
#>
#> DIR: C:/Users/Utente/AppData/Local/Temp/RtmpE7A6zT/sourceCpp-x86_64-w64-mingw32-1.0.6/sourcecpp_39b04576172f
#>
#> C:/PROGRA~1/R/R-40~1.4/bin/x64/R CMD SHLIB -o "sourceCpp_2.dll" "file39b067833a46.cpp"
res <- vapply(
X = 10 ^ (-(0:16)),
FUN = cpp_match,
FUN.VALUE = integer(1)
)
#> prec_int is equal to: 1
#> prec_int is equal to: 10
#> prec_int is equal to: 100
#> prec_int is equal to: 1000
#> prec_int is equal to: 10000
#> prec_int is equal to: 100000
#> prec_int is equal to: 1000000
#> prec_int is equal to: 10000000
#> prec_int is equal to: 100000000
#> prec_int is equal to: 1000000000
#> prec_int is equal to: 10000000000
#> prec_int is equal to: 100000000000
#> prec_int is equal to: 1000000000000
#> prec_int is equal to: 10000000000000
#> prec_int is equal to: 100000000000000
#> prec_int is equal to: 1000000000000000
#> prec_int is equal to: 10000000000000000
res
#> [1] 1 10 100 1000 10000 100000
#> [7] 1000000 10000000 100000000 1000000000 1410065408 1215752192
#> [13] -727379968 1316134912 276447232 -1530494976 1874919424 Created on 2021-03-17 by the reprex package (v1.0.0) Session infosessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#> setting value
#> version R version 4.0.4 (2021-02-15)
#> os Windows 10 x64
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate Italian_Italy.1252
#> ctype Italian_Italy.1252
#> tz Europe/Berlin
#> date 2021-03-17
#>
#> - Packages -------------------------------------------------------------------
#> package * version date lib source
#> assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.4)
#> backports 1.2.1 2020-12-09 [1] CRAN (R 4.0.3)
#> cli 2.3.1 2021-02-23 [1] CRAN (R 4.0.4)
#> crayon 1.4.1 2021-02-08 [1] CRAN (R 4.0.4)
#> digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.4)
#> ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.4)
#> evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.4)
#> fansi 0.4.2 2021-01-15 [1] CRAN (R 4.0.4)
#> fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.4)
#> glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.4)
#> highr 0.8 2019-03-20 [1] CRAN (R 4.0.4)
#> htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.0.4)
#> knitr 1.31 2021-01-27 [1] CRAN (R 4.0.4)
#> lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.0.4)
#> magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.4)
#> pillar 1.5.1 2021-03-05 [1] CRAN (R 4.0.4)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.4)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.4)
#> Rcpp 1.0.6 2021-01-15 [1] CRAN (R 4.0.4)
#> reprex 1.0.0 2021-01-27 [1] CRAN (R 4.0.4)
#> rlang 0.4.10 2020-12-30 [1] CRAN (R 4.0.4)
#> rmarkdown 2.7 2021-02-19 [1] CRAN (R 4.0.4)
#> sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.4)
#> stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.3)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.4)
#> styler 1.3.2 2020-02-23 [1] CRAN (R 4.0.4)
#> tibble 3.1.0 2021-02-25 [1] CRAN (R 4.0.4)
#> utf8 1.2.1 2021-03-12 [1] CRAN (R 4.0.4)
#> vctrs 0.3.6 2020-12-17 [1] CRAN (R 4.0.4)
#> withr 2.4.1 2021-01-26 [1] CRAN (R 4.0.4)
#> xfun 0.22 2021-03-11 [1] CRAN (R 4.0.4)
#> yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.3)
#>
#> [1] C:/Users/Utente/Documents/R/win-library/4.0
#> [2] C:/Program Files/R/R-4.0.4/library I have no idea about the drawbacks. |
Wow, thanks @agila5, those results on windows are really ... interesting. The short answer to your Q is, Yes, you can easily just use A slight longer answer: if you really wanted to ensure that all was truly okay back over in R, you could determine a precision value like this: xymax <- max (abs (st_coordinates (nodes)))
precision <- .Machine$integer.max / xymax
precision <- nchar (as.character (round (precision)))
extra_bit <- 2 # or whatever
precision <- 1 / 10 ^ (precision - extra_bit) But using |
Just to fix the bug for the time being I am now using |
No worries @luukvdmeer, time is precious. As said, the C++ code is all there, so happy to PR when you've got time to review it - just let me know. |
Describe the bug
There is a bug in
create_nodes_from_edges()
when the input coordinates are really really really close but not exactly "identical" (I'm not even sure how to define "identical" here).Reproducible example
The problem occurs in
create_nodes_from_edges()
and, more precisely, here:sfnetworks/R/utils.R
Lines 68 to 72 in eba28b7
In fact,
has only three rows. Please note that the last line of code was taken from
sfnetworks/R/utils.R
Line 81 in eba28b7
tidygraph
are wrong, creating an error. Hence, sooner or later I think we should remove the code that compares the equality of numerical values (which is also the problem flagged by @mpadge here).I know that it sounds like a pathological example, but I found the problem while running some R code for a GIS SE question.
Expected behavior
No error.
R Session Info
Session info
The text was updated successfully, but these errors were encountered: