Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export is_polars_dtype() #927

Merged
merged 12 commits into from
Mar 17, 2024
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ Collate:
'sql.R'
'vctrs.R'
'zzz.R'
Config/rextendr/version: 0.3.1
Config/rextendr/version: 0.3.1.9000
etiennebacher marked this conversation as resolved.
Show resolved Hide resolved
VignetteBuilder: knitr
Config/polars/LibVersion: 0.38.2
Config/polars/RustToolchainVersion: nightly-2024-02-23
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -258,6 +258,7 @@ export(as_polars_df)
export(as_polars_lf)
export(as_polars_series)
export(is_polars_df)
export(is_polars_dtype)
export(is_polars_lf)
export(is_polars_series)
export(pl)
Expand Down
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
- New functions `pl$datetime()`, `pl$date()`, and `pl$time()` to easily create
Expr of class datetime, date, and time via columns and literals (#918).
- New function `pl$arg_where()` to get the indices that match a condition (#922).
- New function `is_polars_dtype()` (#927).

## Polars R Package 0.15.1

Expand Down
12 changes: 6 additions & 6 deletions R/dataframe__frame.R
Original file line number Diff line number Diff line change
Expand Up @@ -279,10 +279,10 @@ pl_DataFrame = function(..., make_names_unique = TRUE, schema = NULL) {
# no args create empty DataFrame
if (length(largs) == 0L) {
if (!is.null(schema)) {
largs = lapply(seq_along(schema), \(x) {
out = lapply(seq_along(schema), \(x) {
pl$lit(numeric(0))$cast(schema[[x]])$alias(names(schema)[x])
})
out = pl$select(largs)
}) |>
pl$select()
} else {
out = .pr$DataFrame$default()
}
Expand Down Expand Up @@ -735,7 +735,7 @@ DataFrame_drop_in_place = function(name) {
#' @description Check if two DataFrames are equal.
#'
#' @param other DataFrame to compare with.
#' @return A boolean.
#' @return A logical value
#' @keywords DataFrame
#' @examples
#' dat1 = pl$DataFrame(iris)
Expand Down Expand Up @@ -962,7 +962,7 @@ DataFrame_to_data_frame = function(..., int64_conversion = polars_options()$int6

#' Return Polars DataFrame as a list of vectors
#'
#' @param unnest_structs Boolean. If `TRUE` (default), then `$unnest()` is applied
#' @param unnest_structs Logical. If `TRUE` (default), then `$unnest()` is applied
#' on any struct column.
#' @inheritParams DataFrame_to_data_frame
#'
Expand Down Expand Up @@ -1682,7 +1682,7 @@ DataFrame_describe = function(percentiles = c(.25, .75), interpolation = "neares
#' @title Glimpse values in a DataFrame
#' @keywords DataFrame
#' @param ... not used
#' @param return_as_string Boolean (default `FALSE`). If `TRUE`, return the
#' @param return_as_string Logical (default `FALSE`). If `TRUE`, return the
#' output as a string.
#' @return DataFrame
#' @examples
Expand Down
10 changes: 0 additions & 10 deletions R/datatype.R
Original file line number Diff line number Diff line change
Expand Up @@ -85,16 +85,6 @@ print.RPolarsDataType = function(x, ...) {
"!=.RPolarsDataType" = function(e1, e2) e1$ne(e2)


#' check if x is a valid RPolarsDataType
#' @name is_polars_dtype
#' @noRd
#' @param x a candidate
#' @return a list DataType with an inner DataType
#' @examples .pr$env$is_polars_dtype(pl$Int64)
is_polars_dtype = function(x, include_unknown = FALSE) {
inherits(x, "RPolarsDataType") && (x != pl$Unknown || include_unknown)
}

#' check if x is a valid RPolarsDataType
#' @name same_outer_datatype
#' @param lhs an RPolarsDataType
Expand Down
20 changes: 10 additions & 10 deletions R/expr__expr.R
Original file line number Diff line number Diff line change
Expand Up @@ -563,10 +563,10 @@ Expr_alias = use_extendr_wrapper

#' Apply logical AND on a column
#'
#' Check if all boolean values in a Boolean column are `TRUE`. This method is an
#' Check if all values in a Boolean column are `TRUE`. This method is an
#' expression - not to be confused with `pl$all()` which is a function to select
#' all columns.
#' @param drop_nulls Boolean. Default TRUE, as name says.
#' @param drop_nulls Logical. Default TRUE, as name says.
#' @return Boolean literal
#' @examples
#' pl$DataFrame(
Expand All @@ -586,7 +586,7 @@ Expr_all = function(drop_nulls = TRUE) {
#' Apply logical OR on a column
#'
#' Check if any boolean value in a Boolean column is `TRUE`.
#' @param drop_nulls Boolean. Default TRUE, as name says.
#' @param drop_nulls Logical. Default TRUE, as name says.
#' @return Boolean literal
#' @examples
#' pl$DataFrame(
Expand Down Expand Up @@ -736,7 +736,7 @@ construct_ProtoExprArray = function(...) {
#' to inform schema of the actual return type of the R function. Setting this wrong
#' could theoretically have some downstream implications to the query.
#' @param agg_list Aggregate list. Map from vector to group in group_by context.
#' @param in_background Boolean. Whether to execute the map in a background R
#' @param in_background Logical. Whether to execute the map in a background R
#' process. Combined with setting e.g. `options(polars.rpool_cap = 4)` it can speed
#' up some slow R functions as they can run in parallel R sessions. The
#' communication speed between processes is quite slower than between threads.
Expand Down Expand Up @@ -839,7 +839,7 @@ Expr_map_batches = function(f, output_type = NULL, agg_list = FALSE, in_backgrou
#' so each element will itself be a Series. Therefore, depending on the context,
#' requirements for function differ:
#' * in `$select()` or `$with_columns()` (selection context), the function must
#' operate on R scalar values. Polars will convert each element into an R value
#' operate on R values of length 1. Polars will convert each element into an R value
#' and pass it to the function. The output of the user function will be converted
#' back into a polars type (the return type must match, see argument `return_type`).
#' Using `$map_elements()` in this context should be avoided as a `lapply()`
Expand Down Expand Up @@ -871,7 +871,7 @@ Expr_map_batches = function(f, output_type = NULL, agg_list = FALSE, in_backgrou
#' pl$DataFrame(iris)$group_by("Species")$agg(e_sum, e_head)
#'
#' # apply a function on each value (should be avoided): here the input is an R
#' # scalar
#' # value of length 1
#' # select only Float64 columns
#' my_selection = pl$col(pl$dtypes$Float64)
#'
Expand Down Expand Up @@ -1579,7 +1579,7 @@ Expr_sort_by = function(by, descending = FALSE) {

#' Gather values by index
#'
#' @param indices R scalar/vector or Series, or Expr that leads to a Series of
#' @param indices R vector or Series, or Expr that leads to a Series of
#' dtype Int64. (0-indexed)
#' @return Expr
#' @examples
Expand Down Expand Up @@ -3387,8 +3387,8 @@ Expr_rolling = function(
#' the names are the old values and the values are the replacements. Note that
#' if old values are numeric, the names must be wrapped in backticks;
#' * an Expr
#' @param new Either a scalar, a vector of same length as `old` or an Expr. If
#' missing, `old` must be a named list.
#' @param new Either a vector of length 1, a vector of same length as `old` or
#' an Expr. If missing, `old` must be a named list.
#' @param default The default replacement if the value is not in `old`. Can be
#' an Expr. If `NULL` (default), then the value doesn't change.
#' @param return_dtype The data type of the resulting expression. If set to
Expand All @@ -3399,7 +3399,7 @@ Expr_rolling = function(
#' @examples
#' df = pl$DataFrame(a = c(1, 2, 2, 3))
#'
#' # "old" and "new" can take either scalars or vectors of same length
#' # "old" and "new" can take vectors of length 1 or of same length
#' df$with_columns(replaced = pl$col("a")$replace(2, 100))
#' df$with_columns(replaced = pl$col("a")$replace(c(2, 3), c(100, 200)))
#'
Expand Down
2 changes: 1 addition & 1 deletion R/expr__list.R
Original file line number Diff line number Diff line change
Expand Up @@ -400,7 +400,7 @@ ExprList_tail = function(n = 5L) {
#' @param fields If the name and number of the desired fields is known in
#' advance, a list of field names can be given, which will be assigned by
#' index. Otherwise, to dynamically assign field names, a custom R function
#' that takes an R scalar double and outputs a string value can be used. If
#' that takes an R double and outputs a string value can be used. If
#' `NULL` (default), fields will be `field_0`, `field_1` ... `field_n`.

#' @param upper_bound A `LazyFrame` needs to know the schema at all time. The
Expand Down
4 changes: 2 additions & 2 deletions R/expr__meta.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
#' counterpart [`$meta$neq()`][ExprMeta_neq].
#'
#' @param other Expr to compare with
#' @return A boolean: `TRUE` if equal, `FALSE` otherwise
#' @return A logical value
#' @examples
#' # three naive expression literals
#' e1 = pl$lit(40) + 2
Expand All @@ -30,7 +30,7 @@ ExprMeta_eq = function(other) {
#' the counterpart [`$meta$eq()`][ExprMeta_eq].
#'
#' @inheritParams ExprMeta_eq
#' @return A boolean: `TRUE` if different, `FALSE` otherwise
#' @return A logical value
#' @examples
#' # three naive expression literals
#' e1 = pl$lit(40) + 2
Expand Down
2 changes: 1 addition & 1 deletion R/functions__lazy.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
#' `pl$lit(NULL)` translates into a polars `null`.
#'
#' @examples
#' # scalars to literal, explicit `pl$lit(42)` implicit `+ 2`
#' # values to literal, explicit `pl$lit(42)` implicit `+ 2`
#' pl$col("some_column") / pl$lit(42) + 2
#'
#' # vector to literal explicitly via Series and back again
Expand Down
24 changes: 21 additions & 3 deletions R/is_polars.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#' Test if the object is a polars DataFrame
#'
#' These functions test if the object is a polars DataFrame.
#' This function tests if the object is a polars DataFrame.
#' @param x An object
#' @return A logical value
#' @export
Expand All @@ -15,7 +15,7 @@ is_polars_df = function(x) {

#' Test if the object is a polars LazyFrame
#'
#' These functions test if the object is a polars LazyFrame.
#' This function tests if the object is a polars LazyFrame.
#' @inherit is_polars_df params return
#' @export
#' @examples
Expand All @@ -29,7 +29,7 @@ is_polars_lf = function(x) {

#' Test if the object is a polars Series
#'
#' These functions test if the object is a polars Series.
#' This function tests if the object is a polars Series.
#' @inherit is_polars_df params return
#' @export
#' @examples
Expand All @@ -39,3 +39,21 @@ is_polars_lf = function(x) {
is_polars_series = function(x) {
inherits(x, "RPolarsSeries")
}

#' Test if the object a polars DataType
#'
#' @param x An object to be tested.
etiennebacher marked this conversation as resolved.
Show resolved Hide resolved
#' @param include_unknown If `FALSE` (default), `pl$Unknown` is considered as
#' an invalid datatype.
#'
#' @export
#' @return A logical value
etiennebacher marked this conversation as resolved.
Show resolved Hide resolved
#'
#' @examples
#' is_polars_dtype(pl$Int64)
#' is_polars_dtype(mtcars)
#' is_polars_dtype(pl$Unknown)
#' is_polars_dtype(pl$Unknown, include_unknown = TRUE)
is_polars_dtype = function(x, include_unknown = FALSE) {
inherits(x, "RPolarsDataType") && (x != pl$Unknown || include_unknown)
}
30 changes: 15 additions & 15 deletions R/lazyframe__lazy.R
Original file line number Diff line number Diff line change
Expand Up @@ -361,23 +361,23 @@ LazyFrame_get_optimization_toggle = function() {
#' @title Configure optimization toggles
#' @description Configure the optimization toggles for the lazy query
#' @keywords LazyFrame
#' @param type_coercion Boolean. Coerce types such that operations succeed and
#' @param type_coercion Logical. Coerce types such that operations succeed and
#' run on minimal required memory.
#' @param predicate_pushdown Boolean. Applies filters as early as possible at
#' @param predicate_pushdown Logical. Applies filters as early as possible at
#' scan level.
#' @param projection_pushdown Boolean. Select only the columns that are needed
#' @param projection_pushdown Logical. Select only the columns that are needed
#' at the scan level.
#' @param simplify_expression Boolean. Various optimizations, such as constant
#' @param simplify_expression Logical. Various optimizations, such as constant
#' folding and replacing expensive operations with faster alternatives.
#' @param slice_pushdown Boolean. Only load the required slice from the scan
#' @param slice_pushdown Logical. Only load the required slice from the scan
#' level. Don't materialize sliced outputs (e.g. `join$head(10)`).
#' @param comm_subplan_elim Boolean. Will try to cache branching subplans that
#' @param comm_subplan_elim Logical. Will try to cache branching subplans that
#' occur on self-joins or unions.
#' @param comm_subexpr_elim Boolean. Common subexpressions will be cached and
#' @param comm_subexpr_elim Logical. Common subexpressions will be cached and
#' reused.
#' @param streaming Boolean. Run parts of the query in a streaming fashion
#' @param streaming Logical. Run parts of the query in a streaming fashion
#' (this is in an alpha state).
#' @param eager Boolean. Run the query eagerly.
#' @param eager Logical. Run the query eagerly.
#' @return LazyFrame with specified optimization toggles
#' @examples
#' pl$LazyFrame(mtcars)$set_optimization_toggle(type_coercion = FALSE)
Expand Down Expand Up @@ -410,12 +410,12 @@ LazyFrame_set_optimization_toggle = function(
#' DataFrame
#' @inheritParams LazyFrame_set_optimization_toggle
#' @param ... Ignored.
#' @param no_optimization Boolean. Sets the following parameters to `FALSE`:
#' @param no_optimization Logical. Sets the following parameters to `FALSE`:
#' `predicate_pushdown`, `projection_pushdown`, `slice_pushdown`,
#' `comm_subplan_elim`, `comm_subexpr_elim`.
#' @param inherit_optimization Boolean. Use existing optimization settings
#' @param inherit_optimization Logical. Use existing optimization settings
#' regardless the settings specified in this function call.
#' @param collect_in_background Boolean. Detach this query from R session.
#' @param collect_in_background Logical. Detach this query from R session.
#' Computation will start in background. Get a handle which later can be converted
#' into the resulting DataFrame. Useful in interactive mode to not lock R session.
#' @details
Expand Down Expand Up @@ -546,7 +546,7 @@ LazyFrame_collect_in_background = function() {
#' * "gzip": min-level: 0, max-level: 10.
#' * "brotli": min-level: 0, max-level: 11.
#' * "zstd": min-level: 1, max-level: 22.
#' @param statistics Boolean. Whether compute and write column statistics.
#' @param statistics Logical. Whether compute and write column statistics.
#' This requires extra compute.
#' @param row_group_size `NULL` or Integer. Size of the row groups in number of
#' rows. If `NULL` (default), the chunks of the DataFrame are used. Writing in
Expand Down Expand Up @@ -1259,10 +1259,10 @@ LazyFrame_join = function(
#' @param by Column(s) to sort by. Can be character vector of column names,
#' a list of Expr(s) or a list with a mix of Expr(s) and column names.
#' @param ... More columns to sort by as above but provided one Expr per argument.
#' @param descending Boolean. Sort in descending order (default is `FALSE`). This must be
#' @param descending Logical. Sort in descending order (default is `FALSE`). This must be
#' either of length 1 or a logical vector of the same length as the number of
#' Expr(s) specified in `by` and `...`.
#' @param nulls_last Boolean. Place `NULL`s at the end? Default is `FALSE`.
#' @param nulls_last Logical. Place `NULL`s at the end? Default is `FALSE`.
#' @param maintain_order Whether the order should be maintained if elements are
#' equal. If `TRUE`, streaming is not possible and performance might be worse
#' since this requires a stable search.
Expand Down
2 changes: 1 addition & 1 deletion R/polars_options.R
Original file line number Diff line number Diff line change
Expand Up @@ -286,7 +286,7 @@ pl_disable_string_cache = function() {
#' This function simply checks if the global string cache is active.
#'
#' @keywords options
#' @return A boolean
#' @return A logical value
#' @seealso
#' [`pl$with_string_cache`][pl_with_string_cache]
#' [`pl$enable_enable_cache`][pl_enable_string_cache]
Expand Down
Loading
Loading