-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add coerce_float
option to pl.from_arrow
#3761
Comments
You can cast to float64 on the pyarrow table with pyarrow, if you need it. Before converting it to a polars dataframe. It might get supported natively in the future. |
Polars has some support for Decimal now, but for now it still converts pyarrow decimals to float64 by default, but converting to decimal can be enabled with In [40]: tbl = pa.table(
...: {
...: "a": pa.array([1, 2, 3, 4, 5], pa.decimal128(38, 2)),
...: "b": pa.array([1, 2, 3, 4, 5], pa.int64()),
...: }
...: )
In [41]: tbl.schema
Out[41]:
a: decimal128(38, 2)
b: int64
In [42]: pl.from_arrow(tbl)
Out[42]:
shape: (5, 2)
┌─────┬─────┐
│ a ┆ b │
│ --- ┆ --- │
│ f64 ┆ i64 │
╞═════╪═════╡
│ 1.0 ┆ 1 │
│ 2.0 ┆ 2 │
│ 3.0 ┆ 3 │
│ 4.0 ┆ 4 │
│ 5.0 ┆ 5 │
└─────┴─────┘
In [8]: import os
In [9]: os.environ["POLARS_ACTIVATE_DECIMAL"] = "1"
In [23]: pl.from_arrow(tbl)
Out[23]:
shape: (5, 2)
┌────────────────┬─────┐
│ a ┆ b │
│ --- ┆ --- │
│ decimal[.38,2] ┆ i64 │
╞════════════════╪═════╡
│ 1 ┆ 1 │
│ 2 ┆ 2 │
│ 3 ┆ 3 │
│ 4 ┆ 4 │
│ 5 ┆ 5 │
└────────────────┴─────┘ |
Or use the config option, instead of the environment variable: pl.Config.activate_decimals(True) |
Closing as it seems resolved now. |
Add
coerce_float
option to pl.from_arrowCurrently I can't read in Arrow data with decimals. If it's not easy enough to add support, another, perhaps easier/quicker option is adding a
coerce_float
flag that will convert them to floating points. Similar to what Pandas does [1]. It's kinda hacky, but I need something now 😭 and coercing is a totally acceptable for my use case.[1] https://pandas.pydata.org/docs/reference/api/pandas.read_sql.html#pandas.read_sql
Thanks y'all! This is an amazing project!
The text was updated successfully, but these errors were encountered: