Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: generic DataFrame #293

Merged
merged 11 commits into from
Nov 21, 2024

Conversation

scarf005
Copy link
Contributor

@scarf005 scarf005 commented Nov 14, 2024

image

resolves #292. please check individual commits for detail.

@scarf005 scarf005 force-pushed the feat/generic-dataframes branch 2 times, most recently from ee1b921 to adf2c51 Compare November 14, 2024 13:26
@scarf005 scarf005 force-pushed the feat/generic-dataframes branch from 01bf949 to 0b83d6c Compare November 18, 2024 13:25
@scarf005 scarf005 force-pushed the feat/generic-dataframes branch from 0b83d6c to e13e1f8 Compare November 18, 2024 13:41
@@ -1314,8 +1314,8 @@ describe("dataframe", () => {
]);
expect(actual).toFrameEqual(expected);
});
test("pivot", () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for removing this? Thx

Copy link
Contributor Author

@scarf005 scarf005 Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gah, accidently removed when adding scopes, will amend and push again
EDIT: force-pushed on a675894 (#293)


df = pl.DataFrame({
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for these extra { } ? thx

Copy link
Contributor Author

@scarf005 scarf005 Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

existing tests were failing as the type of DataFrame got more specific, so reassignment was failing due to type mismatch. thus i changed usage of let to const, and added {} for scope to prevent name collision of df, actual and expected. the changes are easier to see with whitespace disabled: https://github.com/pola-rs/nodejs-polars/pull/293/files?w=1

@scarf005 scarf005 force-pushed the feat/generic-dataframes branch from e13e1f8 to a675894 Compare November 18, 2024 14:45
@scarf005 scarf005 requested a review from Bidek56 November 18, 2024 15:39
Copy link
Collaborator

@universalmind303 universalmind303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, but I think we should update some of the docstrings to include examples with the new type system.

@scarf005
Copy link
Contributor Author

Overall looks good, but I think we should update some of the docstrings to include examples with the new type system.

applied in ed11909 (#293) and 1cf2275 (#293).

@universalmind303
Copy link
Collaborator

thanks @scarf005

@universalmind303 universalmind303 merged commit d806f3c into pola-rs:main Nov 21, 2024
12 checks passed
@scarf005 scarf005 deleted the feat/generic-dataframes branch November 21, 2024 00:04
universalmind303 pushed a commit that referenced this pull request Dec 17, 2024
Adding generic types to a few more methods beyond what was added in #293
by @scarf005

Focusing mostly on adding identity types to methods which I believe
don’t change the original type of the dataframe. I added “identity” type
signatures to the following methods:
> extend, fillNull, filter, interpolate, limit, max, mean, median, min,
quantile, rechunk, shiftAndFill, shrinkToFit, slice, sort, std, sum,
tail, unique, var, vstack, where, upsample

These previously returned `DataFrame<any>`, even when called on a
well-typed DataFrame, but now return `DataFrame<T>` (the original type)

---

I also added better types for a few slightly more complex ones:
- map
- improved return type based on the function passed, but unimproved
parameter type
- nullCount
- toRecords
- toSeries
- for now, returning a broad union type, rather than identifying the
specific column by index
- withColumn

---

Along the way, I added minor fixes for the types of:
1. `pl.intRange`
[[1]](890bf21)
which had overloads in the wrong order leading to incorrect return
types, and
2. the `pl.Series(name, values, dtype)` constructor
[[2]](a2635bd),
whose strongly-typed overload was failing to apply in simple cases like
`pl.Series("index", [0, 1, 2, 3, 4], pl.Int64)` when the input array
used `number`s instead of `BigInt`s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

generic signature for dataframes
3 participants