Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Group By Transform in mo.ui.dataframe(df) does not return valid Polars code #3348

Closed
henryharbeck opened this issue Jan 6, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@henryharbeck
Copy link

Describe the bug

Group By Transform in mo.ui.dataframe(df) does not return valid Polars code.

Details in example below. It is pretty easy to see what is going wrong.

Environment

In WASM, so unsure how to run marimo env
Instead, here are the package versions I am using

marimo.__version__: "0.10.8-dev3"
polars.__version__: "1.18.0"

Code to reproduce

import marimo as mo
import polars as pl

df = pl.DataFrame({"group": ["a", "a", "b"], "age": [10, 11, 12]})
mo.ui.dataframe(df)

This Transform
Image

Produces this Polars code

df_next = df
df_next = df_next.group_by(["group"], maintain_order=True).agg([pl.col("age_max").max().alias("age_max_max")])

which raises polars.exceptions.ColumnNotFoundError: age_max

The correct Polars code (with no other stylistic adjustments) would be

df_next = df
df_next = df_next.group_by(["group"], maintain_order=True).agg([pl.col("age").max().alias("age_max")])
@henryharbeck henryharbeck added the bug Something isn't working label Jan 6, 2025
mscolnick added a commit that referenced this issue Jan 6, 2025
mscolnick added a commit that referenced this issue Jan 6, 2025
# Fix Polars GroupBy Issue #3348

This PR fixes an issue where group by transformations in Polars were not
correctly referencing the original column names in the generated code.

## Changes
- Modified the code generation in `print_code.py` to use `pl.col()` for
group by columns
- Added test case `test_polars_groupby_alias` to verify proper column
name handling in group by transformations
- Ensures both group by and aggregation operations reference original
column names correctly

## Testing
Added a new test that:
- Creates a test DataFrame with "group" and "age" columns
- Applies a group by transformation with max aggregation
- Verifies the transformed DataFrame structure and values
- Checks that the generated code correctly uses `pl.col()` for both
group by and aggregation columns

## Link to Devin run
https://app.devin.ai/sessions/ba11f083aa6b4f63857d6f1fbe11ac00

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Myles Scolnick <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
@mscolnick
Copy link
Contributor

Fixed by #3349

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants