-
Notifications
You must be signed in to change notification settings - Fork 133
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
The purpose of this adapter is to showcase how you can write transforms that are agnostic of the dataframe type. Assumptions for this plugin: * you can only have one "backend"; you can't mix & match. That means you can't load some in pandas, and some in polars I don't think -- this is a narwhals limitation. * This change uses the narwhals decorator. This assumes that non pandas/polars stuff would be left alone by it. If not, we could just skip adding it if we don't detect a type. * This makes the user choose what the return result builder is and then requires them to nest it in the narwhals result builder that just converts the outputs to the backend that is being used. * I think this is a good enough integration to get out -- we'll likely tweak/add more functionality as feedback comes in. Squashed commits: * Adds stub of NarwhalsAdapter Assumptions narwhals has (I believe): 1. you can only have one "backend"; you can't mix & match. That means you can't load some in pandas, and some in polars I don't think. 2. This change uses the narwhals decorator. This assumes that non pandas/polars stuff would be left alone by it. If not, we could just skip adding it if we don't detect a type. Otherwise probably need a better example from narhwals. * Adds one attempt at a result builder This makes the user choose what the return type is and then requires them to nest it in the narwhals result builder that just converts the outputs to the backend that is being used. * Adds narwhals plugin v1 First version of narwhals support. * Completes Narwhals example Adds README and notebook so that people can run this example easily. Also adds circleci tests. * Adds missing dependency * Fixes polars test for polars 1.0+ * Adds narwhals to integration docs
- Loading branch information
Showing
14 changed files
with
619 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# Narwhals | ||
|
||
[Narwhals](https://narwhals-dev.github.io/narwhals/) is a library that aims | ||
to unify expression across dataframe libraries. It is meant to be lightweight | ||
and focuses on python first dataframe libraries. | ||
|
||
This examples shows how you can write dataframe agnostic code | ||
and then load up a pandas or polars data to then use with it. | ||
|
||
## Running the example | ||
|
||
You can run the example doing: | ||
|
||
```bash | ||
# cd examples/narwhals/ | ||
python example.py | ||
``` | ||
This will run both variants one after the other. | ||
|
||
or running the notebook: | ||
|
||
```bash | ||
# cd examples/narwhals | ||
jupyter notebook # pip install jupyter if you don't have it | ||
``` | ||
Or you can open up the notebook in Colab: | ||
|
||
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/dagworks-inc/hamilton/blob/main/examples/narwhals/notebook.ipynb) |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
import narwhals as nw | ||
import pandas as pd | ||
import polars as pl | ||
|
||
from hamilton.function_modifiers import config, tag | ||
|
||
|
||
@config.when(load="pandas") | ||
def df__pandas() -> nw.DataFrame: | ||
return pd.DataFrame({"a": [1, 1, 2, 2, 3], "b": [4, 5, 6, 7, 8]}) | ||
|
||
|
||
@config.when(load="pandas") | ||
def series__pandas() -> nw.Series: | ||
return pd.Series([1, 3]) | ||
|
||
|
||
@config.when(load="polars") | ||
def df__polars() -> nw.DataFrame: | ||
return pl.DataFrame({"a": [1, 1, 2, 2, 3], "b": [4, 5, 6, 7, 8]}) | ||
|
||
|
||
@config.when(load="polars") | ||
def series__polars() -> nw.Series: | ||
return pl.Series([1, 3]) | ||
|
||
|
||
@tag(nw_kwargs=["eager_only"]) | ||
def example1(df: nw.DataFrame, series: nw.Series, col_name: str) -> int: | ||
return df.filter(nw.col(col_name).is_in(series.to_numpy())).shape[0] | ||
|
||
|
||
def group_by_mean(df: nw.DataFrame) -> nw.DataFrame: | ||
return df.group_by("a").agg(nw.col("b").mean()).sort("a") | ||
|
||
|
||
if __name__ == "__main__": | ||
import __main__ as example | ||
|
||
from hamilton import base, driver | ||
from hamilton.plugins import h_narwhals, h_polars | ||
|
||
# pandas | ||
dr = ( | ||
driver.Builder() | ||
.with_config({"load": "pandas"}) | ||
.with_modules(example) | ||
.with_adapters( | ||
h_narwhals.NarwhalsAdapter(), | ||
h_narwhals.NarwhalsDataFrameResultBuilder(base.PandasDataFrameResult()), | ||
) | ||
.build() | ||
) | ||
r = dr.execute([example.group_by_mean, example.example1], inputs={"col_name": "a"}) | ||
print(r) | ||
|
||
# polars | ||
dr = ( | ||
driver.Builder() | ||
.with_config({"load": "polars"}) | ||
.with_modules(example) | ||
.with_adapters( | ||
h_narwhals.NarwhalsAdapter(), | ||
h_narwhals.NarwhalsDataFrameResultBuilder(h_polars.PolarsDataFrameResult()), | ||
) | ||
.build() | ||
) | ||
r = dr.execute([example.group_by_mean, example.example1], inputs={"col_name": "a"}) | ||
print(r) | ||
dr.display_all_functions("example.png") |
Oops, something went wrong.