Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

171 implement multi table helper functions (second round) #248

Merged
merged 14 commits into from
Oct 23, 2024

Conversation

folmos-at-orange
Copy link
Member

No description provided.

@folmos-at-orange folmos-at-orange force-pushed the 171-implement-multi-table-helper-functions branch 2 times, most recently from a2d8df1 to 8af29c8 Compare October 8, 2024 13:08
@folmos-at-orange folmos-at-orange self-assigned this Oct 15, 2024
@folmos-at-orange folmos-at-orange force-pushed the 171-implement-multi-table-helper-functions branch 2 times, most recently from 8b0b70b to d5a9366 Compare October 22, 2024 12:38
khiops/utils/helpers.py Outdated Show resolved Hide resolved
khiops/utils/helpers.py Outdated Show resolved Hide resolved
khiops/utils/helpers.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@popescu-v popescu-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments.

@folmos-at-orange folmos-at-orange force-pushed the 171-implement-multi-table-helper-functions branch 3 times, most recently from 82d9b1d to 31208f5 Compare October 23, 2024 10:10
khiops/utils/helpers.py Outdated Show resolved Hide resolved
- Move the dataset spec check methods out of the Dataset class
- Simplify the messages of the aformentioned check errors
  - In particular eliminate all referenoces to `X` or `y`
- Add a few new tests to `tests/test_dataset_errors.py`
- Uniformize the pattern of the `tests/test_dataset_errors.py` tests
Also:
- Simplification of the dictionary dataset tests
- Add exceptions to dictionary dataset fixtures
- Use a fixed seed for the generated data
Before this commit input tables needed to have the same number of
columns, names and types as the model dictionary. The columns needed
also to be in the same order.

Now the conditions are the following for the predict* and transform
methods:
- Columns must have the same names regardless the order of the input
  table.
  - An additional flexibility with supervised models: the target column
    may be present in the input table.
- The types must be the same for the input but the following case is
  allowed:
  - If a given column has Numerical type as input but the model is
    Categorical, then it is coerced to categorical with a warning.
- Remove `target_column_type` and `target_column_dtype` members
- Make `is_in_memory` and `is_multitable` properties
- Minor changes in comments and renamings
@folmos-at-orange folmos-at-orange force-pushed the 171-implement-multi-table-helper-functions branch from 31208f5 to d386853 Compare October 23, 2024 11:55
@folmos-at-orange folmos-at-orange force-pushed the 171-implement-multi-table-helper-functions branch from d386853 to 863e189 Compare October 23, 2024 12:09
Copy link
Collaborator

@popescu-v popescu-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@folmos-at-orange folmos-at-orange merged commit 8147793 into dev Oct 23, 2024
30 checks passed
@folmos-at-orange folmos-at-orange deleted the 171-implement-multi-table-helper-functions branch October 23, 2024 16:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants