You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
The rows of the Adult dataset end with a dot: '...,<=50K.'
When handled naively , it's included in the label column, which is undesirable. Therefore, the following code is implemented in the preprocessing function of the Adult class (Adult.py):
Describe the bug
The rows of the Adult dataset end with a dot: '...,<=50K.'
When handled naively , it's included in the label column, which is undesirable. Therefore, the following code is implemented in the preprocessing function of the Adult class (Adult.py):
df["Target"] = df["Target"].str.replace(r".", "", regex=True)
But, in my case, this doesn't correct the issue.
To Reproduce
adult = Adult(test_path="local_path_to_test_adult.csv",
train_path="local_path_to_train_adult.csv",
preprocess=True,
)
adult_test_data = adult.inverse_preprocess(adult.test_data)
adult_data = pd.concat([adult_test_data,
adult.test_labels[">50K"].to_frame(name="labels").astype("float32")
], axis=1)
--> This will give the error
Expected behavior
I would expect that the code would erase the dot.
Environment
The text was updated successfully, but these errors were encountered: