Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datetime objects in row_filter expressions are not casted and raise an error #1456

Open
jayceslesar opened this issue Dec 20, 2024 · 3 comments

Comments

@jayceslesar
Copy link
Contributor

Feature Request / Improvement

in the following row_filter in a scan() call,

row_filter=And(
            GreaterThanOrEqual("timestamp_received", start_time),
            LessThan("timestamp_received", end_time),
        ),

the start_time and end_time variables are datetime objects. When running a scan with that row_filter I end up with the error of

TypeError: Invalid literal value: datetime.datetime(2024, 11, 13, 11, 35, 28, 730000, tzinfo=datetime.timezone.utc)

Instead, the expressions should correctly cast the datetime object instead of expecting an isoformat string, which is the current behavior

@kevinjqliu
Copy link
Contributor

Thanks for reporting this issue!
I believe this is the relevant code

def literal(value: L) -> Literal[L]:
"""
Construct an Iceberg Literal based on Python primitive data type.
Args:
value (Python primitive type): the value to be associated with literal.
Example:
from pyiceberg.expressions.literals import literal.
>>> literal(123)
LongLiteral(123)
"""
if isinstance(value, float):
return DoubleLiteral(value) # type: ignore
elif isinstance(value, bool):
return BooleanLiteral(value)
elif isinstance(value, int):
return LongLiteral(value)
elif isinstance(value, str):
return StringLiteral(value)
elif isinstance(value, UUID):
return UUIDLiteral(value.bytes) # type: ignore
elif isinstance(value, bytes):
return BinaryLiteral(value)
elif isinstance(value, Decimal):
return DecimalLiteral(value)
else:
raise TypeError(f"Invalid literal value: {repr(value)}")

Looks like we don't take into account python's datetime object.

cc @Fokko do you have context around this?

@jayceslesar
Copy link
Contributor Author

Any thoughts here? there are a few ways to implement this from what I can tell, the easiest would be to just add a clause to the literal function that returns TimestampLiteral(datetime_to_micros(value)) if value is a datetime

@kunnapatt
Copy link
Contributor

@jayceslesar
Thanks I got the same problem, but when I followed your comment, It worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@kevinjqliu @kunnapatt @jayceslesar and others