You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using PyIceberg to scan it into a duckdb conn, it works and returns empty df like expected.
However, when trying to query it directly with duckdb like this:
duckdb.sql(f"""INSTALL iceberg; LOAD iceberg; SELECT * FROM iceberg_scan('{table.metadata_location}', skip_schema_inference=True)""").fetchdf()
I get the following error:
InternalException: INTERNAL Error: Value::LIST without providing a child-type requires a non-empty list of values. Use Value::LIST(child_type, list) instead.
This error signals an assertion failure within DuckDB. This usually occurs due to unexpected conditions or errors in the program's logic.
For more information, see https://duckdb.org/docs/dev/internal_errors
If I try to use the schema_inference I get this error:
IOException: IO Error: Invalid field found while parsing field: type
Thanks!
The text was updated successfully, but these errors were encountered:
D select * from iceberg_scan("s3://a-test-bucket/test_db/metadata/00024-364b99cb-1888-46fa-adb8-5d59db1029b1.metadata.json");
IO Error: Invalid field found while parsing field: type
D select * from iceberg_scan("s3://a-test-bucket/test_db/metadata/00024-364b99cb-1888-46fa-adb8-5d59db1029b1.metadata.json", skip_schema_inference=True);
IO Error: Failed to read file "s3://a-test-bucket/test_db/data/ingest_ts_day=2023-06-21/00000-1495-abf583d7-67bb-4a8a-af8a-f061a00b5963-00009.parquet": schema mismatch in glob: column "source_file_identifier" was read from the original file "s3://a-test-bucket/test_db/data/ingest_ts_day=2024-06-26/00000-57-e3eb283b-c2d2-487a-b9f7-989be9b00edb-00001.parquet", but could not be found in file "s3://a-test-bucket/test_db/data/ingest_ts_day=2023-06-21/00000-1495-abf583d7-67bb-4a8a-af8a-f061a00b5963-00009.parquet".
Candidate names: .... snip(list of column names) ....
If you are trying to read files with different schemas, try setting union_by_name=True
D select * from iceberg_scan("s3://a-test-bucket/test_db/metadata/00024-364b99cb-1888-46fa-adb8-5d59db1029b1.metadata.json", skip_schema_inference=True, union_by_name=True);
Binder Error: Invalid named parameter "union_by_name" for function iceberg_scan
Candidates:
version_name_format VARCHAR
version VARCHAR
mode VARCHAR
metadata_compression_codec VARCHAR
allow_moved_paths BOOLEAN
skip_schema_inference BOOLEAN
(schema has nested structs/arrays)
FROM duckdb_extensions() select extension_name, extension_version;
│ iceberg │ d62d91d │
Hey,
I have the following empty iceberg table:
table_name (
1: col1: optional string,
2: col2: optional timestamptz,
3: col3: optional string,
4: col4: optional timestamptz,
5: col5: optional timestamptz,
6: col6: optional list,
7: col7: optional timestamptz,
8: col8: optional list,
9: col9: optional list,
10: col10: optional list,
11: col11: optional string,
12: col12: optional string,
13: col13: optional string,
14: col14: optional list,
15: col15: optional string,
16: col16: optional list,
17: col17: optional list,
18: col18: optional list,
19: col19: optional list,
20: col20: optional list,
21: col21: optional list,
22: col22: optional list,
23: col23: optional boolean,
24: col24: optional list,
25: col25: optional list,
26: col26: optional list,
27: col27: optional timestamptz,
28: col28: optional timestamptz,
29: col29: optional timestamptz
),
partition by: [],
sort order: [],
snapshot: Operation.APPEND: id=3172179944265825688, schema_id=0
When using PyIceberg to scan it into a duckdb conn, it works and returns empty df like expected.
However, when trying to query it directly with duckdb like this:
I get the following error:
InternalException: INTERNAL Error: Value::LIST without providing a child-type requires a non-empty list of values. Use Value::LIST(child_type, list) instead.
This error signals an assertion failure within DuckDB. This usually occurs due to unexpected conditions or errors in the program's logic.
For more information, see https://duckdb.org/docs/dev/internal_errors
If I try to use the schema_inference I get this error:
IOException: IO Error: Invalid field found while parsing field: type
Thanks!
The text was updated successfully, but these errors were encountered: