v3.2.0
What's Changed
[new] Sliding Windows
Sliding windows are overlapping time-based windows that advance with each incoming message rather than at fixed intervals like hopping windows.
They have a fixed 1 ms resolution, perform better, and are less resource-intensive than hopping windows with a 1 ms step.
Read more in Sliding Windows docs.
PR by @gwaramadze - #515
[new] FileSink and FileSource connectors
FileSink allows to batches of data to files on disk in JSON and Parquet formats.
FileSource enables processing data streams from JSON or Parquet files.
The resulting messages can be produced in "replay" mode, where the time between record producing is matched as close as possible to the original.
Learn more on File Sink and FileSource pages.
PRs:
- local file sink by @tomas-quix in #560
- local file source by @tim-quix in #601
[upd] Updated time tracking in windowed aggregations
In previous versions, Windowed aggregations were tracking time in the streams per topic-partition, but kept expiring them per keys.
It was not a fully consistent behavior, and it also created problems when processing data from misaligned producers.
For example, IoT and other physical devices may produce data at certain frequency, which results in misaligned data streams within one topic-partition, and more data is considered "late" and dropped from the processing.
To make the processing of such data more complete, Quix Streams now tracks event time per each message key in the windows.
PRs:
- #591 by @daniil-quix
- #607 by @daniil-quix
[upd] Updated CSVSource
Some breaking changes were made to CSVSource
to make it easier to use:
- It now accepts CSV files in arbitrary formats and produces each row as a message value, making it less opinionated about the data format.
- It now requires the
name
to be passed directly. Previously, it was using the file name as a name of the source. - Message keys and timestamps can be extracted from the rows via
key_extractor
andtimestamp_extractor
params - Removed params
key_serializer
andvalue_serializer
PR by @daniil-quix in #602
Bug fixes
- Fix invalid mapping for
oauth_cb
in BaseSettings by @daniil-quix in #606
Dependencies
- Update confluent-kafka requirement from <2.5,>=2.2 to >=2.6,<2.7 by @dependabot in #578
Docs
- Update README by @gwaramadze in #604
- Update sinks.md by @SteveRosam in #610
Full Changelog: v3.1.1...v3.2.0