This repository has been archived by the owner on Nov 11, 2022. It is now read-only.
Version 1.6.0
- Added
InProcessPipelineRunner
, an improvement over theDirectPipelineRunner
that better implements the Dataflow model.InProcessPipelineRunner
runs on a user's local machine and supports multithreaded execution, unboundedPCollection
s, and triggers for speculative and late outputs. - Added display data, which allows annotating user functions (
DoFn
,CombineFn
, andWindowFn
),Source
s, andSink
s with static metadata to be displayed in the Dataflow Monitoring Interface. Display data has been implemented for core components and is automatically applied to allPipelineOptions
. - Added the ability to compose multiple
CombineFn
s into a singleCombineFn
usingCombineFns.compose
orCombineFns.composeKeyed
. - Added the methods
getSplitPointsConsumed
andgetSplitPointsRemaining
to theBoundedReader
API to improve Dataflow's ability to automatically scale a job reading from these sources. Default implementations of these functions have been provided, but reader implementers should override them to provide better information when available. - Improved performance of side inputs when using workers with many cores.
- Improved efficiency when using
CombineFnWithContext
. - Fixed several issues related to stability in the streaming mode.