diff --git a/kernel/src/lib.rs b/kernel/src/lib.rs index fa88e7afa..49dceea75 100644 --- a/kernel/src/lib.rs +++ b/kernel/src/lib.rs @@ -371,8 +371,20 @@ pub trait JsonHandler: AsAny { output_schema: SchemaRef, ) -> DeltaResult>; - /// Read and parse the JSON format file at given locations and return - /// the data as EngineData with the columns requested by physical schema. + /// Read and parse the JSON format file at given locations and return the data as EngineData with + /// the columns requested by physical schema. Note: The [`FileDataReadResultIterator`] must emit + /// data from files in the order that `files` is given. For example if files ["a", "b"] is provided, + /// then the engine data iterator must first return all the engine data from file "a", _then_ all + /// the engine data from file "b". Moreover, for a given file, all of its [`EngineData`] and + /// constituent rows must be in order that they occur in the file. Consider a file with rows + /// (1, 2, 3). The following are legal iterator batches: + /// iter: [EngineData(1, 2), EngineData(3)] + /// iter: [EngineData(1), EngineData(2, 3)] + /// iter: [EngineData(1, 2, 3)] + /// The following are illegal batches: + /// iter: [EngineData(3), EngineData(1, 2)] + /// iter: [EngineData(1), EngineData(3, 2)] + /// iter: [EngineData(2, 1, 3)] /// /// # Parameters ///