-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Perform handle spill IO outside of locked section in SpillFramework #11880
base: branch-25.02
Are you sure you want to change the base?
Conversation
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
sql-plugin/src/main/scala/com/nvidia/spark/rapids/spill/SpillFramework.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/spill/SpillFramework.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/spill/SpillFramework.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/spill/SpillFramework.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/spill/SpillFramework.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/spill/SpillFramework.scala
Outdated
Show resolved
Hide resolved
sql-plugin/src/main/scala/com/nvidia/spark/rapids/spill/SpillFramework.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
sql-plugin/src/main/scala/com/nvidia/spark/rapids/spill/SpillFramework.scala
Outdated
Show resolved
Hide resolved
Signed-off-by: Zach Puller <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please update the docs for all of the public APIs in SpillableHandle
and for SpillableHandle
itself to make it clear what APIs need to be thread safe and how they are supposed to be protected. For example spill
has no indication as to how it should behave when multiple people call it. We can infer that it one wins and the others fail. I want this mostly so it is clear what the contract is for each of these APIs. spillable
being a best effort API intended only for quick filtering is fine, but it needs to be documented so if someone tries to use it in a way that requires it to be exact we know that is wrong and it violates the contract. This also will let me reason about the code and who is calling it so that I can better reason about the correctness.
Also could you please explain how error handling/error recovery is intended to happen when spilling? Like if an exception is thrown while we are in the middle of trying to spill something what should happen. Are we supposed to just keep it in the spilling state?
sql-plugin/src/main/scala/com/nvidia/spark/rapids/spill/SpillFramework.scala
Outdated
Show resolved
Hide resolved
Absolutely yeah, I'm just planning to resolve the other outstanding issues in the PR and ensure I have a clear understanding and answer to the other questions, and then will document accordingly. |
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
I discussed offline with @abellina and he gave me some ideas on how to rework this to not require a separate lock, but to just move the IO component out of the protected section of |
I'm triggering a build job to see if some test consistently fails for all PRs. |
build |
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
…into spill_lock
Signed-off-by: Zach Puller <[email protected]>
Signed-off-by: Zach Puller <[email protected]>
I've pushed several updates to the branch. Per my previous comment, now the changes are structured in such a way that the interface does not need to change. I was able to test this and see evidence in traces showing that it allows multiple threads to spill concurrently. I considered refactoring some duplicate code fragments between the handle types around how we do the buffer swaps etc. but I'm not sure if it will actually simplify the overall code readability/PR so I held off for now. |
Signed-off-by: Zach Puller <[email protected]>
Addresses #11830
Moves the IO outside of the critical section in the spillable buffer handle
spill
functions to allow threads interacting with theSpillFramework
to manage the spill state of a handle consistently without being blocked on IO. Eg. if threadt1
is in the middle of spilling, and threadt2
wants to check whether this handle is currently spilling, it doesn't need to wait for the spill IO operation to complete in order to check whether the handle it spillable.