-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return effective trim point LSN on successful trim request #2468
Conversation
c54764d
to
7f9d532
Compare
7f9d532
to
b86ac06
Compare
@@ -587,7 +587,7 @@ impl<T: TransportConnect> Service<T> { | |||
info!( | |||
?log_id, | |||
trim_point_inclusive = ?trim_point, | |||
"Manual trim log command received"); | |||
"Trim log command received"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is slightly more accurate; it might have been issued by automation.
/// Trim the log to the minimum of new_trim_point and last_committed_offset | ||
/// new_trim_point is inclusive (will be trimmed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better documented in the trait.
@@ -46,7 +46,7 @@ pub(super) struct ReplicatedLoglet<T> { | |||
#[debug(skip)] | |||
networking: Networking<T>, | |||
#[debug(skip)] | |||
logservers_rpc: LogServersRpc, | |||
log_servers_rpc: LogServersRpc, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could not resist some trivial fixups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for creating this PR @pcholakov. This is a good improvement. Before we can merge this PR we need to revisit the contract of trim
wrt to the return value. I would suggest to not return Option<Lsn>
and instead the last trim point independent whether this call did the trimming or not. We should also make sure that the different loglet implementations are aligned wrt the contract we settle on.
@@ -94,6 +94,11 @@ message TrimLogRequest { | |||
uint64 trim_point = 2; | |||
} | |||
|
|||
message TrimLogResponse { | |||
uint32 log_id = 1; | |||
optional uint64 trim_point = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under which condition would a none trim_point
value a valid response? Maybe document it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack; one could also make the case we don't need an optional
here as we could represent Lsn::INVALID
as a uint64
but I rather like the cleaner semantics of a None
value.
crates/bifrost/src/bifrost.rs
Outdated
/// Trim the log to the specified LSN trim point (inclusive). Returns the new trim point LSN if | ||
/// the log was actually trimmed by this call, or `None` otherwise. | ||
pub async fn trim(&self, log_id: LogId, trim_point: Lsn) -> Result<Option<Lsn>, Error> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering whether changing this contract t always return the lowest available lsn is a bit nicer. In the current form if I send trim(0, 5) twice, my first request will receive a Some(5)
as response and the second None
. Differently asked, why is it important to be able to distinguish whether a trim
call has actually performed the trim or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It behaves in the way you describe now! I was over-indexing on the needs of testing, if you zoom out a bit it's pretty clear this is how the API should work.
if current_trim_point >= actual_trim_point { | ||
return Ok(()); | ||
if current_trim_point >= requested_trim_point { | ||
return Ok(Some(current_trim_point)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this implementation is not in line with how you described it earlier. According to your specification, it should return None
if nothing is done. The local loglet is implemented differently in this regard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for flagging this! I've revised the method contract as you suggested, and this is now the correct behavior. I've also added some minimal coverage for these edge cases to gapless_loglet_smoke_test
which runs against all loglet providers.
Agree, this is probably the better API contract as it makes the operation naturally idempotent. I was keen to determine whether a specific call had caused the trim for the purposes of testing but I think that's not very important overall. Will update, thanks for the input! |
b3bc1c4
to
73943fb
Compare
Updated the method contract to return the actual trim point regardless of whether a trim was performed (or None
, if the log has never been trimmed / the trim position is 0
).
// If we get a quorum, we will acknowledge the trim request with the highest trim point | ||
// reported by any log server. This may be higher than what was requested. | ||
let mut response_trim_point = LogletOffset::INVALID; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On re-reading the change, I realized that the replicated loglet retured as the actual trim point that reported by the latest log server to respond as we reach the write consensus requirement. A more paranoid approach is to take the max of all the reported trim points thus far - this could be higher than was was requested, but it's probably safer to over-report.
As discussed offline. There is a directional change that needs to happen with trimming and I'm a little concerned that starting to return the trim-point may introduce an unintentional dependency on the current semantics. I'd prefer if we park this and use |
Ack, let's can this. |
This change enables
restatectl
reporting of the effective trim point as well as positive validation in places like the upcoming trim-gap handling test (#2463).It's mostly just plumbing and a couple of opportunistic cleanups.
@tillrohrmann - making you the primary reviewer since you've been quite engaged with this!
cc: @AhmedSoliman for visibility - I've slightly updated the Bifrost log server
Trimmed
response RPC to support the updated semantics.