Return effective trim point LSN on successful trim request #2468

pcholakov · 2025-01-06T11:51:59Z

This change enables restatectl reporting of the effective trim point as well as positive validation in places like the upcoming trim-gap handling test (#2463).

It's mostly just plumbing and a couple of opportunistic cleanups.

@tillrohrmann - making you the primary reviewer since you've been quite engaged with this!

cc: @AhmedSoliman for visibility - I've slightly updated the Bifrost log server Trimmed response RPC to support the updated semantics.

github-actions · 2025-01-06T12:09:04Z

Test Results

7 files ±0 7 suites ±0 4m 21s ⏱️ -6s
47 tests ±0 46 ✅ ±0 1 💤 ±0 0 ❌ ±0
182 runs ±0 179 ✅ ±0 3 💤 ±0 0 ❌ ±0

Results for commit d55eee4. ± Comparison against base commit 3ff1c0c.

♻️ This comment has been updated with latest results.

pcholakov · 2025-01-06T12:37:30Z

crates/admin/src/cluster_controller/service.rs

@@ -587,7 +587,7 @@ impl<T: TransportConnect> Service<T> {
                info!(
                    ?log_id,
                    trim_point_inclusive = ?trim_point,
-                    "Manual trim log command received");
+                    "Trim log command received");


I think this is slightly more accurate; it might have been issued by automation.

pcholakov · 2025-01-06T12:45:09Z

crates/bifrost/src/providers/local_loglet/mod.rs

-    /// Trim the log to the minimum of new_trim_point and last_committed_offset
-    /// new_trim_point is inclusive (will be trimmed)


Better documented in the trait.

pcholakov · 2025-01-06T12:45:44Z

crates/bifrost/src/providers/replicated_loglet/loglet.rs

@@ -46,7 +46,7 @@ pub(super) struct ReplicatedLoglet<T> {
    #[debug(skip)]
    networking: Networking<T>,
    #[debug(skip)]
-    logservers_rpc: LogServersRpc,
+    log_servers_rpc: LogServersRpc,


Could not resist some trivial fixups.

tillrohrmann

Thanks a lot for creating this PR @pcholakov. This is a good improvement. Before we can merge this PR we need to revisit the contract of trim wrt to the return value. I would suggest to not return Option<Lsn> and instead the last trim point independent whether this call did the trimming or not. We should also make sure that the different loglet implementations are aligned wrt the contract we settle on.

tillrohrmann · 2025-01-07T08:20:41Z

crates/admin/protobuf/cluster_ctrl_svc.proto

@@ -94,6 +94,11 @@ message TrimLogRequest {
  uint64 trim_point = 2;
 }

+message TrimLogResponse {
+  uint32 log_id = 1;
+  optional uint64 trim_point = 2;


Under which condition would a none trim_point value a valid response? Maybe document it.

Ack; one could also make the case we don't need an optional here as we could represent Lsn::INVALID as a uint64 but I rather like the cleaner semantics of a None value.

tillrohrmann · 2025-01-07T08:30:41Z

crates/bifrost/src/bifrost.rs

+    /// Trim the log to the specified LSN trim point (inclusive). Returns the new trim point LSN if
+    /// the log was actually trimmed by this call, or `None` otherwise.
+    pub async fn trim(&self, log_id: LogId, trim_point: Lsn) -> Result<Option<Lsn>, Error> {


I am wondering whether changing this contract t always return the lowest available lsn is a bit nicer. In the current form if I send trim(0, 5) twice, my first request will receive a Some(5) as response and the second None. Differently asked, why is it important to be able to distinguish whether a trim call has actually performed the trim or not?

It behaves in the way you describe now! I was over-indexing on the needs of testing, if you zoom out a bit it's pretty clear this is how the API should work.

crates/bifrost/src/bifrost.rs

tillrohrmann · 2025-01-07T08:39:59Z

crates/bifrost/src/providers/memory_loglet.rs

-        if current_trim_point >= actual_trim_point {
-            return Ok(());
+        if current_trim_point >= requested_trim_point {
+            return Ok(Some(current_trim_point));


I think this implementation is not in line with how you described it earlier. According to your specification, it should return None if nothing is done. The local loglet is implemented differently in this regard.

Thanks for flagging this! I've revised the method contract as you suggested, and this is now the correct behavior. I've also added some minimal coverage for these edge cases to gapless_loglet_smoke_test which runs against all loglet providers.

crates/bifrost/src/providers/replicated_loglet/tasks/trim.rs

pcholakov · 2025-01-07T08:56:59Z

Agree, this is probably the better API contract as it makes the operation naturally idempotent. I was keen to determine whether a specific call had caused the trim for the purposes of testing but I think that's not very important overall. Will update, thanks for the input!

…oglet implementations

Updated the method contract to return the actual trim point regardless of whether a trim was performed (or None, if the log has never been trimmed / the trim position is 0).

…quorum

pcholakov · 2025-01-08T08:41:35Z

crates/bifrost/src/providers/replicated_loglet/tasks/trim.rs

+        // If we get a quorum, we will acknowledge the trim request with the highest trim point
+        // reported by any log server. This may be higher than what was requested.
+        let mut response_trim_point = LogletOffset::INVALID;


On re-reading the change, I realized that the replicated loglet retured as the actual trim point that reported by the latest log server to respond as we reach the write consensus requirement. A more paranoid approach is to take the max of all the reported trim points thus far - this could be higher than was was requested, but it's probably safer to over-report.

AhmedSoliman · 2025-01-08T12:27:00Z

As discussed offline. There is a directional change that needs to happen with trimming and I'm a little concerned that starting to return the trim-point may introduce an unintentional dependency on the current semantics. I'd prefer if we park this and use get_trim_point() after trimming instead.

pcholakov · 2025-01-08T12:37:43Z

Ack, let's can this.

pcholakov force-pushed the feat/trim-log-report-lsn branch from c54764d to 7f9d532 Compare January 6, 2025 12:46

pcholakov requested a review from muhamadazmy January 6, 2025 12:47

pcholakov force-pushed the feat/trim-log-report-lsn branch from 7f9d532 to b86ac06 Compare January 6, 2025 12:50

pcholakov marked this pull request as ready for review January 6, 2025 12:57

pcholakov commented Jan 6, 2025

View reviewed changes

pcholakov mentioned this pull request Jan 6, 2025

Add an end-to-end test for trim gap handling using snapshots #2463

Merged

tillrohrmann previously requested changes Jan 7, 2025

View reviewed changes

pcholakov added 2 commits January 7, 2025 16:59

Return effective trim point LSN on successful trim

5454dac

Update Trim contract to return the actual latest trim point for all l…

73943fb

…oglet implementations

pcholakov force-pushed the feat/trim-log-report-lsn branch from b3bc1c4 to 73943fb Compare January 7, 2025 15:11

PR feedback

97beb75

pcholakov requested a review from tillrohrmann January 7, 2025 15:33

pcholakov removed the request for review from muhamadazmy January 7, 2025 15:36

AhmedSoliman self-requested a review January 7, 2025 15:59

Replicated Loglet reports trim LSN as max of all log servers part of …

d55eee4

…quorum

pcholakov commented Jan 8, 2025

View reviewed changes

pcholakov closed this Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return effective trim point LSN on successful trim request #2468

Return effective trim point LSN on successful trim request #2468

pcholakov commented Jan 6, 2025 •

edited

Loading

github-actions bot commented Jan 6, 2025 •

edited

Loading

pcholakov Jan 6, 2025

pcholakov Jan 6, 2025

pcholakov Jan 6, 2025

tillrohrmann left a comment

tillrohrmann Jan 7, 2025

pcholakov Jan 7, 2025

tillrohrmann Jan 7, 2025

pcholakov Jan 7, 2025

tillrohrmann Jan 7, 2025

pcholakov Jan 7, 2025

pcholakov commented Jan 7, 2025

pcholakov Jan 8, 2025

AhmedSoliman commented Jan 8, 2025

pcholakov commented Jan 8, 2025

		/// Trim the log to the minimum of new_trim_point and last_committed_offset
		/// new_trim_point is inclusive (will be trimmed)

Return effective trim point LSN on successful trim request #2468

Return effective trim point LSN on successful trim request #2468

Conversation

pcholakov commented Jan 6, 2025 • edited Loading

github-actions bot commented Jan 6, 2025 • edited Loading

Test Results

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tillrohrmann left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pcholakov commented Jan 7, 2025

Choose a reason for hiding this comment

AhmedSoliman commented Jan 8, 2025

pcholakov commented Jan 8, 2025

pcholakov commented Jan 6, 2025 •

edited

Loading

github-actions bot commented Jan 6, 2025 •

edited

Loading