Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DBZ-PGYB][yugabyte/yugabyte-db#24555] Add task ID to PostgresPartition #163

Merged
merged 18 commits into from
Oct 24, 2024

Conversation

vaibhav-yb
Copy link
Collaborator

Problem

With the introduction of the parallel snapshot model, we can have multiple tasks when the snapshot mode is set to parallel. This introduces a problem at the underlying layer when the connector stores the sourceInfo for its partitions i.e. PostgresPartition objects in Kafka.

The PostgresPartition is identified by a map which has a structure {"server", topicPrefix} - currently this is the same for all the PostgresPartition objects which are created by the tasks when snapshot.mode is parallel and hence they all end up referring to the same source partition in the Kafka topic. Subsequently, what happens is that (assume that we have 2 tasks i.e. 0 and 1):

  1. One task (task_0) completes the snapshot while the other is yet to start.
    a. After completion, task_0 updates the sourceInfo saying that its snapshot is completed.
  2. When task_1 starts up, it reads the same sourceInfo object and concludes that the snapshot is completed so it skips its snapshot.

The above situation will cause a data loss since task_1 will never actually take a snapshot.

Solution

This PR implements a short term solution where we simply add the task ID to the partition so that each PostgresPartition can identity a sourcePartition uniquely, the identifying map will now become {"server", topicPrefix_taskId}.

Note:
This solution is a quick fix for the problem given that the number of tasks in the connector remain the same.

This partially fixes yugabyte/yugabyte-db#24555

@vaibhav-yb vaibhav-yb added the enhancement New feature or request label Oct 22, 2024
@vaibhav-yb vaibhav-yb self-assigned this Oct 22, 2024
@Sumukh-Phalgaonkar
Copy link

Can you modify the existing parallel snapshot UT, by setting tasks.max to a sufficiently large value, so that one of the task finishes its select query before the next task starts.

@vaibhav-yb
Copy link
Collaborator Author

Can you modify the existing parallel snapshot UT, by setting tasks.max to a sufficiently large value, so that one of the task finishes its select query before the next task starts.

@Sumukh-Phalgaonkar It is not possible to have multiple tasks in the connector test framework currently.

@vaibhav-yb vaibhav-yb merged commit 4e55ebd into ybdb-debezium-2.5.2 Oct 24, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DBZ-PGYB] Decouple connector to not share the same PostgresPartition to sourcePartition mapping across tasks
3 participants