【Popular Science Article】 Understand SeaTunnel CDC data synchronization in 3 minutes

Apache SeaTunnel
5 min readApr 15, 2024

CDC (Change Data Capture) is a technology used to track row-level changes in database events (insertions, updates, deletions) and notify other systems of these events in the order they occurred. In disaster recovery scenarios, CDC primarily facilitates data synchronization between primary and backup systems, enabling real-time data synchronization from the primary database to the backup database.

source ----------> CDC ----------> sink

Apache SeaTunnel CDC

Data synchronization in SeaTunnel CDC is divided into two types:

  • Snapshot Read: Reading historical data from a table.
  • Incremental Tracking: Reading incremental log changes from a table.

2.1 Lock-Free Snapshot Synchronization

The reason why we emphasize “lock-free” in the lock-free snapshot synchronization phase is that existing CDC platforms might lock tables during synchronization of historical data, such as Debezium. The snapshot read phase involves synchronizing historical database data. The basic overview of the process is as follows:

storage------------->splitEnumerator----------split---------->reader
^ |
| |
\-----------------report------------/
  • Split division: The splitEnumerator (split distributor) divides the table data into multiple splits based on specified fields (e.g., table id or unique key) and step size.
  • Parallel processing: Each split is assigned to different readers through a routing algorithm for parallel reading, with each reader occupying one connection.
  • Event feedback: After completing the reading of a split, each reader reports progress back to the splitEnumerator.
  • The splitEnumerator sends a split to the reader, with the metadata of the split as follows:
String              splitId         Routing id
TableId tableId Table id
SeatunnelRowType splitKeyType Type of field the split is based on
Object splitStart Start point of the split read
Object splitEnd End point of the split read
  • The reader, upon receiving the split information, generates the relevant SQL statements. Before this, it records the starting position of the database log log corresponding to the current split and reports back to the splitEnumerator after processing the current split. The report content is as follows:
String      splitId         Split id
Offset highWatermark Position of the split's corresponding log for subsequent verification

2.2 Incremental Synchronization

The incremental synchronization phase is based on the snapshot read phase. When changes occur in the source database, the changed data is synchronized to the backup database in real-time. This phase listens to the database’s log, such as MySQL’s bin log. Incremental tracking is typically single-threaded to avoid duplicating bin log pulls and reduce database load. Therefore, only one reader works during this phase, using only one connection.

data log------------->splitEnumerator----------split---------->reader
^ |
| |
\-----------------report------------/
  • Incremental synchronization combines all splits and tables from the snapshot phase, so there is only one split during the incremental phase. The split information for this phase is as follows:
String                              splitId
Offset startingOffset Smallest log start among all splits
Offset endingOffset Log end position, continuous if no end, like in incremental phase
List<TableId> tableIds
Map<TableId, Offset> tableWatermarks Watermarks of all splits
List<CompletedSnapshotSplitInfo> completedSnapshotSplitInfos Details of splits read during the snapshot phase
  • The fields in CompletedSnapshotSplitInfo are as follows:
String              splitId
TableId tableId
SeatunnelRowType splitKeyType
Object splitStart
Object splitEnd
Offset watermark Corresponding to the highWatermark in the report
  • The split in the incremental phase includes all watermarks from the snapshot phase, selecting an appropriate position for incremental synchronization, which is the smallest watermark.

Exactly-once

Whether during the snapshot read or the incremental read, the database may also be undergoing changes. How do we ensure exactly-once?

3.1 Snapshot Read Phase

In the snapshot read phase, for example, if during the synchronization of a split, the data in that split changes, such as the operations shown below (inserting k3, updating k2, deleting k1), if no identifiers are used during the reading process, then the updates to this part of the data would be lost. SeaTunnel’s approach is:

  • Check the bin log position in the database before reading the split: low watermark.
  • Read split {start, end} data.
  • Record the high watermark afterward.
  • If high = low, it indicates that the data within that split did not change during the reading; if (high — low) > 0, it indicates that the data changed during processing. The following steps are taken: ① Establish an in-memory table cache with the split data read; ② Change from low watermark to high watermark; ③ Replay operations in order of primary key to the in-memory table.
  • Report high watermark.
insert k3      update k2      delete k1
| | |
v v v
bin log --|---------------------------------------------------|-- log offset
low watermark high watermark
CDC read data: k1 k3 k4
| replay
v
Real data: k2 k3' k4

Incremental Phase

Before starting the incremental phase, all splits from the previous step are verified, as data updates might occur between splits, such as inserting several records between split1 and split2 during the snapshot phase, which would be missed. Seatunnel’s approach for retrieving data between splits is:

  • Find the smallest watermark from all split reports as the start watermark, and start reading the log.
  • For each log entry read, check if it has been processed in any split from the completedSnapshotSplitInfos. If not, it indicates that it is data from between splits and should be corrected.
  • After filtering the table, the entries from completedSnapshotSplitInfos can be deleted, and the remaining tables continued.
  • Once all splits have been verified, the full incremental phase begins.
|------------filter split2-----------------|
|----filter split1------|
data log -|-----------------------|------------------|----------------------------------|- log offset
min watermark split1 watermark split2 watermark max watermark

Checkpoint Resume

How to achieve pause and resume? Distributed snapshot algorithm (Chandy-Lamport):

Suppose the system contains two processes, p1 and p2. Process p1’s state includes three variables X1, Y1, Z1, and p2 contains three variables X2, Y2, Z2, with the initial state as follows:

p1                                  p2
X1:0 X2:4
Y1:0 Y2:2
Z1:0 Z2:3

At this point, p1 initiates a global snapshot recording, first recording its own process state, then sending marker information to p2. Before the marker information reaches p2, p2 sends message M to p1.

p1                                  p2
X1:0 -------marker-------> X2:4
Y1:0 <---------M---------- Y2:2
Z1:0 Z2:3

After p2 receives the marker from p1, it records its own state, and then p1 receives the message M sent by p2 before. Since p1 has already taken a local snapshot, p1 only needs to record M. Thus, the final snapshot is as follows:

p1 M                                p2
X1:0 X2:4
Y1:0 Y2:2
Z1:0 Z2:3

During the SeaTunnel CDC process, markers sent to all nodes like readers, splitEnumerators, and writers will save their own memory state.

--

--

Apache SeaTunnel

The next-generation high-performance, distributed, massive data integration tool.