Skip to main content

CDC MySQL

Captures changes from MySQL/MariaDB binary log (CDC — Change Data Capture). Supports batched reads, automatic reconnection with exponential backoff, and ordered checkpointing.

Prerequisites: MySQL must have log_bin = ON, binlog_format = ROW, and binlog_row_image = FULL. For GTID mode, also enable gtid_mode = ON and enforce_gtid_consistency = ON. The user needs REPLICATION SLAVE, REPLICATION CLIENT, and SELECT privileges.

Connection

FieldTypeDefaultDescription
HoststringlocalhostMySQL server hostname or IP address
Portinteger3306MySQL server port
UserstringrootMySQL username
PasswordstringMySQL password
Server IDstring1000Unique server ID for this binlog consumer. Corresponds to MySQL's server_id
FlavorstringmysqlDatabase flavor (mysql or mariadb)

Position tracking

Requires a cache resource to persist binlog position across restarts.

FieldTypeDefaultDescription
Position CachecacheCache resource for storing binlog position
Position Cache KeystringKey within the cache to store position data
Position ModestringgtidPosition tracking mode: gtid or file
Cache Save Intervalstring30sHow often to persist position to cache. 0s for immediate saves

GTID mode (default, recommended): stores a GTID set and on first start queries gtid_purged from MySQL. File mode: stores binlog filename and position. If the stored position is no longer available (e.g., binlogs purged), the component automatically purges and reconnects from the earliest available position.

Table filtering

FieldTypeDefaultDescription
Include TablesarrayTables to monitor in schema.table format. If empty, all tables are monitored
Exclude TablesarrayTables to exclude in schema.table format

Schema cache

FieldTypeDefaultDescription
Use Schema CachebooleantrueQuery INFORMATION_SCHEMA for column names
Schema Cache TTLstring5mHow long cached schema information is valid

Batching and backpressure

FieldTypeDefaultDescription
Max Batch Sizeinteger1000Maximum messages per batch
Max Pending Checkpointsinteger100Maximum unacknowledged batches before applying backpressure

Connection retry

FieldTypeDefaultDescription
Retry Initial Intervalstring1sInitial wait before first reconnect attempt
Retry Max Intervalstring30sMaximum wait between reconnect attempts
Retry Multiplierfloat2.0Exponential backoff multiplier

Output format

Each message contains:

FieldTypeDescription
databasestringDatabase name
tablestringTable name
typestringOperation type: insert, update, or delete
tsintegerUnix timestamp when the event was processed
server_idstringThe configured server ID value
newobjectRow after the change — present for insert and update, absent for delete
oldobjectRow before the change — present for update and delete, absent for insert
gtidstringGTID of the transaction (when using GTID mode)
tip

For CDC position tracking, use a persistent cache type like Redis or File. The Memory cache works for development but loses position data on restart, causing the consumer to reprocess events from the earliest available position.