[documentation] Configuring CDC source with a Cassandra cluster behind a router #76

aymkhalil · 2022-07-06T17:48:44Z

While the CDC agent publish mutations to the dirty topic in Pulsar, the deployed C* source needs to query back Cassandra nodes for the converged record before publishing to the clean topic. This setup can maybe challenging because it is not obvious how would the source connector talk to the Cassandra cluster and do node discovery.

It would be nice to document a reference network topology with Cassandra setting behind a NAT address and instruct the user how to consider their source contact points.

Few notes:

Cassandra driver does automatically discover nodes after connecting to an initial set of nodes defined by contact points.
Contact points are configured on the source via

--source-config "{
  \"keyspace\": \"ks1\",
  \"table\": \"table1\",
  ...
  \"contactPoints\": \"localhost OR NAT address/etc.\",
  ...
}"

Cassandra nodes has few relevant configs in the cassandra.yml conf file (namely listen_address, rpc_address, broadcast_address and broadcast_rpc_address). Check the advanced settings section.
Seed nodes are good candidate for bootstrapping the driver )and can go in contactPoints) - it might be reasonable to one expose those via NAT and keep the non-seed nodes private. (More regarding seed points)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[documentation] Configuring CDC source with a Cassandra cluster behind a router #76

[documentation] Configuring CDC source with a Cassandra cluster behind a router #76

aymkhalil commented Jul 6, 2022 •

edited

Loading

[documentation] Configuring CDC source with a Cassandra cluster behind a router #76

[documentation] Configuring CDC source with a Cassandra cluster behind a router #76

Comments

aymkhalil commented Jul 6, 2022 • edited Loading

aymkhalil commented Jul 6, 2022 •

edited

Loading