Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[documentation] Configuring CDC source with a Cassandra cluster behind a router #76

Open
aymkhalil opened this issue Jul 6, 2022 · 0 comments

Comments

@aymkhalil
Copy link
Contributor

aymkhalil commented Jul 6, 2022

While the CDC agent publish mutations to the dirty topic in Pulsar, the deployed C* source needs to query back Cassandra nodes for the converged record before publishing to the clean topic. This setup can maybe challenging because it is not obvious how would the source connector talk to the Cassandra cluster and do node discovery.

It would be nice to document a reference network topology with Cassandra setting behind a NAT address and instruct the user how to consider their source contact points.

Few notes:

  • Cassandra driver does automatically discover nodes after connecting to an initial set of nodes defined by contact points.
  • Contact points are configured on the source via
--source-config "{
  \"keyspace\": \"ks1\",
  \"table\": \"table1\",
  ...
  \"contactPoints\": \"localhost OR NAT address/etc.\",
  ...
}" 
  • Cassandra nodes has few relevant configs in the cassandra.yml conf file (namely listen_address, rpc_address, broadcast_address and broadcast_rpc_address). Check the advanced settings section.
  • Seed nodes are good candidate for bootstrapping the driver )and can go in contactPoints) - it might be reasonable to one expose those via NAT and keep the non-seed nodes private. (More regarding seed points)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant