Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JetStream publish api needs better Leader election signal #1781

Open
VadimZhiltsov opened this issue Jan 16, 2025 · 3 comments
Open

JetStream publish api needs better Leader election signal #1781

VadimZhiltsov opened this issue Jan 16, 2025 · 3 comments
Labels
defect Suspected defect such as a bug or regression

Comments

@VadimZhiltsov
Copy link

Observed behavior

Currently js.Publish() uses noResponders error as sign that Steam is currently under leader election process.
https://github.com/nats-io/nats.go/blob/main/jetstream/publish.go#L201

That is a good way to understand that stream is not ready to accept messages, but thing goes wrong if we have any Core subscriptions on the subject which is part of stream.

Expected behavior

As expected behaviour Nats sdk should initiate retry logic, because leader election is ongoing and message can not be stored

Server and client version

latest & latest

Host environment

does not metter

Steps to reproduce

Steps to reproduce:

  1. Run Nats cluster with 3 nodes
  2. Setup JS with replication factor 3
  3. Include subject subject.example as part of JS
  4. Initiate leader election for Stream (don't wait until it gets done)
  5. publish message to subject.example with JS.Publish api

As a result, because of core subscriber request to store message in JS failed by timeout, instread of trigger noResponders error and initiate retry logic in nats sdk.

@VadimZhiltsov VadimZhiltsov added the defect Suspected defect such as a bug or regression label Jan 16, 2025
@Jarema
Copy link
Member

Jarema commented Jan 16, 2025

The no responders can also occur for different reasons:

  • JetStream itself is not enabled at the server
  • There is no Stream with matching subject

the SDK has no way to differentiate those, so it should not retry.
Also, we are very careful with adding automatic retires, as in many cases, those are unwanted behaviors if response time (does not matter if failure or success) is critical.

@VadimZhiltsov
Copy link
Author

@Jarema this functionality is already in Nats SDK, so there is desire to remove it?

I believe we still need to implement some resiliency pattern to ensure message delivery to Nats during stream leader election.
May be we could persist steam subscription to subject and respond with kind of negative acknowledge and error "LeaderElectionInProgress", so we can do retries then?

@Jarema
Copy link
Member

Jarema commented Jan 16, 2025

What I'm trying to say, is there is no way in knowing if no responders is caused by leader election.

No responders is happening because noone is listening on the subject.
If you have core subscription on the same subject as Stream, that means that both the server and the client triggers interest, and then - removing the stream, causing leader election, or disabling JetStream will no longer cause no responders, becaues there are responders.

PS - you're right, there is a retry logic, however I'd prefer it it was opt-in. There are no plans to remove it though. Sorry for the confusion.

@wallyqs wallyqs changed the title Jetstream publish api needs better Leader election signal JetStream publish api needs better Leader election signal Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Suspected defect such as a bug or regression
Projects
None yet
Development

No branches or pull requests

2 participants