Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jetstream.PublishMsgAsync panic #1661

Open
AetheWu opened this issue Jun 21, 2024 · 9 comments
Open

jetstream.PublishMsgAsync panic #1661

AetheWu opened this issue Jun 21, 2024 · 9 comments
Labels
defect Suspected defect such as a bug or regression

Comments

@AetheWu
Copy link

AetheWu commented Jun 21, 2024

Observed behavior

Publish messages to a stream which is not existed, then the client panic

panic: runtime error: slice bounds out of range [14:0]

goroutine 2369 [running]:
github.com/nats-io/nats.go/jetstream.(*jetStream).PublishMsgAsync(0xc00007e690, 0xc00033f810, {0xc00073e7c8, 0x1, 0x1?})
	/go/pkg/mod/github.com/nats-io/[email protected]/jetstream/publish.go:306 +0xa49
github.com/nats-io/nats.go/jetstream.(*jetStream).handleAsyncReply.func2()
	/go/pkg/mod/github.com/nats-io/[email protected]/jetstream/publish.go:398 +0xd8
created by time.goFunc
	/usr/local/go/src/time/sleep.go:177 +0x2d

Expected behavior

no panic

Server and client version

nats-server: v2.10.16
nats-client: nats.go-v1.34.1

Host environment

linux amd64;
nats deploy with helm

Steps to reproduce

No response

@AetheWu AetheWu added the defect Suspected defect such as a bug or regression label Jun 21, 2024
@piotrpio
Copy link
Collaborator

Hello @AetheWu, do you have a reproduction example? I tried to reproduce this and so far I am getting a valid error (nats: no response from stream) and no panic, so it would be awesome if you could provide some more details.

@patihomirov
Copy link

patihomirov commented Jan 23, 2025

in version 1.36 is bug:

    if paf == nil && m.Reply != "" {
        return nil, ErrAsyncPublishReplySubjectSet
    }

if paf == nil {
...
    } else {
        // when retrying, get the ID from existing reply subject
        id = m.Reply[js.replyPrefixLen:]
    }

Obviously, if paf != nil && m.Reply == "", we praise panic.

And in version 1.38:

  } else {
        // when retrying, get the ID from existing reply subject
        reply = paf.reply
        id = reply[js.replyPrefixLen:]
    }

that is, the data is taken from a non-empty paf.

@patihomirov
Copy link

patihomirov commented Jan 23, 2025

Can you confirm that this bug on version 1.38 was resolved?

@patihomirov
Copy link

It seems to me that panic occurs when trying to resend, when using paf.

@patihomirov
Copy link

Its covered by tests in 1.38 here:

{
name: "invalid subject set",
msgs: []publishConfig{
{
msg: &nats.Msg{
Data: []byte("msg 1"),
Subject: "ABC",
},
withAckError: func(t *testing.T, err error) {
if !errors.Is(err, jetstream.ErrNoStreamResponse) {
t.Fatalf("Expected error: %v; got: %v", jetstream.ErrNoStreamResponse, err)
}
},
},
},
},

If use 1.38 and rollback this:

        // when retrying, get the ID from existing reply subject
        reply = paf.reply
        id = reply[js.replyPrefixLen:]
    }

to 1.36:

 } else {
        // when retrying, get the ID from existing reply subject
        id = m.Reply[js.replyPrefixLen:]
    }

you will catch this bug.

@piotrpio
Copy link
Collaborator

@patihomirov I'm a bit lost here, so you're saying there is no issue in 1.38.0 or are you still seeing something? I would assume this (and a few other) issue were fixed in #1719

@patihomirov
Copy link

patihomirov commented Jan 24, 2025

Im not sure yet. We try to catch/retry bug on 1.36 to be shure that it was fixed on 1.38.
Our bug appear when message cant be delivered and m.Reply became "" before we retry deliever.

@patihomirov
Copy link

patihomirov commented Jan 24, 2025

I think I caught something on 1.36 while I was using a debugger on the code during the test. But not stable. 2 times out of 6 attempts.

Here: https://github.com/nats-io/nats.go/blob/c97f022375952007086bf4b063b8e5e708e64620/jetstream/publish .go#L290-L293

Panic happens if I spend a long time using a debugger on PublishMsgAsync and this defer

defer func() { m.Reply = "" }()
if err != nil {
return nil, fmt.Errorf("nats: error creating async reply handler: %s", err)
}

It works before we get to here:
paf.retries++

@patihomirov
Copy link

patihomirov commented Jan 24, 2025

We have made a global chan. We are waiting for this chan in the defer here

defer func() { m.Reply = "" }()
, provided that m.The Subject contains the required topic. Here:
_, err := js.PublishMsgAsync(paf.msg, func(po *pubOpts) error {
before calling PublishMsgAsync, we write to this chan and do a 50ms slip. In this case, this test consistently panics.
name: "invalid subject set",

The pill for this panic for our project is to call PublishMsgAsync with opts: []jetstream.PublishOpt{jetstream.WithRetryAttempts(0)}, because we do not use Retry in the project. Or upgrade to 1.38, but we are not completely sure that we can safely include 1.38 in our project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect Suspected defect such as a bug or regression
Projects
None yet
Development

No branches or pull requests

3 participants