Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass at least one key to script executions #49

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

nyergler
Copy link
Contributor

@nyergler nyergler commented Jan 6, 2021

This allows Redis to route our execution to the correct node when running against a Redis Cluster.


I encountered this issue with an on premises deployment of our application, in which our customer is using Redis Cluster. They were seeing errors with cross-slot key access in a Lua script. This surprised me, since I knew that work provides a mechanism for namespacing keys to avoid this very problem.

And then I read the following in the EVAL docs:

All Redis commands must be analyzed before execution to determine which keys the command will operate on. In order for this to be true for EVAL, keys must be passed explicitly. This is useful in many ways, but especially to make sure Redis Cluster can forward your request to the appropriate cluster node.

Note this rule is not enforced in order to provide the user with opportunities to abuse the Redis single instance configuration, at the cost of writing scripts not compatible with Redis Cluster

When I noticed that work doesn't pass any keys to the script evaluation, I surmised that this allowed Redis to route the script to any node in the cluster.

This PR is the minimal update to fix this issue, passing a single namespaced key to the script evaluation.

A more thorough, albeit slightly more invasive, fix would be to perform all key composition outside the script. That would also have the nice advantage of not requiring the Lua & Go key creation to be kept in sync.

If you're open to that, I'll happily update my fork and this PR.

@taylorchu
Copy link
Owner

taylorchu commented Jan 8, 2021

We run sentinel but getting this to work sounds good to me.

I will try to add a circle ci redis "cluster" environment for this here:

build_and_test:

the previous idea is to use this "hash" in either name space or queue name:

Hash tags are documented in the Redis Cluster specification, but the gist is that if there is a substring between {} brackets in a key, only what is inside the string is hashed, so for example this{foo}key and another{foo}key are guaranteed to be in the same hash slot, and can be used together in a command with multiple keys as arguments.

@taylorchu
Copy link
Owner

@nyergler check this out: #50

This current master already works by adding {} "hash" to either namespace / queue name.

  • if you add {} namespace, all kvs created will be on the same node
  • furthermore, adding {} to queue name will be able to shard queues in the same namespace.

@nyergler
Copy link
Contributor Author

We were previously specifying the namespace using {} in order to get Redis Cluster to treat it as a hash tag. However, it seems that even then, unless you specify the key(s) as EVAL arguments, you're basically just getting lucky if it works. My understanding is that Redis Cluster routes the script to a single node, choosing that node based on the KEYS passed in. If no KEYS are passed, it gets routed to any node, since it doesn't know what needs to be accessed until the script is actually executed.

For what it's worth, we deployed this initially just specifying the hash tag as the namespace (in {}), and things seemed to work fine. It was only when utilization grew and Redis rebalanced the nodes that we started seeing this issue. Passing the key in as part of the call to EVAL immediately fixed the problem.

@taylorchu
Copy link
Owner

I need more info to produce this locally.

  1. the redis version that you are running.
  2. how redis cluster master/slave is added in rebalancing (the steps/commands that you run). there might be an issue of moving slots. I think there is still a missing piece from how redis works from your description because I can change the hash {} to any value and the tests still pass under the same cluster. If what you described is true that it might send to any node, the tests will fail 67% of the time after I change hash to another value.

@nyergler
Copy link
Contributor Author

Hey @taylorchu, makes sense; I'll come up with a crisp set of repro steps and update here.

nyergler and others added 2 commits November 10, 2022 11:46
This allows Redis to route our execution to the correct node when
running in a Redis Cluster.
Quick fix. We use the heartbeat middlewear which allows long-running
jobs to work. However, the context is also ending up canceled which is
causing other issues.
@taylorchu
Copy link
Owner

@jpalawaga @nyergler could you check if 016e03b fixes it?

@taylorchu
Copy link
Owner

backstory: it turns out go-redis 9.0.3 fixes a bug where it might take the first non key (normal args) as routing key, so this problem is not surfaced. after upgrading to 9.0.5, I finally can reproduce this problem and fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants