Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sticky routing based on next-uri fields #446

Open
shk3 opened this issue Aug 22, 2024 · 14 comments
Open

Sticky routing based on next-uri fields #446

shk3 opened this issue Aug 22, 2024 · 14 comments

Comments

@shk3
Copy link

shk3 commented Aug 22, 2024

Hi folks,

Have we considered rewriting next-uri and info-uri directly from the responses in order to achieve query-level sticky routing?

The idea is kinda similar to Trino Proxy, where Trino Gateway proxies all requests. Then we bind the URLs in the following ways:

  • /v1/...: handled by the current logic and proxied to one of the Trino coordinators based on whatever load balancing algorithms of choice. In the response, we rewrite the next-uri / info-uri to something like /backend1/v1/... that is directly pointing to the particular backend (or rewrite the X-Forwarded-* headers to achieve the same goal).
  • /backend1/v1/...: directly proxied to backend 1 (but block POST to /v1/statement so that queries can only be submitted via the load balancer);
  • /backend2/v1/...: directly proxied to backend 2 (but block POST to /v1/statement so that queries can only be submitted via the load balancer);
  • ...

With this approach, for query-level sticky routing, we don't need to track which backend each query id gets assigned to. Instead, such assignment is retained on the client side.

The caveat is that for the Trino UI, we would need to develop a way for users to do a combined search queries across all backends as well as a summary of all backend's stats.

Has this approach been considered in the past? We could eliminate the dependency on the databases / caches. If cross-regional networking could be a concern, we could even change the URLs with different domains to avoid inter-regional proxying.

I know Trino Gateway's architecture is pretty much set, so it's not necessarily something we have to do now, but mostly a discussion just in case later on it's needed.

George

@xkrogen
Copy link
Member

xkrogen commented Aug 22, 2024

We talked a bit about making the GW more of a "full proxy" in one of the recent GW dev syncs. It potentially unlocks a lot of new capabilities.

I like the idea you've proposed here of embedding this state into the client instead of storing it on the GW side. Tracking when a query has finished, and thus its state can be cleaned up, is an annoying process. Right now we just have a periodic task, every 2 hours, to clear our query records older than a configurable time window (but that query may actually still be running!):

private void startCleanUps()
{
executorService.scheduleWithFixedDelay(
() -> {
log.info("Performing query history cleanup task");
long created = System.currentTimeMillis() - TimeUnit.HOURS.toMillis(this.configuration.getQueryHistoryHoursRetention());
jdbi.onDemand(QueryHistoryDao.class).deleteOldHistory(created);
},
1,
120,
TimeUnit.MINUTES);

Moving it to the client is in line with Trino philosophy in general, IMO, like how we implement session properties and prepared statements on the client-side.

For the UI, I think as you said, we could do a fan-out that pulls query results from each backend ... That also has the benefit of not having two copies of the same data (query IDs / query history stored on both GW and Coordinator).

Curious to hear what others think, but personally at first pass I like the idea. One thing we should consider is whether this would make it harder to implement other new functionality in the future.

@shk3
Copy link
Author

shk3 commented Aug 23, 2024

One thing we should consider is whether this would make it harder to implement other new functionality in the future.

Yes! This is the exact concern I have too.

We evaluated Trino Gateway vs running Envoy with a query ID cache vs just getting a thin layer of rewriting headers for next-uri in combination with some cloud load balancers a while ago. It's great to see that Trino Gateway is now officially part of Trino project and is collaborating with Trino!

We could actually achieve this next-uri design even as of today with the current Trino Gateway, if we tweak the X-Forwarded-* headers rewriting logic in some way and put the Trino coordinators on their own domains (eg. trino-gw.mydomain, trino-1.mydomain, trino-2.mydomain). In this way, Trino Gateway effectively acts as a query dispatcher, and the subsequent calls won't go through Trino Gateway.
However, I'm worried about creating yet-another a snowflake use case for Trino Gateway. So, let's see if this idea could fit into Trino Gateway's bigger design in anyway and doesn't break any functionality Trino Gateway wants to support.

@oneonestar
Copy link
Member

oneonestar commented Aug 23, 2024

I had been thinking about routing using QueryID. When Trino coordinator starts, it generates a random coordinatorId and embed it into the last part in QueryID. (ref)

If we can keep track of the coordinatorId for each cluster, we can route it to the corresponding cluster without any additional info.

For example, all the query ID from the same coordinator have the same suffix:

Cluster A (tr8tg):
20240801_040236_47295_tr8tg
20240801_040244_44562_tr8tg
20240801_040245_41234_tr8tg

Cluster B (fejs4):
20240801_040301_24461_fejs4
20240801_040302_21235_fejs4
20240801_040303_25678_fejs4

@vishalya
Copy link
Member

Having a coordinator (or cluster id) as a part of the trino protocol is a good idea. This could also solve the issue# 465

@oneonestar
Copy link
Member

Currently we can't obtain the coordinatorId through Trino Coordinator's API. It'll be great if we can obtain this info from API when doing cluster health check.

@mosabua
Copy link
Member

mosabua commented Oct 16, 2024

Ideally we chat about this with @wendigo and @electrum .. might be good to wrap some of this into the current work on client protocol.

@wendigo
Copy link
Contributor

wendigo commented Oct 29, 2024

How can I help here? :)

@mosabua
Copy link
Member

mosabua commented Oct 29, 2024

I think it would be great if you can chime in at trinodb/trino#23910 and help there and also take this into account for the spooling protocol work @wendigo

@oneonestar
Copy link
Member

oneonestar commented Oct 30, 2024

Looks like we can modify the infoUri and nextUri using X-Forwarded-XX from trinodb/trino#22227.

protocol://{X-Forwarded-Host}/{X-Forwarded-Prefix}/v1/statement/...
$ curl -XPOST -vvvv http://127.0.0.1:8080/v1/statement -d "SELECT 1" -H "X-Trino-User: star"
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 127.0.0.1:8080...
* Connected to 127.0.0.1 (127.0.0.1) port 8080
> POST /v1/statement HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/8.7.1
> Accept: */*
> X-Trino-User: star
> Content-Length: 8
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 8 bytes
< HTTP/1.1 200 OK
< Date: Wed, 30 Oct 2024 15:04:09 GMT
< Vary: Accept-Encoding
< Content-Type: application/json
< X-Content-Type-Options: nosniff
< Content-Length: 595
<
{"id":"20241030_150409_00000_zhg55","infoUri":"http://127.0.0.1:8080/ui/query.html?20241030_150409_00000_zhg55","nextUri":"http://127.0.0.1:8080/v1/statement/queued/20241030_150409_00000_zhg55/y3ad3f7e909dd59fa86eaf43c51a1f45bed0357f0/1","stats":{"state":"QUEUED","queued":true,"scheduled":false,"nodes":0,"totalSplits":0,"queuedSplits":0,"runningSplits":0,"completedSplits":0,"cpuTimeMillis":0,"wallTimeMillis":0,"queuedTimeMillis":0,"elapsedTimeMillis":0,"processedRows":0,"processedBytes":0,"physicalInputBytes":0,"physicalWrittenBytes":0,"peakMemoryBytes":0,"spilledBytes":0},"warnings":[]}
* Connection #0 to host 127.0.0.1 left intact

$ curl -XPOST -vvvv http://127.0.0.1:8080/v1/statement -d "SELECT 1" -H "X-Trino-User: star" -H "X-Forwarded-Prefix: some-prefix"
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 127.0.0.1:8080...
* Connected to 127.0.0.1 (127.0.0.1) port 8080
> POST /v1/statement HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/8.7.1
> Accept: */*
> X-Trino-User: star
> X-Forwarded-Prefix: some-prefix
> Content-Length: 8
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 8 bytes
< HTTP/1.1 200 OK
< Date: Wed, 30 Oct 2024 15:04:36 GMT
< Vary: Accept-Encoding
< Content-Type: application/json
< X-Content-Type-Options: nosniff
< Content-Length: 619
<
{"id":"20241030_150436_00001_zhg55","infoUri":"http://127.0.0.1:8080/some-prefix/ui/query.html?20241030_150436_00001_zhg55","nextUri":"http://127.0.0.1:8080/some-prefix/v1/statement/queued/20241030_150436_00001_zhg55/y6f4617b74af025647da20558223ecfbf0dc324ee/1","stats":{"state":"QUEUED","queued":true,"scheduled":false,"nodes":0,"totalSplits":0,"queuedSplits":0,"runningSplits":0,"completedSplits":0,"cpuTimeMillis":0,"wallTimeMillis":0,"queuedTimeMillis":0,"elapsedTimeMillis":0,"processedRows":0,"processedBytes":0,"physicalInputBytes":0,"physicalWrittenBytes":0,"peakMemoryBytes":0,"spilledBytes":0},"warnings":[]}
* Connection #0 to host 127.0.0.1 left intact

$ curl -XPOST -vvvv http://127.0.0.1:8080/v1/statement -d "SELECT 1" -H "X-Trino-User: star" -H "X-Forwarded-Prefix: some-prefix" -H "X-Forwarded-Host: some-host.com"
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 127.0.0.1:8080...
* Connected to 127.0.0.1 (127.0.0.1) port 8080
> POST /v1/statement HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/8.7.1
> Accept: */*
> X-Trino-User: star
> X-Forwarded-Prefix: some-prefix
> X-Forwarded-Host: some-host.com
> Content-Length: 8
> Content-Type: application/x-www-form-urlencoded
>
* upload completely sent off: 8 bytes
< HTTP/1.1 200 OK
< Date: Wed, 30 Oct 2024 15:05:00 GMT
< Vary: Accept-Encoding
< Content-Type: application/json
< X-Content-Type-Options: nosniff
< Content-Length: 617
<
{"id":"20241030_150500_00002_zhg55","infoUri":"http://some-host.com/some-prefix/ui/query.html?20241030_150500_00002_zhg55","nextUri":"http://some-host.com/some-prefix/v1/statement/queued/20241030_150500_00002_zhg55/y284ba1cdd09640c34cec06ca74afeb21acf10123/1","stats":{"state":"QUEUED","queued":true,"scheduled":false,"nodes":0,"totalSplits":0,"queuedSplits":0,"runningSplits":0,"completedSplits":0,"cpuTimeMillis":0,"wallTimeMillis":0,"queuedTimeMillis":0,"elapsedTimeMillis":0,"processedRows":0,"processedBytes":0,"physicalInputBytes":0,"physicalWrittenBytes":0,"peakMemoryBytes":0,"spilledBytes":0},"warnings":[]}
* Connection #0 to host 127.0.0.1 left intact

One interesting thing is the url with prefix won't work. Coordinator will return 404 for that.

$ curl -XPOST http://127.0.0.1:8080/v1/statement -d "SELECT 1" -H "X-Trino-User: star" -H "X-Forwarded-Prefix: some-prefix"
{"id":"20241030_153616_00003_z5drh","infoUri":"http://127.0.0.1:8080/some-prefix/ui/query.html?20241030_153616_00003_z5drh","nextUri":"http://127.0.0.1:8080/some-prefix/v1/statement/queued/20241030_153616_00003_z5drh/yb1171cfcc1af0a43017540da914037c349f8753c/1","stats":{"state":"QUEUED","queued":true,"scheduled":false,"nodes":0,"totalSplits":0,"queuedSplits":0,"runningSplits":0,"completedSplits":0,"cpuTimeMillis":0,"wallTimeMillis":0,"queuedTimeMillis":0,"elapsedTimeMillis":0,"processedRows":0,"processedBytes":0,"physicalInputBytes":0,"physicalWrittenBytes":0,"peakMemoryBytes":0,"spilledBytes":0},"warnings":[]}

$ curl http://127.0.0.1:8080/some-prefix/v1/statement/queued/20241030_153616_00003_z5drh/yb1171cfcc1af0a43017540da914037c349f8753c/1
Error 404 Not Found: HTTP 404 Not Found%

@shk3
Copy link
Author

shk3 commented Oct 30, 2024 via email

@wendigo
Copy link
Contributor

wendigo commented Oct 30, 2024

@oneonestar yeah, we plan to add support for it to the client as well, but for now the server-to-server should work just fine

@shk3
Copy link
Author

shk3 commented Oct 31, 2024

One interesting thing is the url with prefix won't work. Coordinator will return 404 for that.

@oneonestar Sorry I missed out this part.

What I had in mind is that we could use X-Forwarded-xx headers to point the next-uri / info-uri to the configured external URL of the backends, which doesn't go through Trino Gateway anymore. Say you have backend1.somehost.com/some-prefix publicly exposing Trino backend1 coordinator through nginx or some sort of Gateway. In Trino Gateway, we can use these headers to make the returned next-uri / info-uri pointing to the external-url directly -- something like http://backend1.somehost.com/some-prefix/v1/statement/queued/20241030_153616_00003_z5drh/yb1171cfcc1af0a43017540da914037c349f8753c/1.

Alternatively, if we want to manipulate next-uri / info-uri with some-prefix on the same host with Trino Gateway, we would need to set up some proxy rules to proxy the requests to the proper clusters based on the prefix, and when Trino coordinator gets this request, the URL won't contain that prefix anymore. When Trino Gateway sees that prefix, it knows which backend this request needs to go.

@shk3
Copy link
Author

shk3 commented Nov 25, 2024 via email

@wendigo
Copy link
Contributor

wendigo commented Nov 25, 2024

This is possible to achieve using X-Forwarded-Prefix approach

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants