Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Stateful Map Vertex #2384

Closed
th0ger opened this issue Feb 7, 2025 · 2 comments
Closed

Support for Stateful Map Vertex #2384

th0ger opened this issue Feb 7, 2025 · 2 comments
Labels
enhancement New feature or request

Comments

@th0ger
Copy link
Contributor

th0ger commented Feb 7, 2025

Summary

Proposing a stateful map vertex.

Inside the map handler, the user must be able to read and write a global state for the vertex.

Use Cases

The state content is fully up to the user, but may for example hold:

  • the previous message,
  • a list of the N previous messages,
  • an aggregated metric of prior events,
  • an online ML model with updateable weights,
  • a histogram of prior events.

This proposal could also solve the GPS data smoothening raised in #2235 in cases where only "past" events are needed. (If "future" events are needed, one could use the state as a N-message ring buffer, where the incoming message is added to the buffer and the oldest buffer message is output from the map.)

Design Considerations

  • The state must persisted to be resilient to pod/pipeline restarts.
  • State writes and msg acks should somehow be covered in the same transaction. (For ex. a handler should not be able to increase a count in the state, crash on msg writeout, pod restarts, all repeating in a loop.)
  • The state should be keyed (keyed streams have one global state per key).
  • Only single-partition is required (per key).

Message from the maintainers:

If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.

@th0ger th0ger added the enhancement New feature or request label Feb 7, 2025
@vigith
Copy link
Member

vigith commented Feb 10, 2025

There are two concerns I have with implementing this within Numaflow.

Choosing a Store

The current stores that come with Numaflow are optimized for data and metadata movement. It won't be able to support any types that is deviant from what we have optimized for. E.g., we will experience OOMs if the size grows or the throughput will be severely compromised, causing a lot of unwanted side effects.

On the other hand, there are lots of open-source cloud-native stores out there and they can be deployed very easily in K8s. One can choose any optimal store of any API style and configure it specifically for their needs.

NOTE: Even in the Flink pipelines we write, we move the state out from Flink to external DBs because Flink simply cannot scale as these stores can get huge at high TPS.

Fulfilling Completeness Property

For a platform to implement the Stateful Map Vertex obeying the "completeness property" (should work in all use cases) is quite tricky.
As a platform, we will need to have concrete answers for:

  • How big should these stores be (apriori knowledge)? (we have auto-scaling mechanisms if we see backpressure for ISB, which cannot be translated for custom states).
  • What should the APIs look like? Just put, and get (KV style), or should we support pop, push (list style).
  • When to GC/delete these datasets (element based or time based), we can give APIs but then someone will have to track the names of these datasets?

@vigith
Copy link
Member

vigith commented Feb 10, 2025

I am moving this to GitHub discussion, we can convert it to an issue once we have a better picture.

@numaproj numaproj locked and limited conversation to collaborators Feb 10, 2025
@vigith vigith converted this issue into discussion #2388 Feb 10, 2025

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants