Replies: 2 comments 1 reply
-
There are two concerns I have with implementing this within Numaflow. Choosing a StoreThe current stores that come with Numaflow are optimized for data and metadata movement. It won't be able to support any types that is deviant from what we have optimized for. E.g., we will experience OOMs if the size grows or the throughput will be severely compromised, causing a lot of unwanted side effects. On the other hand, there are lots of open-source cloud-native stores out there and they can be deployed very easily in K8s. One can choose any optimal store of any API style and configure it specifically for their needs. NOTE: Even in the Flink pipelines we write, we move the state out from Flink to external DBs because Flink simply cannot scale as these stores can get huge at high TPS. Fulfilling Completeness PropertyFor a platform to implement the Stateful Map Vertex obeying the "completeness property" (should work in all use cases) is quite tricky.
|
Beta Was this translation helpful? Give feedback.
-
I am moving this to GitHub discussion, we can convert it to an issue once we have a better picture. |
Beta Was this translation helpful? Give feedback.
-
Summary
Proposing a stateful map vertex.
Inside the map handler, the user must be able to read and write a global state for the vertex.
Use Cases
The state content is fully up to the user, but may for example hold:
This proposal could also solve the GPS data smoothening raised in #2235 in cases where only "past" events are needed. (If "future" events are needed, one could use the state as a N-message ring buffer, where the incoming message is added to the buffer and the oldest buffer message is output from the map.)
Design Considerations
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
Beta Was this translation helpful? Give feedback.
All reactions