-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add offline scaling support for background ddl with arrangement backfill #20006
base: main
Are you sure you want to change the base?
Conversation
3a43e0b
to
3bb430b
Compare
3bb430b
to
4da7f10
Compare
…in streaming processes.
…trieval Signed-off-by: Shanicky Chen <[email protected]>
…/cancel, init streaming nodes.
df36762
to
868aa26
Compare
This pull request has been modified. If you want me to regenerate unit test for any of the files related, please find the file in "Files Changed" tab and add a comment |
…ports in `fragment.rs`.
src/meta/src/controller/fragment.rs
Outdated
.. | ||
} in fragments | ||
{ | ||
let mut stream_node = stream_node.to_protobuf(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use fragment_type_mask
instead to reduce memory usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the fragment_type_mask
cannot determine the type of stream scan; at least it couldn't in previous versions. We only have the stream node's node body to make that judgment. cc @kwannoel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can shortcut for snapshot backfill
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Snapshot backfill uses fragment_type_mask
iirc
…Controller, Integration Tests
I found that performing offline scaling directly during the recovery process causes the background DDL job status to remain at 100%. Do I need to do anything else? cc @kwannoel
|
Did the jobs get created successfully? If not, is the state inside the CreateMviewProgressTracker up to date? Did all the new actors get included in it? Can also add some tracing logs to see which actor did not report that they're finished yet. You can share the test, I'll try to take a look tomorrow. |
Ah, I've found the problem. If the meta undergoes recovery without a restart, the states in the progress tracker are not updated. These states should be reconstructed. |
…e_manager.rs` & `recovery.rs`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
…dd `visit_stream_node` import.
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
WIP STATEIt seems that No Shuffle backfill is still difficult to support in offline scaling, and this PR is still in a WIP state.
I found it somewhat challenging to directly support background jobs in online scaling, so I proposed this PR to add support for background jobs in offline scaling. At the same time, I attempted to skip jobs related to background DDL jobs in online scaling instead of completely omitting them.
This PR makes several specific changes.
Checklist