-
Notifications
You must be signed in to change notification settings - Fork 474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Atlas Api performs worse than Picture Api #2688
Comments
I'm also considering using Atlas and while I dont think it makes as much sense for my use case, I'm 90% sure it would for yours. I'm pretty new to Skia / having to care about the rendering pipeline this much so using this as an opportunity to have to put my current understanding into concrete rationale that someone else has an interest in validating. Please poke holes if any parts seem to conflict with what you're seeing in practice, I didn't explicitly validate this empirically beyond a quick rewrite of your Atlas code to better leverage Skia. Note: I also added a variable speed per circle because the lack of variation bugged me while checking frame rates 😅 . Other than that the only intentional change was fixing the circle position to be centered (cx/cy is the center, not the top left (x/y)) which is why your atlas circles were only quarters of circles. And err making the circles red on white - which was easier to see during the day. ResultHere's the result of my rewrite (on emulators) Gist with entire code: https://gist.github.com/awhiteside1/a734d3135e5b75be2376123d20151372 Recording: Untitled.movExcalidraws: https://link.excalidraw.com/readonly/s69kVVZhFgZt0zPHAIKo?darkMode=true The important change// This is the part that matters, since it runs per frame.
// Instead of creating a new picture of per circle drawing commands, we're updating a buffer of small transform matrices per circle.
// Atlas then can move these matrices over to the GPU (as opposed to all the drawing commands), which only has to transform each by the already in GPU memory image.
// This can in theory happen in parallel on the GPU, but idk if skia actually does that.
const transforms = useRSXformBuffer(CIRCLE_COUNT, (val, i) => {
'worklet'
const point = startingPositions.value[i]
if (!point) return
// This uses the clock as a deterministic way of getting position, including looping when overflowing the canvas height.
const dy = clock.value * point.speed
const y = (point.y + dy) % canvasSize.value.height
// Scale 1, rotate 0, position using original X and new y
const form = Skia.RSXform(1, 0, point.x, y)
// Set the matrix's value (which mutates the buffer)
val.set(form.scos, form.ssin, form.tx, form.ty)
}) Conceptual ExplanationUsing PicturesConceptually speaking, the big issue for this use case with using Pictures is that on every frame, skia generates new positions for each circle - O(n), then generates the drawing commands for every circle while creating the picture another O(n), then has to transfer the entire picture drawing over to the GPU, which then has to actually create the image. (Or worse, uses the CPU to draw the image, not 100% sure on this) This is really only at all viable in your case because your element is a circle. If it was a more complex path or a bitmap, this would be rendering at a crawl. Using Atlas (properly)When using Atlas, you are able to perform the rasterization once at the start and Atlas moves it to the GPU where it is reused. Then instead of computing a new set of drawing commands each frame (each with 1000 circles), you instead create a transform matrix per circle. By using the hooks provided by skia, you're in fact mutating the same buffer each frame - so benefiting from effectively no data copying at all per frame. The real boost though, happens when Atlas generates the image. Because it has the matrices in a form usable by the GPU and the image texture already on the GPU, it can in parallel transform by each matrix and merge the result into a final image. Issue with your original Atlas approachBecause you were giving Atlas a new transform array every frame (vs using the hook to mutate the same one), and by iterating multiple times over the array and storing a new mutated set of positions, you were not fully benefitting from atlas, and more or less doing the same thing the picture approach was doing, but with the added overhead of new atlas buffers every frame. cc @wcandillon in case there are some easy to point out flaws in the above / anything significant I missed. |
Hey @awhiteside1 Thank you for taking a look at this. I was hopeful that using the I changed this part: const transforms = useDerivedValue(() => {
return circles.value.map(circle => Skia.RSXform(1, 0, circle.x, circle.y));
}, []); to this: const transforms = useRSXformBuffer(CIRCLE_COUNT, (val, i) => {
'worklet';
const point = circles.value[i];
if (!point) return;
const form = Skia.RSXform(1, 0, point.x, point.y);
val.set(form.scos, form.ssin, form.tx, form.ty);
}); but unfortunately the result was even worse. Here is a table of every approach and its recording. Note that I bumped the circle count from 1000 to 10000 since on the emulator 60fps is achievable with all approaches at 1000 circles:
|
Description
Hi, first of all thanks for the great library!!
The issue is that Atlas is said to be more efficient at rendering the same instance multiple times but unfortunately that is not the case for me.
For example I tried creating 1000 circles that are falling down, so it is the same image but different transformations and the Picture API while sill has poor performance, was better than Atlas.
Note that I testing on an Android, Samsung Galaxy A34.
So here is an example using Picture:
And here is the Atlas version:
Screen_Recording_20241013_194143_PictureVsAtlas.mp4
Screen_Recording_20241013_194351_PictureVsAtlas.mp4
Not sure also why the output looks different, Atlas circles are smaller? But anyways perf is the bug at the moment
Version
1.4.2
Steps to reproduce
Clone the repo provided and checkout the
main
branch which uses the picture api and theatlas
branch which uses the Atlas Api on an Android device? Perhaps a low end if possible eg: Samsung A34Snack, code example, screenshot, or link to a repository
Picture
Atlas
The text was updated successfully, but these errors were encountered: