Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on timeline semaphore signals chaining availability from previous signals #2481

Open
glebov-andrey opened this issue Jan 22, 2025 · 0 comments

Comments

@glebov-andrey
Copy link

Hi,

While trying to optimize some pipeline stage masks and avoid ALL_COMMANDS where possible, I've come across an interesting issue with the exact synchronization scopes a timeline semaphore provides. This caused a sync. validation error, but I'm not sure if I simply don't understand the wording, or if the validation error is a false positive.

We have two queues of different families - call them GRAPHICS and COMPUTE.
The COMPUTE queue has a timeline semaphore associated with its submissions, and the GRAPHICS queue sometimes needs to wait on it.
Here's the order of operations:

  • COMPUTE queue
    • submit(signal: value = 1, stages = ALL_COMMANDS) (ALL_COMMANDS because of QFOT)
      • vkCmdDispatch() (writes an image before QFOT)
      • queue family release barrier COMPUTE_SHADER, WRITE => NONE, NONE
    • submit(signal: value = 2, stages = COMPUTE_SHADER)
      • vkCmdDispatch() (writes another resource which doesn't require QFOT)
  • GRAPHICS queue
    • submit(wait: value = 2, stages = ALL_COMMANDS)
      • queue family acquire barrier (NONE, NONE => FRAGMENT_SHADER, READ) --- (ERROR HERE)
      • use the results of both dispatches from the COMPUTE queue

The error is SYNC-HAZARD-WRITE-AFTER-WRITE (submitted_usage: SYNC_IMAGE_LAYOUT_TRANSITION in GRAPHICS queue, prior_usage: SYNC_IMAGE_LAYOUT_TRANSITION in COMPUTE queue) for the image which had the QFOT.

TR;DR: The question is, should waiting for value = 2 be enough to satisfy the availability of the queue family release operation, or do I have to wait for both value = 1 and value = 2 in the same batch?

IMO there are actually a few potential ambiguities around signaling a semaphore:

First, the spec. seems to say that the memory dependency only relates to the batch:

When a batch is submitted to a queue via a queue submission, and it includes semaphores to be signaled, it defines a memory dependency on the batch, and defines semaphore signal operations which set the semaphores to the signaled state.

But then it goes on to define the first synchronization scope as including more than just the batch:

Semaphore signal operations that are defined by vkQueueSubmit or vkQueueSubmit2 additionally include all commands that occur earlier in submission order.

Secondly, similarly to #1175 there is also ambiguity around which set of operations stageMask applies to - only the batch or also everything earlier in submission order.

The third and most important question is about this sentence:

Semaphore signal operations that are defined by vkQueueSubmit or vkQueueSubmit2 or vkQueueBindSparse additionally include in the first synchronization scope any semaphore and fence signal operations that occur earlier in signal operation order.

If I understand the "Execution and Memory Dependencies" chapter correctly, then doesn't this imply that the signal with value = 2 happens-after the signal with value = 1, which in turn happens-after the availability operation of the queue family release barrier?

If this isn't the case, and the validation error is correct, then this poses a question about the usability any stageMask other than ALL_COMMANDS when signaling a semaphore (at least for the case when we are interested in waiting for anything which happened before a particular value was signaled).
This becomes more problematic if the host calls vkGetSemaphoreCounterValue() or vkWaitSemaphores(). Assuming these functions' synchronization scopes are defined reasonably (see #2463), it would mean that if the host doesn't observe (or wait for) every single signaled value one-by-one, it can't submit work which requires a memory dependency on everything which happened-before each previous signal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant