Clarification on timeline semaphore signals chaining availability from previous signals #2481

glebov-andrey · 2025-01-22T11:30:05Z

Hi,

While trying to optimize some pipeline stage masks and avoid ALL_COMMANDS where possible, I've come across an interesting issue with the exact synchronization scopes a timeline semaphore provides. This caused a sync. validation error, but I'm not sure if I simply don't understand the wording, or if the validation error is a false positive.

We have two queues of different families - call them GRAPHICS and COMPUTE.
The COMPUTE queue has a timeline semaphore associated with its submissions, and the GRAPHICS queue sometimes needs to wait on it.
Here's the order of operations:

COMPUTE queue
- submit(signal: value = 1, stages = ALL_COMMANDS) (ALL_COMMANDS because of QFOT)
  - vkCmdDispatch() (writes an image before QFOT)
  - queue family release barrier COMPUTE_SHADER, WRITE => NONE, NONE
- submit(signal: value = 2, stages = COMPUTE_SHADER)
  - vkCmdDispatch() (writes another resource which doesn't require QFOT)
GRAPHICS queue
- submit(wait: value = 2, stages = ALL_COMMANDS)
  - queue family acquire barrier (NONE, NONE => FRAGMENT_SHADER, READ) --- (ERROR HERE)
  - use the results of both dispatches from the COMPUTE queue

The error is SYNC-HAZARD-WRITE-AFTER-WRITE (submitted_usage: SYNC_IMAGE_LAYOUT_TRANSITION in GRAPHICS queue, prior_usage: SYNC_IMAGE_LAYOUT_TRANSITION in COMPUTE queue) for the image which had the QFOT.

TR;DR: The question is, should waiting for value = 2 be enough to satisfy the availability of the queue family release operation, or do I have to wait for both value = 1 and value = 2 in the same batch?

IMO there are actually a few potential ambiguities around signaling a semaphore:

First, the spec. seems to say that the memory dependency only relates to the batch:

When a batch is submitted to a queue via a queue submission, and it includes semaphores to be signaled, it defines a memory dependency on the batch, and defines semaphore signal operations which set the semaphores to the signaled state.

But then it goes on to define the first synchronization scope as including more than just the batch:

Semaphore signal operations that are defined by vkQueueSubmit or vkQueueSubmit2 additionally include all commands that occur earlier in submission order.

Secondly, similarly to #1175 there is also ambiguity around which set of operations stageMask applies to - only the batch or also everything earlier in submission order.

The third and most important question is about this sentence:

Semaphore signal operations that are defined by vkQueueSubmit or vkQueueSubmit2 or vkQueueBindSparse additionally include in the first synchronization scope any semaphore and fence signal operations that occur earlier in signal operation order.

If I understand the "Execution and Memory Dependencies" chapter correctly, then doesn't this imply that the signal with value = 2 happens-after the signal with value = 1, which in turn happens-after the availability operation of the queue family release barrier?

If this isn't the case, and the validation error is correct, then this poses a question about the usability any stageMask other than ALL_COMMANDS when signaling a semaphore (at least for the case when we are interested in waiting for anything which happened before a particular value was signaled).
This becomes more problematic if the host calls vkGetSemaphoreCounterValue() or vkWaitSemaphores(). Assuming these functions' synchronization scopes are defined reasonably (see #2463), it would mean that if the host doesn't observe (or wait for) every single signaled value one-by-one, it can't submit work which requires a memory dependency on everything which happened-before each previous signal.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on timeline semaphore signals chaining availability from previous signals #2481

Clarification on timeline semaphore signals chaining availability from previous signals #2481

glebov-andrey commented Jan 22, 2025

Clarification on timeline semaphore signals chaining availability from previous signals #2481

Clarification on timeline semaphore signals chaining availability from previous signals #2481

Comments

glebov-andrey commented Jan 22, 2025