You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When looking at Mamba's code, the inference path that processes the full sequence as input—rather than using a step-by-step method (where only the last step is used as input)—seems identical to the training-time forward pass, as discussed in the link below:
When looking at Mamba's code, the inference path that processes the full sequence as input—rather than using a step-by-step method (where only the last step is used as input)—seems identical to the training-time forward pass, as discussed in the link below:
#187
Does this mean that constant-time inference is only achievable with the step-by-step method?
Thanks in advance!
The text was updated successfully, but these errors were encountered: