You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey authors. Thank you for this insanely insightful paper. I read through the entirety of your paper and had some questions which I hoped you could explain or address in simple words:
How does the gradient based scoring function have O(M) complexity compared to SGP's O(3M)?
For the bootstrap correction, are the additional L steps done after the global weights of the meta-model (outer loop) is updated from the K inner loop steps? Or is the additional L steps done within the inner loop after the K steps?
To adapt to the K new steps, backpropagation is still being done obviously. How exactly is this saving memory costs? By introducing L more computations, does that not affect the overall computation time?
Thank you for your time!
The text was updated successfully, but these errors were encountered:
Hey authors. Thank you for this insanely insightful paper. I read through the entirety of your paper and had some questions which I hoped you could explain or address in simple words:
How does the gradient based scoring function have O(M) complexity compared to SGP's O(3M)?
For the bootstrap correction, are the additional L steps done after the global weights of the meta-model (outer loop) is updated from the K inner loop steps? Or is the additional L steps done within the inner loop after the K steps?
To adapt to the K new steps, backpropagation is still being done obviously. How exactly is this saving memory costs? By introducing L more computations, does that not affect the overall computation time?
Thank you for your time!
The text was updated successfully, but these errors were encountered: