You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After adding convnext as a use-case, the current workflow of adding new modules is not elegant.
Here are the current pain points:
scnn only deals with batchnormalization and not any other normalization layers. What is also difficult is that Layernormalization cannot be handled in the same way as BN: setting sthe weights to 1 and bias to 0 will lead to a normalization of exactly 0, which breaks tile statistics calculations.
Layer scale is a thing in convnext, so again the tile calculations will break when they are not set to 1.
Linear layers can be used, if it's modelled as a 1x1 convolution. This works but it might be good to set a warning.
scnn only handles relu activations, which conveniently break at 0 and then proceed linearly, but newer activation functions might break tile calculations.
Proposed solution:
It is easier to set most of the problematic to a nn.Identity() and restore them later. Since most layers will not contribute to Lost tile calculations, this will simplify a lot:
For normalization layers, this will be easier, since no weights/biases need to be handled.
Random activation functions will not influence tile statistics calculations
In essence, during streaming initialization, only conv, local max, and local avg pooling functions will be left. This way, the tile calculations are as basic as they can be, and then the modules can be reconstructed.
The text was updated successfully, but these errors were encountered:
After adding convnext as a use-case, the current workflow of adding new modules is not elegant.
Here are the current pain points:
Proposed solution:
In essence, during streaming initialization, only conv, local max, and local avg pooling functions will be left. This way, the tile calculations are as basic as they can be, and then the modules can be reconstructed.
The text was updated successfully, but these errors were encountered: