Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Experimental interface for torch ops
See SHI-Labs#184 Only supports forward pass for now, due to current limitations of registering custom ops with torch compared to autograd functions. Some of those limitations are: * No stable interface for supporting autocasting to fp16/bf16, * Gradient scaling doesn't seem to be supported either, leading to training instability. * Ops cannot indicate that they expect contiguous operands, and need to call `.contiguous()` within, and this incurs additional tensor copy costs, and brings down throughput (in some cases it's hard to even tell the difference between compiled and eager.)
- Loading branch information