Ensure layer math, parameter semantics, and model conversion/backends remain consistent so inference results don’t silently change.
Apply these rules:
Example pattern for fast-path gating:
int create_pipeline(const Option& opt) {
if (opt.fast_gelu == 0) {
support_packing = false; // disable incomplete fast/packing
}
return 0;
}
int forward_inplace(Mat& x, const Option& opt) const {
if (opt.fast_gelu == 0) {
// correct baseline implementation
return GELU::forward(x, opt);
}
// otherwise run optimized path
// ...
return 0;
}
These rules prevent silent output drift across AI inference workloads and keep model conversion reliable.
Enter the URL of a public GitHub repository