Minimize expensive operations within inner loops to improve performance. Key practices include: 1. Extract loop-invariant calculations outside of loops
Minimize expensive operations within inner loops to improve performance. Key practices include:
// Good: Calculate once before the loop const size_t loop_limit = output.shape().Dim(dimension) * inner_dimensions_size; for (size_t j = 0; j < loop_limit; j++) { // Loop body }
2. Avoid expensive function calls like `Offset()` in inner loops
```cpp
// Bad: Using Offset() in inner loop
for (int filter_d1 = filter_d1_start; filter_d1 < filter_d1_end; ++filter_d1) {
for (int filter_d2 = filter_d2_start; filter_d2 < filter_d2_end; ++filter_d2) {
for (int filter_d3 = filter_d3_start; filter_d3 < filter_d3_end; ++filter_d3) {
total += input_data[Offset(input_shape, batch, in_d1, in_d2, in_d3, channel)];
}
}
}
// Good: Calculate indices incrementally
// Initialize base_offset before the loops
for (int filter_d1 = filter_d1_start; filter_d1 < filter_d1_end; ++filter_d1) {
int d1_offset = base_offset + stride_d1 * filter_d1;
for (int filter_d2 = filter_d2_start; filter_d2 < filter_d2_end; ++filter_d2) {
int d2_offset = d1_offset + stride_d2 * filter_d2;
for (int filter_d3 = filter_d3_start; filter_d3 < filter_d3_end; ++filter_d3) {
int index = d2_offset + stride_d3 * filter_d3;
total += input_data[index];
}
}
}
// Good: Using small vector optimization llvm::SmallVector<int64_t, 4> a1_transpose_dims; // Or absl::InlinedVector<size_t, 6> lhs_index; ```
These optimizations can significantly improve performance in computation-intensive applications by reducing memory allocation overhead and unnecessary calculations.
Enter the URL of a public GitHub repository