Back to all reviewers

Optimize loop operations

tensorflow/tensorflow
Based on 5 comments
Other

Minimize expensive operations within inner loops to improve performance. Key practices include: 1. Extract loop-invariant calculations outside of loops

Performance Optimization Other

Reviewer Prompt

Minimize expensive operations within inner loops to improve performance. Key practices include:

  1. Extract loop-invariant calculations outside of loops ```cpp // Bad: Recalculating the same value in every iteration for (size_t j = 0; j < (output.shape().Dim(dimension) * inner_dimensions_size); j++) { // Loop body }

// Good: Calculate once before the loop const size_t loop_limit = output.shape().Dim(dimension) * inner_dimensions_size; for (size_t j = 0; j < loop_limit; j++) { // Loop body }


2. Avoid expensive function calls like `Offset()` in inner loops
```cpp
// Bad: Using Offset() in inner loop
for (int filter_d1 = filter_d1_start; filter_d1 < filter_d1_end; ++filter_d1) {
  for (int filter_d2 = filter_d2_start; filter_d2 < filter_d2_end; ++filter_d2) {
    for (int filter_d3 = filter_d3_start; filter_d3 < filter_d3_end; ++filter_d3) {
      total += input_data[Offset(input_shape, batch, in_d1, in_d2, in_d3, channel)];
    }
  }
}

// Good: Calculate indices incrementally
// Initialize base_offset before the loops
for (int filter_d1 = filter_d1_start; filter_d1 < filter_d1_end; ++filter_d1) {
  int d1_offset = base_offset + stride_d1 * filter_d1;
  for (int filter_d2 = filter_d2_start; filter_d2 < filter_d2_end; ++filter_d2) {
    int d2_offset = d1_offset + stride_d2 * filter_d2;
    for (int filter_d3 = filter_d3_start; filter_d3 < filter_d3_end; ++filter_d3) {
      int index = d2_offset + stride_d3 * filter_d3;
      total += input_data[index];
    }
  }
}
  1. Use stack-based containers for small data structures: ```cpp // Bad: Using std::vector for small arrays std::vector a1_transpose_dims;

// Good: Using small vector optimization llvm::SmallVector<int64_t, 4> a1_transpose_dims; // Or absl::InlinedVector<size_t, 6> lhs_index; ```

These optimizations can significantly improve performance in computation-intensive applications by reducing memory allocation overhead and unnecessary calculations.

5
Comments Analyzed
Other
Primary Language
Performance Optimization
Category

Source Discussions