Back to all reviewers

Cross-platform algorithm optimization

deeplearning4j/deeplearning4j
Based on 4 comments
Other

When implementing algorithms that need to execute efficiently across different platforms, consider both compile-time and runtime optimizations: 1. For template-heavy code, use explicit instantiations to improve compilation speed and avoid compiler limitations on specific platforms:

Algorithms Other

Reviewer Prompt

When implementing algorithms that need to execute efficiently across different platforms, consider both compile-time and runtime optimizations:

  1. For template-heavy code, use explicit instantiations to improve compilation speed and avoid compiler limitations on specific platforms:
    // Use CMake to generate explicit template instantiations 
    // in separate compilation units
    #cmakedefine LIBND4J_TYPE_GEN 
       
    #include <ops/declarable/helpers/cpu/summaryReductions.hpp>
    
  2. Use platform-agnostic type definitions to ensure consistent behavior, particularly when interfacing between languages like C++ and Java:
    // Prefer explicit sized types rather than platform-dependent types
    typedef int Nd4jInt;  // For 32-bit integers
    typedef long long Nd4jLong;  // For 64-bit integers
    
  3. Avoid calling member functions from constructors when inlining might be deactivated:
    // Instead of: 
    // if (rootSeed == 0)
    //     rootSeed = currentMilliseconds();
       
    // Directly use the implementation:
    if (rootSeed == 0){
        auto s = std::chrono::system_clock::now().time_since_epoch();
        rootSeed = std::chrono::duration_cast<std::chrono::milliseconds>(s).count();
    }
    
  4. When working with hardware acceleration libraries like CUDA, create wrapper classes that handle version differences and provide consistent error handling:
    template<typename Op, typename ...Args>
    FORCEINLINE void callCudnnIfNoErr(cudnnStatus_t &err, Op op, Args&&... args) {
        if(err==CUDNN_STATUS_SUCCESS) {
            err = op(std::forward<Args>(args)...);
            if(err) {
                nd4j_printf("Cudnn error code %s\n", cudnnGetErrorString(err));
            }
        }
    }
    

Always document platform-specific considerations directly in the code to help future developers understand your optimization decisions.

4
Comments Analyzed
Other
Primary Language
Algorithms
Category

Source Discussions