Choose data structures and algorithms based on their computational complexity and access patterns rather than convenience. Consider performance implications when selecting between alternatives.
Choose data structures and algorithms based on their computational complexity and access patterns rather than convenience. Consider performance implications when selecting between alternatives.
Key principles:
Match data structure to access patterns: Use np.array
instead of Python lists when performing mathematical operations, as numpy can optimize operations internally without type conversion overhead.
Choose algorithms based on constraints: Select streaming algorithms when memory usage matters more than speed, and batch algorithms when speed is critical and memory is available.
Avoid expensive operations in hot paths: Replace costly operations like inspect.getmodule()
with simpler alternatives such as direct I/O operations or cached lookups.
Make implementations algorithm-agnostic: Design interfaces that work with different underlying implementations (e.g., queue.clear()
vs creating new queue instances).
Example from the codebase:
# Before: Python list requiring conversion
self.posenet_stds = [POSENET_STD_INITIAL_VALUE] * (POSENET_STD_HIST_HALF * 2)
old_mean = np.mean(self.posenet_stds[:POSENET_STD_HIST_HALF]) # Converts list to array internally
# After: Direct numpy array for better performance
self.posenet_stds = np.array([POSENET_STD_INITIAL_VALUE] * (POSENET_STD_HIST_HALF * 2))
old_mean = np.mean(self.posenet_stds[:POSENET_STD_HIST_HALF]) # No conversion needed
When choosing between algorithmic approaches, document the trade-offs. For instance, zstd.decompress()
is faster for files with size headers, while streaming decompression handles variable-size data but with slight performance cost. Choose based on your data characteristics and constraints.
Enter the URL of a public GitHub repository