Performance¶
See also
Software performance depends on many choices: language (like Rust versus Python), framework (like FastAPI versus Django), architecture (e.g. map-reduce), networking (e.g. batch requests), etc. Many choices are costly to change at a later date (e.g. full rewrite).
Profiling¶
Use profiling to:
Identify slow dependencies, in case faster alternatives can be easily swapped in
Find major hotspots, like a loop that runs in exponential time instead of quadratic time
Find minor hotspots, if changing language, etc. is too costly
Once a hotspot is found, the solution might be to:
Call it once, via refactoring: for example, traversing JSON once for all CoVE calculations, instead of once for each calculation, in lib-cove
Call it less, via batching: for example, reducing the number of SQL queries in Django projects
Cache the results: for example, caching mergers in Kingfisher Process
Process in parallel: for example, distributing work to multiple threads, like we do with RabbitMQ
Replace it entirely: for example, using the orjson package instead of the
json
library
See also
CPU¶
cProfile is a deterministic profiler, measures functions, and lacks support for threads. For example:
cat packages.json | python -m cProfile -o code.prof ocdskit/__main__.py compile > /dev/null gprof2dot -f pstats code.prof | dot -Tpng -o output.png open output.png
py-spy is a statistical profiler, measures lines, and supports threads, subprocesses and native extensions. The
top
command can attach to a running process.pprofile is a statistical profiler (and very slow deterministic profiler), measures lines, and supports threads and PyPy.
vmprof is a statistic profiler, measures functions or lines, and supports threads and PyPy (and is aware of JIT).
timeit is a deterministic profiler for code snippets.
Other profilers
yappi is a deterministic profiler, measures functions, supports threads and async, and has wall and CPU clocks.
pyinstrument is a statistical profiler, measures functions, and supports async but not threads.
line-profiler is a deterministic profiler, measures lines, requires decorators, and lacks support for threads.
Memory¶
Tip
When profiling a Django project, ensure DEBUG = False
: for example, by running env DJANGO_ENV=production
.
Memory profilers have two use cases: reduce memory consumption (like in data processing) and fix memory leaks (like in long-running processes). Tools for reducing memory consumption typically measure peaks and draw flamegraphs; that said, they also can be used for memory leaks, by generating work that leaks memory.
When evaluating memory usage in production, remember the differences between heap memory and resident memory. In particular, resident memory is not freed immediately.
memray to diagnose peak memory, using attach for a running process, including live reporting
filprofiler to diagnose peak memory
Optimizations¶
Set __slots__ on classes or slots=True on dataclasses that are instantiated frequently.
“The space saved over using __dict__ can be significant. Attribute lookup speed can be significantly improved as well.”