home / skills / xfstudio / skills / python-performance-optimization

python-performance-optimization skill

needs review

This skill helps you profile and optimize Python code by applying best practices and tooling to reduce bottlenecks and memory usage.

npx playbooks add skill xfstudio/skills --skill python-performance-optimization

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

1.3 KB

---
name: python-performance-optimization
description: Profile and optimize Python code using cProfile, memory profilers, and performance best practices. Use when debugging slow Python code, optimizing bottlenecks, or improving application performance.
---

# Python Performance Optimization

Comprehensive guide to profiling, analyzing, and optimizing Python code for better performance, including CPU profiling, memory optimization, and implementation best practices.

## Use this skill when

- Identifying performance bottlenecks in Python applications
- Reducing application latency and response times
- Optimizing CPU-intensive operations
- Reducing memory consumption and memory leaks
- Improving database query performance
- Optimizing I/O operations
- Speeding up data processing pipelines
- Implementing high-performance algorithms
- Profiling production applications

## Do not use this skill when

- The task is unrelated to python performance optimization
- You need a different domain or tool outside this scope

## Instructions

- Clarify goals, constraints, and required inputs.
- Apply relevant best practices and validate outcomes.
- Provide actionable steps and verification.
- If detailed examples are required, open `resources/implementation-playbook.md`.

## Resources

- `resources/implementation-playbook.md` for detailed patterns and examples.

Overview

This skill profiles and optimizes Python code using cProfile, memory profilers, and proven performance practices. It helps find CPU and memory bottlenecks, suggest targeted fixes, and validate improvements. The goal is measurable speedups and lower resource use while preserving correctness.

How this skill works

I start by collecting runtime profiles (cProfile for CPU, tracemalloc or memory-profiler for memory) and representative workload traces. I analyze hotspots, call graphs, and allocation patterns, then recommend focused fixes: algorithm changes, caching, concurrency, or I/O batching. Finally, I verify improvements with regression-safe benchmarks and repeat profiling to ensure gains.

When to use it

Diagnosing slow endpoints, long-running tasks, or high-latency operations
Reducing memory usage or tracking memory leaks in services
Optimizing CPU-bound data processing or numerical loops
Improving throughput of I/O-heavy workflows (network, disk, DB)
Validating performance impact of code changes before deployment

Best practices

Measure before changing: collect representative profiles and benchmarks
Focus on hot paths with high cost per request, not micro-optimizations
Prefer algorithmic improvements and data structure changes first
Use lazy evaluation, caching, and vectorized libraries (NumPy/Pandas) when appropriate
Avoid premature concurrency; choose async, threads, or processes based on the workload
Run post-change benchmarks and add targeted tests to prevent regressions

Example use cases

Profile a web endpoint that spikes latency to pinpoint slow DB calls or serialization
Reduce memory growth in a long-running worker by identifying persistent object retention
Speed up a data pipeline by replacing Python loops with NumPy vectorized operations
Diagnose a CPU-bound task and decide between C extensions, PyPy, or multiprocessing
Optimize file I/O by batching reads/writes and using async patterns for many small requests

FAQ

What profilers do you use and when?

Use cProfile for CPU hotspots, tracemalloc or memory-profiler for memory allocations, and line_profiler for per-line cost. Choose tracer tools that match production fidelity and overhead budget.

Will these changes affect correctness?

Recommendations prioritize correctness; I propose safe refactors and include verification steps and benchmarks. Any higher-risk changes (C extensions, concurrency) include tests and gradual rollout plans.