A reproducible, production-minded guide to configuring Ubuntu for performance and AI/ML workloads.
This repository documents real-world system tuning decisions across CPU, GPU, memory, and Python environments — with an emphasis on correctness, stability, and maintainability.
Rather than focusing on tool installation alone, this guide treats the operating system as part of the engineering surface area.
-
CPU performance tuning
- Power profiles and governor behavior
- Safe performance configuration using OS-supported mechanisms
-
NVIDIA GPU setup
- PRIME (Optimus) configuration
- CUDA-ready driver usage
- Explicit verification of GPU-backed computation
-
Python AI/ML environment
- Virtual environments (PEP 668–compliant)
- CUDA-enabled PyTorch installation
- Clean separation from system Python
-
Memory & swap optimization
- Swap configuration for ML workloads
- Swappiness tuning and trade-offs
-
Verification & validation
- Scripts to confirm system state
- Minimal GPU-backed computation tests
-
Cleanup & system hygiene
- Removal of setup-only dependencies
- Log and package cleanup for long-term stability
This repository is guided by a few core principles:
-
Correctness over shortcuts
Prefer OS-supported mechanisms over brittle tweaks. -
Reproducibility
Steps should work consistently on fresh Ubuntu installs. -
Graceful degradation
Systems without NVIDIA GPUs should still function correctly. -
Explicit trade-offs
Performance decisions are documented, not hidden.
This setup has been validated on the following configuration:
- Ubuntu 24.04 LTS
- Intel CPU
- NVIDIA GPU (Optimus / PRIME)
- Python 3.12
- PyTorch with CUDA support
The scripts are written to degrade gracefully on systems without NVIDIA GPUs.
Clone the repository and run:
bash scripts/setup.sh
bash scripts/verify.shThe setup script configures performance defaults and creates an isolated Python environment for AI/ML work. The verification script confirms CPU, memory, and GPU behavior.
Performance validation and GPU verification results are documented in:
These benchmarks focus on correctness and observability rather than peak theoretical performance.
ubuntu-performance-ml-setup/
├── README.md
├── Benchmarks.md
├── scripts/
│ ├── setup.sh # System setup & performance tuning
│ ├── verify.sh # Validate CPU, GPU, memory state
│ └── cleanup.sh # Remove setup-only dependencies
├── examples/
│ └── gpu_test.py # Minimal CUDA-backed PyTorch test
└── docs/
├── architecture.md # System design philosophy
├── tradeoffs.md # Performance trade-offs
├── case-study.md # Real-world debugging case study
├── lessons-learned.md # Key engineering takeaways
├── design-decisions.md # Explicit design decisions
└── common-pitfalls.md # Common failure modes to avoid
Most guides focus on getting things to run.
In practice, performance and stability issues often come from:
- CPU power states
- Memory pressure and swap behavior
- GPU offload configuration
- Python packaging boundaries
This repository exists to make those details explicit — and to provide a clean, explainable baseline that can be extended for real workloads.
A short real-world case study is available in
docs/case-study.md.
This guide may be useful if you:
- Work on Linux-based development or ML systems
- Use NVIDIA GPUs on Ubuntu
- Care about system-level performance and correctness
- Want a clean, reproducible workstation setup
- Prefer understanding trade-offs over applying opaque tweaks
Vikram Pratap Singh
- GitHub: https://github.com/vikram2327
- LinkedIn: https://www.linkedin.com/in/vikrampratapsingh2
This repository is intentionally conservative:
- It avoids unsupported kernel or driver hacks
- It respects OS-level Python packaging rules
- It favors debuggability and predictability over maximum theoretical performance
The goal is a system that remains understandable, stable, and reproducible over time.