Skip to content

Reproducible Ubuntu performance and AI/ML system setup covering CPU tuning, NVIDIA GPU (PRIME + CUDA), PyTorch venv workflows, memory & swap optimization, and verification scripts. Emphasizes system correctness, stability, and real-world trade-offs.

Notifications You must be signed in to change notification settings

vikram2327/ubuntu-performance-ml-setup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ubuntu Performance & AI/ML System Setup

A reproducible, production-minded guide to configuring Ubuntu for performance and AI/ML workloads.
This repository documents real-world system tuning decisions across CPU, GPU, memory, and Python environments — with an emphasis on correctness, stability, and maintainability.

Rather than focusing on tool installation alone, this guide treats the operating system as part of the engineering surface area.


✨ What This Repository Covers

  • CPU performance tuning

    • Power profiles and governor behavior
    • Safe performance configuration using OS-supported mechanisms
  • NVIDIA GPU setup

    • PRIME (Optimus) configuration
    • CUDA-ready driver usage
    • Explicit verification of GPU-backed computation
  • Python AI/ML environment

    • Virtual environments (PEP 668–compliant)
    • CUDA-enabled PyTorch installation
    • Clean separation from system Python
  • Memory & swap optimization

    • Swap configuration for ML workloads
    • Swappiness tuning and trade-offs
  • Verification & validation

    • Scripts to confirm system state
    • Minimal GPU-backed computation tests
  • Cleanup & system hygiene

    • Removal of setup-only dependencies
    • Log and package cleanup for long-term stability

🎯 Design Principles

This repository is guided by a few core principles:

  • Correctness over shortcuts
    Prefer OS-supported mechanisms over brittle tweaks.

  • Reproducibility
    Steps should work consistently on fresh Ubuntu installs.

  • Graceful degradation
    Systems without NVIDIA GPUs should still function correctly.

  • Explicit trade-offs
    Performance decisions are documented, not hidden.


🧪 Tested Environment

This setup has been validated on the following configuration:

  • Ubuntu 24.04 LTS
  • Intel CPU
  • NVIDIA GPU (Optimus / PRIME)
  • Python 3.12
  • PyTorch with CUDA support

The scripts are written to degrade gracefully on systems without NVIDIA GPUs.


🚀 Quick Start

Clone the repository and run:

bash scripts/setup.sh
bash scripts/verify.sh

The setup script configures performance defaults and creates an isolated Python environment for AI/ML work. The verification script confirms CPU, memory, and GPU behavior.


📊 Benchmarks

Performance validation and GPU verification results are documented in:

👉 Benchmarks.md

These benchmarks focus on correctness and observability rather than peak theoretical performance.


📁 Repository Structure

ubuntu-performance-ml-setup/
├── README.md
├── Benchmarks.md
├── scripts/
│   ├── setup.sh        # System setup & performance tuning
│   ├── verify.sh       # Validate CPU, GPU, memory state
│   └── cleanup.sh      # Remove setup-only dependencies
├── examples/
│   └── gpu_test.py     # Minimal CUDA-backed PyTorch test
└── docs/
    ├── architecture.md       # System design philosophy
    ├── tradeoffs.md          # Performance trade-offs
    ├── case-study.md         # Real-world debugging case study
    ├── lessons-learned.md    # Key engineering takeaways
    ├── design-decisions.md   # Explicit design decisions
    └── common-pitfalls.md    # Common failure modes to avoid

🧠 Why This Repository Exists

Most guides focus on getting things to run.

In practice, performance and stability issues often come from:

  • CPU power states
  • Memory pressure and swap behavior
  • GPU offload configuration
  • Python packaging boundaries

This repository exists to make those details explicit — and to provide a clean, explainable baseline that can be extended for real workloads.

A short real-world case study is available in docs/case-study.md.


🔍 Who This Is For

This guide may be useful if you:

  • Work on Linux-based development or ML systems
  • Use NVIDIA GPUs on Ubuntu
  • Care about system-level performance and correctness
  • Want a clean, reproducible workstation setup
  • Prefer understanding trade-offs over applying opaque tweaks

👤 Author

Vikram Pratap Singh


📌 Notes

This repository is intentionally conservative:

  • It avoids unsupported kernel or driver hacks
  • It respects OS-level Python packaging rules
  • It favors debuggability and predictability over maximum theoretical performance

The goal is a system that remains understandable, stable, and reproducible over time.

About

Reproducible Ubuntu performance and AI/ML system setup covering CPU tuning, NVIDIA GPU (PRIME + CUDA), PyTorch venv workflows, memory & swap optimization, and verification scripts. Emphasizes system correctness, stability, and real-world trade-offs.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages