CachyOS Performance Guide

This guide explains how CachyOS achieves its performance improvements, what optimizations are used, and how to get the best performance from your system.

Understanding CachyOS Performance Optimizations

What Makes CachyOS Fast?

CachyOS achieves better performance through multiple optimization techniques:

CPU Instruction Set Optimizations - Packages compiled for modern CPU features
Advanced Schedulers - Better CPU task scheduling (BORE, EEVDF, etc.)
Link Time Optimization (LTO) - Compiler optimizations across entire programs
Profile-Guided Optimization (PGO) - Packages optimized based on real usage
BOLT Optimization - Binary-level optimizations for specific packages
Custom Kernel - Optimized kernel with performance patches

Real-World Performance Benefits

What you'll notice:

Faster application startup - Programs launch quicker
Lower input lag - Mouse and keyboard feel more responsive
Smoother gaming - Better frame times and lower latency
Faster compilation - Developers build code faster
Better multitasking - System stays responsive under load
Improved battery life - More efficient CPU usage (on laptops)

CPU Instruction Set Optimizations

What Are CPU Instruction Sets?

CPU instruction sets are collections of commands that a CPU can execute. Newer CPUs support more advanced instruction sets that can perform operations faster.

What is an instruction set?

Instruction: A command that tells the CPU what to do
Set: A collection of available instructions
Example instructions: Add two numbers, multiply, load data from memory
Different CPUs: Support different instruction sets

Why do instruction sets matter?

Older CPUs: Support basic instructions (can do the job, but slower)
Newer CPUs: Support advanced instructions (can do the same job faster)
Optimized software: Uses advanced instructions when available
Result: Same program runs faster on newer CPUs with advanced instructions

Real-world analogy:

Older CPUs: Basic tools (hammer, screwdriver)
Can build things, but takes longer
More manual work required
Slower but gets the job done
Newer CPUs: Power tools (drill, impact driver)
Can build the same things, but much faster
Less manual work required
Faster and more efficient

How CachyOS uses this:

Compiles software: Uses advanced instructions if your CPU supports them
Result: Programs run faster on your specific CPU
Example: If your CPU supports AVX2, CachyOS uses AVX2 instructions (faster)
If CPU doesn't support it: Uses basic instructions (still works, just slower)

CachyOS Optimization Levels

CachyOS compiles packages for different CPU generations:

x86-64-v3 (Recommended Minimum)

What it is:

Optimized for CPUs from 2015 onwards
Uses AVX, AVX2, and other modern instructions

Supported CPUs:

Intel: Haswell (4th gen Core) or newer
Examples: Core i5-4xxx, Core i7-4xxx, Xeon E3 v3+
AMD: Excavator or newer
Examples: FX-8xxx, Ryzen series, EPYC

Performance gain:

5-15% faster than generic x86-64
Better for most modern systems

How to check:

# Check if your CPU supports x86-64-v3
lscpu | grep "Flags" | grep -i "avx2"

What this command does:

lscpu: Lists CPU information
Shows CPU model, cores, architecture, and features
| grep "Flags": Finds the line showing CPU feature flags
Flags: CPU features/instructions your processor supports
| grep -i "avx2": Searches for "avx2" (Advanced Vector Extensions 2)
-i: Case-insensitive search
AVX2: A CPU instruction set required for x86-64-v3

Example output if supported:

Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp

What to look for:

If you see avx2 in the output: Your CPU supports x86-64-v3
If you don't see avx2: Your CPU doesn't support v3 (use lower optimization level)

Example output if NOT supported:

Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm

What this means:

No avx2 in the list
Your CPU is older (pre-2015)
Use standard x86-64 packages (not v3 optimized)

Alternative check method:

# Check CPU model directly
lscpu | grep "Model name"

Example output:

Model name:            Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz

What this tells you:

i7-9700K: 9th generation Intel Core (2018)
This CPU supports x86-64-v3 (Haswell or newer)
You can use v3 optimized packages

x86-64-v4 (Better Performance)

What it is:

Optimized for newer CPUs (2016+)
Uses AVX-512 and other advanced instructions

Supported CPUs:

Intel: Skylake (6th gen Core) or newer
Examples: Core i5-6xxx, Core i7-6xxx, Xeon Scalable
AMD: Zen (Ryzen 1000 series) or newer
Examples: Ryzen 5 1600, Ryzen 7 1800X, EPYC

Performance gain:

10-20% faster than x86-64-v3
Best for systems from 2016 onwards

How to check:

# Check if your CPU supports x86-64-v4
lscpu | grep "Flags" | grep -i "avx512"

What this command does:

lscpu: Lists CPU information
| grep "Flags": Finds CPU feature flags
| grep -i "avx512": Searches for "avx512" (Advanced Vector Extensions 512)
AVX-512: A CPU instruction set required for x86-64-v4
More advanced than AVX2 (used in v3)

Example output if supported:

Flags: ... avx2 avx512f avx512dq avx512cd avx512bw avx512vl ...

What to look for:

If you see avx512 (or avx512f, avx512dq, etc.): Your CPU supports x86-64-v4
Multiple AVX-512 variants indicate full v4 support

Example output if NOT supported:

Flags: ... avx2 ... (no avx512)

What this means:

Your CPU supports v3 (has AVX2) but not v4 (no AVX-512)
Use x86-64-v3 optimized packages
Still get good performance improvements

Important note:

Some newer CPUs (especially AMD Ryzen 5000 series) may not have AVX-512
This doesn't mean they're slow - they're just optimized differently
Check your specific CPU model for best optimization level

Zen4 (Best for AMD Ryzen 7000+)

What it is:

Specifically optimized for AMD Zen 4 architecture
Latest optimizations for newest AMD CPUs

Supported CPUs:

AMD: Ryzen 7000 series (Zen 4) or newer
Examples: Ryzen 5 7600X, Ryzen 7 7700X, Ryzen 9 7900X
EPYC 9004 series

Performance gain:

Best performance on supported CPUs
Optimized for latest AMD architecture features

How to check:

# Check your CPU model
lscpu | grep "Model name"

# Look for "Ryzen 7xxx" or "7xxx" in the name

Which Optimization Level Should You Use?

General rule:

Use the highest level your CPU supports
Higher levels = better performance (if CPU supports it)
Using a level your CPU doesn't support will cause errors

Recommendations:

2015-2016 CPU: Use x86-64-v3
2016-2022 CPU: Use x86-64-v4
2022+ AMD Ryzen: Use Zen4 if available
Not sure?: Use x86-64-v3 (most compatible)

BORE Scheduler

What is a CPU Scheduler?

The CPU scheduler is a critical part of the operating system that decides:

Which programs run on which CPU cores: Distributes work across CPU cores
When programs get CPU time: Decides when each program gets to run
How CPU time is distributed: Shares CPU time fairly (or prioritizes certain tasks)

What is a CPU core?

CPU core: A processing unit inside your CPU
Modern CPUs: Have multiple cores (2, 4, 6, 8, 12, 16, etc.)
Each core: Can run one program at a time (or multiple with hyperthreading)
Scheduler's job: Decide which program runs on which core

Why it matters:

Affects system responsiveness: How quickly your system responds to you
Good scheduler: System feels snappy and responsive
Bad scheduler: System feels sluggish and laggy
Determines input lag: Delay between your action and system response
Example: Moving mouse → cursor moves (lower lag = better)
Gaming: Lower input lag = better gaming experience
Impacts gaming performance: How smoothly games run
Good scheduler: Games run smoothly, consistent frame times
Bad scheduler: Games stutter, inconsistent performance
Affects multitasking: Running multiple programs at once
Good scheduler: All programs run smoothly
Bad scheduler: Some programs lag when others are running

Real-world example:

Without good scheduler:
You're playing a game, then open a browser
Game stutters because browser gets too much CPU time
System feels unresponsive
With good scheduler:
You're playing a game, then open a browser
Game keeps running smoothly (gets priority)
Browser still works, but doesn't interfere with game
System stays responsive

What is BORE?

BORE stands for "Burst-Oriented Response Enhancer".

What does "Burst-Oriented" mean?

Burst: A sudden increase in activity (you click something, type, move mouse)
Oriented: Designed to handle bursts of activity
BORE's focus: Responds quickly when you interact with the system

What it does:

Prioritizes interactive tasks: Gives priority to things you're actively using
Interactive tasks: Mouse movement, keyboard input, games, applications you're using
Background tasks: File downloads, system updates, background processes
BORE's approach: Interactive tasks get CPU time first
Reduces latency: Makes things respond faster
Latency: Delay between action and response
Example: Click button → application responds (lower latency = faster response)
Improves responsiveness under load: System stays responsive even when busy
Under load: When CPU is busy (compiling, rendering, etc.)
BORE's benefit: System still feels responsive even when CPU is working hard

Key features:

Burst detection:

What it does: Identifies when you're actively using the system
How it works: Detects sudden increases in activity (mouse movement, keyboard input)
Result: System knows when you're interacting and prioritizes accordingly
Example: You move mouse → BORE detects burst → gives priority to your applications

Priority boost:

What it does: Gives interactive tasks more CPU time
How it works: Temporarily increases priority of tasks you're using
Result: Your active applications get more CPU time than background tasks
Example: Game you're playing gets more CPU time than background download

Low latency:

What it does: Reduces delay between input and response
How it works: Prioritizes tasks that need immediate response
Result: System responds faster to your actions
Example: Clicking a button → application responds almost instantly

How BORE Improves Performance

Gaming:

Lower input lag: Delay between your action and game response is reduced
What it means: Mouse movement, keyboard presses feel more immediate
Real-world impact: Games feel more responsive, easier to aim, better control
Example: Moving mouse in FPS game → crosshair moves almost instantly
More consistent frame times: Frame rendering times are more stable
What it means: Each frame takes similar time to render
Real-world impact: Smoother gameplay, less stuttering
Example: Game runs at 60 FPS consistently instead of jumping between 50-70 FPS
Better performance in CPU-intensive games: Games that need lots of CPU run better
What it means: Games that heavily use CPU get better performance
Real-world impact: Complex games run smoother, less lag
Example: Strategy games, simulation games, games with many NPCs run better

Desktop use:

Instant response to mouse/keyboard: Input devices respond immediately
What it means: Mouse cursor and keyboard input feel instant
Real-world impact: System feels snappy and responsive
Example: Moving mouse → cursor moves instantly, no delay
Smoother window animations: Window transitions are fluid
What it means: Opening, closing, resizing windows is smooth
Real-world impact: Desktop feels polished and professional
Example: Opening application → window animates smoothly, no stuttering
No stuttering when background tasks run: System stays smooth even when busy
What it means: Background tasks don't cause visual stuttering
Real-world impact: Can run updates, downloads, etc. without affecting desktop
Example: Downloading large file → desktop still smooth, no lag

Multitasking:

Active applications stay responsive: Programs you're using don't lag
What it means: Applications you're actively using get priority
Real-world impact: Can work with multiple programs without slowdown
Example: Browser, text editor, music player all run smoothly together
Background tasks don't interrupt your work: Background processes don't interfere
What it means: System updates, downloads, etc. don't slow down active work
Real-world impact: Can run background tasks without affecting productivity
Example: System updating packages → your work continues smoothly
Better balance between foreground and background: System balances priorities well
What it means: Active tasks get priority, but background tasks still progress
Real-world impact: Best of both worlds - responsive system and background progress
Example: Your work is responsive, but downloads still complete

BORE vs Standard Scheduler

Standard Linux scheduler (CFS - Completely Fair Scheduler):

Fair distribution of CPU time: All tasks get equal CPU time
What it means: Every program gets the same amount of CPU time
Problem: Doesn't prioritize what you're actively using
Result: Background tasks can slow down active work
All tasks treated equally: No priority for interactive tasks
What it means: Your active application gets same priority as background download
Problem: System doesn't know what you're actively using
Result: Can cause lag when background tasks are running
Can cause latency spikes: Sometimes has delays
What it means: System can occasionally feel unresponsive
Problem: Fair scheduling can cause delays for interactive tasks
Result: Mouse/keyboard input can feel laggy sometimes

BORE scheduler:

Prioritizes interactive tasks: Gives priority to things you're using
What it means: Active applications get more CPU time
Benefit: System knows what you're using and prioritizes it
Result: Active work stays responsive
Reduces latency for user actions: Input responds faster
What it means: Mouse/keyboard input gets immediate response
Benefit: System feels more responsive
Result: Lower input lag, faster response times
Better for desktop and gaming use: Optimized for interactive use
What it means: Designed for how people actually use computers
Benefit: Better experience for desktop users and gamers
Result: Smoother, more responsive system

Real-world difference:

Scenario 1: Playing a game while downloading files

With CFS (standard scheduler):
Game stutters when download starts
Input lag increases
Frame rate drops
Why: Download gets equal CPU time, interferes with game
With BORE:
Game continues running smoothly
Input lag stays low
Frame rate remains stable
Why: Game gets priority, download runs in background

Scenario 2: Compiling code while browsing the web

With CFS (standard scheduler):
Browser becomes laggy during compilation
Scrolling stutters
Page loading slows down
Why: Compilation gets equal CPU time, slows down browser
With BORE:
Browser stays responsive
Scrolling is smooth
Page loading remains fast
Why: Browser gets priority when you're using it

Scenario 3: System update while working

With CFS (standard scheduler):
Active applications lag during update
Mouse/keyboard feel unresponsive
System feels sluggish
Why: Update process competes equally with active work
With BORE:
Active applications stay responsive
Mouse/keyboard feel instant
System feels snappy
Why: Active work gets priority, update runs in background

Specific improvements:

Mouse movement: Feels instant with BORE (no delay, immediate response)
Game input: Lower latency, smoother gameplay (better control, less lag)
Application switching: Faster, more responsive (Alt+Tab feels instant)

Other Scheduler Options

CachyOS offers multiple scheduler options for different use cases:

EEVDF Scheduler

What it is:

Earliest Eligible Virtual Deadline First
Modern scheduler from Linux kernel 6.6+
Fair and efficient

Best for:

General desktop use
Servers
Balanced workloads

Characteristics:

Fair CPU time distribution
Good for multitasking
Modern design

sched-ext Scheduler

What it is:

Extensible Scheduler Framework
Allows custom scheduler implementations
Experimental but powerful

Best for:

Advanced users
Custom scheduler development
Research and experimentation

Characteristics:

Highly customizable
Can implement custom scheduling policies
Requires kernel support

ECHO Scheduler

What it is:

Another scheduler option
Different scheduling algorithm
Alternative to BORE

Best for:

Users who want to try different schedulers
Specific workloads that benefit from ECHO

RT (Real-Time) Scheduler

What it is:

Real-Time scheduler
For time-critical applications
Guarantees response times

Best for:

Audio production
Real-time applications
Time-critical tasks

Characteristics:

Predictable timing
Low latency guarantees
May affect other applications

Which Scheduler Should You Use?

Recommendations:

Most users: BORE (default) - best for desktop and gaming
Servers: EEVDF - fair and efficient
Audio production: RT - time-critical
Experimentation: sched-ext - customizable
Not sure: Stick with BORE (default)

How to change scheduler:

Select during installation

Or change kernel package later:

# Install different kernel variant
sudo pacman -S linux-cachyos-eevdf  # For EEVDF

Link Time Optimization (LTO)

What is LTO?

Link Time Optimization (LTO) is a compiler optimization technique that optimizes code across the entire program, not just individual files.

What is a compiler?

Compiler: Software that converts source code (human-readable) into machine code (computer-readable)
Optimization: Making code run faster and use resources more efficiently
Traditional compilation: Each file is compiled and optimized separately
LTO compilation: Entire program is analyzed and optimized together

How it works:

Compiler analyzes entire program

Looks at all source files together (not separately)
Understands how different files interact
Sees relationships between code in different files
Benefit: Can make better optimization decisions

Optimizes across file boundaries

Can optimize code that spans multiple files
Removes redundant code between files
Better function inlining across files
Benefit: More efficient code overall

Removes unused code

Identifies functions and code that's never called
Removes dead code (code that can never execute)
Benefit: Smaller programs, faster execution

Inlines functions more aggressively

Inlining: Replaces function calls with the actual function code
Why it helps: Eliminates function call overhead
LTO advantage: Can inline functions even when they're in different files
Benefit: Faster execution (no function call overhead)

Better register allocation

Registers: Fast storage inside the CPU (much faster than RAM)
Allocation: Deciding which variables go in registers
LTO advantage: Can make better decisions across entire program
Benefit: More variables in fast registers = faster execution

Think of it like:

Without LTO: Optimizing each room of a house separately
Each room optimized independently
Doesn't consider how rooms connect
May miss optimization opportunities
Example: Two rooms both have heating - could share one system
With LTO: Optimizing the entire house as a whole
Considers entire house layout
Optimizes connections between rooms
Better overall optimization
Example: One heating system for entire house (more efficient)

Benefits of LTO

Performance improvements:

5-15% faster execution: Programs run noticeably faster
Real-world impact: Applications feel more responsive
Example: Web browser loads pages 10% faster
Smaller binary sizes: Compiled programs are smaller
Why: Unused code is removed
Benefit: Less disk space, faster loading from disk
Better code optimization: More efficient code generation
Why: Compiler sees entire program
Benefit: Better optimization decisions
More efficient memory usage: Better memory access patterns
Why: Optimized code layout
Benefit: Better CPU cache usage

Real-world examples:

Application startup: 10-20% faster launch times
Game loading: Levels and assets load quicker
Compilation: Development tools compile code faster
System responsiveness: Overall system feels snappier

Trade-offs:

Longer compilation time: Takes more time to compile packages
Why: Compiler does more analysis (looks at entire program)
Impact: Only affects package builders, not end users
For you: You don't notice this (packages are pre-compiled)
More memory during compilation: Needs more RAM when compiling
Why: Analyzes entire program at once (needs more memory)
Impact: Only affects package builders, not end users
For you: No impact (you're not compiling packages)
Slightly larger package repository: LTO packages may be slightly larger
Why: Contains optimization metadata
Impact: Minimal (usually offset by smaller final binaries)
For you: Negligible impact on disk space

Is LTO worth it?

For end users: Absolutely! You get faster programs with no downsides
For package maintainers: Trade-off between build time and performance
CachyOS choice: Uses LTO for better performance (worth the trade-off)

What Packages Use LTO?

In CachyOS:

Core system packages
Frequently used applications
Performance-critical software

Examples:

Kernel
Desktop environments
System libraries
Development tools

Profile-Guided Optimization (PGO)

What is PGO?

Profile-Guided Optimization (PGO) optimizes packages based on how they're actually used in real-world scenarios.

What is profiling?

Profiling: Recording how a program runs in real use
Data collected: Which functions are called, how often, which code paths are taken
Purpose: Understand real-world usage patterns (not theoretical)
Result: Optimize based on actual usage, not compiler guesses

How it works:

Package is compiled with profiling enabled

First compilation includes special profiling code
Profiling code records execution information as program runs
Creates "instrumented" binary (binary with tracking code built in)
Think of it: Like adding sensors to a car to see how it's driven

Package is used in typical scenarios

Package is run in real-world situations
People use it normally (browse web, edit documents, play games, etc.)
Profiling code records what happens during use
Think of it: Driving the car normally while sensors record data

Profiling data is collected

System gathers information about:
Which functions are called most: Most-used functions identified
Which code paths are taken: Common execution paths found
How often operations occur: Frequency of different operations
Execution time: How long different parts take
Think of it: Analyzing the sensor data to see driving patterns

Package is recompiled using profiling data

Compiler reads the profiling data
Understands how the program is actually used
Optimizes based on real usage patterns (not guesses)
Creates final optimized binary
Think of it: Redesigning the car based on how it's actually driven

Result: Optimized for actual usage patterns

Code is optimized for how it's really used
Frequently used code is optimized more (gets more attention)
Rarely used code is optimized less (saves compilation time)
Better overall performance for real-world use
Think of it: Car is now optimized for actual driving conditions

Think of it like:

Without PGO: Optimizing based on guesses
Compiler guesses what's important
May optimize wrong things (things rarely used)
May miss optimization opportunities (things frequently used)
Example: Optimizing a rarely-used feature instead of common one
With PGO: Optimizing based on real usage data
Compiler knows what's actually used
Optimizes the right things (frequently used code)
Better optimization decisions
Example: Optimizing the feature people use most

Benefits of PGO

Performance improvements:

10-30% faster for optimized packages: Significant speedup
Real-world impact: Noticeably faster applications
Example: Web browser renders pages 20% faster
Better branch prediction: CPU predicts code paths better
Branch prediction: CPU guessing which code path will be taken (if/else, loops)
Better prediction: Less CPU stalls, faster execution
Why: Code layout optimized based on actual paths taken
More efficient code paths: Frequently used paths optimized more
Code paths: Different ways code can execute (different branches)
Optimization: Common paths get more optimization
Benefit: Most-used code runs fastest
Optimized for common operations: Common tasks run faster
Common operations: Things people do most often
Optimization: These get special attention
Benefit: Everyday tasks are faster

Real-world examples:

Web browsers: Faster page loading, smoother scrolling, quicker JavaScript execution
Compilers: Faster code compilation (compilers compile themselves with PGO)
Desktop environments: Smoother animations, faster window operations
System libraries: Better performance for all applications using them
Games: Faster loading, smoother gameplay

What gets optimized:

Frequently used functions: Functions called often are optimized more
Example: In a web browser, page rendering function gets optimized
Hot code paths: Code paths taken frequently are optimized
Example: Common user interactions get optimized
Common operations: Everyday operations run faster
Example: Opening files, saving documents, scrolling
Real-world usage patterns: Optimized for how software is actually used
Example: Optimized for typical user behavior, not edge cases

What Packages Use PGO?

In CachyOS:

Critical system components
Frequently used applications
Performance-sensitive software

Examples:

Web browsers
Compilers
System libraries
Desktop environments

BOLT Optimization

What is BOLT?

BOLT stands for Binary Optimization and Layout Tool.

What it does:

Optimizes compiled binaries at the binary level
Rearranges code for better CPU cache usage
Improves branch prediction
Optimizes hot code paths

How it works:

Analyzes binary execution
Identifies hot (frequently used) code
Rearranges code layout
Optimizes for CPU cache
Improves performance

Benefits of BOLT

Performance improvements:

5-20% faster for optimized binaries
Better CPU cache utilization
Improved branch prediction
More efficient code layout

What gets optimized:

Frequently executed code
Hot functions
Critical code paths

What Packages Use BOLT?

In CachyOS:

Selected high-impact packages
Performance-critical applications
Frequently used software

Examples:

Web browsers
Development tools
System utilities

Custom Kernel (linux-cachyos)

What is linux-cachyos?

linux-cachyos is CachyOS's custom-compiled Linux kernel with performance optimizations.

What's different:

BORE scheduler (or other scheduler options)
Optimized compilation flags
LTO compilation
Performance patches
Modern CPU optimizations

Kernel Variants

Available kernels:

linux-cachyos - Default with BORE scheduler
linux-cachyos-eevdf - EEVDF scheduler
linux-cachyos-sched-ext - sched-ext scheduler
linux-cachyos-rt - Real-time kernel

How to check your kernel:

# Check kernel version
uname -r

# Should show something like:
# 6.x.x-cachyos

Benefits of Custom Kernel

Performance:

Better scheduler (BORE)
Optimized compilation
Performance patches
Modern optimizations

Features:

Multiple scheduler options
Better hardware support
Performance improvements

Performance Tips

1. Use the Right CPU Optimization Level

Make sure you're using the highest optimization level your CPU supports:

# Check your CPU
lscpu

# Verify optimization level in package names
pacman -Q | grep cachyos

2. Choose the Right Scheduler

For most users:

Use BORE scheduler (default)
Best for desktop and gaming

For specific use cases:

Servers: EEVDF
Audio: RT
Experimentation: sched-ext

3. Keep System Updated

Regular updates include performance improvements:

# Update system regularly
sudo pacman -Syu

4. Use SSD for System Drive

SSD provides:

Faster boot times
Faster application launches
Better overall responsiveness

If using HDD:

Consider upgrading to SSD
Or use SSD for system, HDD for data

5. Optimize Desktop Environment

Lightweight DEs are faster:

XFCE, LXQt: Lightweight
KDE, GNOME: More features, more resources

Choose based on:

Your hardware capabilities
Your preferences
Performance requirements

6. Disable Unnecessary Services

Reduce background processes:

# Check running services
systemctl list-units --type=service --state=running

# Disable services you don't need
sudo systemctl disable service-name

7. Use Appropriate Graphics Drivers

Install proper drivers:

# Use chwd for hardware detection
sudo chwd -h

# Or install manually
# NVIDIA:
sudo pacman -S nvidia

# AMD: Usually works out of box

8. Monitor System Resources

Keep an eye on resource usage:

# Check CPU and memory
htop

# Or
top

Identify resource hogs:

Close unnecessary applications
Optimize startup programs
Manage browser tabs

Measuring Performance

Benchmarking Tools

CPU performance:

# Install benchmarking tools
sudo pacman -S sysbench

# Run CPU benchmark
sysbench cpu --threads=4 run

Disk performance:

# Test disk speed
sudo pacman -S hdparm

# Test read speed
sudo hdparm -tT /dev/sda

Memory performance:

# Test memory
sysbench memory --threads=4 run

Real-World Performance Tests

Application startup time:

Time how long applications take to launch
Compare before/after optimizations

Gaming performance:

Use in-game FPS counters
Monitor frame times
Check input lag

System responsiveness:

Test mouse/keyboard latency
Check window animation smoothness
Monitor system under load

Comparing Performance

Before/after comparisons:

Benchmark before optimizations
Apply optimizations
Benchmark again
Compare results

Note: Performance improvements vary by:

Hardware
Workload
Applications used
System configuration

Additional Resources

CachyOS Getting Started Guide - System overview
CachyOS Installation Guide - Installation instructions
CachyOS Tools Guide - System tools
CachyOS Wiki: https://wiki.cachyos.org/
BORE Scheduler: Information about BORE scheduler
Arch Linux Performance: https://wiki.archlinux.org/title/Improving_performance

Summary

This guide covered:

CPU instruction set optimizations - x86-64-v3, v4, Zen4
BORE scheduler - Burst-Oriented Response Enhancer
Other schedulers - EEVDF, sched-ext, RT, ECHO
LTO - Link Time Optimization
PGO - Profile-Guided Optimization
BOLT - Binary Optimization and Layout Tool
Custom kernel - linux-cachyos
Performance tips - Getting the best performance
Measuring performance - Benchmarking and testing

Key Takeaways:

CachyOS uses multiple optimization techniques
BORE scheduler improves responsiveness
CPU instruction set optimizations provide significant gains
LTO, PGO, and BOLT optimize packages further
Custom kernel adds performance improvements
Choose optimizations based on your hardware

This guide is based on the CachyOS Wiki and expanded with detailed explanations for beginners. For the most up-to-date performance information, always refer to the official CachyOS documentation.

Uh oh!

CachyOS Performance Guide

CachyOS Performance Guide

Table of Contents

Understanding CachyOS Performance Optimizations

What Makes CachyOS Fast?

Real-World Performance Benefits

CPU Instruction Set Optimizations

What Are CPU Instruction Sets?

CachyOS Optimization Levels

x86-64-v3 (Recommended Minimum)

x86-64-v4 (Better Performance)

Zen4 (Best for AMD Ryzen 7000+)

Which Optimization Level Should You Use?

BORE Scheduler

What is a CPU Scheduler?

What is BORE?

How BORE Improves Performance

BORE vs Standard Scheduler

Other Scheduler Options

EEVDF Scheduler

sched-ext Scheduler

ECHO Scheduler

RT (Real-Time) Scheduler

Which Scheduler Should You Use?

Link Time Optimization (LTO)

What is LTO?

Benefits of LTO

What Packages Use LTO?

Profile-Guided Optimization (PGO)

What is PGO?

Benefits of PGO

What Packages Use PGO?

BOLT Optimization

What is BOLT?

Benefits of BOLT

What Packages Use BOLT?

Custom Kernel (linux-cachyos)

What is linux-cachyos?

Kernel Variants

Benefits of Custom Kernel

Performance Tips

1. Use the Right CPU Optimization Level

2. Choose the Right Scheduler

3. Keep System Updated

4. Use SSD for System Drive

5. Optimize Desktop Environment

6. Disable Unnecessary Services

7. Use Appropriate Graphics Drivers

8. Monitor System Resources

Measuring Performance

Benchmarking Tools

Real-World Performance Tests

Comparing Performance

Additional Resources

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!