Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .clang-format
Original file line number Diff line number Diff line change
Expand Up @@ -103,8 +103,8 @@ IncludeCategories:
- Regex: '^<[^/]+>$'
Priority: 2
SortPriority: 2
# 3. Third-party libraries (uni-algo, catch2, nlohmann)
- Regex: '^<(uni|catch2|nlohmann)/'
# 3. Third-party libraries (catch2, nlohmann)
- Regex: '^<(catch2|nlohmann)/'
Priority: 3
SortPriority: 3
# 4. Project headers (skyr/...)
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/clang-tidy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
./bootstrap-vcpkg.sh
export CXXFLAGS="-stdlib=libc++"
export LDFLAGS="-stdlib=libc++ -lc++abi"
./vcpkg install uni-algo nlohmann-json
./vcpkg install nlohmann-json

- name: Install Catch2
run: |
Expand Down
11 changes: 0 additions & 11 deletions .github/workflows/sanitizers.yml
Original file line number Diff line number Diff line change
Expand Up @@ -115,17 +115,6 @@ jobs:
git checkout -b master origin/master
./bootstrap-vcpkg.sh

# For Clang builds, use custom triplet with libc++ and set compiler
if [[ "${{ matrix.config.use_libcxx }}" == "true" ]]; then
export CC=${{ matrix.config.cc }}
export CXX=${{ matrix.config.cxx }}
./vcpkg install uni-algo \
--triplet x64-linux-libcxx \
--overlay-triplets=${GITHUB_WORKSPACE}/cmake/vcpkg-triplets
else
./vcpkg install uni-algo
fi

- name: Configure
shell: cmake -P {0}
run: |
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/skyr-url-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -298,11 +298,11 @@ jobs:
if [[ "${{ matrix.config.cxx }}" == clang++* ]]; then
export CC=${{ matrix.config.cc }}
export CXX=${{ matrix.config.cxx }}
./vcpkg install catch2 uni-algo nlohmann-json \
./vcpkg install catch2 nlohmann-json \
--triplet x64-linux-libcxx \
--overlay-triplets=${GITHUB_WORKSPACE}/cmake/vcpkg-triplets
else
./vcpkg install catch2 uni-algo nlohmann-json
./vcpkg install catch2 nlohmann-json
fi

- name: Install LLVM Clang on macOS
Expand Down Expand Up @@ -338,7 +338,7 @@ jobs:
git checkout -b master origin/master
export
./bootstrap-vcpkg.sh
./vcpkg install catch2 uni-algo nlohmann-json
./vcpkg install catch2 nlohmann-json

- name: Install vcpkg (Windows)
id: vcpkg_windows
Expand All @@ -348,7 +348,7 @@ jobs:
git clone https://github.com/microsoft/vcpkg.git ${Env:GITHUB_WORKSPACE}\vcpkg
cd ${Env:GITHUB_WORKSPACE}\vcpkg
.\bootstrap-vcpkg.bat
vcpkg install catch2 uni-algo nlohmann-json --triplet x64-windows
vcpkg install catch2 nlohmann-json --triplet x64-windows


- name: Configure
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/wpt.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ jobs:
git fetch origin master
git checkout -b master origin/master
./bootstrap-vcpkg.sh
./vcpkg install uni-algo nlohmann-json
./vcpkg install nlohmann-json

- name: Configure
shell: cmake -P {0}
Expand Down
168 changes: 160 additions & 8 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,11 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
- `std::expected<T, E>` for error handling (replaces `tl::expected`)
- `std::format` for string formatting (replaces `fmt::format`)
- `std::ranges` for range-based algorithms and views (replaces `range-v3`)
- `uni-algo` library for Unicode processing
- Custom Unicode/IDNA implementation (header-only)

**Key Advantages**:
- **Header-only** - just include and use, no linking required
- **Minimal external dependencies** - only requires `uni-algo` for Unicode support
- **Zero external dependencies** - completely self-contained for core URL parsing

## Building

Expand All @@ -35,17 +35,16 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
- C++23-compliant compiler (GCC 13+, Clang 16+, MSVC 2022 17.6+)

**Optional** (automatically disabled with warnings if not found):
- `uni-algo` for full Unicode/IDNA processing
- `catch2` for tests
- `nlohmann-json` for JSON functionality

To install optional dependencies:
```bash
cd ${VCPKG_ROOT}
./vcpkg install uni-algo catch2 nlohmann-json
./vcpkg install catch2 nlohmann-json
```

**Note**: The library will work for basic URL parsing even without dependencies, but IDNA/Punycode (internationalized domain names) require `uni-algo`.
**Note**: The library is completely self-contained with zero external dependencies. Unicode/IDNA/Punycode support is built-in via custom header-only implementation.

### Configure and Build

Expand All @@ -64,6 +63,7 @@ cmake --build _build
Key build options:
- `skyr_BUILD_TESTS` (ON): Build tests
- `skyr_BUILD_WPT` (OFF): Build Web Platform Tests runner
- `skyr_BUILD_BENCHMARKS` (OFF): Build performance benchmarks
- `skyr_ENABLE_FILESYSTEM_FUNCTIONS` (ON): Enable filesystem::path conversion
- `skyr_ENABLE_JSON_FUNCTIONS` (ON): Enable JSON serialization
- `skyr_BUILD_WITHOUT_EXCEPTIONS` (OFF): Build without exceptions
Expand Down Expand Up @@ -222,6 +222,159 @@ Test data comes from the official WPT repository:

This ensures compliance testing against the latest WhatWG URL specification test cases.

## Benchmarks

**Performance benchmarks** measure runtime URL parsing speed to identify optimization opportunities and track performance regressions.

### Philosophy

- **Measure, don't guess** - Profile before optimizing
- **Real-world scenarios** - Tests diverse URL patterns (ASCII, IDN, IPv6, percent-encoded, etc.)
- **Actionable metrics** - Reports average µs/URL and throughput (URLs/second)
- **Optional** - Not required for normal development (disabled by default)

### Building Benchmarks

```bash
cmake \
-B _build \
-G "Ninja" \
-Dskyr_BUILD_BENCHMARKS=ON \
.
cmake --build _build --target url_parsing_bench
```

### Running Benchmarks

```bash
# Default: 10,000 iterations × 34 URLs = 340,000 parses
./_build/benchmark/url_parsing_bench

# Custom iteration count (100,000 iterations)
./_build/benchmark/url_parsing_bench 100000

# Quick test (1,000 iterations)
./_build/benchmark/url_parsing_bench 1000
```

### Example Output

```
=================================================
URL Parsing Benchmark Results
=================================================

Configuration:
Test URLs: 34 unique patterns
Iterations: 10000
Total URLs: 340000

Results:
Total time: 820 ms
Successful: 330000 (97.1%)
Failed: 10000 (2.9%)

Performance:
Average: 2.412 µs/URL
Throughput: 414634 URLs/second

=================================================
```

### Interpreting Results

**Good performance (on modern hardware):**
- Average: < 5 µs/URL
- Throughput: > 200,000 URLs/second

**Investigate if:**
- Average: > 10 µs/URL
- Throughput: < 100,000 URLs/second

### Profiling

To find actual performance bottlenecks, use profiling tools:

**macOS (Instruments - requires Xcode):**
```bash
# First, install Xcode from App Store or https://developer.apple.com/download/
# Verify: xctrace version

cmake -B _build -G Ninja -Dskyr_BUILD_BENCHMARKS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build _build --target url_parsing_bench

# Profile with xctrace (modern replacement for 'instruments' command)
xctrace record --template 'Time Profiler' \
--output /tmp/url_bench.trace \
--launch ./_build/benchmark/url_parsing_bench 50000

# Open results in Instruments GUI
open /tmp/url_bench.trace
```

**macOS (sample - built-in, no Xcode needed):**
```bash
cmake -B _build -G Ninja -Dskyr_BUILD_BENCHMARKS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build _build --target url_parsing_bench
sample url_parsing_bench 10 -file /tmp/profile.txt &
./_build/benchmark/url_parsing_bench 50000
open /tmp/profile.txt
```

**Linux (perf):**
```bash
cmake -B _build -G Ninja -Dskyr_BUILD_BENCHMARKS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build _build --target url_parsing_bench
perf record -g ./_build/benchmark/url_parsing_bench 50000
perf report
```

**All platforms (Valgrind):**
```bash
cmake -B _build -G Ninja -Dskyr_BUILD_BENCHMARKS=ON -DCMAKE_BUILD_TYPE=RelWithDebInfo
cmake --build _build --target url_parsing_bench
valgrind --tool=callgrind ./_build/benchmark/url_parsing_bench 1000
qcachegrind callgrind.out # macOS: brew install qcachegrind
# Linux: kcachegrind
```

### Test Coverage

The benchmark tests 34 diverse URL patterns:
- Simple ASCII URLs (http, https, ftp)
- URLs with query parameters and fragments
- URLs with authentication (user:pass@host)
- URLs with non-default ports
- Internationalized domain names (IDN): `http://example.إختبار/`, `https://münchen.de/`
- Unicode in paths: `http://example.com/π`, `https://example.org/文档/`
- Percent-encoded URLs: `http://example.com/path%20with%20spaces`
- Complex real-world URLs (Google search, GitHub, Wikipedia)
- IPv4 addresses: `http://192.168.1.1/`, `https://127.0.0.1:8443/`
- IPv6 addresses: `http://[::1]/`, `https://[2001:db8::1]/`
- Edge cases: file://, data:, mailto:

### Performance Expectations

**Typical results on modern hardware (Apple M1/M2, Intel i7+, AMD Ryzen):**
- Average: 2-4 µs/URL
- Throughput: 250,000 - 500,000 URLs/second

**Why this is fast enough:**
- Most applications parse URLs once per request
- A typical HTTP request takes 10-100ms
- URL parsing is < 0.01% of total request time
- Bottleneck is almost never URL parsing

### Before Adding Dependencies

Before adding external libraries like simdutf for "faster UTF conversion":

1. **Profile first** - Use profiling tools to find real bottlenecks
2. **Measure UTF time** - Is UTF conversion > 10% of runtime?
3. **Consider trade-offs** - Zero dependencies vs marginal speedup

The benchmark helps answer: "Is optimization worth the complexity?"

## Code Structure

**Directory Layout**:
Expand Down Expand Up @@ -279,11 +432,10 @@ Aliases for compatibility:
## Key Dependencies

- **C++23 standard library**: `std::expected`, `std::format`, `std::ranges`
- **uni-algo**: Unicode algorithms and IDNA processing
- **nlohmann-json** (optional): JSON serialization
- **Catch2** (tests): Testing framework
- **Catch2** (optional, tests only): Testing framework

**Key advantage**: Minimal external dependencies - only requires `uni-algo` for Unicode support. All other modern C++ features (`expected`, `format`, `ranges`) are provided by the standard library!
**Key advantage**: Zero external dependencies for core URL parsing! All modern C++ features (`expected`, `format`, `ranges`) and Unicode/IDNA support are either from the standard library or custom header-only implementations.

## Code Quality Tools

Expand Down
14 changes: 7 additions & 7 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ option(skyr_BUILD_TESTS "Build the URL tests." ON)
option(skyr_BUILD_WPT "Build Web Platform Tests runner (report-only, not unit tests)." OFF)
option(skyr_BUILD_DOCS "Build the URL documentation." OFF)
option(skyr_BUILD_EXAMPLES "Build the URL examples." OFF)
option(skyr_BUILD_BENCHMARKS "Build performance benchmarks." OFF)
option(skyr_FULL_WARNINGS "Build the library with all warnings turned on." ON)
option(skyr_WARNINGS_AS_ERRORS "Treat warnings as errors." ON)
option(skyr_BUILD_WITHOUT_EXCEPTIONS "Build without exceptions." OFF)
Expand All @@ -50,13 +51,6 @@ if (skyr_IS_TOP_LEVEL_PROJECT)
set(CMAKE_CXX_STANDARD_REQUIRED ON)
endif()

# C++23 standard library provides std::expected, std::format, std::ranges
# Only external dependency is uni-algo for Unicode processing
find_package(uni-algo CONFIG)
if (NOT uni-algo_FOUND)
message(WARNING "uni-algo not found. Install with: vcpkg install uni-algo")
endif()

if (skyr_ENABLE_JSON_FUNCTIONS)
find_package(nlohmann_json CONFIG)
if (NOT nlohmann_json_FOUND)
Expand Down Expand Up @@ -115,6 +109,12 @@ if (skyr_BUILD_EXAMPLES)
add_subdirectory(examples)
endif()

# Benchmarks
if (skyr_BUILD_BENCHMARKS)
message(STATUS "[skyr-url] Configuring benchmarks")
add_subdirectory(benchmark)
endif()

## Install
include(CMakePackageConfigHelpers)
include(GNUInstallDirs)
Expand Down
18 changes: 7 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,13 @@ This library provides:
* **`std::expected`** for error handling (no external dependency needed!)
* **`std::format`** for string formatting
* **`std::ranges`** for functional-style operations
* Minimal external dependencies - only `uni-algo` required for Unicode/IDNA support
* **Zero external dependencies** - Unicode/IDNA/Punycode support built-in

## Using the library

This project requires:
* A **C++23 compliant compiler** (GCC 13+, Clang 16+, MSVC 2022 17.6+)
* **uni-algo** library (optional but recommended for full IDNA support)
* **No external dependencies** for core URL parsing

### ``vcpkg``

Expand All @@ -62,7 +62,7 @@ call to ``bootstrap-vcpkg.sh`` with ``bootstrap-vcpkg.bat``.

### Installing dependencies using `vcpkg`

Using `vcpkg`, install the library dependencies:
Using `vcpkg`, install the optional test dependencies:

```bash
> cd ${VCPKG_ROOT}
Expand All @@ -71,10 +71,10 @@ Using `vcpkg`, install the library dependencies:
> git fetch origin master
> git checkout -b master origin/master
> ./bootstrap-vcpkg.sh
> ./vcpkg install uni-algo catch2 nlohmann-json
> ./vcpkg install catch2 nlohmann-json
```

**Note**: Only `uni-algo` is required for the library itself. `catch2` and `nlohmann-json` are only needed for tests and JSON functionality.
**Note**: The library has zero dependencies. `catch2` and `nlohmann-json` are only needed for tests and optional JSON functionality.

### Building the project with `CMake` and `Ninja`

Expand Down Expand Up @@ -209,17 +209,13 @@ This library leverages **C++23 standard library features**:
* **`std::format`** (C++23) - String formatting
* **`std::ranges`** (C++20/23) - Range algorithms and views

**External dependencies**:
* [uni-algo](https://github.com/uni-algo/uni-algo) - Unicode algorithms and IDNA processing (optional but recommended)
**Core library**: **Zero external dependencies!**
* Unicode/IDNA/Punycode support via custom header-only implementation

**Test dependencies** (optional):
* [Catch2](https://github.com/catchorg/catch2) - Testing framework
* [nlohmann-json](https://github.com/nlohmann/json) - JSON support (for optional JSON features)

## Acknowledgements

This library includes Unicode processing support from [uni-algo](https://github.com/uni-algo/uni-algo).

## Platform support

Look at the [GitHub Actions Status](https://github.com/cpp-netlib/url/actions)
Expand Down
Loading
Loading