Skip to content

PDF Coding Standards

Wei Lin edited this page Mar 2, 2026 · 2 revisions

PDF Coding Standards

This document defines the coding standards and conventions for the MiniPdf library — a minimal, zero-dependency .NET library for generating PDF documents.


Table of Contents


General Principles

  1. Zero dependencies — The library must not reference any external NuGet packages. Only built-in .NET APIs (System.IO, System.IO.Compression, System.Text, System.Xml, etc.) are permitted.
  2. Minimal surface area — Expose only what consumers need. Internal implementation details must stay internal.
  3. Correctness first — Generated PDF output must be valid PDF 1.4 that can be opened by any compliant reader.
  4. Simplicity over flexibility — Prefer a small, easy-to-use API over a highly configurable one. Add options only when there is a clear use case.

Project Structure

MiniPdf.sln
├── src/MiniPdf/                # Library project
│   ├── PdfDocument.cs          # Document model (public entry point)
│   ├── PdfPage.cs              # Page with text placement
│   ├── PdfTextBlock.cs         # Text block data
│   ├── PdfWriter.cs            # Low-level PDF 1.4 binary writer
│   ├── ExcelReader.cs          # .xlsx parser (ZIP + XML)
│   └── ExcelToPdfConverter.cs  # Excel-to-PDF public API
└── tests/MiniPdf.Tests/        # xUnit test project
    ├── PdfDocumentTests.cs
    └── ExcelToPdfConverterTests.cs

Rules

  • One primary type per file. The file name must match the type name (e.g., PdfPagePdfPage.cs).
  • Source files go in src/MiniPdf/.
  • Test files go in tests/MiniPdf.Tests/ and are named <TypeUnderTest>Tests.cs.
  • Do not introduce additional project layers (e.g., abstractions projects) unless the library scope significantly grows.

Naming Conventions

Element Convention Example
Namespace MiniSoftware namespace MiniSoftware;
Public class PascalCase PdfDocument, PdfPage
Public method PascalCase AddPage(), Save()
Public property PascalCase Width, Height, Pages
Private field _camelCase _textBlocks, _pages
Parameter camelCase fontSize, maxWidth
Local variable camelCase currentY, lineHeight
Constants PascalCase DefaultPageWidth
Test method MethodUnderTest_Scenario_ExpectedResult AddText_StoresTextBlock()
Nested options class PascalCase, inside owner ExcelToPdfConverter.ConversionOptions

Prefixes

  • Pdf — Types that represent core PDF concepts (PdfDocument, PdfPage, PdfWriter, PdfTextBlock).
  • Excel — Types related to Excel reading or conversion (ExcelReader, ExcelToPdfConverter).

C# Style Guidelines

Language Version & Target

  • Target .NET 9.0 or later.
  • Use the latest C# language features where they improve readability (e.g., collection expressions [], file-scoped namespaces, raw string literals).

Formatting

  • Use file-scoped namespaces (namespace MiniPdf;).
  • Use 4-space indentation (no tabs).
  • Place opening braces on the same line for statements; on a new line for type and method declarations (Allman style).
  • Limit lines to a reasonable length (~120 characters).

Type Design

  • Mark classes as sealed unless they are explicitly designed for inheritance.
  • Use internal for implementation types (PdfWriter, ExcelReader). Only expose types consumers need.
  • Prefer IReadOnlyList<T> for public collection properties backed by private List<T>.
// Good
private readonly List<PdfPage> _pages = [];
public IReadOnlyList<PdfPage> Pages => _pages;

Method Design

  • Support method chaining where it improves usability (return this).
public PdfPage AddText(string text, float x, float y, float fontSize = 12)
{
    _textBlocks.Add(new PdfTextBlock(text, x, y, fontSize));
    return this; // enables chaining
}
  • Use optional parameters with sensible defaults instead of multiple overloads.
  • Prefer float over double for PDF coordinates (PDF specification uses single-precision values).

Patterns

  • Use StringBuilder for building PDF content streams.
  • Use CultureInfo.InvariantCulture for all numeric-to-string conversions in PDF output (PDF requires . as decimal separator).
var fontSize = block.FontSize.ToString(CultureInfo.InvariantCulture);
  • Use collection expressions ([]) for initializing empty collections.

PDF-Specific Standards

PDF Version

  • Target PDF 1.4. All generated output must begin with %PDF-1.4.
  • Include the recommended binary comment after the header: %âãÏÓ (bytes 0xE2 0xE3 0xCF 0xD3).

Object Numbering

  • Reserve fixed object numbers:
    • Object 1: Catalog (/Type /Catalog)
    • Object 2: Pages tree (/Type /Pages)
    • Object 3: Font resource (Helvetica)
    • Object 4+: Page objects and their content streams

Font Handling

  • Use Helvetica as the built-in font (Type 1, no embedding required).
  • Encoding: /WinAnsiEncoding.
  • Font resource is referenced as /F1 in content streams.

Content Streams

  • All text rendering must be enclosed in BT (begin text) / ET (end text) operators.
  • Use absolute positioning with Td operator, resetting after each text block:
BT
/F1 12 Tf
50 700 Td
(Hello World) Tj
-50 -700 Td
ET

String Escaping

All text placed in PDF strings must be escaped:

Character Escape Sequence
\ \\
( \(
) \)
CR \r
LF \n

Cross-Reference Table

  • The xref table must list all objects with 10-digit zero-padded byte offsets.
  • The trailer must reference the Catalog as /Root.
  • The file must end with %%EOF.

Coordinate System

  • PDF coordinates are in points (1 point = 1/72 inch).
  • Origin is at the bottom-left corner of the page.
  • Y values decrease as you move downward on the page.
  • Default page size is US Letter: 612 × 792 points.

Page Sizes Reference

Size Width (pt) Height (pt)
US Letter 612 792
A4 595 842

Public API Design

Entry Points

The library provides two main entry points:

  1. PdfDocument — For programmatic text-to-PDF generation.
  2. ExcelToPdfConverter — For converting .xlsx files to PDF.

API Conventions

  • Fluent chaining: Methods that modify a page return the PdfPage instance.
  • Multiple output targets: Support saving to file path, Stream, and byte[].
doc.Save("output.pdf");       // file
doc.Save(stream);              // stream
byte[] bytes = doc.ToArray();  // byte array
  • Options classes: Use a nested sealed class with public settable properties and sensible defaults.
public sealed class ConversionOptions
{
    public float FontSize { get; set; } = 10;
    public float PageWidth { get; set; } = 612;  // US Letter
    public float PageHeight { get; set; } = 792;
    // ...
}
  • Options parameters should be nullable with a fallback to defaults:
public static PdfDocument Convert(string path, ConversionOptions? options = null)
{
    options ??= new ConversionOptions();
    // ...
}

Breaking Changes

  • Avoid breaking changes to public API signatures. If a new option is needed, add it to the options class with a default value that preserves existing behavior.

Error Handling

  • Do not silently swallow errors. If input is invalid (e.g., corrupt .xlsx), let the exception propagate.
  • Validate early: Check for null/empty inputs at public API boundaries.
  • Guard against edge cases:
    • Empty documents should still produce valid PDF output.
    • Sheets with zero rows should be handled gracefully (produce an empty page, not an exception).

Testing Standards

Framework

  • Use xUnit for all tests.
  • Test files follow the pattern <TypeUnderTest>Tests.cs.

Test Naming

Use the pattern: MethodName_Scenario_ExpectedBehavior

[Fact]
public void AddText_StoresTextBlock() { ... }

[Fact]
public void Save_MultiplePages_AllIncluded() { ... }

[Fact]
public void Convert_EmptyExcel_CreatesAtLeastOnePage() { ... }

What to Test

Category Examples
PDF structure Header starts with %PDF-1.4, ends with %%EOF
Text content Written text appears in PDF bytes
Page management Multiple pages, correct /Count, custom dimensions
Special characters Parentheses, backslashes are properly escaped
Excel conversion Simple data, custom options, multi-page, empty sheets
Edge cases Empty document, empty text wrapping, zero-row sheets
File I/O Save(path) creates a file, ConvertToFile produces output

Test Helpers

  • Use in-memory streams (MemoryStream) for test data. For Excel tests, construct .xlsx files programmatically using ZipArchive rather than relying on external test fixtures.
  • Clean up temp files in finally blocks.

Documentation

XML Documentation

  • All public types and members must have <summary> XML doc comments.
  • Include <param> tags for method parameters and <returns> for return values.
/// <summary>
/// Adds a text block at the specified position.
/// </summary>
/// <param name="text">The text to render.</param>
/// <param name="x">X position in points from the left edge.</param>
/// <param name="y">Y position in points from the bottom edge.</param>
/// <param name="fontSize">Font size in points (default: 12).</param>
/// <returns>The current page for chaining.</returns>
public PdfPage AddText(string text, float x, float y, float fontSize = 12)
  • internal types should also have <summary> comments for maintainability.

Wiki Documentation

  • Keep this wiki up to date when API surface changes.
  • Document all public types, methods, and options with usage examples.

Version Control

Branch Strategy

  • main is the primary branch. All code in main must pass CI.
  • Use feature branches for new work and merge via pull request.

Commit Messages

  • Use clear, imperative-mood messages: "Add text wrapping support", not "Added text wrapping".
  • Reference issue numbers where applicable.

CI/CD

  • The project uses GitHub Actions for CI.
  • All pull requests must pass dotnet build and dotnet test before merging.