Reorg and new custom decoder support #177

oschwald · 2025-06-22T21:20:05Z

Move decoder code into new internal package
Add change log
Make functional option functions return a function
Add support for options to Open and FromBytes
Add missing docs
Tighten up the golangci-lint config
Start decoupling the decoding and relection
Move size checks to DataDecoder
Improve naming of data types and make them public
Add separate Decoder type for manual decoding
Add support for UnmarshalMaxMinDB

This will be used in the future for things like cache configuration. Adding now as it is a breaking change.

Although this doesn't provide immediate benefit, the intent is to make the data decoder available separately in a future commit for public use.

This further decouples the code. There are probably future improvements for reducing redundancy however.

This is largely based on #91.

Extend the UnmarshalMaxMindDB interface to work recursively with nested types, matching the behavior of encoding/json's UnmarshalJSON. Custom unmarshalers are now called for: - Struct fields that implement Unmarshaler - Pointer fields (creates value if nil, then checks for Unmarshaler) - Slice elements that implement Unmarshaler - Map values that implement Unmarshaler This enhancement allows for more flexible custom decoding strategies in complex data structures, improving performance optimization opportunities for nested types.

Add PeekType() method to Decoder that returns the type of the current value without consuming it, similar to jsontext.Decoder.PeekKind(). This enables look-ahead parsing for conditional decoding logic. The method follows pointers to return the actual data type being pointed to rather than just returning TypePointer.

The Decoder and Unmarshaler types are now available in the mmdbdata package for applications that need direct access to the decoding API.

Renames Type* constants to Kind* and PeekType() to PeekKind() throughout the codebase to match Go's encoding/json/v2 naming conventions. This improves API consistency with the standard library.

Renames all Decoder methods from Decode* to Read* (e.g., DecodeString to ReadString) to match Go's encoding/json/v2 jsontext.Decoder naming conventions. This improves API consistency with the standard library.

Adds DecoderOption type and variadic options parameter to enable future configuration without breaking API changes. Follows existing library patterns like ReaderOption and NetworksOption.

Adds String(), IsContainer(), and IsScalar() methods to Kind type for better debugging and type classification. These methods enable human-readable Kind names and easy identification of container vs scalar types.

Improved error messages to include byte offset information and, for the reflection-based API, path information for nested structures using JSON Pointer format. For example, errors may now show "at offset 1234, path /city/names/en" or "at offset 1234, path /list/0/name" instead of just the underlying error message. The implementation maintains zero allocation on the happy path through retroactive path building during error unwinding.

Replaces unbounded cache with fixed 512-entry array using offset-based indexing. Provides 15% performance improvement while preventing memory growth and ensuring thread safety for concurrent reader usage.

Move size validation logic to DataDecoder to eliminate duplication and create single source of truth for data validation.

And make them more accurate.

- Add comprehensive field documentation to Metadata struct with references to MaxMind DB specification - Add BuildTime() convenience method to convert BuildEpoch to time.Time - Enhance Verify() documentation explaining validation scope and use cases - Add ExampleReader_Verify showing verification and metadata access - Add TestMetadataBuildTime to verify BuildTime() method correctness

Add makeTestName helper function to create reasonable test names from long hex strings. TestDecodeByte and TestDecodeString now show concise names like '9e06b37878787878...7878' instead of extremely long hex strings that made test output unreadable.

Copilot

Pull Request Overview

This PR reorganizes the codebase to extract decoding logic into internal packages, introduces a public mmdbdata API for custom decoders, and centralizes error handling with a new mmdberrors package. It also refactors Reader and Result types to use a unified ReflectionDecoder, adds functional options to Open/FromBytes, and updates tests and examples for these changes.

Extract decoder and error types into internal/decoder and internal/mmdberrors
Introduce public mmdbdata package with Decoder and Unmarshaler interface
Refactor Reader, Result, and verify logic to use new internal types and centralized errors

Reviewed Changes

Copilot reviewed 32 out of 33 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
reader.go	Added `ReaderOption`, decoupled decoder to use `ReflectionDecoder`, centralized errors under `mmdberrors`
result.go	Simplified `Decode`/`DecodePath` to delegate to `ReflectionDecoder`
verifier.go	Moved data-section verification into `internal/decoder/verifier.go` and updated error creation
traverse.go / traverse_test.go	Changed option functions to return `NetworksOption` closures and updated test calls
mmdbdata/type.go & interface.go	New public API for low-level decoder types and `Unmarshaler` interface
internal/mmdberrors/errors.go	New centralized error types and constructors
internal/decoder/verifier.go	New `VerifyDataSection` implementation

Comments suppressed due to low confidence (2)

reader.go:365

The errors package is not imported in reader.go, causing errors.New to fail to compile. Add import "errors" to the imports.

deserializer_test.go:72

The call to d.add(v) passes an int32 to a method expecting int. Either revert to d.add(int(v)) or update the add method to accept int32.

	return d.add(v)

Remove the experimental deserializer interface and all supporting code: - Delete deserializer.go interface definition - Delete deserializer_test.go test file - Remove deserializer support from reflection.go - Remove deserializer methods from data_decoder.go - Remove unused math/big import from data_decoder.go - Add breaking change notice to changelog recommending UnmarshalMaxMindDB

Error messages for type mismatches now display readable type names like 'Map' and 'Slice' instead of numeric codes, making debugging easier.

Make StringCache, buffer access, InternAt, and all DataDecoder methods package-private since they are only used within the decoder package. Keeps Kind methods public as they are exposed via mmdbdata type alias.

Replaces global mutex with per-entry mutexes to reduce allocation count from 33 to 10 per operation in downstream libraries while maintaining thread safety and good concurrent performance.

Add BenchmarkCityLookupConcurrent to demonstrate string cache performance improvements under concurrent load. Tests 1, 4, 16, and 64 goroutines performing realistic city lookups, providing clear metrics for concurrent scaling behavior.

oschwald force-pushed the greg/decoder branch from 2acd825 to c18b036 Compare June 22, 2025 21:40

oschwald requested a review from Copilot June 22, 2025 21:46

This comment was marked as outdated.

Sign in to view

oschwald added 15 commits June 28, 2025 16:20

Move decoder code into new internal package

828da60

Add change log

a1dd40a

Make functional option functions return a function

e910ce2

Add support for options to Open and FromBytes

6137730

This will be used in the future for things like cache configuration. Adding now as it is a breaking change.

Add missing docs

d0b1b90

Tighten up the golangci-lint config

c99b49d

Start decoupling the decoding and relection

eb2699b

Although this doesn't provide immediate benefit, the intent is to make the data decoder available separately in a future commit for public use.

Move size checks to DataDecoder

0398bcd

This further decouples the code. There are probably future improvements for reducing redundancy however.

Improve naming of data types and make them public

40a9a76

Add separate Decoder type for manual decoding

e38dbaa

This is largely based on #91.

Add support for UnmarshalMaxMinDB

3304dbb

Improve go docs

8b7bd07

Update README.md

8c89e70

Add bounds check suggested by Copilot

b97e1e3

oschwald force-pushed the greg/decoder branch from 2bb951e to 2d8cb52 Compare June 28, 2025 23:20

oschwald added 11 commits June 28, 2025 20:01

Move Decoder to public mmdbdata package

91450f6

The Decoder and Unmarshaler types are now available in the mmdbdata package for applications that need direct access to the decoding API.

Rename Type to Kind to align with encoding/json/v2

0ab3407

Renames Type* constants to Kind* and PeekType() to PeekKind() throughout the codebase to match Go's encoding/json/v2 naming conventions. This improves API consistency with the standard library.

Rename Decode methods to Read to align with jsontext

0f9b4a1

Renames all Decoder methods from Decode* to Read* (e.g., DecodeString to ReadString) to match Go's encoding/json/v2 jsontext.Decoder naming conventions. This improves API consistency with the standard library.

Add options pattern to NewDecoder

6e16927

Adds DecoderOption type and variadic options parameter to enable future configuration without breaking API changes. Follows existing library patterns like ReaderOption and NetworksOption.

Add Kind helper methods for type introspection

b106591

Adds String(), IsContainer(), and IsScalar() methods to Kind type for better debugging and type classification. These methods enable human-readable Kind names and easy identification of container vs scalar types.

Add files and test databases to .gitignore

edec975

Add thread-safe bounded string cache

698918f

Replaces unbounded cache with fixed 512-entry array using offset-based indexing. Provides 15% performance improvement while preventing memory growth and ensuring thread safety for concurrent reader usage.

Consolidate decoder validation to DataDecoder

5b10e59

Move size validation logic to DataDecoder to eliminate duplication and create single source of truth for data validation.

Fixes for golangci-lint v2.2.0

6579ca5

Update UnmarshalMaxMindDB docs

7fc4929

And make them more accurate.

oschwald force-pushed the greg/decoder branch 2 times, most recently from 604ec94 to 652ee9d Compare June 30, 2025 02:42

oschwald added 3 commits June 29, 2025 19:49

Fix minor documentation errors and typos

29d34c9

oschwald force-pushed the greg/decoder branch from 652ee9d to 9e78231 Compare June 30, 2025 02:49

Test on recent Go versions

75f5cf5

oschwald requested a review from Copilot June 30, 2025 02:52

Copilot AI reviewed Jun 30, 2025

View reviewed changes

oschwald added 5 commits June 29, 2025 20:03

Improve error messages to show type names instead of numbers

51356a3

Error messages for type mismatches now display readable type names like 'Map' and 'Slice' instead of numeric codes, making debugging easier.

Make internal decoder exports package-private

546266b

Make StringCache, buffer access, InternAt, and all DataDecoder methods package-private since they are only used within the decoder package. Keeps Kind methods public as they are exposed via mmdbdata type alias.

Reduce string allocation overhead in decoders

10b66bb

Replaces global mutex with per-entry mutexes to reduce allocation count from 33 to 10 per operation in downstream libraries while maintaining thread safety and good concurrent performance.

Add concurrent city lookup benchmark

d19914b

Add BenchmarkCityLookupConcurrent to demonstrate string cache performance improvements under concurrent load. Tests 1, 4, 16, and 64 goroutines performing realistic city lookups, providing clear metrics for concurrent scaling behavior.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reorg and new custom decoder support #177

Reorg and new custom decoder support #177

Uh oh!

oschwald commented Jun 22, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reorg and new custom decoder support #177

Are you sure you want to change the base?

Reorg and new custom decoder support #177

Uh oh!

Conversation

oschwald commented Jun 22, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!