Skip to content

Reorg and new custom decoder support #177

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 36 commits into
base: main
Choose a base branch
from
Open

Reorg and new custom decoder support #177

wants to merge 36 commits into from

Conversation

oschwald
Copy link
Owner

  • Move decoder code into new internal package
  • Add change log
  • Make functional option functions return a function
  • Add support for options to Open and FromBytes
  • Add missing docs
  • Tighten up the golangci-lint config
  • Start decoupling the decoding and relection
  • Move size checks to DataDecoder
  • Improve naming of data types and make them public
  • Add separate Decoder type for manual decoding
  • Add support for UnmarshalMaxMinDB

@oschwald oschwald requested a review from Copilot June 22, 2025 21:46
Copilot

This comment was marked as outdated.

oschwald added 15 commits June 28, 2025 16:20
This will be used in the future for things like cache configuration.
Adding now as it is a breaking change.
Although this doesn't provide immediate benefit, the intent is to make
the data decoder available separately in a future commit for public use.
This further decouples the code. There are probably future improvements
for reducing redundancy however.
Extend the UnmarshalMaxMindDB interface to work recursively with nested
types, matching the behavior of encoding/json's UnmarshalJSON. Custom
unmarshalers are now called for:

- Struct fields that implement Unmarshaler
- Pointer fields (creates value if nil, then checks for Unmarshaler)
- Slice elements that implement Unmarshaler
- Map values that implement Unmarshaler

This enhancement allows for more flexible custom decoding strategies in
complex data structures, improving performance optimization opportunities
for nested types.
oschwald added 11 commits June 28, 2025 20:01
Add PeekType() method to Decoder that returns the type of the current value
without consuming it, similar to jsontext.Decoder.PeekKind(). This enables
look-ahead parsing for conditional decoding logic.

The method follows pointers to return the actual data type being pointed to
rather than just returning TypePointer.
The Decoder and Unmarshaler types are now available in the mmdbdata
package for applications that need direct access to the decoding API.
Renames Type* constants to Kind* and PeekType() to PeekKind()
throughout the codebase to match Go's encoding/json/v2 naming
conventions. This improves API consistency with the standard
library.
Renames all Decoder methods from Decode* to Read* (e.g.,
DecodeString to ReadString) to match Go's encoding/json/v2
jsontext.Decoder naming conventions. This improves API
consistency with the standard library.
Adds DecoderOption type and variadic options parameter to enable future
configuration without breaking API changes. Follows existing library
patterns like ReaderOption and NetworksOption.
Adds String(), IsContainer(), and IsScalar() methods to Kind type
for better debugging and type classification. These methods enable
human-readable Kind names and easy identification of container vs
scalar types.
Improved error messages to include byte offset information and, for the
reflection-based API, path information for nested structures using JSON
Pointer format. For example, errors may now show "at offset 1234, path
/city/names/en" or "at offset 1234, path /list/0/name" instead of just the
underlying error message.

The implementation maintains zero allocation on the happy path through
retroactive path building during error unwinding.
Replaces unbounded cache with fixed 512-entry array using offset-based
indexing. Provides 15% performance improvement while preventing memory
growth and ensuring thread safety for concurrent reader usage.
Move size validation logic to DataDecoder to eliminate duplication
and create single source of truth for data validation.
And make them more accurate.
@oschwald oschwald force-pushed the greg/decoder branch 2 times, most recently from 604ec94 to 652ee9d Compare June 30, 2025 02:42
oschwald added 3 commits June 29, 2025 19:49
- Add comprehensive field documentation to Metadata struct with
  references to MaxMind DB specification
- Add BuildTime() convenience method to convert BuildEpoch to time.Time
- Enhance Verify() documentation explaining validation scope and use cases
- Add ExampleReader_Verify showing verification and metadata access
- Add TestMetadataBuildTime to verify BuildTime() method correctness
Add makeTestName helper function to create reasonable test names from
long hex strings. TestDecodeByte and TestDecodeString now show concise
names like '9e06b37878787878...7878' instead of extremely long hex
strings that made test output unreadable.
@oschwald oschwald requested a review from Copilot June 30, 2025 02:52
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR reorganizes the codebase to extract decoding logic into internal packages, introduces a public mmdbdata API for custom decoders, and centralizes error handling with a new mmdberrors package. It also refactors Reader and Result types to use a unified ReflectionDecoder, adds functional options to Open/FromBytes, and updates tests and examples for these changes.

  • Extract decoder and error types into internal/decoder and internal/mmdberrors
  • Introduce public mmdbdata package with Decoder and Unmarshaler interface
  • Refactor Reader, Result, and verify logic to use new internal types and centralized errors

Reviewed Changes

Copilot reviewed 32 out of 33 changed files in this pull request and generated no comments.

Show a summary per file
File Description
reader.go Added ReaderOption, decoupled decoder to use ReflectionDecoder, centralized errors under mmdberrors
result.go Simplified Decode/DecodePath to delegate to ReflectionDecoder
verifier.go Moved data-section verification into internal/decoder/verifier.go and updated error creation
traverse.go / traverse_test.go Changed option functions to return NetworksOption closures and updated test calls
mmdbdata/type.go & interface.go New public API for low-level decoder types and Unmarshaler interface
internal/mmdberrors/errors.go New centralized error types and constructors
internal/decoder/verifier.go New VerifyDataSection implementation
Comments suppressed due to low confidence (2)

reader.go:365

  • The errors package is not imported in reader.go, causing errors.New to fail to compile. Add import "errors" to the imports.
	}

deserializer_test.go:72

  • The call to d.add(v) passes an int32 to a method expecting int. Either revert to d.add(int(v)) or update the add method to accept int32.
	return d.add(v)

oschwald added 5 commits June 29, 2025 20:03
Remove the experimental deserializer interface and all supporting code:
- Delete deserializer.go interface definition
- Delete deserializer_test.go test file
- Remove deserializer support from reflection.go
- Remove deserializer methods from data_decoder.go
- Remove unused math/big import from data_decoder.go
- Add breaking change notice to changelog recommending UnmarshalMaxMindDB
Error messages for type mismatches now display readable type names like
'Map' and 'Slice' instead of numeric codes, making debugging easier.
Make StringCache, buffer access, InternAt, and all DataDecoder methods
package-private since they are only used within the decoder package.
Keeps Kind methods public as they are exposed via mmdbdata type alias.
Replaces global mutex with per-entry mutexes to reduce allocation
count from 33 to 10 per operation in downstream libraries while
maintaining thread safety and good concurrent performance.
Add BenchmarkCityLookupConcurrent to demonstrate string cache performance
improvements under concurrent load. Tests 1, 4, 16, and 64 goroutines
performing realistic city lookups, providing clear metrics for concurrent
scaling behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant