Skip to content

Tags: linkedin/isolation-forest

Tags

v4.1.5

Toggle v4.1.5's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Bump pytest from 8.3.2 to 9.0.3 in /isolation-forest-onnx (#84)

Bumps [pytest](https://github.com/pytest-dev/pytest) from 8.3.2 to 9.0.3.
- [Release notes](https://github.com/pytest-dev/pytest/releases)
- [Changelog](https://github.com/pytest-dev/pytest/blob/main/CHANGELOG.rst)
- [Commits](pytest-dev/pytest@8.3.2...9.0.3)

---
updated-dependencies:
- dependency-name: pytest
  dependency-version: 9.0.3
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

v4.1.4

Toggle v4.1.4's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Upgrade ONNX to 1.21.0 and align dependencies (onnxruntime, numpy, Py…

…thon >=3.11) (#83)

* Bump onnx from 1.17.0 to 1.21.0 in /isolation-forest-onnx

Bumps [onnx](https://github.com/onnx/onnx) from 1.17.0 to 1.21.0.
- [Release notes](https://github.com/onnx/onnx/releases)
- [Changelog](https://github.com/onnx/onnx/blob/main/docs/Changelog-ml.md)
- [Commits](onnx/onnx@v1.17.0...v1.21.0)

---
updated-dependencies:
- dependency-name: onnx
  dependency-version: 1.21.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

* Pin ONNX model IR version to 10 for onnxruntime compatibility

onnx 1.21.0 defaults to IR version 13, which is unsupported by
onnxruntime < 1.24.1. Since the model only uses opset 14, IR version 10
is sufficient and ensures broad onnxruntime compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Upgrade onnxruntime, numpy, and Python version for onnx 1.21.0 compatibility

- Upgrade onnxruntime from 1.19.2/1.18.0 to 1.24.1 (supports IR version 13
  that onnx 1.21.0 produces by default)
- Upgrade numpy from 1.26.4 to 2.2.6 and fix np.trapz -> np.trapezoid
  (trapz was removed in numpy 2.0)
- Update python_requires from >=3.9 to >=3.10 (required by onnx 1.21.0)
- Remove the ir_version=10 pin since onnxruntime 1.24.1 natively supports
  IR version 13

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Align Python version in CI and Black config with python_requires >=3.10

- Update pypi-publish job from Python 3.9 to 3.10
- Update Black target-version from py39 to py310

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Pin ONNX model IR version to 10 for maximum runtime portability

The model only uses opset 14, which is fully supported by IR version 10.
Pinning avoids requiring onnxruntime >= 1.24.1 or other recent runtimes
just to load the model, maximizing cross-platform compatibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Add explicit Python setup to the build job

The build job runs Gradle which creates a Python venv for the
isolation-forest-onnx tests. Without setup-python, it relies on
whatever python3 the runner provides, which is implicit and fragile.
Pin to 3.10 to match python_requires.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Raise minimum Python to 3.12

onnxruntime 1.24.x only ships Linux wheels for Python 3.11+, and 3.12
is the current ubuntu-latest default. Since this is a converter tool
(not a foundational library), targeting 3.12 is a practical baseline
that ensures all dependencies have prebuilt wheels.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Set python_requires to >=3.11 based on actual dependency floor

onnxruntime 1.24.1 only ships wheels for Python 3.11+, making that the
true minimum. CI remains on 3.12 (runner default with full wheel
coverage), but the package contract should reflect what users can
actually install, not what CI happens to run.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: James Verbus <james.verbus@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

v4.1.3

Toggle v4.1.3's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add eif heatmap plots to README.md (#82)

* Add Standard IF vs Extended IF score heatmap plots to README

Add synthetic 2D score heatmaps illustrating the axis-aligned bias of
Standard Isolation Forest that Extended Isolation Forest eliminates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Update heatmap plots with improved versions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Update heatmap plots

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Update heatmap plots

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

v4.1.2

Toggle v4.1.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add project description to isolation-forest-onnx PyPI package (#81)

Add a README.md for the isolation-forest-onnx package and configure
setup.cfg to use it as the long description on PyPI. Include the
README in MANIFEST.in so it is packaged in the source distribution.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

v4.1.1

Toggle v4.1.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Upgrade GitHub Actions for Node 24 compatibility (#80)

* Upgrade GitHub Actions for Node 24 compatibility

Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>

* Add required distribution parameter to setup-java@v5

The upgrade to actions/setup-java@v5 requires the distribution input.
Use Eclipse Temurin as the JDK distribution for all CI jobs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix Java version format for setup-java@v5

setup-java@v5 uses semver and does not recognize '1.8'. Use '8' instead,
which matches the available Temurin distribution versions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
Co-authored-by: Salman Muin Kayser Chishti <13schishti@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

v4.1.0

Toggle v4.1.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add Extended Isolation Forest (EIF) support to the Scala/Spark isolat…

…ion-forest library. (#79)

* First working version of extended isolation forest training and scoring. Results look reasonable, but detailed correctness not yet verified.

* Updated rough draft code for EIF.

* Refactor Extended Isolation Forest for clearer logic more in line with Isolation Forest, improved docs, and parameter validation

- Renamed local variables in `ExtendedIsolationForest.scala` for clarity (`dataset` → `data`).
- Moved and refined parameter validation in `validateAndResolveParams`, logging chosen samples/features.
- Updated Javadoc-style comments in `ExtendedIsolationForest`, `ExtendedIsolationForestModel`, and related classes.
- Changed schema checks to use `VectorType` instead of `SQLDataTypes.VectorType`.
- Renamed and documented internal methods (e.g., `pathLengthInternal`) in `ExtendedIsolationTree`.
- Ensured consistent naming across `ExtendedIsolationForestModel` fields (e.g., `extendedIsolationTrees`).
- Cleaned up imports, minor style fixes, and removed commented-out debug prints.

There are still likely opprotunities to factor out more shared logic into `core`..

* Got standard isolation forest R/W working after major refactor. Still a work in progress.

* Fixed package structure.

* WORK IN PROGRESS - Have prototype extended isolation forest read / write working with tests.

* Did linting for eif code.

* fix(EIF): align hyperplane split + path test with paper; correct intercept sampling, ≤ semantics, and degeneracy handling

- Sample normal in the selected subspace with up to (extensionLevel+1) non‑zero coords; normalize and guard zero‑norm.
- Sample intercept as point p by drawing each active coordinate uniformly from that node’s data range; set offset = n·p.
- Use inclusive left-branch test x·n ≤ n·p in both training and scoring so the split predicate matches the paper.
- Treat minDot == maxDot or an empty partition as a leaf (stores numInstances); keeps trees well‑formed.
- Compute dot against a full‑length normal (zeros for unused coords) to match the (x − p)·n test.
- Minor: log message tweaks; one‑pass min/max scan instead of materializing arrays; consistent ≤ in train/score.
- No change to model IO or public params.

* fix(EIF): retry degenerate hyperplane splits instead of premature leafing

  Previously, a single failed split attempt (constant feature, all-same
  dot products, or empty partition) immediately produced a leaf node.
  This meant extensionLevel=0 was not equivalent to standard IF when the
  first randomly chosen feature happened to be constant. Now retries up
  to 50 times before falling back to a leaf.

* chore(EIF): remove dead code, unused imports, and fix test description typos

* fix(EIF): validate extensionLevel at fit time instead of silent clamping

  Remove Int.MaxValue-1 sentinel default. If the user sets extensionLevel
  above numFeatures-1, throw immediately. If unset, default to
  numFeatures-1 (fully extended). The resolved value is persisted in the
  model rather than the sentinel.

* fix: fail fast on empty partition in shared tree training

Guard against dataForTree.head crash when a partition receives zero
sampled points. Throws a clear IllegalStateException instead of a
confusing NoSuchElementException.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: use actual tree count instead of numEstimators param in scoring

  Divide path length sum by the actual number of trees in the model
  rather than the $(numEstimators) parameter, preventing model/param
  drift from producing incorrect anomaly scores.

* test(EIF): replace toString tree comparison with structural equality check

  Use recursive node-by-node comparison with epsilon tolerance for
  doubles instead of fragile toString matching.

* docs: add Extended Isolation Forest documentation to README

  Add EIF section covering when to use it, the extensionLevel parameter
  and its interaction with maxFeatures, and a usage example. Call out
  that ONNX export is not supported for EIF. Add Hariri et al. 2018
  to references.

* Added citation info to readme.

* fix(EIF): use strict < for hyperplane split and stop fit() from mutating estimator

  Change the split criterion in ExtendedIsolationTree from <= to strict <,
  matching both the reference implementation (sahandha/eif) and our own
  standard IsolationTree. Affects tree building (partition) and scoring
  (path traversal).

  Remove the set(extensionLevel, resolvedExtensionLevel) call in
  ExtendedIsolationForest.fit() that mutated the estimator. When
  extensionLevel was unset (defaulting to fully extended), the first
  fit() permanently set it, causing reuse on a dataset with fewer
  features to either fail validation or silently use the wrong level.

* fix(EIF): match reference implementation split semantics instead of retry loop

  Remove bounded retry loop for degenerate splits. Instead, follow the
  EIF paper and reference implementation: allow empty partitions to
  become ExtendedExternalNode(0) leaves. Change split predicate from
  <= to strict < to match reference implementation's (x-p)·n < 0.
  Relax ExtendedExternalNode to accept numInstances >= 0.

* docs: update benchmarks with StandardIF, ExtendedIF_0, and ExtendedIF_max results

  Replace the old IF-only benchmark table with comprehensive results
  across 13 datasets comparing all three model variants against Liu
  et al. and the reference Python EIF implementation.

* docs: update benchmark table with reference Python results for all 13 datasets

* fix(EIF): persist resolved extensionLevel on trained model

Set the resolved extensionLevel on the estimator before copyValues so
it flows into the model's param map. Without this, models trained
without explicitly calling setExtensionLevel() would lose the effective
value on save/load. Add test covering default resolution and
round-trip persistence.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test(EIF): add pre-merge tests for zero-size leaves and ext=0 axis-aligned splits

Exercise the numInstances >= 0 semantics that became first-class EIF
behavior when degenerate hyperplane splits were allowed to produce empty
children. New tests cover:
- ExtendedExternalNode(0) construction and subtreeDepth
- Path length through a zero-size leaf contributes avgPathLength(0) = 0
- Save/load round-trip preserves a tree containing a zero-size leaf
- extensionLevel=0 produces strictly axis-aligned normals (1 non-zero coordinate)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: soften benchmark claims and clarify EIF_0 vs StandardIF wording

- Use "closely matches" for ExtendedIF_max reference comparison
- Note mulcross as an open outlier in ExtendedIF_0 parity (12 of 13)
- Describe extensionLevel=0 as "uses axis-aligned splits" instead of
  "recovers standard axis-aligned splits"
- Frame low-dimensional underperformance as our benchmark observation,
  not a broad established finding from the paper

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test(EIF): enable saved model tree structure regression test

  Uncomment savedExtendedIsolationForestModelTreeStructureTest and add the
  required resource files: a saved ExtendedIsolationForestModel and its
  expected first-tree toString golden file. This provides a regression
  guard against accidental changes to tree serialization or structure.

* What was done: Extracted the duplicated validateAndResolveParams method into SharedTrainLogic (where the other shared training helpers already live). Both IsolationForest.scala
  and ExtendedIsolationForest.scala now call the single shared implementation, passing $(maxFeatures) and $(maxSamples) as arguments.

  Files changed:
  - core/SharedTrainLogic.scala — added validateAndResolveParams(dataset, maxFeatures, maxSamples) method and its ResolvedParams import
  - IsolationForest.scala — removed private method, updated import and call site
  - extended/ExtendedIsolationForest.scala — removed private method, updated import and call site

* refactor: extract duplicated transformSchema into Utils.validateAndTransformSchema

  All four Estimator/Model classes had identical 15-line transformSchema
  overrides. Extract the shared logic into Utils and delegate with a
  one-liner in each class.

* chore(EIF): remove unused import, fix docstring, and align threshold comparison style

  - Remove unused IsolationForestModel import from ExtendedIsolationForestModelReadWrite
  - Fix reader docstring that incorrectly said "standard" instead of "Extended"
  - Change `outlierScoreThreshold > 0.0` to `> 0` to match standard IF style

* test(EIF): add tests for L2-normalized normals, invalid extensionLevel, and intermediate levels

  - Verify all hyperplane normals are L2-normalized across extension levels and seeds
  - Verify extensionLevel > numFeatures - 1 throws IllegalArgumentException at fit time
  - Verify intermediate extensionLevel values (1-4) train valid models with reasonable AUROC

* chore(EIF): fix redundant import and stale docstrings in ExtendedIsolationForestModel

  - Remove unnecessary self-import of ExtendedIsolationForestModel in ReadWrite file
  - Fix companion object and threshold comments that said "IsolationForestModel"
    instead of "ExtendedIsolationForestModel"

* fix: address EIF review findings and harden model edge cases

  Resolve the review issues uncovered while comparing the extended isolation
  forest branch against master and the EIF reference implementation.

  ExtendedIsolationForest
  - stop mutating the estimator with a resolved default extensionLevel during
    fit()
  - keep dataset-dependent extensionLevel resolution local to each fit and
    apply the resolved value only to the trained model
  - add a regression test that reuses the same estimator across datasets with
    different feature dimensions to ensure default extensionLevel does not
    leak across fits

  IsolationForestModel / ExtendedIsolationForestModel
  - fail fast when transform() is called on an empty ensemble instead of
    dividing by zero and producing invalid scores
  - keep scoring normalized by the actual loaded tree count, but guard the
    zero-tree case explicitly
  - add transform-throws coverage for manually constructed empty standard and
    extended models
  - preserve existing empty model write/read tests so persistence still
    round-trips this edge case correctly

  Tests and style cleanup
  - move ExtendedIsolationForestModelWriteReadTest into the
    com.linkedin.relevance.isolationforest.extended package so package names
    match file paths and the surrounding test suites
  - restore the Spark-derived attribution header on the moved/copied
    read-write helpers
  - align ExtendedIsolationForestModelReadWrite visibility with the rest of
    the package-private isolation forest internals

  Verification
  - ./gradlew -g /tmp/codex-gradle-home :isolation-forest:test
  - ./gradlew -g /tmp/codex-gradle-home :isolation-forest:test --tests com.linkedin.relevance.isolationforest.extended.ExtendedIsolationForestTest
  - ./gradlew -g /tmp/codex-gradle-home :isolation-forest:test --tests com.linkedin.relevance.isolationforest.IsolationForestModelWriteReadTest --tests
  com.linkedin.relevance.isolationforest.extended.ExtendedIsolationForestModelWriteReadTest
  - ./gradlew -g /tmp/codex-gradle-home :isolation-forest:compileScala :isolation-forest:compileTestScala

* docs: refresh README for EIF and current build defaults

  Update the README so the documented examples and version references match
  the current repo state and are copy-paste runnable.

  README updates
  - change the documented default Spark version from 3.5.1 to 3.5.5
  - update the example build command to use the current default Spark/Scala
    combination
  - replace stale hardcoded library and ONNX package versions with
    <latest-version> / <matching-version> placeholders
  - switch the Gradle dependency example from deprecated `compile` to
    `implementation`
  - add the missing `org.apache.spark.sql.functions.col` import to the Scala
    training example
  - fix the training example text to refer to the `label` column instead of
    `labels`
  - clarify the EIF `extensionLevel(5)` example comment so the dimensional
    assumption is explicit
  - define `dataset_name` and `num_examples_to_print` in the ONNX Python
    inference example so the snippet is runnable as written
  - remove the benchmark prose reference to a `LI IF` comparison column that
    is not present in the table

  This is a documentation-only change.

* feat: add extended isolation forest with sparse hyperplane persistence

  Add Extended Isolation Forest (EIF) support alongside the existing standard
  Isolation Forest implementation, and harden the standard/extended model
  persistence and scoring paths.

  Extended Isolation Forest
  - add ExtendedIsolationForest estimator, ExtendedIsolationForestModel,
    ExtendedIsolationForestParams, ExtendedIsolationTree, ExtendedNodes, and
    ExtendedUtils
  - implement EIF training with extensionLevel-controlled random hyperplane
    splits based on the Hariri et al. algorithm
  - resolve extensionLevel per fit without mutating estimator state
  - support axis-aligned EIF (extensionLevel = 0) through fully extended EIF
    (extensionLevel = numFeatures - 1)

  Sparse EIF model representation
  - store hyperplanes sparsely as (indices, weights, offset) instead of dense
    per-node normal vectors
  - canonicalize stored sparse coordinates by sorting feature indices before
    constructing SplitHyperplane
  - use sparse dot products for tree traversal and add a direct Spark Vector
    scoring path so EIF scoring benefits from sparsity end to end
  - enforce sparse hyperplane invariants: non-empty, length-matched,
    non-negative, distinct, sorted indices

  Persistence and read/write refactor
  - move standard model read/write into a top-level
    IsolationForestModelReadWrite implementation
  - add shared metadata helpers in IsolationForestModelReadWriteUtils
  - add sparse EIF model read/write support and checked-in EIF persistence
    fixtures
  - preserve standard-model backward compatibility when loading older saved
    models that do not contain totalNumFeatures metadata, logging that
    dimension validation is unavailable for those legacy models

  Model/scoring hardening
  - reject numSamples values that resolve to fewer than 2 samples during
    training
  - fail fast when transform() is called on empty standard or extended models
  - store totalNumFeatures in newly saved models and validate scoring input
    dimension when that training dimension is known
  - keep standard IF backward compatibility by restoring the legacy public
    4-arg IsolationForestModel constructor while making the richer internal
    constructor package-private
  - restrict the extended model constructor to package-private use so
    totalNumFeatures remains internal to fit/load/copy flows

  Tests
  - add comprehensive EIF estimator, tree, sparse-hyperplane, and write/read
    tests
  - add regression coverage for repeated EIF fits, empty-model scoring guards,
    numSamples >= 2 enforcement, scoring-time feature dimension validation,
    standard legacy metadata loading, and standard legacy constructor behavior
  - update saved model metadata/tree-structure fixtures for the new extended
    persistence format and formatting changes

  Documentation
  - refresh README dependency/version examples and fix copy-paste issues in the
    Scala and ONNX examples
  - add EIF usage and persistence examples
  - document benchmark results for standard IF vs EIF variants
  - fix benchmark/doc typos and soften the benchmark agreement statement to
    avoid overstating row-by-row verification

  Verification
  - ./gradlew -g /tmp/codex-gradle-home :isolation-forest:test

* Updated readme.

* docs: update README benchmark table and references

  - Apply  rounding to all value ± error pairs (1 sig fig on
    error, 2 if leading digit is 1)
  - Move Ref Python results from StandardIF to ExtendedIF_0 rows since
    the reference Python EIF at ext=0 is not a true standard IF
  - Add DOI to EIF paper reference and add reference Python eif repo
  - Clarify column headers (Liu et al., Ref. Python with IF/EIF labels)
  - Simplify key observations and fix overstated dimensionality claim
  - Minor wording improvements throughout

* Added scroll to results table.

* updated readme

  1. Non-breaking spaces around ± — replaced ± with &nbsp;±&nbsp; in all value cells so values like 0.813 ± 0.004 won't wrap mid-value.
  2. Dashes in empty cells — all empty reference cells now show - instead of blank:
    - StandardIF rows: - in both Ref. Python columns
    - ExtendedIF rows: - in the Liu et al. column

* Updated readme.

* fix(EIF): use float-precision hyperplane weights for Spark 4.x Avro compatibility

  Spark 4.x's Avro encoder silently demotes Array[Double] elements to
  float (32-bit) precision during serialization, while scalar Double
  fields survive intact. This caused all five EIF model write/read tests
  to fail on Spark 4.0.1 and 4.1.1, with weight mismatches at ~1e-8
  (the exact double→float→double precision boundary).

  The fix changes SplitHyperplane weights from Array[Double] to
  Array[Float]. This is the correct design from first principles:
  - Features are already Array[Float] (DataPoint.features)
  - Weights define the hyperplane *direction* (analogous to the feature
    index in standard IF, which is just an Int)
  - The offset defines *where* to split and remains Double (analogous to
    splitValue in standard IF, which is Double for the same reason)
  - The dot product is accumulated in Double regardless of operand type
  - The split comparison (dot < offset) is always Double vs Double

  Weights are converted to float after normalization but before computing
  the offset, so training and scoring are consistent. Benchmarks confirm
  the change is invisible after rounding: only one value across all 13
  datasets changed (breastw ExtendedIF_max AUPRC: 0.9568 → 0.9569,
  well within the ±0.0015 error bar).

  Production code:
  - ExtendedUtils.scala: SplitHyperplane.weights Array[Double] → Array[Float]
  - ExtendedIsolationTree.scala: normalize to float before offset computation
  - ExtendedIsolationForestModelReadWrite.scala: ExtendedNodeData.weights
    and NullWeights updated to float

  Test code:
  - ExtendedIsolationTreeTest.scala: float literals, L2 norm tolerance
    widened from 1e-10 to 1e-6 (appropriate for float precision)
  - ExtendedIsolationForestModelWriteReadTest.scala: float literals,
    added disabled regenerateGoldenExtendedModel helper
  - Regenerated golden model and expected tree structure

  README:
  - Updated breastw ExtendedIF_max AUPRC from 0.9568 to 0.9569

  Verified all 67 tests pass on Spark 3.5.5, 4.0.1, and 4.1.1.

* docs: address Copilot review feedback on PR #79

Copilot review responses:

1. Grammar fix (accepted): "some dataset" → "some datasets" in README
   benchmark observations.

2. .toFloat cast comment (accepted): Added clarifying comment explaining
   why features(indices(i)).toFloat is intentional — it matches the
   DataPoint (Array[Float]) precision used during training, ensuring
   scoring consistency between the DataPoint and Vector code paths.

3. Shuffle optimization (declined): Copilot suggested replacing
   Random.shuffle + take with reservoir sampling. The shuffle operates
   on a tiny array (≤ dim features, typically < 100 elements) once per
   tree node during training — not a hot path. Readability outweighs
   the micro-optimization.

4. outlierScoreThreshold > 0 sentinel (declined): Copilot noted that
   threshold=0.0 would be treated as "unset". Technically correct, but
   this mirrors the existing standard IF pattern identically. A
   threshold of 0.0 (label everything as outlier) is not a practical
   use case. Fixing it properly requires changing both IF and EIF
   together in a separate PR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

v4.0.12

Toggle v4.0.12's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Bump protobuf from 5.29.5 to 5.29.6 in /isolation-forest-onnx (#78)

Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 5.29.5 to 5.29.6.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Commits](https://github.com/protocolbuffers/protobuf/commits)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-version: 5.29.6
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

v4.0.11

Toggle v4.0.11's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
build(isolation-forest-onnx): bump twine to 6.2.0 for PyPI upload com…

…patibility (#77)

- Update dev tooling pin from twine==5.1.1 -> 6.2.0 in requirements-dev.txt and setup.cfg
- Fixes CI publish failures caused by newer wheel metadata validation

v4.0.9

Toggle v4.0.9's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Bump wheel from 0.38.1 to 0.46.2 in /isolation-forest-onnx (#75)

Bumps [wheel](https://github.com/pypa/wheel) from 0.38.1 to 0.46.2.
- [Release notes](https://github.com/pypa/wheel/releases)
- [Changelog](https://github.com/pypa/wheel/blob/main/docs/news.rst)
- [Commits](pypa/wheel@0.38.1...0.46.2)

---
updated-dependencies:
- dependency-name: wheel
  dependency-version: 0.46.2
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

v4.0.8

Toggle v4.0.8's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Add support for Spark 4.1 (#74)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>