metrics.py: fix inconsistent interface 3/n #41

TaXxER · 2025-12-09T15:25:09Z

Summary:
Some of our methods using predicted_labels (for discrete predictions):

recall
precision
dcg_score
ndcg_score
etc.

Other methods used predicted_scores (for continuous functions):

expected_calibration_error
proportional_expected_calibration_error
adaptive_calibration_error
proportional_adaptive_calibration_error
calibration_ratio
kuiper_calibration

However, normalized_entropy and calibration_free_normalized_entropy used predictions, which is inconsistent. Let's switch to predicted_scores like the rest.

Differential Revision: D88746323

…t.py, thresholding.py, and torchscript_modules.py

…calibration import [...]'

…st, which is not included in this release

…th Apple M1/M2 hardware, resulting in segmentation faults. We ignore affected tests.

…ting_file_exists Adding Contributing file

…conduct_file_exists Adding Code of Conduct file

… is not release yet on pypi. Here we undo the tuning change to make it compatible with 1.1.2.

- Fix spelling and grammar issues throughout documentation - Enhance clarity with concrete examples and expanded acronyms - Correct MCE definition from 'Maximum Calibration Error' to 'Multicalibration Error' - Fix all code examples to match actual DataFrame-based API - Add academic references for main paper (KDD 2026), metrics paper, and applications paper - Update all documentation files with accurate method signatures and parameter names Files modified: - README.md: Add proper citations, fix code example API - pyproject.toml: Fix subject-verb agreement - website/docs/: Update all doc pages with correct API usage - website/src/pages/index.js: Update homepage with correct citations and code example

The internal and external repositories are out of sync. This Pull Request attempts to brings them back in sync by patching the GitHub repository. Please carefully review this patch. You must disable ShipIt for your project in order to merge this pull request. DO NOT IMPORT this pull request. Instead, merge it directly on GitHub using the MERGE BUTTON. Re-enable ShipIt after merging.

Summary: Pull Request resolved: #24 We're starting to see some "running out of disk space" errors on the machines that build Sphinx. We have plenty of budget, so let's just throw a bigger machine. #18 Reviewed By: Lorenzo-Perini Differential Revision: D88486737 fbshipit-source-id: 3ab67c816a15a6d54bc330106e962b3e1f315e70

Summary: Some of our methods using `predicted_labels` (for discrete predictions): - recall - precision - dcg_score - ndcg_score etc. Other methods used `predicted_scores` (for continuous functions): - expected_calibration_error - proportional_expected_calibration_error - adaptive_calibration_error - proportional_adaptive_calibration_error - calibration_ratio - kuiper_calibration However, `normalized_entropy` and `calibration_free_normalized_entropy` used `predictions`, which is inconsistent. Let's switch to `predicted_scores` like the rest. Differential Revision: D88746323

… internal only module (#36) Summary: Pull Request resolved: #36 * MCNet is not part of the OSS release * Create a new module for internal only implementations * For backward compatibility re-export MCNet in `methods.py` with `oss-disable` directive * To avoid circular import due to `BaseCalibrator` move the abstract class to a new private module Reviewed By: TaXxER Differential Revision: D88473570 fbshipit-source-id: 96f52388e4d9055db020418b8b44703c5bac2466

…tebook (#35) Summary: Pull Request resolved: #35 as title Reviewed By: TaXxER Differential Revision: D88475947 Privacy Context Container: L1334583 fbshipit-source-id: efe4cb270d2596a1d5d9aa620d8b52486724e347

…ng -> PlattScalingWithFeatures (#34) Summary: Pull Request resolved: #34 SwissCheese is not a public name so we should rename this to something more generic. * Rename the implementation `SwissCheesePlattScaling` -> `PlattScalingWithFeatures` * Create a wrapper class of `SwissCheesePlattScaling` for backward compatibility (oss-disabled) Reviewed By: TaXxER Differential Revision: D88475946 Privacy Context Container: L1334583 fbshipit-source-id: 0a2120d75ab42d1ee8f4965383ff43157e70b3fc

Summary: Some of our methods using `predicted_labels` (for discrete predictions): - recall - precision - dcg_score - ndcg_score etc. Other methods used `predicted_scores` (for continuous functions): - expected_calibration_error - proportional_expected_calibration_error - adaptive_calibration_error - proportional_adaptive_calibration_error - calibration_ratio - kuiper_calibration However, `normalized_entropy` and `calibration_free_normalized_entropy` used `predictions`, which is inconsistent. Let's switch to `predicted_scores` like the rest. Differential Revision: D88746323

…/private methods naming (#43) Summary: Pull Request resolved: #43 We have several methods that are just used internally and not meant to be exposed to users / documented in our website / docs. Let's make this consistent and only expose a selection of methods that are core functionality and we want to fully document. Reviewed By: TaXxER Differential Revision: D88750778 fbshipit-source-id: b125880f51d516b0d246cd93c79489dca50ff2eb

Summary: Pull Request resolved: #45 Latest sklearn version no longer accepts empty input, and raises an error. This breaks our unit test, and our Github CI. This change keeps our behavior the same as it was. See error: https://github.com/facebookincubator/MCGrad/actions/runs/20099929745/job/57667810962 Reviewed By: Lorenzo-Perini Differential Revision: D88847925 fbshipit-source-id: 57e50b0deaa6da4a36af5d180ea4e65309d5b482

Summary: Pull Request resolved: #39 This diff just does the low-hanging fruit: removing references to internal wikis, documentation, code, etc # Discussion point - There are still a bunch of metrics that have `adjust_unjoined` argument, and a description of what unjoined data is. It will be somewhat non-trivial to fully disentangle this from the main metrics. Do want to: 1) Fully remove all mentions of `adjust_unjoined`, disentangle all unjoined versions of the metrics from the main metric implementation, and move those to the `internal` folder, or 2) Simply just remove the references of where unjoined data is used (previously one comment said that it is used a lot in the Ads or), and be OK with `adjust_unjoined` existing in the codebase. My feeling is that this will come across at worst as "a bit silly" to outsiders, and they will wonder "why the hell would anyone ever have data in that format", but as I foresee, the harm of leaving it in won't really go beyond that. Reviewed By: Lorenzo-Perini Differential Revision: D88739146 fbshipit-source-id: e229d5d056d45a43e3c963d33def69d3915e8478

Summary: Pull Request resolved: #40 Changes: - Fixed typo min_segmens_size -> min_segments_size - One of the tests missed the `test_` prefix and therefore never ran. It was actually a test to check that a mutable input argument was never modified, so given recent SEV, it is an important type of test - In one unit test, I switched to using an RNG seed, to prevent that this test might potentially be flaky / non-deterministic - List comprehension was unnecessarily complex with brackets and parantheses Reviewed By: Lorenzo-Perini Differential Revision: D88744756 fbshipit-source-id: f4a0d8228149b943b32b4ecf0373d923e61bb271

Summary: Pull Request resolved: #46 Providing arguments to pass df_val into tuning/final fit. Likely not something everyone wants to do - especially if people want to use cross validation. But this provides an option for those usecases where it's needed. Reviewed By: flinder Differential Revision: D88816384 fbshipit-source-id: 80da8accc544027222bae7cc8052b02650eb6fad

…public methods/properties in methods.py (#44) Summary: Pull Request resolved: #44 as title Reviewed By: Lorenzo-Perini Differential Revision: D88756988 Privacy Context Container: L1334583 fbshipit-source-id: 01debc913a8bb884f8ee657ae62e2b11f4d6492c

…ith `ValueErrors` (#48) Summary: Pull Request resolved: #48 Replaced 3 assert statements used for input validation with proper raise ValueError exceptions. Asserts can be disabled with Python's -O flag, making them unsuitable for production validation. This ensures validation always runs regardless of Python optimization settings. Reviewed By: leepface Differential Revision: D89060673 fbshipit-source-id: 8eb8c199b24135b23a3eb1643d9c0277ed835382

…or EARLY_STOPPING_ESTIMATION_METHOD (#49) Summary: Pull Request resolved: #49 Replaced a deeply nested ternary expression with a clear if/elif/else block when determining the early stopping estimation method. Added explicit type annotation before the conditional to satisfy Pyre. Reviewed By: leepface Differential Revision: D89060683 fbshipit-source-id: ebb4b7b3f4b6cb74eff03975d4379f7e54e0a641

…n expressions (#50) Summary: Pull Request resolved: #50 Simplified two redundant True if condition else False patterns to direct boolean expressions. The comparison already returns a boolean, so wrapping in a ternary is unnecessary. Reviewed By: leepface Differential Revision: D89060702 fbshipit-source-id: 4c18f4edc177509582dabfff0ce63e7c53e610a6

…> lower case class attributes (start with MONTONE_T) (#54) Summary: Pull Request resolved: #54 Dynamically set class attributes should be lower case according to PEP8 / Google style guide. I introduce a backward compatibility module `_compat.py` (python convention) to avoid cluttering the main class. How it works: It implements a Mixinclass which adds deprecated alias descriptors to the class. This allows retaining the old attribute name and produce a deprecation warning whenever it is set or accessed. Since we don't have open source users yet, backward compatibility is not a concern there, therefore I `oss-disable` the mixin. Reviewed By: Lorenzo-Perini Differential Revision: D89061588 fbshipit-source-id: e7564e6d6437c31234b381a73b2b5c6c7a32aa93

…> lower case class attributes - EARLY_STOPPING (#53) Summary: Pull Request resolved: #53 as title Reviewed By: Lorenzo-Perini Differential Revision: D89219315 fbshipit-source-id: 253574a10cbec85d30aeb6b9fea5968145dff2be

…> lower case class attributes - EARLY_STOPPING_ESTIMATION_METHOD (#52) Summary: Pull Request resolved: #52 As title. Reviewed By: Lorenzo-Perini Differential Revision: D89219345 fbshipit-source-id: c88ad4dc1330383bf88247bebd515303d47b59ff

…> lower case class attributes - EARLY_STOPPING_TIMEOUT (#51) Summary: Pull Request resolved: #51 As title Reviewed By: Lorenzo-Perini Differential Revision: D89219361 fbshipit-source-id: 911977a715a6518fe009f64346633fa2b24f4648

…> lower case class attributes - N_FOLDS (#55) Summary: Pull Request resolved: #55 As title Reviewed By: Lorenzo-Perini Differential Revision: D89219389 fbshipit-source-id: f2bb72cbba822e86d53303bb09dbbc112cfadc79

Summary: Some of our methods using `predicted_labels` (for discrete predictions): - recall - precision - dcg_score - ndcg_score etc. Other methods used `predicted_scores` (for continuous functions): - expected_calibration_error - proportional_expected_calibration_error - adaptive_calibration_error - proportional_adaptive_calibration_error - calibration_ratio - kuiper_calibration However, `normalized_entropy` and `calibration_free_normalized_entropy` used `predictions`, which is inconsistent. Let's switch to `predicted_scores` like the rest. Differential Revision: D88746323

TaXxER and others added 30 commits October 1, 2025 13:26

initial commit

9f2f528

OSS Automated Fix: Addition of Contributing

e5815b3

OSS Automated Fix: Addition of Code of Conduct

4e40f39

.gitignore for pytest cache

ca87751

Initial commit: just copied fbcode/multicalibration, but without mcne…

b233276

…t.py, thresholding.py, and torchscript_modules.py

Add *.egg-info to .gitignore

1f3fc44

Move src files to src/multicalibration, in order to match 'from multi…

23931ed

…calibration import [...]'

Add pyproject.toml to build MCGrad package

6628c1f

Solving failing tests: removing tests that depend on regression MCBoo…

532d2a1

…st, which is not included in this release

Solving failing tests: there seems to be a BoTorch incompatibility wi…

a2b0e30

…th Apple M1/M2 hardware, resulting in segmentation faults. We ignore affected tests.

Merge pull request #1 from facebookincubator/automated_fixup_contribu…

a2cd70c

…ting_file_exists Adding Contributing file

Merge pull request #2 from facebookincubator/automated_fixup_code_of_…

08c945e

…conduct_file_exists Adding Code of Conduct file

Add MIT license file

ab4c2b3

Applied isort and black

9c01105

Adds pre-commit hook that formats on commit and runs unit tests on push

45aff7d

Add a CODEOWNERS file

ff4987f

Add pytest to continuous integration

6396fad

Drop support for Python 3.9

153de92

Ax version 1.2.0 changes behavior that affects test_tuning.py, but it…

dbdc9b3

… is not release yet on pypi. Here we undo the tuning change to make it compatible with 1.1.2.

Cleanup disk space in CI machines

fdfd3a2

Switch to bigger machine

0692382

Initial setup of docusaurus

f159663

Update CI to use Node.js 20 for Docusaurus

24c6dc8

Set up test coverage report in CI

d53a16f

Add test coverage coming from mutmul mutation testing report

ab48f28

meta-codesync bot added the meta-exported label Dec 9, 2025

flinder added 3 commits December 9, 2025 09:21

methods.py internal/external check 3/n - Remove reference to Bento no…

07abfe3

…tebook (#35) Summary: Pull Request resolved: #35 as title Reviewed By: TaXxER Differential Revision: D88475947 Privacy Context Container: L1334583 fbshipit-source-id: efe4cb270d2596a1d5d9aa620d8b52486724e347

facebook-github-bot force-pushed the export-D88746323 branch from 0ddd339 to 3cee263 Compare December 10, 2025 13:16

flinder and others added 15 commits December 10, 2025 05:56

facebook-github-bot force-pushed the export-D88746323 branch 2 times, most recently from d39f7fc to 142d2b4 Compare December 19, 2025 16:32

facebook-github-bot closed this Jan 12, 2026

facebook-github-bot force-pushed the main branch from d3f884a to 126ba8b Compare January 12, 2026 13:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

metrics.py: fix inconsistent interface 3/n #41

metrics.py: fix inconsistent interface 3/n #41

Uh oh!

TaXxER commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

metrics.py: fix inconsistent interface 3/n #41

metrics.py: fix inconsistent interface 3/n #41

Uh oh!

Conversation

TaXxER commented Dec 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants