Skip to content

Conversation

@tevko
Copy link
Collaborator

@tevko tevko commented Dec 6, 2025

No description provided.

tevko and others added 28 commits November 17, 2025 21:05
* remove deprecated conversation fields

* add GET all_conversations route

* superadmin all-conversations view

* Participant Management WIP

* refactor xid logic; show xid list with pids in client-admin

* new xid tests

* Enable XID Upload

* show xid vote_count

* block non-xid participants when xid is required

* update some internal naming from "whitelist" to "allow list"

* xid arg not needed in votesPost

* fix test

* participation-management e2e

* upgrade cypress

* fix e2e test

* update alpha client with xid concerns

* normalize message; fix test

* rebuild astro
* update pip-tools; remove pip version restriction; update requirements.lock

* simplify Dockerfile; remove unused `IS_GITHUB_ACTION` conditional

* update cypress config to not use `IS_GITHUB_ACTION`

* conditionally use cpu-only torch libs in test builds
* Fix run_math_pipeline test import to use proper package path

The test file was importing `from run_math_pipeline import main` which
failed locally because `run_math_pipeline.py` lives inside the `polismath`
package at `delphi/polismath/run_math_pipeline.py`.

CI was working around this by copying the file to a flat location:
  docker cp delphi/polismath/run_math_pipeline.py delphi:/app/run_math_pipeline.py

This created a discrepancy between local and CI environments.

The fix:
1. Update test imports to use the correct package path:
   `from polismath.run_math_pipeline import main`
2. Update mock.patch paths to match:
   `mock.patch('polismath.run_math_pipeline.fetch_comments', ...)`
3. Remove the CI workaround that copied the file to /app flat
4. Simplify coverage to `--cov=polismath` (run_math_pipeline is inside it)

The Docker image already has `polismath/` at `/app/polismath/` and the
package is installed via `pip install --no-deps .`, so the proper import
path works in both local and CI environments.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Improve CI coverage reporting reliability

Changes to the CI workflow:
1. Print coverage report to workflow logs (always visible)
2. Upload coverage report as downloadable artifact
3. Make PR comment step non-fatal with continue-on-error: true
   (fork PRs cannot post comments due to GitHub token restrictions)

Coverage is now accessible three ways:
- In the workflow logs (step 7)
- As a downloadable artifact (step 8)
- As a PR comment when permissions allow (step 9)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Add graceful error handling for coverage comment on fork PRs

Instead of showing an unhandled error when posting coverage comments
fails on fork PRs, the script now catches the 403 error and displays
a helpful message explaining:
- Why the comment could not be posted (GitHub token permissions)
- Where to find the coverage report (logs and artifact)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
* Merge Squashed onto `edge`:

commit 7f14aed
Merge: 93a2d31 780f129
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 20 15:50:04 2025 +0000

    Merge commit '780f1298ca7d72b9717f6aa38526301305e520e8' into replace_named_matrix

    This will allow CI to run correctly.

commit 93a2d31
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 21:10:15 2025 +0000

    Recompile requirements.lock to include natsort

commit 0fd3734
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 15:01:16 2025 +0000

    Update golden records

    Now that we have changed behaviours of matrix in terms of ordering and of types,
    we need to update the golden records to reflect these changes.

commit 08d2383
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 15:01:04 2025 +0000

    Fix regression bugs from package reorganization due to hallucinations

    During refactoring to polismath.regression package, introduced bugs by
    hallucinating non-existent methods and changing behavior without checking
    the original code (commit afb8525).

    Fixed:
    - prepare_votes_data(): Restored CSV columns ('voter-id', 'comment-id')
      and vote dict keys ('pid', 'tid') instead of hallucinated alternatives
    - compute_all_stages(): Restored actual methods (update_votes(),
      _compute_pca(), _compute_clusters()) instead of hallucinated ones
      (process_votes(), compute_pca(), compute_clustering())
    - compute_all_stages_with_benchmark(): Restored original implementation
    - get_dataset_files(): Restored original dict keys ('votes', 'comments')
      instead of changed keys ('votes_csv', 'comments_csv')
    - load_golden_snapshot(): Restored golden_path computation logic
    - Numpy type handling: Added custom JSON encoder to preserve numeric types
      and extended comparer to treat Python/numpy numeric types as compatible

commit 334c01b
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 14:11:31 2025 +0000

    Reorganize regression testing into dedicated polismath.regression package

    - Split monolithic regression.py (1167 lines) into focused modules:
      - recorder.py: ConversationRecorder class
      - comparer.py: ConversationComparer class
      - datasets.py: Dataset configuration (moved from tests/)
      - utils.py: Shared utility functions

    - Clean architecture: No backwards dependencies from production to tests
    - Updated all imports in CLI scripts and test files
    - Regression testing now treated as first-class production feature

    This improves code organization, maintainability, and makes the regression
    tools suitable for use in production environments (monitoring, validation).

    🤖 Generated with [Claude Code](https://claude.com/claude-code)

    Co-Authored-By: Claude <noreply@anthropic.com>

commit afb8525
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 13:47:11 2025 +0000

    Improve logging throughout regression testing system

    - Replace all print statements with proper logging calls in polismath/regression.py
      - Use logger.info() for progress updates and results
      - Use logger.warning() for comparison mismatches
      - Use logger.debug() for detailed diagnostic information

    - Make PCA debug output conditional on DEBUG log level
      - Only save debug JSON files when logger.isEnabledFor(logging.DEBUG)
      - Move debug outputs from current directory to .test_outputs/debug/

    - Add --log-level CLI argument to regression scripts
      - Support DEBUG, INFO, WARNING, ERROR, CRITICAL levels
      - Default to INFO level
      - DEBUG level enables PCA debug file generation

    - Fix conversation module's logging initialization
      - Check logging.root.handlers instead of logger.handlers
      - Prevents duplicate handlers when logging is externally configured
      - Simplifies logging setup in CLI scripts

    The regression tools now provide full control over logging verbosity,
    making it easier to debug issues (with DEBUG) or run quietly (with WARNING/ERROR).

    🤖 Generated with Claude Code
    Co-Authored-By: Claude <noreply@anthropic.com>

commit 87f8cb2
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 13:14:10 2025 +0000

    Reorganize regression tests and consolidate test outputs

    - Move golden snapshots to dataset folders (real_data/{dataset}/golden_snapshot.json)
    - Relocate regression library from regression_tests/ to polismath/regression.py
    - Move CLI tools to scripts/ with clearer names (regression_recorder.py, regression_comparer.py)
    - Mark Clojure comparison tests as legacy with 'legacy_' prefix
    - Consolidate ALL test outputs in hidden .test_outputs/ directory:
      - Regression outputs → .test_outputs/regression/
      - Python implementation outputs → .test_outputs/python_output/{dataset}/
    - Keep real_data/ clean with only source data and golden snapshots
    - Fix path resolution bugs and unknown dataset handling in regression system
    - Update documentation and simplify .gitignore

    This reorganization clearly separates:
    - Source data and golden snapshots (real_data/) from temporary outputs (.test_outputs/)
    - Standard Python regression tests from legacy Clojure comparisons
    - Core libraries (polismath/) from CLI tools (scripts/)

commit a947c5a
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 12:08:03 2025 +0000

    Process appropriate RunTimeWarning in correlation tests

    The fourth row of the test matrix is intentationally constant, which
    causes a RuntimeWarning when computing correlations. This commit updates
    the test to properly handle this warning using the warnings module, ensuring
    that the test suite runs cleanly without unhandled warnings.

commit b6fbc09
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 11:56:16 2025 +0000

    Skip failing Clojure regression tests

    It's OK for now, as we want Delphi to stand on its own.

commit d8cb942
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 11:49:45 2025 +0000

    Remove hardcoded paths fed to Claude

commit 8dca87d
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 11:48:03 2025 +0000

    Factorize the clojure comparison and pipeline tests

    A lot of code was redundant and there was little separation
    of purpose between the clojure comparison logic and the
    pipeline tests. This change factorizes the clojure comparison
    logic into its own module and simplifies the pipeline tests.

commit 4622440
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 11:07:57 2025 +0000

    Fix output of full pipeline test

commit a274b8a
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 11:05:11 2025 +0000

    Refactor comparison to Clojure results

commit b94a6c1
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 09:47:07 2025 +0000

    Preserve original data types and uses natural sorting.

    Makes for a much clearer output. Will need to uppdate the golden record.
    All tests passing.

commit 7c6412b
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 09:39:30 2025 +0000

    Add test for natural sorting order before implementing

commit e06f0eb
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 18 09:14:17 2025 +0000

    Match old sorting and conveting behaviour

commit e5f47cd
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 17 15:31:59 2025 +0000

    Comment out BG2018 report for tests

commit cdc238c
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 17 15:05:00 2025 +0000

    Remove every mention of NamedMatrix

commit dacd95a
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 17 14:49:18 2025 +0000

    Restrict pytest regression test to VW dataset only for speed

commit b657c87
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 17 14:43:41 2025 +0000

    Vectorize matrix clean-up

commit 4bb11b5
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 17 14:39:00 2025 +0000

    Fix bug in PCA that caused different results

    Found the bug ! (With Claude Code's help)
    The PCA code starts by "cleaning" the matrix with some replacement rules for NaN
    and strings. Then it proceeds to compute the PCA on that cleaned up matrix.
    Great, I've done the cleaning, and done it in-place for efficiency, since the
    matrix is cleaned up first thing in the code and the unclean one therefore not
    used.  Right ?
    ...
    RIGHT ??
    *It turns out*, hidden way below, the projection of the participants on the
    *low-dimensional space is (intentionally) done *on the non-cleaned matrix* !!
    *(TODO : I'll have to put my math thinking cap on understand exactly why it was
    *coded like that...)

    Adding "copy=True" in one built-in invocation solved it.

    This version here also restored the loop-in-loop cleanup code. My next commit will clean it up.

commit 8da68f3
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 17 13:35:27 2025 +0000

    Try but fail to mimic the old handling of strings and NaNs

commit 0765357
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 17 13:32:15 2025 +0000

    Add a sanity check test for matrix cleaning functions

    Compare old and new way of doing things, to spot differnces.

commit b5ca831
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 17 10:19:51 2025 +0000

    Print differences in regular order

    Set operations are unordered...

commit 762dcaa
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 17 10:07:27 2025 +0000

    Order lexicographically (by str) upon moderation

commit dde8d6f
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 17 10:07:05 2025 +0000

    Store actual computation results

commit e9a231a
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 17 10:02:09 2025 +0000

    Test ordering to match pre-NamedMatrixectomy ordering

commit fd592f7
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 17 09:28:40 2025 +0000

    Save computed JSON for outside comparison

    Also create a symlink to the latest, for ease of opening without having
    to read timestamps.

commit 3392f44
Author: Julien Cornebise <julien@cornebise.com>
Date:   Fri Nov 14 15:22:06 2025 +0000

    Sort comment ids and participants

    Sort the comment ids and participant ids using natsort to ensure consistent ordering.
    Not sure why things need to be ordered, but it is probably less surprising this way.
    As a bonus, our indices can now be any type instead of being force-converted to strings.

commit 27b4cca
Author: Julien Cornebise <julien@cornebise.com>
Date:   Fri Nov 14 13:52:21 2025 +0000

    Remove python output that is dynamically generated during tests

commit c056ef6
Author: Julien Cornebise <julien@cornebise.com>
Date:   Fri Nov 14 13:49:09 2025 +0000

    Remove duplicate files

commit 81b3d9d
Author: Julien Cornebise <julien@cornebise.com>
Date:   Fri Nov 14 13:46:33 2025 +0000

    Rename folder to new name

commit 7c2cc3f
Author: Julien Cornebise <julien@cornebise.com>
Date:   Fri Nov 14 13:42:51 2025 +0000

    Fix trailing comma

commit 020ae42
Author: Julien Cornebise <julien@cornebise.com>
Date:   Fri Nov 14 13:39:53 2025 +0000

    Correct spaces to avoid false positives in git diffs.

commit 3866da5
Merge: 9bbdc49 2081ed8
Author: Julien Cornebise <julien@cornebise.com>
Date:   Fri Nov 14 12:26:08 2025 +0000

    Merge remote-tracking branch 'upstream/edge' into replace_named_matrix

    A lot of merge conflicts due to this branch having merged changes earlier
    that were merge-squashed into upstream/edge since then.

commit 9bbdc49
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 13 20:06:08 2025 +0000

    Pass all unit tests without NamedMatrix

commit 3fbeee6
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 13 20:03:45 2025 +0000

    Remove python output

    This python output is overwritten each time the tests are run, and should not be committed.

commit 1392b65
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 13 19:46:56 2025 +0000

    Pass correlation tests without NamedMatrix

commit 6b47f00
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 13 19:41:39 2025 +0000

    Pass Clustering tests without NamedMatrix :)

commit dbe197b
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 13 19:38:12 2025 +0000

    Pass all PCA unit tests

    PCA now works without NamedMatrix !

commit 13b8395
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 13 19:18:24 2025 +0000

    Replace NamedMatrix by DF in corr. clusters, and repness

    This passes test_conversation.py !

commit 2d1a6f7
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 13 19:02:01 2025 +0000

    Revert "Skip a warning generated by boto3 about datetime.utcnow being deprecated"

    This reverts commit 80be8bc.

commit 11f18b2
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 13 19:01:11 2025 +0000

    Replace NamedMatrix by DF in conversation.recompute()

    This means also applying to pca and clustering!

commit 7ac7b25
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 13 15:47:37 2025 +0000

    First replacement and first test to pass

    Replace NamedMatrix by DataFrame in
    - conversation.update_votes()
    - conversation._get_clean_matrix()
    - conversation._apply_moderation()

    and modify test_conversation::test_init

commit 80be8bc
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 13 13:01:24 2025 +0000

    Skip a warning generated by boto3 about datetime.utcnow being deprecated

commit 767a2d2
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 13 12:48:53 2025 +0000

    Add BG2018 and rename for clarity + Replace DB connection by URL

commit b80bd07
Merge: 58bd636 52a458a
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 13 09:35:31 2025 +0000

    Merge remote-tracking branch 'upstream/te-delphi-py-tests' into replace_named_matrix

commit 52a458a
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 21:55:27 2025 -0600

    build dependency

commit e26b482
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 21:41:04 2025 -0600

    update other action

commit 58bd636
Merge: e00d970 b427582
Author: Julien Cornebise <julien@cornebise.com>
Date:   Wed Nov 12 22:54:15 2025 +0000

    Merge branch 'te-delphi-py-tests' into replace_named_matrix

commit e00d970
Author: Julien Cornebise <julien@cornebise.com>
Date:   Wed Nov 12 22:43:17 2025 +0000

    Allow comparison to be run on multiple datasets

commit 1e48a4a
Author: Julien Cornebise <julien@cornebise.com>
Date:   Wed Nov 12 22:16:45 2025 +0000

    Add BG2050 and BG2018 datasets to dataset_config.py

commit d99c79e
Author: Julien Cornebise <julien@cornebise.com>
Date:   Wed Nov 12 22:04:55 2025 +0000

    Improve downloading of real data

    - Check if data already exists before downloading
    - Add option to force re-download of data
    - Change paths to include dataset name if known
    - Download all datasets from the config, by default
    - Add progress bars

commit 46eac1b
Author: Julien Cornebise <julien@cornebise.com>
Date:   Wed Nov 12 22:00:15 2025 +0000

    Allow to run tests in parallel

    Particularly useful now that we have multiple tests.

commit f43cbc0
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 15:52:48 2025 -0600

    freup space

commit 26f023c
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 15:42:21 2025 -0600

    shared, test db

commit fb6beeb
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 15:39:07 2025 -0600

    add access keys

commit ad85a95
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 15:22:00 2025 -0600

    update region

commit aa69f3f
Author: Julien Cornebise <julien@cornebise.com>
Date:   Wed Nov 12 21:09:39 2025 +0000

    Configure VScode to run pytest on the regression tests

commit 457125e
Author: Julien Cornebise <julien@cornebise.com>
Date:   Wed Nov 12 21:06:31 2025 +0000

    Run regression tests on all available datasets by default

commit 44ef75c
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 15:02:30 2025 -0600

    ensure dynamo tables created

commit 29ee422
Author: Julien Cornebise <julien@cornebise.com>
Date:   Wed Nov 12 21:01:54 2025 +0000

    Fix regression tests' logic

    The golden tests should not be generated by the tests, but by the developer once.

commit 695ede0
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 14:47:09 2025 -0600

    remove duplicate data

commit ae17516
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 12:37:03 2025 -0600

    add real data, update action

commit 1ef3c06
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 09:33:40 2025 -0600

    use pol.is baseurl -- actions

commit f1ce4aa
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 09:15:10 2025 -0600

    try more robust action -- actions

commit a5b4e6c
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 09:04:53 2025 -0600

    remove pg check again - actions

commit 966a551
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 08:57:02 2025 -0600

    change healthcheck - actions

commit 585cd2b
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 08:37:39 2025 -0600

    remove pg check

commit dd1c492
Author: Julien Cornebise <julien@cornebise.com>
Date:   Wed Nov 12 14:33:33 2025 +0000

    Improve benchmark: 3 runs, statistical test

commit 55d39a2
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 08:24:10 2025 -0600

    actions - change baseUrl

commit 9778137
Author: tevko <tim@devzero.io>
Date:   Wed Nov 12 08:07:00 2025 -0600

    actions - mount volume tests

commit 017d686
Author: Julien Cornebise <julien@cornebise.com>
Date:   Wed Nov 12 13:45:15 2025 +0000

    Refactor regression test and add basic benchmark

commit d90dd3c
Author: Julien Cornebise <julien@cornebise.com>
Date:   Wed Nov 12 09:41:30 2025 +0000

    Fix comparer and recorder to properly record and compare

    Saves PCA, clusters, etc

commit b427582
Author: tevko <tim@devzero.io>
Date:   Tue Nov 11 21:40:00 2025 -0600

    another actions fix again again again

commit 411c751
Author: tevko <tim@devzero.io>
Date:   Tue Nov 11 21:29:13 2025 -0600

    action fix again

commit d9421e1
Author: tevko <tim@devzero.io>
Date:   Tue Nov 11 21:19:04 2025 -0600

    another actions fix

commit be7bfbb
Author: tevko <tim@devzero.io>
Date:   Tue Nov 11 21:10:20 2025 -0600

    update action again

commit dac822b
Author: tevko <tim@devzero.io>
Date:   Tue Nov 11 21:09:44 2025 -0600

    add delphi service to test

commit 2b3aa6c
Author: tevko <tim@devzero.io>
Date:   Tue Nov 11 20:27:46 2025 -0600

    actions update 2

commit 483c5b4
Author: tevko <tim@devzero.io>
Date:   Tue Nov 11 20:15:37 2025 -0600

    fix action

commit 008bd22
Author: tevko <tim@devzero.io>
Date:   Tue Nov 11 20:07:15 2025 -0600

    fix all tests

commit 09edb21
Author: tevko <tim@devzero.io>
Date:   Tue Nov 11 16:30:06 2025 -0600

    use env for data script

commit b9f8f60
Author: Julien Cornebise <julien@cornebise.com>
Date:   Tue Nov 11 12:54:29 2025 +0000

    First draft of regression tests based on recorder

    The output is not yet the kind of exhaustive result I was expecting,
    so needs more work.

    Done with Claude.

commit 531280b
Author: tevko <tim@devzero.io>
Date:   Mon Nov 10 22:32:39 2025 -0600

    update action 3

commit ea8f989
Author: tevko <tim@devzero.io>
Date:   Mon Nov 10 22:24:08 2025 -0600

    update action 2

commit 779f5dd
Author: tevko <tim@devzero.io>
Date:   Mon Nov 10 22:17:10 2025 -0600

    update action

commit 359dbe3
Author: tevko <tim@devzero.io>
Date:   Mon Nov 10 22:05:53 2025 -0600

    add action

commit cb33f2d
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 10 13:16:35 2025 +0000

    Exclude Conversation serialization tests

    Until #2284 is resolved

commit 5a9d60a
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 10 10:05:42 2025 +0000

    Add assert failure messages

commit 78a27df
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sun Nov 9 21:02:12 2025 +0000

    Refactor test_repness_comparison.py to proper pytest structure

    Similar to pca tests, refactor test_repness_comparison.py
    - Converts test_comparison() function to TestRepnessComparison class
    - Uses @pytest.mark.parametrize for multiple datasets
    - Proper fixtures for clojure_results, conversation, python_results
    - Two test methods: test_structural_compatibility and test_comparison_visibility
    - Replaces print() with logging.info/debug
    - Adds warning that results are known to be very different
    - Reports comparison results for visibility without asserting on match rates
    - Maintains comparison functionality for manual inspection

    Test results: 4 tests passed (2 datasets × 2 test methods)

commit bc4f9e0
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sun Nov 9 19:27:07 2025 +0000

    Rename test_repness.py to test_repness_unit.py for clarity

    Rename to clarify that these are unit tests with synthetic data,
    following the same naming convention established for PCA tests:

    - test_repness.py → test_repness_unit.py (unit tests, synthetic data)
    - test_repness_smoke.py (real data, smoke tests - already renamed)
    - test_repness_comparison.py (Python vs Clojure - already clear)

    This mirrors the PCA test structure:
    - test_pca_unit.py (unit tests)
    - test_pca_edge_cases.py (edge cases)
    - test_pca_smoke.py (smoke tests)

    All 14 tests pass:
    - Statistical utility functions (z-scores, proportion tests)
    - Comment statistics calculation
    - Representative comment selection
    - Consensus selection
    - Integration tests (conv_repness, participant_stats)

commit 41355a6
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sun Nov 9 19:19:56 2025 +0000

    Refactor repness smoke test

    Similar to how we refactored the "direct PCA" tests

commit c3947d1
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sun Nov 9 19:04:05 2025 +0000

    Ignore warning from library ddtrace in pytest

commit 622adb4
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sun Nov 9 19:01:51 2025 +0000

    Clarify the naming of PCA test files and remove redundant tests

commit 0a0b55e
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sun Nov 9 18:57:28 2025 +0000

    Refactor direct_pca_test.py to test_pca_smoke.py with pytest structure

    Converted legacy procedural test script to proper pytest:
    - Class-based structure with TestPCAImplementation
    - Parametrized tests for all datasets
    - Fixtures for vote matrix loading
    - Proper logging instead of prints
    - Smoke test warning (no correctness validation)
    - Tests: runs without error, projection statistics, clustering

    Tests PCA functions directly (not through Conversation class).

commit 43593a0
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sun Nov 9 18:26:20 2025 +0000

    Fix direct conversation test

    - Convert to proper pytest format, not standalone script
    - Use fixtures for setup/teardown
    - Warn it is  test to check Conversation class instantiation and method calls
    - Replace prints by logging
    - Parametrize the test to run over all available real_data
    - Add some dimension and attributes assertions
    - Rename to test_conversation_smoke.py

commit c717c47
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sun Nov 9 18:12:07 2025 +0000

    Fix buggy test that blocked pytest collection

    The `test_batch_id.py` was running code at load time, and that code had an error,
    thus crashed during pytest collection, preventing all tests from running.

    By refactoring into a proper test function, pytest can now collect all tests and run them.

    We also fix the error itself, which was a missing escape of the "scan" reserved word in DynamoDB.

commit 84547f2
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sun Nov 9 17:54:50 2025 +0000

    Clarify terms in messages and comments

commit 23d1833
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sun Nov 9 17:54:30 2025 +0000

    Fix path...

commit 6adbd51
Merge: d560fe6 b8df940
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sun Nov 9 11:24:27 2025 +0000

    Merge branch 'edge' into replace_named_matrix

commit d560fe6
Merge: be3d50e c5ec899
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sat Nov 8 11:45:13 2025 +0000

    Merge remote-tracking branch 'upstream/edge' into replace_named_matrix

commit be3d50e
Author: Julien Cornebise <julien@cornebise.com>
Date:   Sat Nov 8 11:44:22 2025 +0000

    Print whether comment priorites are missing from test data

commit d7970d8
Author: Julien Cornebise <julien@cornebise.com>
Date:   Fri Nov 7 12:38:16 2025 +0000

    Refactor real_data loading

    Remove duplication, allow for automatic finding of the files within a location,
    allow for generalisation to other conversations than the two used so far.

commit f5ac669
Author: Julien Cornebise <julien@cornebise.com>
Date:   Fri Nov 7 10:55:55 2025 +0000

    Create script to download real data for tests

    This is useful if no folder `real data` was provided. I suspect these tests were
    written with a `real data` folder already in place. I do not have it, therefore
    we need to download it. See the `README` file that has been updated.

commit 23cced0
Author: Julien Cornebise <julien@cornebise.com>
Date:   Thu Nov 6 13:51:13 2025 +0000

    Extract common function to utils file

    That function was defined 3 times in 3 different files.

commit 3c6e788
Author: Julien Cornebise <julien@cornebise.com>
Date:   Mon Nov 3 17:57:21 2025 +0000

    Add type hint in some poller functions

* Fix run_math_pipeline test import to use proper package path

The test file was importing `from run_math_pipeline import main` which
failed locally because `run_math_pipeline.py` lives inside the `polismath`
package at `delphi/polismath/run_math_pipeline.py`.

CI was working around this by copying the file to a flat location:
  docker cp delphi/polismath/run_math_pipeline.py delphi:/app/run_math_pipeline.py

This created a discrepancy between local and CI environments.

The fix:
1. Update test imports to use the correct package path:
   `from polismath.run_math_pipeline import main`
2. Update mock.patch paths to match:
   `mock.patch('polismath.run_math_pipeline.fetch_comments', ...)`
3. Remove the CI workaround that copied the file to /app flat
4. Simplify coverage to `--cov=polismath` (run_math_pipeline is inside it)

The Docker image already has `polismath/` at `/app/polismath/` and the
package is installed via `pip install --no-deps .`, so the proper import
path works in both local and CI environments.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Improve CI coverage reporting reliability

Changes to the CI workflow:
1. Print coverage report to workflow logs (always visible)
2. Upload coverage report as downloadable artifact
3. Make PR comment step non-fatal with continue-on-error: true
   (fork PRs cannot post comments due to GitHub token restrictions)

Coverage is now accessible three ways:
- In the workflow logs (step 7)
- As a downloadable artifact (step 8)
- As a PR comment when permissions allow (step 9)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Add graceful error handling for coverage comment on fork PRs

Instead of showing an unhandled error when posting coverage comments
fails on fork PRs, the script now catches the 403 error and displays
a helpful message explaining:
- Why the comment could not be posted (GitHub token permissions)
- Where to find the coverage report (logs and artifact)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix test for malformed votes

Malformed votes should be ignored.

* Clean up unused variables and imports

Address GitHub Copilot review comments:
- Log superseded votes count in conversation.py instead of leaving unused
- Remove unused p1_idx/p2_idx index lookups in corr.py
- Remove unused all_passed variable in regression_comparer.py
- Remove unused imports (numpy, Path, List, datetime, stats, pca/cluster functions)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
* Only run python-ci for delphi changes; minimize output

* address PR feedback
This reverts commit 51665ab, reversing
changes made to 3901ee5.
* add narrative pipelline test

* change filename

* slight mocking adjustment

* mock sentence transformer

* better evoc

* try massaging mock data again

* more mocking

* diff mock strategy

* fix cov report

* test 500 gen embed

* syntax fixes

* update sytax again

* syntax fix again

* attempt mock fix

* another mock attempt

* fix action

* fix action again

* actions fix

* add another test

* add another test

* fix test
* add client-visualization submodule

* add pca visualization to alpha client

* show user in the data viz

* fetch and animate new pca data

* remove gitmodule

* use concaveman lib; update package.json; use gray color; only show when vis_type is set

* reset selected statement when group changes

* update astro types

* include remaining comment count
Bumps [js-yaml](https://github.com/nodeca/js-yaml) from 4.1.0 to 4.1.1.
- [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md)
- [Commits](nodeca/js-yaml@4.1.0...4.1.1)

---
updated-dependencies:
- dependency-name: js-yaml
  dependency-version: 4.1.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [js-yaml](https://github.com/nodeca/js-yaml) from 3.14.1 to 3.14.2.
- [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md)
- [Commits](nodeca/js-yaml@3.14.1...3.14.2)

---
updated-dependencies:
- dependency-name: js-yaml
  dependency-version: 3.14.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [glob](https://github.com/isaacs/node-glob) from 10.3.16 to 10.5.0.
- [Changelog](https://github.com/isaacs/node-glob/blob/main/changelog.md)
- [Commits](isaacs/node-glob@v10.3.16...v10.5.0)

---
updated-dependencies:
- dependency-name: glob
  dependency-version: 10.5.0
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps  and [js-yaml](https://github.com/nodeca/js-yaml). These dependencies needed to be updated together.

Updates `js-yaml` from 4.1.0 to 4.1.1
- [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md)
- [Commits](nodeca/js-yaml@4.1.0...4.1.1)

Updates `js-yaml` from 3.14.1 to 3.14.2
- [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md)
- [Commits](nodeca/js-yaml@4.1.0...4.1.1)

---
updated-dependencies:
- dependency-name: js-yaml
  dependency-version: 4.1.1
  dependency-type: indirect
- dependency-name: js-yaml
  dependency-version: 3.14.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Optimize update_votes with vectorized pivot_table (5x speedup)

Replace the row-by-row for-loop in update_votes with a vectorized
pivot_table approach. This dramatically speeds up vote loading for
large datasets.

Performance on bg2050 dataset (1M+ votes, 7.8k participants, 7.7k comments):
- Before: 18.5s average, 56k votes/sec
- After:  3.5s average, 295k votes/sec
- Speedup: 5.3x overall, 16x for the batch update step

The optimization:
1. Use pivot_table to reshape long-form votes to wide-form matrix
2. Use DataFrame.where() to merge with existing matrix
3. Use float32 for intermediate matrix to halve memory usage

Also adds a benchmark script at polismath/benchmarks/bench_update_votes.py
for measuring update_votes performance.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Vectorize _compute_vote_stats and make benchmark standalone

- _compute_vote_stats: Replace per-row/per-column loops with numpy
  vectorized operations using boolean masks and axis-based sums.
  This eliminates O(rows + cols) Python loops.

- bench_update_votes.py: Make standalone by accepting CSV path directly
  instead of depending on datasets package. Add TODO for using datasets
  package once PR #2312 is merged.

Combined with pivot_table optimization, achieves ~10x speedup on bg2050
dataset (1M votes): 18-30s -> 2.5s (~400k votes/sec).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix: Remove misleading float32 conversion in update_votes

Addresses GitHub Copilot review comments on PR #2313:
- Removed float32 conversion that only provided temporary memory savings
- The comment was misleading as savings didn't persist after .where()

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix: Use vectorized pandas operations in benchmark loader

Replace iterrows() with rename() + to_dict('records') for efficiency,
as suggested by GitHub Copilot review.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Add timing logging for PCA and repness

* Add benchmark script for repness

* Add profiling to benchmark for repness

* Vectorize vote count: 2x speedup on large convos

* Extract common setup code

* Rename vote_matrix to vote_matrix_df for clarity

* Keep NaNs instead of None: 2x more speedup

* Refactor conv_repness() to use long-format DataFrame

Convert wide-format vote matrix to long-format using melt() and use
vectorized pandas groupby operations instead of nested loops.

Key changes:
- Add compute_group_comment_stats_df() for vectorized (group, comment) stats
- Add prop_test_vectorized() and two_prop_test_vectorized() for batch z-tests
- Add select_rep_comments_df() and select_consensus_comments_df() for
  DataFrame-native selection, converting to dicts only at the end
- Compute "other" stats as total - group instead of recalculating
- Use MultiIndex.from_product() to ensure all (group, comment) combinations

Test changes:
- Add test_old_format_repness.py to preserve backwards compatibility tests
- Add TestVectorizedFunctions class with 8 tests for new DataFrame interface

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* Shorten imports as per GH Copilot Review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update docstring as per GH Copilot Review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Remove unused import as per GH Copilot Review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Move profiler to within profiling function as per GH Copilot review

* Remove unused import as per GH Copilot review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Profile new functions

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* remove express from oidc-simulator; update other libs

* pin auth0-simulator to 0.10.2

* e2e lib updates

* client-admin lib updates

* fix delphi dockerfile -- torch versions for cpu
Bumps  and [js-yaml](https://github.com/nodeca/js-yaml). These dependencies needed to be updated together.

Updates `js-yaml` from 4.1.0 to 4.1.1
- [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md)
- [Commits](nodeca/js-yaml@4.1.0...4.1.1)

Updates `js-yaml` from 3.14.1 to 3.14.2
- [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md)
- [Commits](nodeca/js-yaml@4.1.0...4.1.1)

---
updated-dependencies:
- dependency-name: js-yaml
  dependency-version: 4.1.1
  dependency-type: indirect
- dependency-name: js-yaml
  dependency-version: 3.14.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [nodemailer](https://github.com/nodemailer/nodemailer) from 7.0.7 to 7.0.11.
- [Release notes](https://github.com/nodemailer/nodemailer/releases)
- [Changelog](https://github.com/nodemailer/nodemailer/blob/master/CHANGELOG.md)
- [Commits](nodemailer/nodemailer@v7.0.7...v7.0.11)

---
updated-dependencies:
- dependency-name: nodemailer
  dependency-version: 7.0.11
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [node-forge](https://github.com/digitalbazaar/forge) from 1.3.1 to 1.3.3.
- [Changelog](https://github.com/digitalbazaar/forge/blob/main/CHANGELOG.md)
- [Commits](digitalbazaar/forge@v1.3.1...v1.3.3)

---
updated-dependencies:
- dependency-name: node-forge
  dependency-version: 1.3.3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps  and [jws](https://github.com/brianloveswords/node-jws). These dependencies needed to be updated together.

Updates `jws` from 4.0.0 to 4.0.1
- [Release notes](https://github.com/brianloveswords/node-jws/releases)
- [Changelog](https://github.com/auth0/node-jws/blob/master/CHANGELOG.md)
- [Commits](auth0/node-jws@v4.0.0...v4.0.1)

Updates `jws` from 3.2.2 to 3.2.3
- [Release notes](https://github.com/brianloveswords/node-jws/releases)
- [Changelog](https://github.com/auth0/node-jws/blob/master/CHANGELOG.md)
- [Commits](auth0/node-jws@v4.0.0...v4.0.1)

---
updated-dependencies:
- dependency-name: jws
  dependency-version: 4.0.1
  dependency-type: indirect
- dependency-name: jws
  dependency-version: 3.2.3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
@tevko tevko marked this pull request as ready for review December 6, 2025 16:59
@tevko tevko merged commit a6d7215 into stable Dec 6, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants