PROD DEPLOY 12/6 #2326

tevko · 2025-12-06T16:59:03Z

No description provided.

* add db scaling, install datadog * add to example env * dd instrumentation

* expand on object properties for LLM * prompt hardening

* remove deprecated conversation fields * add GET all_conversations route * superadmin all-conversations view * Participant Management WIP * refactor xid logic; show xid list with pids in client-admin * new xid tests * Enable XID Upload * show xid vote_count * block non-xid participants when xid is required * update some internal naming from "whitelist" to "allow list" * xid arg not needed in votesPost * fix test * participation-management e2e * upgrade cypress * fix e2e test * update alpha client with xid concerns * normalize message; fix test * rebuild astro

* update pip-tools; remove pip version restriction; update requirements.lock * simplify Dockerfile; remove unused `IS_GITHUB_ACTION` conditional * update cypress config to not use `IS_GITHUB_ACTION` * conditionally use cpu-only torch libs in test builds

add more tests

* Fix run_math_pipeline test import to use proper package path The test file was importing `from run_math_pipeline import main` which failed locally because `run_math_pipeline.py` lives inside the `polismath` package at `delphi/polismath/run_math_pipeline.py`. CI was working around this by copying the file to a flat location: docker cp delphi/polismath/run_math_pipeline.py delphi:/app/run_math_pipeline.py This created a discrepancy between local and CI environments. The fix: 1. Update test imports to use the correct package path: `from polismath.run_math_pipeline import main` 2. Update mock.patch paths to match: `mock.patch('polismath.run_math_pipeline.fetch_comments', ...)` 3. Remove the CI workaround that copied the file to /app flat 4. Simplify coverage to `--cov=polismath` (run_math_pipeline is inside it) The Docker image already has `polismath/` at `/app/polismath/` and the package is installed via `pip install --no-deps .`, so the proper import path works in both local and CI environments. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Improve CI coverage reporting reliability Changes to the CI workflow: 1. Print coverage report to workflow logs (always visible) 2. Upload coverage report as downloadable artifact 3. Make PR comment step non-fatal with continue-on-error: true (fork PRs cannot post comments due to GitHub token restrictions) Coverage is now accessible three ways: - In the workflow logs (step 7) - As a downloadable artifact (step 8) - As a PR comment when permissions allow (step 9) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add graceful error handling for coverage comment on fork PRs Instead of showing an unhandled error when posting coverage comments fails on fork PRs, the script now catches the 403 error and displays a helpful message explaining: - Why the comment could not be posted (GitHub token permissions) - Where to find the coverage report (logs and artifact) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>

* Merge Squashed onto `edge`: commit 7f14aed Merge: 93a2d31 780f129 Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 20 15:50:04 2025 +0000 Merge commit '780f1298ca7d72b9717f6aa38526301305e520e8' into replace_named_matrix This will allow CI to run correctly. commit 93a2d31 Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 21:10:15 2025 +0000 Recompile requirements.lock to include natsort commit 0fd3734 Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 15:01:16 2025 +0000 Update golden records Now that we have changed behaviours of matrix in terms of ordering and of types, we need to update the golden records to reflect these changes. commit 08d2383 Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 15:01:04 2025 +0000 Fix regression bugs from package reorganization due to hallucinations During refactoring to polismath.regression package, introduced bugs by hallucinating non-existent methods and changing behavior without checking the original code (commit afb8525). Fixed: - prepare_votes_data(): Restored CSV columns ('voter-id', 'comment-id') and vote dict keys ('pid', 'tid') instead of hallucinated alternatives - compute_all_stages(): Restored actual methods (update_votes(), _compute_pca(), _compute_clusters()) instead of hallucinated ones (process_votes(), compute_pca(), compute_clustering()) - compute_all_stages_with_benchmark(): Restored original implementation - get_dataset_files(): Restored original dict keys ('votes', 'comments') instead of changed keys ('votes_csv', 'comments_csv') - load_golden_snapshot(): Restored golden_path computation logic - Numpy type handling: Added custom JSON encoder to preserve numeric types and extended comparer to treat Python/numpy numeric types as compatible commit 334c01b Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 14:11:31 2025 +0000 Reorganize regression testing into dedicated polismath.regression package - Split monolithic regression.py (1167 lines) into focused modules: - recorder.py: ConversationRecorder class - comparer.py: ConversationComparer class - datasets.py: Dataset configuration (moved from tests/) - utils.py: Shared utility functions - Clean architecture: No backwards dependencies from production to tests - Updated all imports in CLI scripts and test files - Regression testing now treated as first-class production feature This improves code organization, maintainability, and makes the regression tools suitable for use in production environments (monitoring, validation). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> commit afb8525 Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 13:47:11 2025 +0000 Improve logging throughout regression testing system - Replace all print statements with proper logging calls in polismath/regression.py - Use logger.info() for progress updates and results - Use logger.warning() for comparison mismatches - Use logger.debug() for detailed diagnostic information - Make PCA debug output conditional on DEBUG log level - Only save debug JSON files when logger.isEnabledFor(logging.DEBUG) - Move debug outputs from current directory to .test_outputs/debug/ - Add --log-level CLI argument to regression scripts - Support DEBUG, INFO, WARNING, ERROR, CRITICAL levels - Default to INFO level - DEBUG level enables PCA debug file generation - Fix conversation module's logging initialization - Check logging.root.handlers instead of logger.handlers - Prevents duplicate handlers when logging is externally configured - Simplifies logging setup in CLI scripts The regression tools now provide full control over logging verbosity, making it easier to debug issues (with DEBUG) or run quietly (with WARNING/ERROR). 🤖 Generated with Claude Code Co-Authored-By: Claude <noreply@anthropic.com> commit 87f8cb2 Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 13:14:10 2025 +0000 Reorganize regression tests and consolidate test outputs - Move golden snapshots to dataset folders (real_data/{dataset}/golden_snapshot.json) - Relocate regression library from regression_tests/ to polismath/regression.py - Move CLI tools to scripts/ with clearer names (regression_recorder.py, regression_comparer.py) - Mark Clojure comparison tests as legacy with 'legacy_' prefix - Consolidate ALL test outputs in hidden .test_outputs/ directory: - Regression outputs → .test_outputs/regression/ - Python implementation outputs → .test_outputs/python_output/{dataset}/ - Keep real_data/ clean with only source data and golden snapshots - Fix path resolution bugs and unknown dataset handling in regression system - Update documentation and simplify .gitignore This reorganization clearly separates: - Source data and golden snapshots (real_data/) from temporary outputs (.test_outputs/) - Standard Python regression tests from legacy Clojure comparisons - Core libraries (polismath/) from CLI tools (scripts/) commit a947c5a Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 12:08:03 2025 +0000 Process appropriate RunTimeWarning in correlation tests The fourth row of the test matrix is intentationally constant, which causes a RuntimeWarning when computing correlations. This commit updates the test to properly handle this warning using the warnings module, ensuring that the test suite runs cleanly without unhandled warnings. commit b6fbc09 Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 11:56:16 2025 +0000 Skip failing Clojure regression tests It's OK for now, as we want Delphi to stand on its own. commit d8cb942 Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 11:49:45 2025 +0000 Remove hardcoded paths fed to Claude commit 8dca87d Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 11:48:03 2025 +0000 Factorize the clojure comparison and pipeline tests A lot of code was redundant and there was little separation of purpose between the clojure comparison logic and the pipeline tests. This change factorizes the clojure comparison logic into its own module and simplifies the pipeline tests. commit 4622440 Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 11:07:57 2025 +0000 Fix output of full pipeline test commit a274b8a Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 11:05:11 2025 +0000 Refactor comparison to Clojure results commit b94a6c1 Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 09:47:07 2025 +0000 Preserve original data types and uses natural sorting. Makes for a much clearer output. Will need to uppdate the golden record. All tests passing. commit 7c6412b Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 09:39:30 2025 +0000 Add test for natural sorting order before implementing commit e06f0eb Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 18 09:14:17 2025 +0000 Match old sorting and conveting behaviour commit e5f47cd Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 17 15:31:59 2025 +0000 Comment out BG2018 report for tests commit cdc238c Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 17 15:05:00 2025 +0000 Remove every mention of NamedMatrix commit dacd95a Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 17 14:49:18 2025 +0000 Restrict pytest regression test to VW dataset only for speed commit b657c87 Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 17 14:43:41 2025 +0000 Vectorize matrix clean-up commit 4bb11b5 Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 17 14:39:00 2025 +0000 Fix bug in PCA that caused different results Found the bug ! (With Claude Code's help) The PCA code starts by "cleaning" the matrix with some replacement rules for NaN and strings. Then it proceeds to compute the PCA on that cleaned up matrix. Great, I've done the cleaning, and done it in-place for efficiency, since the matrix is cleaned up first thing in the code and the unclean one therefore not used. Right ? ... RIGHT ?? *It turns out*, hidden way below, the projection of the participants on the *low-dimensional space is (intentionally) done *on the non-cleaned matrix* !! *(TODO : I'll have to put my math thinking cap on understand exactly why it was *coded like that...) Adding "copy=True" in one built-in invocation solved it. This version here also restored the loop-in-loop cleanup code. My next commit will clean it up. commit 8da68f3 Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 17 13:35:27 2025 +0000 Try but fail to mimic the old handling of strings and NaNs commit 0765357 Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 17 13:32:15 2025 +0000 Add a sanity check test for matrix cleaning functions Compare old and new way of doing things, to spot differnces. commit b5ca831 Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 17 10:19:51 2025 +0000 Print differences in regular order Set operations are unordered... commit 762dcaa Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 17 10:07:27 2025 +0000 Order lexicographically (by str) upon moderation commit dde8d6f Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 17 10:07:05 2025 +0000 Store actual computation results commit e9a231a Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 17 10:02:09 2025 +0000 Test ordering to match pre-NamedMatrixectomy ordering commit fd592f7 Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 17 09:28:40 2025 +0000 Save computed JSON for outside comparison Also create a symlink to the latest, for ease of opening without having to read timestamps. commit 3392f44 Author: Julien Cornebise <julien@cornebise.com> Date: Fri Nov 14 15:22:06 2025 +0000 Sort comment ids and participants Sort the comment ids and participant ids using natsort to ensure consistent ordering. Not sure why things need to be ordered, but it is probably less surprising this way. As a bonus, our indices can now be any type instead of being force-converted to strings. commit 27b4cca Author: Julien Cornebise <julien@cornebise.com> Date: Fri Nov 14 13:52:21 2025 +0000 Remove python output that is dynamically generated during tests commit c056ef6 Author: Julien Cornebise <julien@cornebise.com> Date: Fri Nov 14 13:49:09 2025 +0000 Remove duplicate files commit 81b3d9d Author: Julien Cornebise <julien@cornebise.com> Date: Fri Nov 14 13:46:33 2025 +0000 Rename folder to new name commit 7c2cc3f Author: Julien Cornebise <julien@cornebise.com> Date: Fri Nov 14 13:42:51 2025 +0000 Fix trailing comma commit 020ae42 Author: Julien Cornebise <julien@cornebise.com> Date: Fri Nov 14 13:39:53 2025 +0000 Correct spaces to avoid false positives in git diffs. commit 3866da5 Merge: 9bbdc49 2081ed8 Author: Julien Cornebise <julien@cornebise.com> Date: Fri Nov 14 12:26:08 2025 +0000 Merge remote-tracking branch 'upstream/edge' into replace_named_matrix A lot of merge conflicts due to this branch having merged changes earlier that were merge-squashed into upstream/edge since then. commit 9bbdc49 Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 13 20:06:08 2025 +0000 Pass all unit tests without NamedMatrix commit 3fbeee6 Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 13 20:03:45 2025 +0000 Remove python output This python output is overwritten each time the tests are run, and should not be committed. commit 1392b65 Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 13 19:46:56 2025 +0000 Pass correlation tests without NamedMatrix commit 6b47f00 Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 13 19:41:39 2025 +0000 Pass Clustering tests without NamedMatrix :) commit dbe197b Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 13 19:38:12 2025 +0000 Pass all PCA unit tests PCA now works without NamedMatrix ! commit 13b8395 Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 13 19:18:24 2025 +0000 Replace NamedMatrix by DF in corr. clusters, and repness This passes test_conversation.py ! commit 2d1a6f7 Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 13 19:02:01 2025 +0000 Revert "Skip a warning generated by boto3 about datetime.utcnow being deprecated" This reverts commit 80be8bc. commit 11f18b2 Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 13 19:01:11 2025 +0000 Replace NamedMatrix by DF in conversation.recompute() This means also applying to pca and clustering! commit 7ac7b25 Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 13 15:47:37 2025 +0000 First replacement and first test to pass Replace NamedMatrix by DataFrame in - conversation.update_votes() - conversation._get_clean_matrix() - conversation._apply_moderation() and modify test_conversation::test_init commit 80be8bc Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 13 13:01:24 2025 +0000 Skip a warning generated by boto3 about datetime.utcnow being deprecated commit 767a2d2 Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 13 12:48:53 2025 +0000 Add BG2018 and rename for clarity + Replace DB connection by URL commit b80bd07 Merge: 58bd636 52a458a Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 13 09:35:31 2025 +0000 Merge remote-tracking branch 'upstream/te-delphi-py-tests' into replace_named_matrix commit 52a458a Author: tevko <tim@devzero.io> Date: Wed Nov 12 21:55:27 2025 -0600 build dependency commit e26b482 Author: tevko <tim@devzero.io> Date: Wed Nov 12 21:41:04 2025 -0600 update other action commit 58bd636 Merge: e00d970 b427582 Author: Julien Cornebise <julien@cornebise.com> Date: Wed Nov 12 22:54:15 2025 +0000 Merge branch 'te-delphi-py-tests' into replace_named_matrix commit e00d970 Author: Julien Cornebise <julien@cornebise.com> Date: Wed Nov 12 22:43:17 2025 +0000 Allow comparison to be run on multiple datasets commit 1e48a4a Author: Julien Cornebise <julien@cornebise.com> Date: Wed Nov 12 22:16:45 2025 +0000 Add BG2050 and BG2018 datasets to dataset_config.py commit d99c79e Author: Julien Cornebise <julien@cornebise.com> Date: Wed Nov 12 22:04:55 2025 +0000 Improve downloading of real data - Check if data already exists before downloading - Add option to force re-download of data - Change paths to include dataset name if known - Download all datasets from the config, by default - Add progress bars commit 46eac1b Author: Julien Cornebise <julien@cornebise.com> Date: Wed Nov 12 22:00:15 2025 +0000 Allow to run tests in parallel Particularly useful now that we have multiple tests. commit f43cbc0 Author: tevko <tim@devzero.io> Date: Wed Nov 12 15:52:48 2025 -0600 freup space commit 26f023c Author: tevko <tim@devzero.io> Date: Wed Nov 12 15:42:21 2025 -0600 shared, test db commit fb6beeb Author: tevko <tim@devzero.io> Date: Wed Nov 12 15:39:07 2025 -0600 add access keys commit ad85a95 Author: tevko <tim@devzero.io> Date: Wed Nov 12 15:22:00 2025 -0600 update region commit aa69f3f Author: Julien Cornebise <julien@cornebise.com> Date: Wed Nov 12 21:09:39 2025 +0000 Configure VScode to run pytest on the regression tests commit 457125e Author: Julien Cornebise <julien@cornebise.com> Date: Wed Nov 12 21:06:31 2025 +0000 Run regression tests on all available datasets by default commit 44ef75c Author: tevko <tim@devzero.io> Date: Wed Nov 12 15:02:30 2025 -0600 ensure dynamo tables created commit 29ee422 Author: Julien Cornebise <julien@cornebise.com> Date: Wed Nov 12 21:01:54 2025 +0000 Fix regression tests' logic The golden tests should not be generated by the tests, but by the developer once. commit 695ede0 Author: tevko <tim@devzero.io> Date: Wed Nov 12 14:47:09 2025 -0600 remove duplicate data commit ae17516 Author: tevko <tim@devzero.io> Date: Wed Nov 12 12:37:03 2025 -0600 add real data, update action commit 1ef3c06 Author: tevko <tim@devzero.io> Date: Wed Nov 12 09:33:40 2025 -0600 use pol.is baseurl -- actions commit f1ce4aa Author: tevko <tim@devzero.io> Date: Wed Nov 12 09:15:10 2025 -0600 try more robust action -- actions commit a5b4e6c Author: tevko <tim@devzero.io> Date: Wed Nov 12 09:04:53 2025 -0600 remove pg check again - actions commit 966a551 Author: tevko <tim@devzero.io> Date: Wed Nov 12 08:57:02 2025 -0600 change healthcheck - actions commit 585cd2b Author: tevko <tim@devzero.io> Date: Wed Nov 12 08:37:39 2025 -0600 remove pg check commit dd1c492 Author: Julien Cornebise <julien@cornebise.com> Date: Wed Nov 12 14:33:33 2025 +0000 Improve benchmark: 3 runs, statistical test commit 55d39a2 Author: tevko <tim@devzero.io> Date: Wed Nov 12 08:24:10 2025 -0600 actions - change baseUrl commit 9778137 Author: tevko <tim@devzero.io> Date: Wed Nov 12 08:07:00 2025 -0600 actions - mount volume tests commit 017d686 Author: Julien Cornebise <julien@cornebise.com> Date: Wed Nov 12 13:45:15 2025 +0000 Refactor regression test and add basic benchmark commit d90dd3c Author: Julien Cornebise <julien@cornebise.com> Date: Wed Nov 12 09:41:30 2025 +0000 Fix comparer and recorder to properly record and compare Saves PCA, clusters, etc commit b427582 Author: tevko <tim@devzero.io> Date: Tue Nov 11 21:40:00 2025 -0600 another actions fix again again again commit 411c751 Author: tevko <tim@devzero.io> Date: Tue Nov 11 21:29:13 2025 -0600 action fix again commit d9421e1 Author: tevko <tim@devzero.io> Date: Tue Nov 11 21:19:04 2025 -0600 another actions fix commit be7bfbb Author: tevko <tim@devzero.io> Date: Tue Nov 11 21:10:20 2025 -0600 update action again commit dac822b Author: tevko <tim@devzero.io> Date: Tue Nov 11 21:09:44 2025 -0600 add delphi service to test commit 2b3aa6c Author: tevko <tim@devzero.io> Date: Tue Nov 11 20:27:46 2025 -0600 actions update 2 commit 483c5b4 Author: tevko <tim@devzero.io> Date: Tue Nov 11 20:15:37 2025 -0600 fix action commit 008bd22 Author: tevko <tim@devzero.io> Date: Tue Nov 11 20:07:15 2025 -0600 fix all tests commit 09edb21 Author: tevko <tim@devzero.io> Date: Tue Nov 11 16:30:06 2025 -0600 use env for data script commit b9f8f60 Author: Julien Cornebise <julien@cornebise.com> Date: Tue Nov 11 12:54:29 2025 +0000 First draft of regression tests based on recorder The output is not yet the kind of exhaustive result I was expecting, so needs more work. Done with Claude. commit 531280b Author: tevko <tim@devzero.io> Date: Mon Nov 10 22:32:39 2025 -0600 update action 3 commit ea8f989 Author: tevko <tim@devzero.io> Date: Mon Nov 10 22:24:08 2025 -0600 update action 2 commit 779f5dd Author: tevko <tim@devzero.io> Date: Mon Nov 10 22:17:10 2025 -0600 update action commit 359dbe3 Author: tevko <tim@devzero.io> Date: Mon Nov 10 22:05:53 2025 -0600 add action commit cb33f2d Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 10 13:16:35 2025 +0000 Exclude Conversation serialization tests Until #2284 is resolved commit 5a9d60a Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 10 10:05:42 2025 +0000 Add assert failure messages commit 78a27df Author: Julien Cornebise <julien@cornebise.com> Date: Sun Nov 9 21:02:12 2025 +0000 Refactor test_repness_comparison.py to proper pytest structure Similar to pca tests, refactor test_repness_comparison.py - Converts test_comparison() function to TestRepnessComparison class - Uses @pytest.mark.parametrize for multiple datasets - Proper fixtures for clojure_results, conversation, python_results - Two test methods: test_structural_compatibility and test_comparison_visibility - Replaces print() with logging.info/debug - Adds warning that results are known to be very different - Reports comparison results for visibility without asserting on match rates - Maintains comparison functionality for manual inspection Test results: 4 tests passed (2 datasets × 2 test methods) commit bc4f9e0 Author: Julien Cornebise <julien@cornebise.com> Date: Sun Nov 9 19:27:07 2025 +0000 Rename test_repness.py to test_repness_unit.py for clarity Rename to clarify that these are unit tests with synthetic data, following the same naming convention established for PCA tests: - test_repness.py → test_repness_unit.py (unit tests, synthetic data) - test_repness_smoke.py (real data, smoke tests - already renamed) - test_repness_comparison.py (Python vs Clojure - already clear) This mirrors the PCA test structure: - test_pca_unit.py (unit tests) - test_pca_edge_cases.py (edge cases) - test_pca_smoke.py (smoke tests) All 14 tests pass: - Statistical utility functions (z-scores, proportion tests) - Comment statistics calculation - Representative comment selection - Consensus selection - Integration tests (conv_repness, participant_stats) commit 41355a6 Author: Julien Cornebise <julien@cornebise.com> Date: Sun Nov 9 19:19:56 2025 +0000 Refactor repness smoke test Similar to how we refactored the "direct PCA" tests commit c3947d1 Author: Julien Cornebise <julien@cornebise.com> Date: Sun Nov 9 19:04:05 2025 +0000 Ignore warning from library ddtrace in pytest commit 622adb4 Author: Julien Cornebise <julien@cornebise.com> Date: Sun Nov 9 19:01:51 2025 +0000 Clarify the naming of PCA test files and remove redundant tests commit 0a0b55e Author: Julien Cornebise <julien@cornebise.com> Date: Sun Nov 9 18:57:28 2025 +0000 Refactor direct_pca_test.py to test_pca_smoke.py with pytest structure Converted legacy procedural test script to proper pytest: - Class-based structure with TestPCAImplementation - Parametrized tests for all datasets - Fixtures for vote matrix loading - Proper logging instead of prints - Smoke test warning (no correctness validation) - Tests: runs without error, projection statistics, clustering Tests PCA functions directly (not through Conversation class). commit 43593a0 Author: Julien Cornebise <julien@cornebise.com> Date: Sun Nov 9 18:26:20 2025 +0000 Fix direct conversation test - Convert to proper pytest format, not standalone script - Use fixtures for setup/teardown - Warn it is test to check Conversation class instantiation and method calls - Replace prints by logging - Parametrize the test to run over all available real_data - Add some dimension and attributes assertions - Rename to test_conversation_smoke.py commit c717c47 Author: Julien Cornebise <julien@cornebise.com> Date: Sun Nov 9 18:12:07 2025 +0000 Fix buggy test that blocked pytest collection The `test_batch_id.py` was running code at load time, and that code had an error, thus crashed during pytest collection, preventing all tests from running. By refactoring into a proper test function, pytest can now collect all tests and run them. We also fix the error itself, which was a missing escape of the "scan" reserved word in DynamoDB. commit 84547f2 Author: Julien Cornebise <julien@cornebise.com> Date: Sun Nov 9 17:54:50 2025 +0000 Clarify terms in messages and comments commit 23d1833 Author: Julien Cornebise <julien@cornebise.com> Date: Sun Nov 9 17:54:30 2025 +0000 Fix path... commit 6adbd51 Merge: d560fe6 b8df940 Author: Julien Cornebise <julien@cornebise.com> Date: Sun Nov 9 11:24:27 2025 +0000 Merge branch 'edge' into replace_named_matrix commit d560fe6 Merge: be3d50e c5ec899 Author: Julien Cornebise <julien@cornebise.com> Date: Sat Nov 8 11:45:13 2025 +0000 Merge remote-tracking branch 'upstream/edge' into replace_named_matrix commit be3d50e Author: Julien Cornebise <julien@cornebise.com> Date: Sat Nov 8 11:44:22 2025 +0000 Print whether comment priorites are missing from test data commit d7970d8 Author: Julien Cornebise <julien@cornebise.com> Date: Fri Nov 7 12:38:16 2025 +0000 Refactor real_data loading Remove duplication, allow for automatic finding of the files within a location, allow for generalisation to other conversations than the two used so far. commit f5ac669 Author: Julien Cornebise <julien@cornebise.com> Date: Fri Nov 7 10:55:55 2025 +0000 Create script to download real data for tests This is useful if no folder `real data` was provided. I suspect these tests were written with a `real data` folder already in place. I do not have it, therefore we need to download it. See the `README` file that has been updated. commit 23cced0 Author: Julien Cornebise <julien@cornebise.com> Date: Thu Nov 6 13:51:13 2025 +0000 Extract common function to utils file That function was defined 3 times in 3 different files. commit 3c6e788 Author: Julien Cornebise <julien@cornebise.com> Date: Mon Nov 3 17:57:21 2025 +0000 Add type hint in some poller functions * Fix run_math_pipeline test import to use proper package path The test file was importing `from run_math_pipeline import main` which failed locally because `run_math_pipeline.py` lives inside the `polismath` package at `delphi/polismath/run_math_pipeline.py`. CI was working around this by copying the file to a flat location: docker cp delphi/polismath/run_math_pipeline.py delphi:/app/run_math_pipeline.py This created a discrepancy between local and CI environments. The fix: 1. Update test imports to use the correct package path: `from polismath.run_math_pipeline import main` 2. Update mock.patch paths to match: `mock.patch('polismath.run_math_pipeline.fetch_comments', ...)` 3. Remove the CI workaround that copied the file to /app flat 4. Simplify coverage to `--cov=polismath` (run_math_pipeline is inside it) The Docker image already has `polismath/` at `/app/polismath/` and the package is installed via `pip install --no-deps .`, so the proper import path works in both local and CI environments. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Improve CI coverage reporting reliability Changes to the CI workflow: 1. Print coverage report to workflow logs (always visible) 2. Upload coverage report as downloadable artifact 3. Make PR comment step non-fatal with continue-on-error: true (fork PRs cannot post comments due to GitHub token restrictions) Coverage is now accessible three ways: - In the workflow logs (step 7) - As a downloadable artifact (step 8) - As a PR comment when permissions allow (step 9) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add graceful error handling for coverage comment on fork PRs Instead of showing an unhandled error when posting coverage comments fails on fork PRs, the script now catches the 403 error and displays a helpful message explaining: - Why the comment could not be posted (GitHub token permissions) - Where to find the coverage report (logs and artifact) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix test for malformed votes Malformed votes should be ignored. * Clean up unused variables and imports Address GitHub Copilot review comments: - Log superseded votes count in conversation.py instead of leaving unused - Remove unused p1_idx/p2_idx index lookups in corr.py - Remove unused all_passed variable in regression_comparer.py - Remove unused imports (numpy, Path, List, datetime, stats, pca/cluster functions) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>

* Only run python-ci for delphi changes; minimize output * address PR feedback

This reverts commit 51665ab, reversing changes made to 3901ee5.

* add narrative pipelline test * change filename * slight mocking adjustment * mock sentence transformer * better evoc * try massaging mock data again * more mocking * diff mock strategy * fix cov report * test 500 gen embed * syntax fixes * update sytax again * syntax fix again * attempt mock fix * another mock attempt * fix action * fix action again * actions fix * add another test * add another test * fix test

* add client-visualization submodule * add pca visualization to alpha client * show user in the data viz * fetch and animate new pca data * remove gitmodule * use concaveman lib; update package.json; use gray color; only show when vis_type is set * reset selected statement when group changes * update astro types * include remaining comment count

Bumps [js-yaml](https://github.com/nodeca/js-yaml) from 4.1.0 to 4.1.1. - [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md) - [Commits](nodeca/js-yaml@4.1.0...4.1.1) --- updated-dependencies: - dependency-name: js-yaml dependency-version: 4.1.1 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [js-yaml](https://github.com/nodeca/js-yaml) from 3.14.1 to 3.14.2. - [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md) - [Commits](nodeca/js-yaml@3.14.1...3.14.2) --- updated-dependencies: - dependency-name: js-yaml dependency-version: 3.14.2 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [glob](https://github.com/isaacs/node-glob) from 10.3.16 to 10.5.0. - [Changelog](https://github.com/isaacs/node-glob/blob/main/changelog.md) - [Commits](isaacs/node-glob@v10.3.16...v10.5.0) --- updated-dependencies: - dependency-name: glob dependency-version: 10.5.0 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps and [js-yaml](https://github.com/nodeca/js-yaml). These dependencies needed to be updated together. Updates `js-yaml` from 4.1.0 to 4.1.1 - [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md) - [Commits](nodeca/js-yaml@4.1.0...4.1.1) Updates `js-yaml` from 3.14.1 to 3.14.2 - [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md) - [Commits](nodeca/js-yaml@4.1.0...4.1.1) --- updated-dependencies: - dependency-name: js-yaml dependency-version: 4.1.1 dependency-type: indirect - dependency-name: js-yaml dependency-version: 3.14.2 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Optimize update_votes with vectorized pivot_table (5x speedup) Replace the row-by-row for-loop in update_votes with a vectorized pivot_table approach. This dramatically speeds up vote loading for large datasets. Performance on bg2050 dataset (1M+ votes, 7.8k participants, 7.7k comments): - Before: 18.5s average, 56k votes/sec - After: 3.5s average, 295k votes/sec - Speedup: 5.3x overall, 16x for the batch update step The optimization: 1. Use pivot_table to reshape long-form votes to wide-form matrix 2. Use DataFrame.where() to merge with existing matrix 3. Use float32 for intermediate matrix to halve memory usage Also adds a benchmark script at polismath/benchmarks/bench_update_votes.py for measuring update_votes performance. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Vectorize _compute_vote_stats and make benchmark standalone - _compute_vote_stats: Replace per-row/per-column loops with numpy vectorized operations using boolean masks and axis-based sums. This eliminates O(rows + cols) Python loops. - bench_update_votes.py: Make standalone by accepting CSV path directly instead of depending on datasets package. Add TODO for using datasets package once PR #2312 is merged. Combined with pivot_table optimization, achieves ~10x speedup on bg2050 dataset (1M votes): 18-30s -> 2.5s (~400k votes/sec). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix: Remove misleading float32 conversion in update_votes Addresses GitHub Copilot review comments on PR #2313: - Removed float32 conversion that only provided temporary memory savings - The comment was misleading as savings didn't persist after .where() 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Fix: Use vectorized pandas operations in benchmark loader Replace iterrows() with rename() + to_dict('records') for efficiency, as suggested by GitHub Copilot review. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Add timing logging for PCA and repness * Add benchmark script for repness * Add profiling to benchmark for repness * Vectorize vote count: 2x speedup on large convos * Extract common setup code * Rename vote_matrix to vote_matrix_df for clarity * Keep NaNs instead of None: 2x more speedup * Refactor conv_repness() to use long-format DataFrame Convert wide-format vote matrix to long-format using melt() and use vectorized pandas groupby operations instead of nested loops. Key changes: - Add compute_group_comment_stats_df() for vectorized (group, comment) stats - Add prop_test_vectorized() and two_prop_test_vectorized() for batch z-tests - Add select_rep_comments_df() and select_consensus_comments_df() for DataFrame-native selection, converting to dicts only at the end - Compute "other" stats as total - group instead of recalculating - Use MultiIndex.from_product() to ensure all (group, comment) combinations Test changes: - Add test_old_format_repness.py to preserve backwards compatibility tests - Add TestVectorizedFunctions class with 8 tests for new DataFrame interface 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * Shorten imports as per GH Copilot Review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Update docstring as per GH Copilot Review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Remove unused import as per GH Copilot Review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Move profiler to within profiling function as per GH Copilot review * Remove unused import as per GH Copilot review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Profile new functions --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* remove express from oidc-simulator; update other libs * pin auth0-simulator to 0.10.2 * e2e lib updates * client-admin lib updates * fix delphi dockerfile -- torch versions for cpu

Bumps and [js-yaml](https://github.com/nodeca/js-yaml). These dependencies needed to be updated together. Updates `js-yaml` from 4.1.0 to 4.1.1 - [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md) - [Commits](nodeca/js-yaml@4.1.0...4.1.1) Updates `js-yaml` from 3.14.1 to 3.14.2 - [Changelog](https://github.com/nodeca/js-yaml/blob/master/CHANGELOG.md) - [Commits](nodeca/js-yaml@4.1.0...4.1.1) --- updated-dependencies: - dependency-name: js-yaml dependency-version: 4.1.1 dependency-type: indirect - dependency-name: js-yaml dependency-version: 3.14.2 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [nodemailer](https://github.com/nodemailer/nodemailer) from 7.0.7 to 7.0.11. - [Release notes](https://github.com/nodemailer/nodemailer/releases) - [Changelog](https://github.com/nodemailer/nodemailer/blob/master/CHANGELOG.md) - [Commits](nodemailer/nodemailer@v7.0.7...v7.0.11) --- updated-dependencies: - dependency-name: nodemailer dependency-version: 7.0.11 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [node-forge](https://github.com/digitalbazaar/forge) from 1.3.1 to 1.3.3. - [Changelog](https://github.com/digitalbazaar/forge/blob/main/CHANGELOG.md) - [Commits](digitalbazaar/forge@v1.3.1...v1.3.3) --- updated-dependencies: - dependency-name: node-forge dependency-version: 1.3.3 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps and [jws](https://github.com/brianloveswords/node-jws). These dependencies needed to be updated together. Updates `jws` from 4.0.0 to 4.0.1 - [Release notes](https://github.com/brianloveswords/node-jws/releases) - [Changelog](https://github.com/auth0/node-jws/blob/master/CHANGELOG.md) - [Commits](auth0/node-jws@v4.0.0...v4.0.1) Updates `jws` from 3.2.2 to 3.2.3 - [Release notes](https://github.com/brianloveswords/node-jws/releases) - [Changelog](https://github.com/auth0/node-jws/blob/master/CHANGELOG.md) - [Commits](auth0/node-jws@v4.0.0...v4.0.1) --- updated-dependencies: - dependency-name: jws dependency-version: 4.0.1 dependency-type: indirect - dependency-name: jws dependency-version: 3.2.3 dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

tevko and others added 30 commits September 6, 2025 17:34

add db scaling, install datadog (#2147)

68cc4a3

* add db scaling, install datadog * add to example env * dd instrumentation

Update deploy-prod.yml

012c5ea

Merge branch 'stable' into edge

66c78f7

fix dd

c376ca4

Merge branch 'stable' into edge

548bcd4

more dd config (#2150)

71b3b70

stop dd agent

1c62a90

Merge branch 'stable' into edge

3ff3ee8

more dd config

61692ed

more dd instrumentation

c7c2959

Merge branch 'stable' into edge

99c20e9

dd config add network

cc10204

Merge branch 'stable' into edge

de4714f

add log tags

4ca3f4c

Merge branch 'stable' into edge

c908a16

delphi dd config

ed46435

Merge branch 'stable' into edge

d571a9f

dd add report RUM

5a1036a

Merge branch 'stable' into edge

1b285fc

try new rum strategy

cafff99

Merge branch 'stable' into edge

f8533bf

fix obj prop name

ed807ca

Merge branch 'stable' into edge

f0acf77

fix err superadmin

68e30f5

Merge branch 'stable' into edge

714f557

another superadmin fix

08b16e1

Merge branch 'stable' into edge

68172f7

make collective statements scroll more good (#2163)

b1f9ee5

Merge branch 'stable' into edge

84d414c

Te adjust collective stmt prmpt (#2167)

2642d3b

* expand on object properties for LLM * prompt hardening

tevko and others added 28 commits November 17, 2025 21:05

fix field names

ab3f0e4

fix dynamo calls in test

a80ff60

switch to scan

2a74958

relax test assertions

c99a51f

more relaxed tests

bb61b30

relax tests further

1189b71

ignore pakistan test

4707b63

Merge branch 'edge' into te-delphi-tests-2

9dc6bb9

Merge branch 'edge' into te-delphi-tests-2

769c4e6

Merge pull request #2291 from compdemocracy/te-delphi-tests-2

d95ad13

add more tests

Only run python-ci for delphi changes; minimize output (#2315)

1ce8660

* Only run python-ci for delphi changes; minimize output * address PR feedback

Revert "Merge branch 'stable' into edge" (#2305)

af27bba

This reverts commit 51665ab, reversing changes made to 3901ee5.

some lib updates (#2323)

ddd4e64

* remove express from oidc-simulator; update other libs * pin auth0-simulator to 0.10.2 * e2e lib updates * client-admin lib updates * fix delphi dockerfile -- torch versions for cpu

tevko marked this pull request as ready for review December 6, 2025 16:59

tevko merged commit a6d7215 into stable Dec 6, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

PROD DEPLOY 12/6 #2326

PROD DEPLOY 12/6 #2326

Uh oh!

tevko commented Dec 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants