-
Notifications
You must be signed in to change notification settings - Fork 117
Comparing changes
Open a pull request
base repository: CogStack/MedCAT
base: v1.13.2
head repository: CogStack/MedCAT
compare: v1.14.0
- 20 commits
- 29 files changed
- 2 contributors
Commits on Aug 30, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b8bb4e3 - Browse repository at this point
Copy the full SHA b8bb4e3View commit details
Commits on Sep 2, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b28fa05 - Browse repository at this point
Copy the full SHA b28fa05View commit details
Commits on Sep 5, 2024
-
Pushing bug fix for metacat (#487)
* Pushing bug fix for metacat 2-phase learning for MetaCAT utilises data_undersampled. Fixed a bug in the eval function, which was incorrectly using the data_undersampled instead of the full_data * Pushing change for lazy logging * Pushing update for lazy logging * Pushing lint fix
Configuration menu - View commit details
-
Copy full SHA for 6127f77 - Browse repository at this point
Copy the full SHA 6127f77View commit details
Commits on Sep 16, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 2588670 - Browse repository at this point
Copy the full SHA 2588670View commit details -
CU-8695uhe5n: Update docs dependency pins (#491)
* CU-8695uhe5n: Update docs dependency pins * CU-8695uhe5n: Fix typo in fsspec version pin
Configuration menu - View commit details
-
Copy full SHA for 56a2856 - Browse repository at this point
Copy the full SHA 56a2856View commit details
Commits on Sep 17, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 394e17b - Browse repository at this point
Copy the full SHA 394e17bView commit details -
CU-8695pvhfe fix usage monitoring for multiprocessing (#488)
* CU-8695pvhfe: Rename a test class * CU-8695pvhfe: Add tests for multiprocessig usage monitoring * CU-8695pvhfe: Fix usage monitor for multiprocessig. When using CAT.multiprocessing_batch_char_size (CAT._multiprocessing_batch and CAT._mp_cons internally), flush the usage monitor at the end of multiprocessing method. When using CAT.get_entities_multi_texts or CAT.multiprocessing_batch_docs_size (uses the former internally), add logging of usage to output * CU-8695pvhfe: Fix remaining issues with usage monitor for multiprocessig. Avoid checking length of (potentially) non-existent strings. Avoid early iteration of generator.
Configuration menu - View commit details
-
Copy full SHA for eb912d6 - Browse repository at this point
Copy the full SHA eb912d6View commit details
Commits on Sep 30, 2024
-
CU-8695knfbg Add name edits to regression suite (#486)
* CU-8695knfbg: Decouple the edit finder methods from the spell checker * CU-8695knfbg: Add methods for random edit picking and variant estimation to utils; Plus a few tests * CU-8695knfbg: Add edit distance option and use to CLI * CU-8695knfbg: Allow retaining order of elements in generator when getting edits for run-to-run consistency * CU-8695knfbg: Add safeguard for name order to be consistent across runs * CU-8695knfbg: Sort names when getting from CDB to avoid run to run variance * CU-8695knfbg: Move edit finding methods back to BasicSpellChecker class, but make the 1-distance method a class method * CU-8695knfbg: Move validation earlier in edit finder * CU-8695knfbg: Simplify edit finder somewhat
Configuration menu - View commit details
-
Copy full SHA for cbae5b3 - Browse repository at this point
Copy the full SHA cbae5b3View commit details -
CU-869574kvp update snomed preprocessing naming (#469)
* CU-869574kvp: Add pattern based release version identifying for Snomed preprocessing * CU-869574kvp: Add tests for pattern-based snomed release identification * CU-869574kvp: Update Snomed preprocessing: Separate extensions into an Enum. Do the release/paths check at init to allow for early failures in case of issues * CU-869574kvp: Simplify mappings somewhat. Move common avoids to a common location. Fix UK Drug relationship name * CU-869574kvp: Simplify mappings somewhat more. Remove some clutter by separating common prefixes for release types and file names. * CU-869574kvp: Simplify mappings somewhat more, agai. Remove some clutter by separating common suffixes for release types. * CU-869574kvp: Update preprocessing. New abstraction. Use supprted extensions which describe their file formats along with bundles which give some further insight and control. * CU-869574kvp: Fix data class init * CU-869574kvp: Fix issue with file paths * CU-869574kvp: Fix a UK Clinical description file path * CU-869574kvp: Add (optional) 2nd part of folder name to extension. For AU models, the folder name seems to be 'SnomedCT_Release_AU1000036_20240630T120000Z', so the 1st part is just 'Release' and the 2nd part is indicative of AU. Add usage of this where relevant. * CU-869574kvp: Fix preprocessing tests. Add patch for files/folders where applicable. Change the paths of attributes where applicable.
Configuration menu - View commit details
-
Copy full SHA for a9544f7 - Browse repository at this point
Copy the full SHA a9544f7View commit details -
Configuration menu - View commit details
-
Copy full SHA for b433195 - Browse repository at this point
Copy the full SHA b433195View commit details
Commits on Oct 7, 2024
-
CU-8695ucw9b deid transformers fix (#490)
* CU-8695ucw9b: Fix older DeID models due to changes in transformers. Since transformers 4.42.0, the tokenizer is expected to have the 'split_special_tokens' attribute. But the version we've saved does not. So when it's loaded, this causes an exception to be raised (which is currently caught and logged by medcat). * CU-8695ucw9b: Add functionality for transformers NER to spectacularly fail upon consistent consecutive exceptions. The idea is that this way, if something in the underlying models is consistently failing, the exception is raised rather than simply logged * CU-8695ucw9b: Add tests for exception raising after a pre-defined number of failed document processes * CU-8695ucw9b: Change conditions for raising exception on consecutive failure. Now only raise the exception if the consecutive failure is identical (or similar). We determine that from the type and string-representation of the exception being raised. * CU-8695ucw9b: Small additional cleanup on successful TNER processing * CU-8695ucw9b: Use custom exception when failing due to consecutive exceptions * CU-8695ucw9b: Remove try-except when processing transformers NER to force immediate raising of exception
Configuration menu - View commit details
-
Copy full SHA for 44db08b - Browse repository at this point
Copy the full SHA 44db08bView commit details
Commits on Oct 9, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 909cfad - Browse repository at this point
Copy the full SHA 909cfadView commit details
Commits on Oct 14, 2024
-
MetaCAT fixes and upgrades (#495)
* MetaCAT fixes and upgrades Pushing for 3 updates: 1) Removed the check and update for labels with zero data, as this was causing issues during evaluation 2) Resolved an issue where the confusion matrix couldn't be calculated when testing on a single class with an F1 score of 1, as it expected the original number of training classes (3) 3) Updated the attention mask creation to dynamically use the actual pad_idx value instead of assuming it to be 0 * Pushing type fix * Pushing for type fix * Fixing type issues * Pushing change * Pushing update w/o try except block For the issue where the confusion matrix couldn't be calculated when testing on a single class with an F1 score of 1, as it expected the original number of training classes (3), pushing an optimized version w/o the try except block
Configuration menu - View commit details
-
Copy full SHA for 3e01747 - Browse repository at this point
Copy the full SHA 3e01747View commit details
Commits on Oct 28, 2024
-
CU-869671bn4: Update requirements to fix workflow issue due to mypy (#…
…497) * CU-869671bn4: Update requirements (GHA should fail due to mypy) * CU-869671bn4: Update mypy dev requirement to be less than 1.12
Configuration menu - View commit details
-
Copy full SHA for 976adc2 - Browse repository at this point
Copy the full SHA 976adc2View commit details -
CU-86967nnra Drop python 3.8 support (EoL) (#498)
* CU-86967nnra: Remove python 3.8 from GHA * CU-86967nnra: Remove python 3.8 from classifiers * CU-86967nnra: Add python version requirements to setup.py (allowing from 3.9 to 3.11) * CU-86967nnra: Remove upper bound from python requirements. Upper bound could be lifted as soon as `spacy` releases a compatible versions. And it _shouldn't_ require any changes from our side. And it isn't possible to install it on higher versions (currently) due to no `spacy` being available for those versions
Configuration menu - View commit details
-
Copy full SHA for 04efda5 - Browse repository at this point
Copy the full SHA 04efda5View commit details
Commits on Nov 1, 2024
-
CU-86964zm4d fix preprocessing (#496)
* CU-86964zm4d: Use ignore tag correctly to ignore certain parts of UK release * CU-86964zm4d: Use OPCS4 later refset ID by default (and switch to older if needed) * CU-86964zm4d: Fix OPCS4 refset ID tests. Fix the default value being tested for (i.e in case of international release that'll be shown). Add a test for old UK extension. * CU-86964zm4d: Add note regarding OPCS refset ID relevance only for UK extensions. * CU-86964zm4d: Fix checking of extension outside loops. I.e determinie if a UK release/bundle is used for OPCS4/ICD10 mappings splitting. Always returning separate refsets for ICD10 and OSC internally, even if the latter is None.
Configuration menu - View commit details
-
Copy full SHA for e924798 - Browse repository at this point
Copy the full SHA e924798View commit details -
CU-8695hghww backwards compatibility workflow (#478)
* CU-8695hghww: Add bash script to run backwards compatibility * CU-8695hghww: Rename backwards compatibility running bash script * CU-8695hghww: Add new step to workflow to run model backwards compatibility * CU-8695hghww: Fix model compatibility regression suite path * CU-8695hghww: Simplify creation and removal of fake model folder
Configuration menu - View commit details
-
Copy full SHA for c0082ef - Browse repository at this point
Copy the full SHA c0082efView commit details
Commits on Nov 15, 2024
-
Configuration menu - View commit details
-
Copy full SHA for df3df66 - Browse repository at this point
Copy the full SHA df3df66View commit details
Commits on Nov 19, 2024
-
CU-8696m1mch: Remove versioning utility since all its parts were depr…
…ecated (#500) * CU-8696m1mch: Remove versioning utility since all its parts were deprecated * CU-8696m1mch: Remove tests for versioning utility * CU-8696m1mch: Remove unused test-specific binary (CDB)
Configuration menu - View commit details
-
Copy full SHA for 37a8a63 - Browse repository at this point
Copy the full SHA 37a8a63View commit details -
Configuration menu - View commit details
-
Copy full SHA for ceb74b1 - Browse repository at this point
Copy the full SHA ceb74b1View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff v1.13.2...v1.14.0