Skip to content
This repository was archived by the owner on Jul 28, 2025. It is now read-only.

Conversation

@shubham-s-agarwal
Copy link
Collaborator

2-phase learning for MetaCAT utilises data_undersampled.
Fixed a bug in the eval function, which was incorrectly using the data_undersampled instead of the full_data

2-phase learning for MetaCAT utilises data_undersampled. Fixed a bug in the eval function, which was incorrectly using the data_undersampled instead of the full_data
Copy link
Collaborator

@mart-r mart-r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the lazy logging formatting.
Also, perhaps that should be a logger.debug instead?

PS:
At initial look it seems like you're changing the order of the returned arguments in the method as well as when unpacking. And that wouldn't change the behaviour. But for the lines changed, that is the intention.
What the change actually does is force the use of encode_category_values in line 354 to use the full data rather than the undersampled data.
So there's a little bit of a confusing path of fixing an issue. In order to fix an issue of unpacking data in line 354, the order of the returned arguments is changed everywhere else.
But this order probably does make sense overall.

if data[i][2] in category_value2id.values():
label_data_[data[i][2]] = label_data_[data[i][2]] + 1

logger.info(f"Original label_data: {label_data_}")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to use lazy formatting for logging. So instead of an f-string, use %s formatting.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes I'll fix that!
Regarding the ordering of unpacking, I was going to only change the order for link 354, however Tom pointed out that it would make the most sense to have it full_data and then undersampled_data
That's why the long route

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I make the change for all logging statements to use lazy logging?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally don't want to change the purpose of a PR.
But I don't think there's too many. So why not.

As for unpacking - yes, I agree. This order makes more sense in general. Just thought I'd mention it.

@shubham-s-agarwal shubham-s-agarwal marked this pull request as draft September 4, 2024 12:44
Copy link
Collaborator

@mart-r mart-r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lazy logging means allowing the logger to parse in the % formats if (and only if) the logged message has a handler (i.e if it needs to be shown/written somewhere).

So a messages gets logged as follows:

logger.info("Messages about %s and nr %d", "topic 1", 4)

@shubham-s-agarwal shubham-s-agarwal marked this pull request as ready for review September 4, 2024 14:06
@shubham-s-agarwal
Copy link
Collaborator Author

Fixed the lazy logging formatting

Copy link
Collaborator

@mart-r mart-r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@shubham-s-agarwal shubham-s-agarwal merged commit 6127f77 into master Sep 5, 2024
@shubham-s-agarwal shubham-s-agarwal deleted the metacat_bug_fix branch September 5, 2024 08:12
mart-r pushed a commit to mart-r/MedCAT that referenced this pull request Oct 14, 2024
* Pushing bug fix for metacat

2-phase learning for MetaCAT utilises data_undersampled. Fixed a bug in the eval function, which was incorrectly using the data_undersampled instead of the full_data

* Pushing change for lazy logging

* Pushing update for lazy logging

* Pushing lint fix
mart-r pushed a commit that referenced this pull request Oct 14, 2024
* Pushing bug fix for metacat

2-phase learning for MetaCAT utilises data_undersampled. Fixed a bug in the eval function, which was incorrectly using the data_undersampled instead of the full_data

* Pushing change for lazy logging

* Pushing update for lazy logging

* Pushing lint fix
alhendrickson pushed a commit to CogStack/cogstack-nlp that referenced this pull request Jul 1, 2025
* Pushing bug fix for metacat

2-phase learning for MetaCAT utilises data_undersampled. Fixed a bug in the eval function, which was incorrectly using the data_undersampled instead of the full_data

* Pushing change for lazy logging

* Pushing update for lazy logging

* Pushing lint fix
alhendrickson pushed a commit to CogStack/cogstack-nlp that referenced this pull request Jul 1, 2025
* Pushing bug fix for metacat

2-phase learning for MetaCAT utilises data_undersampled. Fixed a bug in the eval function, which was incorrectly using the data_undersampled instead of the full_data

* Pushing change for lazy logging

* Pushing update for lazy logging

* Pushing lint fix
alhendrickson pushed a commit to CogStack/cogstack-nlp that referenced this pull request Jul 1, 2025
* Pushing bug fix for metacat

2-phase learning for MetaCAT utilises data_undersampled. Fixed a bug in the eval function, which was incorrectly using the data_undersampled instead of the full_data

* Pushing change for lazy logging

* Pushing update for lazy logging

* Pushing lint fix
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants