Skip to content

Conversation

@NicolasColombi
Copy link
Collaborator

@NicolasColombi NicolasColombi commented Mar 26, 2025

Changes proposed in this PR:

  • Convert only internally available variables into enum class
  • Access such variables from a new function get_region_per_countries() to assign impact function ids

PR Author Checklist

PR Reviewer Checklist

@NicolasColombi NicolasColombi requested a review from chahank March 26, 2025 16:20
Copy link
Member

@chahank chahank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very useful to simplify the use of the tropical cyclone impact function.

  • There are now many tests failing, can you please fix that?
  • Please update the changelog

)

@staticmethod
def get_regions_per_countries(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add tests for this method?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yess ✅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the name get_impf_id_regions_per_countries be clearer?

Just give it a thought: would you have found the use of this function with the current name if you had not written it just by looking at the class?

key
for country in countries
for key, value in country_dict.items()
if country in value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you want to check inclusion and not equality?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, as value is the list of countries iso codes corresponding to the region ID key. Consequently, I want to check if the country iso code country is included in that list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then can use a better name than value? For me, value means a single value, not a list. The same for key which is unclear what this is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes sure, normally key & value refers to the key-value pair in a dictionary, but I guess the meaning of key and value is clear only if you know what the dictionary is... it should be clearer now.

@NicolasColombi
Copy link
Collaborator Author

Very useful to simplify the use of the tropical cyclone impact function.

  • There are now many tests failing, can you please fix that?
  • Please update the changelog

Done ✅

@NicolasColombi NicolasColombi marked this pull request as ready for review March 26, 2025 22:16
@NicolasColombi
Copy link
Collaborator Author

I think this is ready, what do you think @chahank ?

def test_get_region_per_countries(self):
"""Test static get_regions_per_countries()"""
ifs = ImpfSetTropCyclone()
out = ifs.get_regions_per_countries(countries=["CHE"], code_type="ISO3A")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please use more clear variable names. out is not very clear here. Can you also make clear why the outputs are what they are?

a string if the code is iso3a or an integer if ISO3N. For example, for Switzerland:
the ISO3A code is "CHE" and the ISO3N is 756.
code_type : str
Either "ISO3A" or "ISO3N".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please write clearly what this is. ISOA and ISO3N are not official acronyms. Please correct in all the code, the error messages, and the tests.

Comment on lines 475 to 478
regions_ids : list
List of the regions that match the countries.
regions_names : list
List of the regions that match the countries.
Copy link
Member

@chahank chahank Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are identical docstrings, yet different variables. PLease be clear. I think you mean that one contains the ids defined in the paper by Eberenz, the other the region names?

Comment on lines 482 to 488
raise ValueError("code_type must be either 'iso3a' or 'iso3n'")
elif not all(isinstance(country, type(countries[0])) for country in countries):
raise ValueError("All elements in the list must be of the same type.")
elif code_type == "ISO3A" and isinstance((countries[0]), int):
raise ValueError("ISO3A code type cannot have integer values.")
elif code_type == "ISO3N" and isinstance((countries[0]), str):
raise ValueError("ISO3N code type cannot have string values.")
Copy link
Member

@chahank chahank Apr 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please choose clear words / variable names. Here it is neither consistent (e.e. iso3a and ISO3A), nor standard.

self.assertListEqual(out[2], [124, 840])
self.assertListEqual(out[3], ["CAN", "USA"])

def test_get_region_per_countries(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add at least one test with multiple countries.

Co-authored-by: Chahan M. Kropf <chahan.kropf@usys.ethz.ch>
Copy link
Member

@peanutfun peanutfun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! However, I have a lot of questions design-wise. What you implemented seems unnecessarily restrictive to me: Users have to enter a list of either exclusively strings or ints, and then tell the function in a separate parameter whether they entered strings or ints 😬 Also, the enum doubles functionality with country_to_iso, which allows for converting numerical to alpha-3 ISO codes and vice versa.

Overall, the function/enum combination can be made much more accessible and concise, I think.

LOGGER = logging.getLogger(__name__)


class CountryCode(Enum):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this enum contain numerical and alpha-3 country codes? Conversion between the two can be done by country_to_iso and this effectively doubles the functionality

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, only one is enough, just thought it was better practice to have both in case one might need them, I can delate one or the other if you prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is just that the conversion should be done ideally with the existing method so that we do not duplicate existing code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will delate one then 👍

Comment on lines 472 to 476
code_type : str
Either "ISO3A" or "ISO3N". "ISO3A" stands for "ISO 3166-1 alpha-3" which is a
three-letter country code, "ISO3N" stands for "ISO 3166-1 numeric" which is a
three-digit country code, the numeric version of "ISO3A". For example, for Switzerland:
the "ISO3A" code is "CHE" and the "ISO3N" is 756.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent with country_to_iso, where the two types are called numeric and alpha-3

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get_countries_per_region also uses iso3a and iso3n, so we need to chose which name to use consistenly in all climada code. The official would be "ISO 3166-1 alpha-3" and "ISO 3166-1 numeric".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's unfortunate. I think numeric and alpha-3 are clearest and least error-prone. Note that there is also an alpha-2 representation for many countries, so iso3a is not precise.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's go for numeric and alpha-3 then. Shall I change it in the other method too ? (get_countries_per_region)

"""

return region_name[region], impf_id[region], iso3n[region], iso3a[region]
if code_type not in {"ISO3A", "ISO3N"}:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is code_type a parameter at all? If any item in the list is a str, it must be an alpha-3 code. If it is int, it must be a numerical code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, it is not really needed nor optimal 👍

Comment on lines 492 to 493
elif not all(isinstance(country, type(countries[0])) for country in countries):
raise ValueError("All elements in the list must be of the same type.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? It's easy to decide on a per-element basis if a numerical or an alpha-3 code is given.

Comment on lines +480 to +481
impf_ids : list
List of impact function ids matching the countries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not immediately return the appropriate impact function (set)s?

Copy link
Collaborator Author

@NicolasColombi NicolasColombi Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you typically define an impact function id column in the exposure ? (this is what I had in mind as an application of this function)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, understood. You can keep it as is then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. But this only makes sense then if nobody changes the regions and impf IDs defined in from_calibrated_regional_ImpfSet:

def from_calibrated_regional_ImpfSet(

I think this method should then also use the information in CountryCode and not re-define it.

Also here:

regions_short = ["NA1", "NA2", "NI", "OC", "SI", "WP1", "WP2", "WP3", "WP4"]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I did not saw the other 2 functions. Then I guess we just need to use CountryCodein these two.

@peanutfun
Copy link
Member

@NicolasColombi What's the status here?

@NicolasColombi
Copy link
Collaborator Author

@NicolasColombi What's the status here?

I'll take some time next week to implement your suggestions, as for the other PRs as well.

@chahank
Copy link
Member

chahank commented Aug 20, 2025

@NicolasColombi it would be good to finish this PR soon for the next release in the middle of September.

@NicolasColombi
Copy link
Collaborator Author

Hi there @chahank @peanutfun ! I eventually incorporated your suggestions... Let me know what you think of the current version. One thing we need to decide is a consistent use of country codes acronyms: some climada functions uses ISO3A/N and others numeric or alpha-3, the official would be ISO 3166-1 alpha-3 and ISO 3166-1 numeric.

@NicolasColombi
Copy link
Collaborator Author

Okei, I delated the numeric codes in the Enum and used the country_to_iso to convert numeric codes to alpha-3, I also homogenize the use of numeric and alpha-3 throughout the class and modified the other functions that were defining regions so that they use the Enum class.

@NicolasColombi
Copy link
Collaborator Author

Okei, I delated the numeric codes in the Enum and used the country_to_iso to convert numeric codes to alpha-3, I also homogenize the use of numeric and alpha-3 throughout the class and modified the other functions that were defining regions so that they use the Enum class.

@chahank @peanutfun I will be on holidays for the next two week. If you require additional changes I will be available from Sept.15, or if the changes are really small, I might carve out some time tomorrow. I think this PR should be ready now.

@chahank
Copy link
Member

chahank commented Aug 28, 2025

@peanutfun if this is fine for you, I would let you finalize the review for this PR.

@peanutfun peanutfun merged commit 6cc6e08 into develop Sep 3, 2025
18 of 19 checks passed
@peanutfun
Copy link
Member

Jenkins compatibility test hangs, but the GitHub one passed. Merging. Great work, @NicolasColombi!

@emanuel-schmid emanuel-schmid deleted the feature/get-region-per-country branch September 4, 2025 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants