Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: test_get_series & get_series #19

Merged
merged 2 commits into from
Oct 8, 2024
Merged

fix: test_get_series & get_series #19

merged 2 commits into from
Oct 8, 2024

Conversation

csautter
Copy link
Contributor

@csautter csautter commented Oct 6, 2024

Nice work @philsv ! I have started using your myeia python module and faced some issues.

Using the function get_series was not successful for me. Also test_get_series does not pass successful.
I have fixed the issues and added some improvements:

  • deprecation decorator for get_series as it uses a legacy api
  • backoff giveup condition
  • message for 403 errors

Output of test_get_series run (without my fix):

C:\Users\chris\.virtualenvs\myeia_313\Scripts\python.exe "C:/Users/chris/AppData/Local/Programs/PyCharm Professional/plugins/python-ce/helpers/pycharm/_jb_pytest_runner.py" --target test_myeia.py::test_get_series 
Testing started at 19:30 ...
Launching pytest with arguments test_myeia.py::test_get_series --no-header --no-summary -q in C:\Users\chris\PycharmProjects\myeia\tests

============================= test session starts =============================
collecting ... collected 4 items

test_myeia.py::test_get_series[NG.RNGC1.D-2024-01-01-2024-02-01] 
test_myeia.py::test_get_series[PET.WCESTUS1.W-2024-01-01-2024-02-01] 
test_myeia.py::test_get_series[INTL.29-12-HKG-BKWH.A-2024-01-01-2024-02-01] 
test_myeia.py::test_get_series[STEO.PATC_WORLD.M-2024-01-01-2024-02-01] 

=================== 3 failed, 1 passed, 7 warnings in 6.84s ===================
FAILED  [ 25%]
tests\test_myeia.py:8 (test_get_series[NG.RNGC1.D-2024-01-01-2024-02-01])
self = Index(['NEW YORK CITY'], dtype='object'), key = 'product-name'

    def get_loc(self, key):
        """
        Get integer location, slice or boolean mask for requested label.
    
        Parameters
        ----------
        key : label
    
        Returns
        -------
        int if unique index, slice if monotonic index, else mask
    
        Examples
        --------
        >>> unique_index = pd.Index(list('abc'))
        >>> unique_index.get_loc('b')
        1
    
        >>> monotonic_index = pd.Index(list('abbc'))
        >>> monotonic_index.get_loc('b')
        slice(1, 3, None)
    
        >>> non_monotonic_index = pd.Index(list('abcb'))
        >>> non_monotonic_index.get_loc('b')
        array([False,  True, False,  True])
        """
        casted_key = self._maybe_cast_indexer(key)
        try:
>           return self._engine.get_loc(casted_key)

..\..\..\.virtualenvs\myeia_313\Lib\site-packages\pandas\core\indexes\base.py:3805: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
index.pyx:167: in pandas._libs.index.IndexEngine.get_loc
    ???
index.pyx:196: in pandas._libs.index.IndexEngine.get_loc
    ???
pandas\\_libs\\hashtable_class_helper.pxi:7081: in pandas._libs.hashtable.PyObjectHashTable.get_item
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   KeyError: 'product-name'

pandas\\_libs\\hashtable_class_helper.pxi:7089: KeyError

The above exception was the direct cause of the following exception:

series_id = 'NG.RNGC1.D', start_date = '2024-01-01', end_date = '2024-02-01'

    @pytest.mark.parametrize(
        "series_id, start_date, end_date",
        [
            ("NG.RNGC1.D", "2024-01-01", "2024-02-01"),
            ("PET.WCESTUS1.W", "2024-01-01", "2024-02-01"),
            ("INTL.29-12-HKG-BKWH.A", "2024-01-01", "2024-02-01"),
            ("STEO.PATC_WORLD.M", "2024-01-01", "2024-02-01"),
        ],
    )
    def test_get_series(series_id, start_date, end_date):
        """Test get_series method."""
>       df = eia.get_series(series_id, start_date=start_date, end_date=end_date)

test_myeia.py:20: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
..\myeia\api.py:105: in get_series
    df = df.rename(columns={data_identifier: df[col][0]})
..\..\..\.virtualenvs\myeia_313\Lib\site-packages\pandas\core\frame.py:4102: in __getitem__
    indexer = self.columns.get_loc(key)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = Index(['NEW YORK CITY'], dtype='object'), key = 'product-name'

    def get_loc(self, key):
        """
        Get integer location, slice or boolean mask for requested label.
    
        Parameters
        ----------
        key : label
    
        Returns
        -------
        int if unique index, slice if monotonic index, else mask
    
        Examples
        --------
        >>> unique_index = pd.Index(list('abc'))
        >>> unique_index.get_loc('b')
        1
    
        >>> monotonic_index = pd.Index(list('abbc'))
        >>> monotonic_index.get_loc('b')
        slice(1, 3, None)
    
        >>> non_monotonic_index = pd.Index(list('abcb'))
        >>> non_monotonic_index.get_loc('b')
        array([False,  True, False,  True])
        """
        casted_key = self._maybe_cast_indexer(key)
        try:
            return self._engine.get_loc(casted_key)
        except KeyError as err:
            if isinstance(casted_key, slice) or (
                isinstance(casted_key, abc.Iterable)
                and any(isinstance(x, slice) for x in casted_key)
            ):
                raise InvalidIndexError(key)
>           raise KeyError(key) from err
E           KeyError: 'product-name'

..\..\..\.virtualenvs\myeia_313\Lib\site-packages\pandas\core\indexes\base.py:3812: KeyError
FAILED [ 50%]
tests\test_myeia.py:8 (test_get_series[PET.WCESTUS1.W-2024-01-01-2024-02-01])
self = Index(['U.S.'], dtype='object'), key = 'product-name'

    def get_loc(self, key):
        """
        Get integer location, slice or boolean mask for requested label.
    
        Parameters
        ----------
        key : label
    
        Returns
        -------
        int if unique index, slice if monotonic index, else mask
    
        Examples
        --------
        >>> unique_index = pd.Index(list('abc'))
        >>> unique_index.get_loc('b')
        1
    
        >>> monotonic_index = pd.Index(list('abbc'))
        >>> monotonic_index.get_loc('b')
        slice(1, 3, None)
    
        >>> non_monotonic_index = pd.Index(list('abcb'))
        >>> non_monotonic_index.get_loc('b')
        array([False,  True, False,  True])
        """
        casted_key = self._maybe_cast_indexer(key)
        try:
>           return self._engine.get_loc(casted_key)

..\..\..\.virtualenvs\myeia_313\Lib\site-packages\pandas\core\indexes\base.py:3805: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
index.pyx:167: in pandas._libs.index.IndexEngine.get_loc
    ???
index.pyx:196: in pandas._libs.index.IndexEngine.get_loc
    ???
pandas\\_libs\\hashtable_class_helper.pxi:7081: in pandas._libs.hashtable.PyObjectHashTable.get_item
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

>   ???
E   KeyError: 'product-name'

pandas\\_libs\\hashtable_class_helper.pxi:7089: KeyError

The above exception was the direct cause of the following exception:

series_id = 'PET.WCESTUS1.W', start_date = '2024-01-01', end_date = '2024-02-01'

    @pytest.mark.parametrize(
        "series_id, start_date, end_date",
        [
            ("NG.RNGC1.D", "2024-01-01", "2024-02-01"),
            ("PET.WCESTUS1.W", "2024-01-01", "2024-02-01"),
            ("INTL.29-12-HKG-BKWH.A", "2024-01-01", "2024-02-01"),
            ("STEO.PATC_WORLD.M", "2024-01-01", "2024-02-01"),
        ],
    )
    def test_get_series(series_id, start_date, end_date):
        """Test get_series method."""
>       df = eia.get_series(series_id, start_date=start_date, end_date=end_date)

test_myeia.py:20: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
..\myeia\api.py:105: in get_series
    df = df.rename(columns={data_identifier: df[col][0]})
..\..\..\.virtualenvs\myeia_313\Lib\site-packages\pandas\core\frame.py:4102: in __getitem__
    indexer = self.columns.get_loc(key)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = Index(['U.S.'], dtype='object'), key = 'product-name'

    def get_loc(self, key):
        """
        Get integer location, slice or boolean mask for requested label.
    
        Parameters
        ----------
        key : label
    
        Returns
        -------
        int if unique index, slice if monotonic index, else mask
    
        Examples
        --------
        >>> unique_index = pd.Index(list('abc'))
        >>> unique_index.get_loc('b')
        1
    
        >>> monotonic_index = pd.Index(list('abbc'))
        >>> monotonic_index.get_loc('b')
        slice(1, 3, None)
    
        >>> non_monotonic_index = pd.Index(list('abcb'))
        >>> non_monotonic_index.get_loc('b')
        array([False,  True, False,  True])
        """
        casted_key = self._maybe_cast_indexer(key)
        try:
            return self._engine.get_loc(casted_key)
        except KeyError as err:
            if isinstance(casted_key, slice) or (
                isinstance(casted_key, abc.Iterable)
                and any(isinstance(x, slice) for x in casted_key)
            ):
                raise InvalidIndexError(key)
>           raise KeyError(key) from err
E           KeyError: 'product-name'

..\..\..\.virtualenvs\myeia_313\Lib\site-packages\pandas\core\indexes\base.py:3812: KeyError
FAILED [ 75%]
tests\test_myeia.py:8 (test_get_series[INTL.29-12-HKG-BKWH.A-2024-01-01-2024-02-01])
series_id = 'INTL.29-12-HKG-BKWH.A', start_date = '2024-01-01'
end_date = '2024-02-01'

    @pytest.mark.parametrize(
        "series_id, start_date, end_date",
        [
            ("NG.RNGC1.D", "2024-01-01", "2024-02-01"),
            ("PET.WCESTUS1.W", "2024-01-01", "2024-02-01"),
            ("INTL.29-12-HKG-BKWH.A", "2024-01-01", "2024-02-01"),
            ("STEO.PATC_WORLD.M", "2024-01-01", "2024-02-01"),
        ],
    )
    def test_get_series(series_id, start_date, end_date):
        """Test get_series method."""
>       df = eia.get_series(series_id, start_date=start_date, end_date=end_date)

test_myeia.py:20: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
..\myeia\api.py:105: in get_series
    df = df.rename(columns={data_identifier: df[col][0]})
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = Series([], Name: productName, dtype: object), key = 0

    def __getitem__(self, key):
        check_dict_or_set_indexers(key)
        key = com.apply_if_callable(key, self)
    
        if key is Ellipsis:
            if using_copy_on_write() or warn_copy_on_write():
                return self.copy(deep=False)
            return self
    
        key_is_scalar = is_scalar(key)
        if isinstance(key, (list, tuple)):
            key = unpack_1tuple(key)
    
        if is_integer(key) and self.index._should_fallback_to_positional:
            warnings.warn(
                # GH#50617
                "Series.__getitem__ treating keys as positions is deprecated. "
                "In a future version, integer keys will always be treated "
                "as labels (consistent with DataFrame behavior). To access "
                "a value by position, use `ser.iloc[pos]`",
                FutureWarning,
                stacklevel=find_stack_level(),
            )
>           return self._values[key]
E           IndexError: index 0 is out of bounds for axis 0 with size 0

..\..\..\.virtualenvs\myeia_313\Lib\site-packages\pandas\core\series.py:1118: IndexError
PASSED [100%]
Process finished with exit code 1

* added backoff giveup condition
* added message for 403 errors
@philsv
Copy link
Owner

philsv commented Oct 7, 2024

Hi @csautter,

thank you for your contribution! However its not entirly true that the get_series() function is deprecated from EIA.

If you look into the documentations: https://www.eia.gov/opendata/documentation.php#Submittingrequesttoour

You will find this text:

This APIv1 method is now deprecated. However, if you are fond of APIv1 Series IDs, you may invoke them using the special /series/ route, as follows. Note that we use a full route, but at the end we include the APIv1 Series_ID:

https://api.eia.gov/v2/seriesid/ELEC.SALES.CO-RES.A?api_key=xxxxxx

I think the tests were failing due to the date ranges.

Though the rest of the code you provided seems very helpful. If thats okay for you I will accept your pull-request but remove the deprecated decorater from the get_series() function.

Could you try the tests again for the get_series() function on your side with a different start_date value, and let me know if they are still failing?

@philsv philsv merged commit 0767138 into philsv:main Oct 8, 2024
2 checks passed
@csautter csautter deleted the dev branch October 8, 2024 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants