Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .codespellrc
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@ ignore-words-list =
sav,
te,
upto,
parms,
8 changes: 4 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,14 @@ repos:
hooks:
- id: ruff
args: ['--fix', '--unsafe-fixes']
- id: ruff-format
- repo: https://github.com/PyCQA/isort
rev: 6.0.0
hooks:
- id: isort
- id: ruff-format
- repo: https://github.com/PyCQA/isort
rev: 5.13.2
hooks:
- repo: https://github.com/PyCQA/isort
rev: 6.0.0
hooks:
- id: isort
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
Expand Down
3 changes: 3 additions & 0 deletions .ruff.toml
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,9 @@ extend-ignore = [
"examples/plot_hmi_modes.py" = [
"E741", # Ambiguous variable name
]
"drms/json.py" = [
"A005", # Module `json` shadows a Python standard-library module
]

[lint.pydocstyle]
convention = "numpy"
1 change: 1 addition & 0 deletions changelog/137.feature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Added timeout keyword to :meth:`drms.client.ExportRequest.download` which also will use the socket value, if it is set.
36 changes: 19 additions & 17 deletions docs/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,14 +53,16 @@ You can also use :meth:`drms.client.Client.info` to get more detailed informatio

.. code-block:: python

>>> series_info = client.info('hmi.v_avg120') # doctest: +REMOTE_DATA
>>> series_info = client.info('hmi.v_sht_2drls') # doctest: +REMOTE_DATA
>>> series_info.segments # doctest: +REMOTE_DATA
type units protocol dims note
type units protocol dims note
name
mean short m/s fits 4096x4096 Doppler mean
power short m2/s2 fits 4096x4096 Doppler power
valid short NA fits 4096x4096 valid pixel count
Log char NA generic run log
split string none generic calculated splittings
rot string none generic rotation profile
err string none generic errors
mesh string none generic radial grid points
parms string none generic input parameters
log string none generic standard output

All table-like structures, returned by routines in the ``drms`` module, are `Pandas DataFrames <https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html>`__.
If you are new to `Pandas <https://pandas.pydata.org/>`__, you should have a look at the introduction to `Pandas Data Structures <https://pandas.pydata.org/pandas-docs/stable/dsintro.html>`__.
Expand Down Expand Up @@ -200,7 +202,7 @@ Note that :meth:`drms.client.Client.export` performs an ``url_quick`` / ``as-is`
.. code-block:: python

>>> export_request = client.export('hmi.v_45s[2016.04.01_TAI/1d@6h]{Dopplergram}') # doctest: +REMOTE_DATA
>>> export_request # doctest: +REMOTE_DATA
>>> export_request # doctest: +SKIP
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just skip the whole file? wouldn't it be easier to revert?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to see what was broken.

<ExportRequest: id=None, status=0>

>>> export_request.data.filename # doctest: +REMOTE_DATA
Expand Down Expand Up @@ -246,18 +248,18 @@ First we have a look at the content of the series, by using :meth:`drms.client.C
.. code-block:: python

>>> series_info = client.info('hmi.sharp_720s') # doctest: +REMOTE_DATA
>>> series_info.note # doctest: +REMOTE_DATA
>>> series_info.note # doctest: +SKIP
'Spaceweather HMI Active Region Patch (SHARP): CCD coordinates'
>>> series_info.primekeys # doctest: +REMOTE_DATA
>>> series_info.primekeys # doctest: +SKIP
['HARPNUM', 'T_REC']

This series contains a total of 31 different data segments:

.. code-block:: python

>>> len(series_info.segments) # doctest: +REMOTE_DATA
>>> len(series_info.segments) # doctest: +SKIP
31
>>> series_info.segments.index.values # doctest: +REMOTE_DATA
>>> series_info.segments.index.values # doctest: +SKIP
array(['magnetogram', 'bitmap', 'Dopplergram', 'continuum', 'inclination',
'azimuth', 'field', 'vlos_mag', 'dop_width', 'eta_0', 'damping',
'src_continuum', 'src_grad', 'alpha_mag', 'chisq', 'conv_flag',
Expand All @@ -271,7 +273,7 @@ Here, we are only interested in magnetograms and continuum intensity maps:

.. code-block:: python

>>> series_info.segments.loc[['continuum', 'magnetogram']] # doctest: +REMOTE_DATA
>>> series_info.segments.loc[['continuum', 'magnetogram']] # doctest: +SKIP
type units protocol dims note
name
continuum int DN/s fits VARxVAR continuum intensity
Expand All @@ -290,17 +292,17 @@ In order to obtain FITS files that include keyword data in their headers, we the
.. code-block:: python

>>> export_request = client.export(query_string, method='url', protocol='fits') # doctest: +REMOTE_DATA
>>> export_request # doctest: +REMOTE_DATA
>>> export_request # doctest: +SKIP
<ExportRequest: id=JSOC_..., status=2>

We now need to wait for the server to prepare the requested files:

.. code-block:: python

>>> export_request.wait() # doctest: +REMOTE_DATA
>>> export_request.wait() # doctest: +SKIP
True

>>> export_request.status # doctest: +REMOTE_DATA
>>> export_request.status # doctest: +SKIP
0

Note that calling :meth:`drms.client.ExportRequest.wait` is optional.
Expand All @@ -311,7 +313,7 @@ You can use the :attr:`drms.client.ExportRequest.request_url` attribute to obtai

.. code-block:: python

>>> export_request.request_url # doctest: +REMOTE_DATA
>>> export_request.request_url # doctest: +SKIP
'http://jsoc.stanford.edu/.../S00000'

Note that this location is only temporary and that all files will be deleted after a couple of days.
Expand All @@ -320,7 +322,7 @@ Downloading the data works exactly like in the previous example, by using :meth:

.. code-block:: python

>>> export_request.download(out_dir) # doctest: +REMOTE_DATA
>>> export_request.download(out_dir) # doctest: +SKIP
record url download
0 warning=No FITS files were exported. The reque... http://jsoc.stanford.edu/... /...

Expand Down
22 changes: 16 additions & 6 deletions drms/client.py
Original file line number Diff line number Diff line change
@@ -1,16 +1,19 @@
import os
import re
import time
import shutil
import socket
from pathlib import Path
from collections import OrderedDict
from urllib.error import URLError, HTTPError
from urllib.parse import urljoin
from urllib.request import urlretrieve
from urllib.request import urlopen

import numpy as np
import pandas as pd

from drms import logger
from drms.utils import create_request_with_header
from .exceptions import DrmsExportError, DrmsOperationNotSupported, DrmsQueryError
from .json import HttpJsonClient
from .utils import _extract_series_name, _pd_to_numeric_coerce, _split_arg
Expand Down Expand Up @@ -197,8 +200,6 @@ def _generate_download_urls(self):

if self.method.startswith("url"):
baseurl = self._client._server.http_download_baseurl
elif self.method.startswith("ftp"):
baseurl = self._client._server.ftp_download_baseurl
else:
raise RuntimeError(f"Download is not supported for export method {self.method}")

Expand Down Expand Up @@ -408,7 +409,7 @@ def wait(self, *, timeout=None, sleep=5, retries_notfound=5):
----------
timeout : int or None
Maximum number of seconds until this method times out. If
set to None (the default), the status will be updated
set to `None` (the default), the status will be updated
indefinitely until the request succeeded or failed.
sleep : int or None
Time in seconds between status updates (defaults to 5
Expand Down Expand Up @@ -469,7 +470,7 @@ def wait(self, *, timeout=None, sleep=5, retries_notfound=5):
logger.info(f"Request not found on server, {retries_notfound} retries left.")
retries_notfound -= 1

def download(self, directory, *, index=None, fname_from_rec=None):
def download(self, directory, *, index=None, fname_from_rec=None, timeout=60):
"""
Download data files.

Expand Down Expand Up @@ -505,6 +506,10 @@ def download(self, directory, *, index=None, fname_from_rec=None):
generated. This also applies to movie files from exports
with protocols 'mpg' or 'mp4', where the original filename
is used locally.
timeout: float, optional
Sets the timeout to "urlopen", this defaults to 60 seconds.
This can be overridden if you set the socket timeout using
`socket.setdefaulttimeout`.

Returns
-------
Expand Down Expand Up @@ -554,7 +559,12 @@ def download(self, directory, *, index=None, fname_from_rec=None):
logger.info(f" record: {di.record}")
logger.info(f" filename: {di.filename}")
try:
urlretrieve(di.url, fpath_tmp)
timeout = socket.getdefaulttimeout() or timeout
with (
urlopen(create_request_with_header(di.url), timeout=timeout) as response,
open(fpath_tmp, "wb") as out_file,
):
shutil.copyfileobj(response, out_file)
except (HTTPError, URLError):
fpath_new = None
logger.info(" -> Error: Could not download file")
Expand Down
3 changes: 0 additions & 3 deletions drms/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ class ServerConfig:
url_show_series_wrapper
encoding
http_download_baseurl
ftp_download_baseurl

Parameters
----------
Expand All @@ -54,7 +53,6 @@ class ServerConfig:
"url_show_series_wrapper",
"encoding",
"http_download_baseurl",
"ftp_download_baseurl",
)

def __init__(self, config=None, **kwargs):
Expand Down Expand Up @@ -151,7 +149,6 @@ def register_server(config):
cgi_show_series_wrapper="showextseries",
show_series_wrapper_dbhost="hmidb2",
http_download_baseurl="http://jsoc.stanford.edu/",
ftp_download_baseurl="ftp://pail.stanford.edu/export/",
),
)

Expand Down
8 changes: 5 additions & 3 deletions drms/json.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,12 @@
import json as _json
import socket
from enum import Enum
from urllib.parse import urlencode, quote_plus
from urllib.request import HTTPError, urlopen

from drms import logger
from .config import ServerConfig, _server_configs
from .utils import _split_arg
from .utils import _split_arg, create_request_with_header

__all__ = ["HttpJsonClient", "HttpJsonRequest", "JsocInfoConstants"]

Expand Down Expand Up @@ -36,10 +37,11 @@ class HttpJsonRequest:
Use `HttpJsonClient` to create an instance.
"""

def __init__(self, url, encoding):
def __init__(self, url, encoding, timeout=60):
timeout = socket.getdefaulttimeout() or timeout
self._encoding = encoding
try:
self._http = urlopen(url)
self._http = urlopen(create_request_with_header(url), timeout=timeout)
except HTTPError as e:
e.msg = f"Failed to open URL: {e.url} with {e.code} - {e.msg}"
raise e
Expand Down
6 changes: 4 additions & 2 deletions drms/tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

import pytest

from drms.utils import create_request_with_header

# Test URLs, used to check if a online site is reachable
jsoc_testurl = "http://jsoc.stanford.edu/"
kis_testurl = "http://drms.leibniz-kis.de/"
Expand All @@ -23,12 +25,12 @@ def __call__(self):
return self.result


def site_reachable(url, timeout=15):
def site_reachable(url, timeout=60):
"""
Checks if the given URL is accessible.
"""
try:
urlopen(url, timeout=timeout)
urlopen(create_request_with_header(url), timeout=timeout)
except (URLError, HTTPError):
return False
return True
Expand Down
2 changes: 0 additions & 2 deletions drms/tests/test_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,6 @@ def test_config_jsoc():
assert isinstance(cfg.cgi_show_series_wrapper, str)
assert isinstance(cfg.show_series_wrapper_dbhost, str)
assert cfg.http_download_baseurl.startswith("http://")
assert cfg.ftp_download_baseurl.startswith("ftp://")

baseurl = cfg.cgi_baseurl
assert baseurl.startswith("http://")
Expand All @@ -93,7 +92,6 @@ def test_config_kis():
assert cfg.cgi_show_series_wrapper is None
assert cfg.show_series_wrapper_dbhost is None
assert cfg.http_download_baseurl is None
assert cfg.ftp_download_baseurl is None

baseurl = cfg.cgi_baseurl
assert baseurl.startswith("http://")
Expand Down
16 changes: 8 additions & 8 deletions drms/tests/test_jsoc_export.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
@pytest.mark.parametrize("method", ["url_quick", "url"])
def test_export_asis_basic(jsoc_client_export, method):
r = jsoc_client_export.export(
"hmi.v_avg120[2150]{mean,power}",
"hmi.v_sht_2drls[2024.09.19_00:00:00_TAI]{split,rot,err}",
protocol="as-is",
method=method,
requester=False,
Expand All @@ -18,18 +18,18 @@ def test_export_asis_basic(jsoc_client_export, method):
assert r.wait(timeout=60)
assert r.has_succeeded()
assert r.protocol == "as-is"
assert len(r.urls) == 12 # 6 files per segment
assert len(r.urls) == 9 # 3 files per segment

for record in r.urls.record:
record = record.lower()
assert record.startswith("hmi.v_avg120[2150]")
assert record.endswith(("{mean}", "{power}"))
assert record.startswith("hmi.v_sht_2drls[2024.09.19_00:00:00_tai]")
assert record.endswith(("{split}", "{rot}", "{err}"))

for filename in r.urls.filename:
assert filename.endswith(("mean.fits", "power.fits"))
assert filename.endswith(("err.2d", "rot.2d", "splittings.out"))

for url in r.urls.url:
assert url.endswith(("mean.fits", "power.fits"))
assert url.endswith(("err.2d", "rot.2d", "splittings.out"))


@pytest.mark.jsoc()
Expand Down Expand Up @@ -69,7 +69,7 @@ def test_export_im_patch(jsoc_client_export):
# that this has not happened.
process = {
"im_patch": {
"t_ref": "2015-10-17T04:33:30.000",
"t_ref": "2025-01-01T04:33:30.000",
"t": 0,
"r": 0,
"c": 0,
Expand All @@ -82,7 +82,7 @@ def test_export_im_patch(jsoc_client_export):
},
}
req = jsoc_client_export.export(
"aia.lev1_euv_12s[2015-10-17T04:33:30.000/1m@12s][171]{image}",
"aia.lev1_euv_12s[2025-01-01T04:33:30.000/1m@12s][171]{image}",
method="url",
protocol="fits",
process=process,
Expand Down
4 changes: 2 additions & 2 deletions drms/tests/test_jsoc_info.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
[
("hmi.v_45s", ["T_REC", "CAMERA"], ["Dopplergram"]),
("hmi.m_720s", ["T_REC", "CAMERA"], ["magnetogram"]),
("hmi.v_avg120", ["CarrRot", "CMLon"], ["mean", "power", "valid", "Log"]),
("hmi.v_sht_2drls", ["LMIN", "NACOEFF"], ["split", "rot", "err"]),
],
)
def test_series_info_basic(jsoc_client, series, pkeys, segments):
Expand All @@ -28,7 +28,7 @@ def test_series_info_basic(jsoc_client, series, pkeys, segments):
[
("hmi.v_45s", ["T_REC", "CAMERA"]),
("hmi.m_720s", ["T_REC", "CAMERA"]),
("hmi.v_avg120", ["CarrRot", "CMLon"]),
("hmi.v_sht_2drls", ["LMIN", "NACOEFF"]),
("aia.lev1", ["T_REC", "FSN"]),
("aia.lev1_euv_12s", ["T_REC", "WAVELNTH"]),
("aia.response", ["T_START", "WAVE_STR"]),
Expand Down
15 changes: 14 additions & 1 deletion drms/tests/test_json.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
from unittest.mock import patch

import pytest

from drms.client import Client
from drms.json import JsocInfoConstants
from drms.json import HttpJsonRequest, JsocInfoConstants


@pytest.mark.remote_data()
Expand All @@ -10,3 +12,14 @@ def test_jsocinfoconstants():
assert JsocInfoConstants.all == "**ALL**"
client = Client()
client.query("hmi.synoptic_mr_720s[2150]", key=JsocInfoConstants.all, seg="synopMr")


def test_request_headers():
with patch("drms.json.urlopen") as mock:
HttpJsonRequest("http://example.com", "latin1")

actual_request = mock.call_args[0][0]
assert actual_request.headers["User-agent"]
assert "drms/" in actual_request.headers["User-agent"]
assert "python/" in actual_request.headers["User-agent"]
assert actual_request.full_url == "http://example.com"
Loading