Skip to content

Commit 933be09

Browse files
committed
Visualization
1 parent af4838c commit 933be09

File tree

7 files changed

+141
-26
lines changed

7 files changed

+141
-26
lines changed

docs/getting_started_with_epidatpy.rst

Lines changed: 24 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ The ``pub_covidcast`` function lets us access the ``covidcast`` endpoint:
7777

7878
print(apicall)
7979

80-
``pub_covidcast`` returns an ``EpiDataCall``, which can be further converted into different output formats - such as a Pandas DataFrame:
80+
``pub_covidcast`` returns an ``EpiDataCall``, which is a not-yet-executed query that can be inspected. The query can be executed and converted to a DataFrame by using the ``.df()`` method:
8181

8282
.. exec::
8383
:context: true
@@ -175,9 +175,29 @@ it using any of the available Python libraries:
175175

176176
.. code-block:: python
177177
178-
data.plot(x="time_value", y="value", title="Smoothed CLI from Facebook Survey", xlabel="Date", ylabel="CLI")
179-
180-
.. image:: images/Figure_1.png
178+
import matplotlib.pyplot as plt
179+
180+
fig, ax = plt.subplots(figsize=(6, 5))
181+
plt.rc("axes", titlesize=16)
182+
plt.rc("axes", labelsize=16)
183+
plt.rc("xtick", labelsize=14)
184+
plt.rc("ytick", labelsize=14)
185+
ax.spines["right"].set_visible(False)
186+
ax.spines["left"].set_visible(False)
187+
ax.spines["top"].set_visible(False)
188+
189+
data.pivot_table(values = "value", index = "time_value", columns = "geo_value").plot(
190+
title="Smoothed CLI from Facebook Survey",
191+
xlabel="Date",
192+
ylabel="CLI",
193+
ax = ax,
194+
linewidth = 1.5
195+
)
196+
197+
plt.subplots_adjust(bottom=.2)
198+
plt.show()
199+
200+
.. image:: images/Getting_Started.png
181201
:width: 800
182202
:alt: Smoothed CLI from Facebook Survey
183203

docs/images/Figure_1.png

-23.7 KB
Binary file not shown.

docs/images/Getting_Started.png

117 KB
Loading

docs/images/Versioned_Data.png

187 KB
Loading

docs/signal_discovery.rst

Lines changed: 20 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -28,16 +28,20 @@ data streams which are publically accessible from the COVIDcast API. See the `da
2828
and signals documentation <https://cmu-delphi.github.io/delphi-epidata/api/covidcast_signals.html>`_
2929
for descriptions of the available sources.
3030

31-
>>> from epidatpy import CovidcastEpidata
32-
>>> epidata = CovidcastEpidata()
33-
>>> sources = epidata.source_df
34-
>>> sources.head()
35-
source name description reference_signal license dua signals
36-
0 chng Change Healthcare Change Healthcare is a healthcare technology c... smoothed_outpatient_cli CC BY-NC https://cmu.box.com/s/cto4to822zecr3oyq1kkk9xm... smoothed_outpatient_cli,smoothed_adj_outpatien...
37-
1 covid-act-now Covid Act Now (CAN) COVID Act Now (CAN) tracks COVID-19 testing st... pcr_specimen_total_tests CC BY-NC None pcr_specimen_positivity_rate,pcr_specimen_tota...
38-
2 doctor-visits Doctor Visits From Claims Information about outpatient visits, provided ... smoothed_cli CC BY https://cmu.box.com/s/l2tz6kmiws6jyty2azwb43po... smoothed_cli,smoothed_adj_cli
39-
3 fb-survey Delphi US COVID-19 Trends and Impact Survey We conduct the Delphi US COVID-19 Trends and I... smoothed_cli CC BY https://cmu.box.com/s/qfxplcdrcn9retfzx4zniyug... raw_wcli,raw_cli,smoothed_cli,smoothed_wcli,ra...
40-
4 google-symptoms Google Symptoms Search Trends Google's [COVID-19 Search Trends symptoms data... s05_smoothed_search To download or use the data, you must agree to... None ageusia_raw_search,ageusia_smoothed_search,ano...
31+
.. exec::
32+
:context: true
33+
34+
from epidatpy import CovidcastEpidata
35+
import pandas as pd
36+
37+
pd.set_option('display.max_columns', None)
38+
pd.set_option('display.max_rows', None)
39+
pd.set_option('display.width', 1000)
40+
41+
epidata = CovidcastEpidata()
42+
sources = epidata.source_df
43+
44+
print(sources.head())
4145

4246
This DataFrame contains the following columns:
4347

@@ -52,14 +56,12 @@ The ``signal_df`` DataFrame can also be used to obtain information about the sig
5256
that are available - for example, what time range they are available for,
5357
and when they have been updated.
5458

55-
>>> signals = epidata.signal_df
56-
>>> signals.head()
57-
source signal name active short_description description time_type time_label value_label format category high_values_are is_smoothed is_weighted is_cumulative has_stderr has_sample_size geo_types
58-
0 chng smoothed_outpatient_cli COVID-Related Doctor Visits False Estimated percentage of outpatient doctor visi... Estimated percentage of outpatient doctor visi... day Date Value raw early bad True False False False False county,hhs,hrr,msa,nation,state
59-
1 chng smoothed_adj_outpatient_cli COVID-Related Doctor Visits (Day-adjusted) False Estimated percentage of outpatient doctor visi... Estimated percentage of outpatient doctor visi... day Date Value raw early bad True False False False False county,hhs,hrr,msa,nation,state
60-
2 chng smoothed_outpatient_covid COVID-Confirmed Doctor Visits False COVID-Confirmed Doctor Visits Estimated percentage of outpatient doctor visi... day Date Value raw early bad True False False False False county,hhs,hrr,msa,nation,state
61-
3 chng smoothed_adj_outpatient_covid COVID-Confirmed Doctor Visits (Day-adjusted) False COVID-Confirmed Doctor Visits Estimated percentage of outpatient doctor visi... day Date Value raw early bad True False False False False county,hhs,hrr,msa,nation,state
62-
4 chng smoothed_outpatient_flu Influenza-Confirmed Doctor Visits False Estimated percentage of outpatient doctor visi... Estimated percentage of outpatient doctor visi... day Day Value raw early bad True False False None None county,hhs,hrr,msa,nation,state
59+
.. exec::
60+
:context: true
61+
62+
signals = epidata.signal_df
63+
64+
print(signals.head())
6365

6466
This DataFrame contains one row each available signal, with the following columns:
6567

docs/versioned_data.rst

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -90,6 +90,51 @@ for forecasting tasks. To backtest a forecasting model on past data, it is
9090
important to use the data that would have been available *at the time* the model
9191
was or would have been fit, not data that arrived much later.
9292

93+
By plotting API results with different values of the ``as_of`` parameter, we can
94+
see how the indicator value changes over time as new observations become available:
95+
96+
.. code-block:: python
97+
98+
results = []
99+
for as_of_date in ["2020-05-07", "2020-05-14", "2020-05-21", "2020-05-28"]:
100+
apicall = epidata.pub_covidcast(
101+
data_source = "doctor-visits",
102+
signals = "smoothed_adj_cli",
103+
time_type = "day",
104+
time_values = EpiRange("2020-04-20", "2020-04-27"),
105+
geo_type = "state",
106+
geo_values = "pa",
107+
as_of = as_of_date)
108+
109+
results.append(apicall.df())
110+
111+
final_df = pd.concat(results)
112+
final_df["issue"] = final_df["issue"].dt.date
113+
114+
fig, ax = plt.subplots(figsize=(6, 5))
115+
ax.spines["right"].set_visible(False)
116+
ax.spines["left"].set_visible(False)
117+
ax.spines["top"].set_visible(False)
118+
119+
def sub_cmap(cmap, vmin, vmax):
120+
return lambda v: cmap(vmin + (vmax - vmin) * v)
121+
122+
final_df.pivot_table(values = "value", index = "time_value", columns = "issue").plot(
123+
xlabel="Date",
124+
ylabel="CLI",
125+
ax = ax,
126+
linewidth = 1.5,
127+
colormap=sub_cmap(plt.get_cmap('viridis').reversed(), 0.2, 1)
128+
)
129+
130+
plt.title("Smoothed CLI from Doctor Visits", fontsize=16)
131+
plt.subplots_adjust(bottom=.2)
132+
plt.show()
133+
134+
.. image:: images/Versioned_Data.png
135+
:width: 800
136+
:alt: Smoothed CLI from Facebook Survey
137+
93138
Multiple issues of observations
94139
-------------------------------
95140

docs_charts.py

Lines changed: 52 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -15,14 +15,62 @@
1515
apicall = epidata.pub_covidcast(
1616
data_source = "fb-survey",
1717
signals = "smoothed_cli",
18-
geo_type = "nation",
18+
geo_type = "state",
19+
geo_values = "pa,ca,fl",
1920
time_type = "day",
20-
geo_values = "us",
2121
time_values = EpiRange(20210405, 20210410))
2222
print(apicall)
2323

2424
data = apicall.df()
2525

26-
data.plot(x="time_value", y="value", title="Smoothed CLI from Facebook Survey", xlabel="Date", ylabel="CLI")
26+
fig, ax = plt.subplots(figsize=(6, 5))
27+
ax.spines["right"].set_visible(False)
28+
ax.spines["left"].set_visible(False)
29+
ax.spines["top"].set_visible(False)
30+
31+
data.pivot_table(values = "value", index = "time_value", columns = "geo_value").plot(
32+
xlabel="Date",
33+
ylabel="CLI",
34+
ax = ax,
35+
linewidth = 1.5
36+
)
37+
38+
plt.title("Smoothed CLI from Facebook Survey", fontsize=16)
39+
plt.subplots_adjust(bottom=.2)
40+
plt.savefig("docs/images/Getting_Started.png", dpi=300)
41+
42+
results = []
43+
for as_of_date in ["2020-05-07", "2020-05-14", "2020-05-21", "2020-05-28"]:
44+
apicall = epidata.pub_covidcast(
45+
data_source = "doctor-visits",
46+
signals = "smoothed_adj_cli",
47+
time_type = "day",
48+
time_values = EpiRange("2020-04-20", "2020-04-27"),
49+
geo_type = "state",
50+
geo_values = "pa",
51+
as_of = as_of_date)
52+
53+
results.append(apicall.df())
54+
55+
final_df = pd.concat(results)
56+
final_df["issue"] = final_df["issue"].dt.date
57+
58+
fig, ax = plt.subplots(figsize=(6, 5))
59+
ax.spines["right"].set_visible(False)
60+
ax.spines["left"].set_visible(False)
61+
ax.spines["top"].set_visible(False)
62+
63+
def sub_cmap(cmap, vmin, vmax):
64+
return lambda v: cmap(vmin + (vmax - vmin) * v)
65+
66+
final_df.pivot_table(values = "value", index = "time_value", columns = "issue").plot(
67+
xlabel="Date",
68+
ylabel="CLI",
69+
ax = ax,
70+
linewidth = 1.5,
71+
colormap=sub_cmap(plt.get_cmap('viridis').reversed(), 0.2, 1)
72+
)
73+
74+
plt.title("Smoothed CLI from Doctor Visits", fontsize=16)
2775
plt.subplots_adjust(bottom=.2)
28-
plt.show()
76+
plt.savefig("docs/images/Versioned_Data.png", dpi=300)

0 commit comments

Comments
 (0)