Skip to content

Commit

Permalink
update documentation with Sphinx
Browse files Browse the repository at this point in the history
  • Loading branch information
lisphilar committed Aug 25, 2023
1 parent 761293a commit b293633
Show file tree
Hide file tree
Showing 35 changed files with 2,121 additions and 1,723 deletions.
100 changes: 50 additions & 50 deletions docs/01_data_preparation.html
Original file line number Diff line number Diff line change
Expand Up @@ -248,11 +248,11 @@ <h3>1-1. With <code class="docutils literal notranslate"><span class="pre">DataE
</div>
<div class="output_area docutils container">
<div class="highlight"><pre>
2023-07-20 at 13:37:47 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving COVID-19 dataset from https://github.com/lisphilar/covid19-sir/data/</span>
2023-07-20 at 13:37:48 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from COVID-19 Data Hub https://covid19datahub.io/</span>
2023-07-20 at 13:37:56 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from Our World In Data https://github.com/owid/covid-19-data/</span>
2023-07-20 at 13:37:57 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from Our World In Data https://github.com/owid/covid-19-data/</span>
2023-07-20 at 13:37:58 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from Our World In Data https://github.com/owid/covid-19-data/</span>
2023-08-25 at 05:04:23 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving COVID-19 dataset from https://github.com/lisphilar/covid19-sir/data/</span>
2023-08-25 at 05:04:23 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from COVID-19 Data Hub https://covid19datahub.io/</span>
2023-08-25 at 05:04:30 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from Our World In Data https://github.com/owid/covid-19-data/</span>
2023-08-25 at 05:04:32 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from Our World In Data https://github.com/owid/covid-19-data/</span>
2023-08-25 at 05:04:32 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from Our World In Data https://github.com/owid/covid-19-data/</span>
</pre></div></div>
</div>
<div class="nboutput nblast docutils container">
Expand All @@ -261,7 +261,7 @@ <h3>1-1. With <code class="docutils literal notranslate"><span class="pre">DataE
</div>
<div class="output_area docutils container">
<div class="highlight"><pre>
&lt;covsirphy.engineering.engineer.DataEngineer at 0x7f420af66c90&gt;
&lt;covsirphy.engineering.engineer.DataEngineer at 0x7f36dfd90290&gt;
</pre></div></div>
</div>
<p>We can get the all downloaded records as a <code class="docutils literal notranslate"><span class="pre">pandas.DataFrame</span></code> with <code class="docutils literal notranslate"><span class="pre">DataEngineer().all()</span></code> method.</p>
Expand All @@ -281,39 +281,39 @@ <h3>1-1. With <code class="docutils literal notranslate"><span class="pre">DataE
<div class="output_area docutils container">
<div class="highlight"><pre>
&lt;class &#39;pandas.core.frame.DataFrame&#39;&gt;
RangeIndex: 277948 entries, 0 to 277947
RangeIndex: 281012 entries, 0 to 281011
Data columns (total 27 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 ISO3 277948 non-null category
1 Province 277948 non-null category
2 City 277948 non-null category
3 Date 277948 non-null datetime64[ns]
0 ISO3 281012 non-null category
1 Province 281012 non-null category
2 City 281012 non-null category
3 Date 281012 non-null datetime64[ns]
4 Cancel_events 197519 non-null Float64
5 Confirmed 237960 non-null Float64
5 Confirmed 238267 non-null Float64
6 Contact_tracing 197545 non-null Float64
7 Country 271444 non-null string
8 Fatal 220763 non-null Float64
7 Country 274182 non-null string
8 Fatal 221070 non-null Float64
9 Gatherings_restrictions 197519 non-null Float64
10 Information_campaigns 197545 non-null Float64
11 Internal_movement_restrictions 197545 non-null Float64
12 International_movement_restrictions 197552 non-null Float64
13 Population 270282 non-null Float64
14 Product 159057 non-null string
15 Recovered 73432 non-null Float64
13 Population 273020 non-null Float64
14 Product 161552 non-null string
15 Recovered 73536 non-null Float64
16 School_closing 197544 non-null Float64
17 Stay_home_restrictions 197513 non-null Float64
18 Stringency_index 197508 non-null Float64
19 Testing_policy 197545 non-null Float64
20 Tests 90026 non-null Float64
20 Tests 90175 non-null Float64
21 Transport_closing 197525 non-null Float64
22 Vaccinated_full 56130 non-null Float64
23 Vaccinated_once 59594 non-null Float64
24 Vaccinations 62568 non-null Float64
25 Vaccinations_boosters 33894 non-null Float64
22 Vaccinated_full 56777 non-null Float64
23 Vaccinated_once 60200 non-null Float64
24 Vaccinations 63237 non-null Float64
25 Vaccinations_boosters 34477 non-null Float64
26 Workplace_closing 197544 non-null Float64
dtypes: Float64(21), category(3), datetime64[ns](1), string(2)
memory usage: 57.5 MB
memory usage: 58.2 MB
</pre></div></div>
</div>
<p><code class="docutils literal notranslate"><span class="pre">DataEngineer.citations()</span></code> shows citations of the datasets.</p>
Expand Down Expand Up @@ -352,8 +352,8 @@ <h3>1-1. With <code class="docutils literal notranslate"><span class="pre">DataE
</div>
<div class="output_area docutils container">
<div class="highlight"><pre>
2023-07-20 at 13:38:04 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving COVID-19 dataset from https://github.com/lisphilar/covid19-sir/data/</span>
2023-07-20 at 13:38:05 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from COVID-19 Data Hub https://covid19datahub.io/</span>
2023-08-25 at 05:04:36 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving COVID-19 dataset from https://github.com/lisphilar/covid19-sir/data/</span>
2023-08-25 at 05:04:37 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from COVID-19 Data Hub https://covid19datahub.io/</span>
</pre></div></div>
</div>
<div class="nboutput nblast docutils container">
Expand Down Expand Up @@ -544,7 +544,7 @@ <h3>1-1. With <code class="docutils literal notranslate"><span class="pre">DataE
</div>
<div class="output_area docutils container">
<div class="highlight"><pre>
2023-07-20 at 13:38:08 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from COVID-19 Data Hub https://covid19datahub.io/</span>
2023-08-25 at 05:04:39 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from COVID-19 Data Hub https://covid19datahub.io/</span>
</pre></div></div>
</div>
<div class="nboutput nblast docutils container">
Expand Down Expand Up @@ -748,39 +748,39 @@ <h3>1-2. With <code class="docutils literal notranslate"><span class="pre">DataD
<div class="output_area docutils container">
<div class="highlight"><pre>
&lt;class &#39;pandas.core.frame.DataFrame&#39;&gt;
RangeIndex: 277948 entries, 0 to 277947
RangeIndex: 281012 entries, 0 to 281011
Data columns (total 27 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 ISO3 277948 non-null string
1 Province 277948 non-null string
2 City 277948 non-null string
3 Date 277948 non-null datetime64[ns]
0 ISO3 281012 non-null string
1 Province 281012 non-null string
2 City 281012 non-null string
3 Date 281012 non-null datetime64[ns]
4 Cancel_events 197519 non-null Float64
5 Confirmed 237960 non-null Float64
5 Confirmed 238267 non-null Float64
6 Contact_tracing 197545 non-null Float64
7 Country 271444 non-null string
8 Fatal 220763 non-null Float64
7 Country 274182 non-null string
8 Fatal 221070 non-null Float64
9 Gatherings_restrictions 197519 non-null Float64
10 Information_campaigns 197545 non-null Float64
11 Internal_movement_restrictions 197545 non-null Float64
12 International_movement_restrictions 197552 non-null Float64
13 Population 270282 non-null Float64
14 Product 159057 non-null string
15 Recovered 73432 non-null Float64
13 Population 273020 non-null Float64
14 Product 161552 non-null string
15 Recovered 73536 non-null Float64
16 School_closing 197544 non-null Float64
17 Stay_home_restrictions 197513 non-null Float64
18 Stringency_index 197508 non-null Float64
19 Testing_policy 197545 non-null Float64
20 Tests 90026 non-null Float64
20 Tests 90175 non-null Float64
21 Transport_closing 197525 non-null Float64
22 Vaccinated_full 56130 non-null Float64
23 Vaccinated_once 59594 non-null Float64
24 Vaccinations 62568 non-null Float64
25 Vaccinations_boosters 33894 non-null Float64
22 Vaccinated_full 56777 non-null Float64
23 Vaccinated_once 60200 non-null Float64
24 Vaccinations 63237 non-null Float64
25 Vaccinations_boosters 34477 non-null Float64
26 Workplace_closing 197544 non-null Float64
dtypes: Float64(21), datetime64[ns](1), string(5)
memory usage: 62.8 MB
memory usage: 63.5 MB
</pre></div></div>
</div>
<p>Note that ISO3/Province/City columns have string data instead of categorical data.</p>
Expand Down Expand Up @@ -841,9 +841,9 @@ <h3>2-1. Retrieve Monkeypox line list<a class="headerlink" href="#2-1.-Retrieve-
</div>
<div class="output_area docutils container">
<div class="highlight"><pre>
Global.health Monkeypox (accessed on 2023-07-20):
Global.health Monkeypox (accessed on 2023-08-25):
Kraemer, Tegally, Pigott, Dasgupta, Sheldon, Wilkinson, Schultheiss, et al. Tracking the 2022 Monkeypox Outbreak with Epidemiological Data in Real-Time. The Lancet Infectious Diseases. https://doi.org/10.1016/S1473-3099(22)00359-0.
European Centre for Disease Prevention and Control/WHO Regional Office for Europe. Monkeypox, Joint Epidemiological overview, 20 7, 2022
European Centre for Disease Prevention and Control/WHO Regional Office for Europe. Monkeypox, Joint Epidemiological overview, 25 8, 2022
</pre></div></div>
</div>
<p>Retrieve CSV file with <code class="docutils literal notranslate"><span class="pre">pandas.read_csv()</span></code>, using <a class="reference external" href="https://arrow.apache.org/docs/python/index.html">Pyarrow</a> as the engine.</p>
Expand Down Expand Up @@ -1753,7 +1753,7 @@ <h3>2-2. Convert line list to the number of cases data<a class="headerlink" href
</div>
<div class="output_area docutils container">
<div class="highlight"><pre>
2023-07-20 at 13:39:28 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving GIS data from Natural Earth https://www.naturalearthdata.com/</span>
2023-08-25 at 05:05:50 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving GIS data from Natural Earth https://www.naturalearthdata.com/</span>
</pre></div></div>
</div>
<div class="nboutput nblast docutils container">
Expand Down Expand Up @@ -1809,8 +1809,8 @@ <h3>2-3. Retrieve total population data<a class="headerlink" href="#2-3.-Retriev
</div>
<div class="output_area docutils container">
<div class="highlight"><pre>
2023-07-20 at 13:39:33 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from World Population Prospects https://population.un.org/wpp/</span>
2023-07-20 at 13:39:52 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold"> [INFO] &#39;Province&#39; layer was removed.</span>
2023-08-25 at 05:05:54 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold">Retrieving datasets from World Population Prospects https://population.un.org/wpp/</span>
2023-08-25 at 05:06:05 | <span class="ansi-bold">INFO</span> | <span class="ansi-bold"> [INFO] &#39;Province&#39; layer was removed.</span>
</pre></div></div>
</div>
<div class="nboutput docutils container">
Expand Down Expand Up @@ -2103,13 +2103,13 @@ <h3>2-4. Register Monkeypox data<a class="headerlink" href="#2-4.-Register-Monke
<div class="highlight"><pre>
[&#39;United Nations, Department of Economic and Social Affairs, Population &#39;
&#39;Division (2022). World Population Prospects 2022, Online Edition.&#39;,
&#39;Global.health Monkeypox (accessed on 2023-07-20):\n&#39;
&#39;Global.health Monkeypox (accessed on 2023-08-25):\n&#39;
&#39;Kraemer, Tegally, Pigott, Dasgupta, Sheldon, Wilkinson, Schultheiss, et al. &#39;
&#39;Tracking the 2022 Monkeypox Outbreak with Epidemiological Data in Real-Time. &#39;
&#39;The Lancet Infectious Diseases. &#39;
&#39;https://doi.org/10.1016/S1473-3099(22)00359-0.\n&#39;
&#39;European Centre for Disease Prevention and Control/WHO Regional Office for &#39;
&#39;Europe. Monkeypox, Joint Epidemiological overview, 20 7, 2022&#39;]
&#39;Europe. Monkeypox, Joint Epidemiological overview, 25 8, 2022&#39;]
</pre></div></div>
</div>
<p>Move forward to <a class="reference external" href="https://lisphilar.github.io/covid19-sir/02_data_engineering.html">Tutorial: Data engineering</a>.</p>
Expand Down
Loading

0 comments on commit b293633

Please sign in to comment.