Skip to content

Commit

Permalink
Edit notebook 1
Browse files Browse the repository at this point in the history
- Added solutions to all exercises using `%load`
- Some minor edits to the code

We probably want to add prose around the code bits, but I left that for
later.
  • Loading branch information
jcrist authored and gforsyth committed Aug 7, 2023
1 parent 5ba9432 commit 5a44265
Show file tree
Hide file tree
Showing 6 changed files with 146 additions and 82 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
*.ipynb_checkpoints
*.ddb
174 changes: 92 additions & 82 deletions 01 - Getting Started.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -106,16 +106,6 @@
"penguins = con.table(\"penguins\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8d7f4471-2387-4247-8ef1-f3ac3a5011b3",
"metadata": {},
"outputs": [],
"source": [
"penguins = con.tables.penguins"
]
},
{
"cell_type": "code",
"execution_count": null,
Expand Down Expand Up @@ -290,114 +280,76 @@
"id": "ed38a0aa-c7f2-4da1-8caf-c60a70e1686b",
"metadata": {},
"source": [
"### Exercise\n",
"### Exercise 1\n",
"\n",
"Your PI is a cranky American biologist who thinks the metric system is for suckers (oh no).\n",
"\n",
"He demands that we convert all of the existing measurements (`mm` and `g`) to inches and lbs, respectively (I am so sorry).\n",
"He demands that we convert all of the existing measurements (`mm` and `g`) to inches and lbs, respectively (I am so sorry). The PI is extra cranky this morning, so we had better make sure that ONLY the silly units are visible in the output.\n",
"\n",
"Some ~helpful~ cursed conversion factors:\n",
"\n",
"| | |\n",
"| -- | -- |\n",
"| mm -> in | divide by 25.4 |\n",
"| g -> lb | divide by 453.6 |"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0923065d-7062-4fb0-b600-817bed375234",
"metadata": {},
"outputs": [],
"source": [
"# TODO: hide this\n",
"penguins.mutate(\n",
" bill_length_in=penguins.bill_length_mm / 25.4,\n",
" bill_depth_in=penguins.bill_depth_mm / 25.4,\n",
" flipper_length_in=penguins.flipper_length_mm / 25.4,\n",
" body_weight_lb=penguins.body_mass_g / 453.6,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "e2096c09-a6ab-4c4a-be9f-09be2ff8067d",
"metadata": {},
"source": [
"| g -> lb | divide by 453.6 |\n",
"\n",
"Two ways you might accomplish this:\n",
"- Chaining `.mutate` to create new columns and `.drop` to drop the old metric columns\n",
"- Using a single `.select` to create the new columns as well as select the relevant older columns\n",
"\n",
"And the PI is extra cranky this morning, so we had better make sure that ONLY the silly units are visible in the output."
"Try both ways below! How do they compare?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8e4a4ae5-a546-4a85-a8b3-45775413b261",
"id": "fb031827",
"metadata": {},
"outputs": [],
"source": [
"# TODO: hide this\n",
"penguins.mutate(\n",
" bill_length_in=penguins.bill_length_mm / 25.4,\n",
" bill_depth_in=penguins.bill_depth_mm / 25.4,\n",
" flipper_length_in=penguins.flipper_length_mm / 25.4,\n",
" body_weight_lb=penguins.body_mass_g / 453.6,\n",
").drop(\n",
" \"bill_length_mm\",\n",
" \"bill_depth_mm\",\n",
" \"flipper_length_mm\",\n",
" \"body_mass_g\",\n",
")"
"# Convert the metric units to imperial, and drop the metric columns.\n",
"# Try this using a `.mutate` and `.drop` call.\n",
"penguins_imperial = ..."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e14cd32b-6d0b-423a-807e-1c8b73e8a50a",
"id": "a0f6d6e3",
"metadata": {},
"outputs": [],
"source": [
"# TODO: hide this\n",
"penguins.select(\n",
" \"species\",\n",
" \"island\",\n",
" \"sex\",\n",
" \"year\",\n",
" bill_length_in=penguins.bill_length_mm / 25.4,\n",
" bill_depth_in=penguins.bill_depth_mm / 25.4,\n",
" flipper_length_in=penguins.flipper_length_mm / 25.4,\n",
" body_weight_lb=penguins.body_mass_g / 453.6,\n",
")"
"# Convert the metric units to imperial, and drop the metric columns.\n",
"# Try this using a single `.select` call.\n",
"penguins_imperial = ..."
]
},
{
"cell_type": "markdown",
"id": "e0c9b2e7-e33f-422d-8f87-b055f659d7a2",
"id": "66b5cf2c",
"metadata": {},
"source": [
"We won't save these conversions over our existing data because\n",
"\n",
"* a) the metric system is objectively better and\n",
"* b) I didn't do well in grad school and I'm not about to start now"
"#### Solutions"
]
},
{
"cell_type": "markdown",
"id": "30f01944-6d59-4e31-895d-93c18e921c37",
"cell_type": "code",
"execution_count": null,
"id": "90345bf7",
"metadata": {},
"outputs": [],
"source": [
"**Note**: If you did save the penguins expression with the imperial units, you can uncomment the cell below and run it to reload:"
"%load solutions/nb01-ex01-mutate-drop.py"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "77109bb4-da53-438b-993b-fc6fb2662482",
"id": "2c3f1135",
"metadata": {},
"outputs": [],
"source": [
"# con = ibis.duckdb.connect(\"palmer_penguins.ddb\")\n",
"# penguins = con.tables.penguins"
"%load solutions/nb01-ex01-select.py"
]
},
{
Expand Down Expand Up @@ -431,8 +383,8 @@
"metadata": {},
"outputs": [],
"source": [
"penguins.order_by(penguins.flipper_length_cm).select(\n",
" \"species\", \"island\", \"flipper_length_cm\"\n",
"penguins.order_by(penguins.flipper_length_mm).select(\n",
" \"species\", \"island\", \"flipper_length_mm\"\n",
")"
]
},
Expand All @@ -451,8 +403,8 @@
"metadata": {},
"outputs": [],
"source": [
"penguins.order_by(penguins.flipper_length_cm.desc()).select(\n",
" \"species\", \"island\", \"flipper_length_cm\"\n",
"penguins.order_by(penguins.flipper_length_mm.desc()).select(\n",
" \"species\", \"island\", \"flipper_length_mm\"\n",
")"
]
},
Expand All @@ -471,8 +423,8 @@
"metadata": {},
"outputs": [],
"source": [
"penguins.order_by(ibis.desc(\"flipper_length_cm\")).select(\n",
" \"species\", \"island\", \"flipper_length_cm\"\n",
"penguins.order_by(ibis.desc(\"flipper_length_mm\")).select(\n",
" \"species\", \"island\", \"flipper_length_mm\"\n",
")"
]
},
Expand Down Expand Up @@ -645,15 +597,73 @@
"id": "899237e9-6376-4bff-908c-5a1d163d3c3c",
"metadata": {},
"source": [
"### What was the largest female penguin (by body mass) on each island in the year 2008?"
"### Exercise 2\n",
"\n",
"What was the largest female penguin (by body mass) on each island in the year 2008?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "eda8ee1b",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "97a83eb9",
"metadata": {},
"source": [
"#### Solution\n",
"\n",
"Note that there are several ways these queries could be written - it's fine if your solution doesn't look like ours, what's important is that the results are the same."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ef097744",
"metadata": {},
"outputs": [],
"source": [
"%load solutions/nb01-ex02.py"
]
},
{
"cell_type": "markdown",
"id": "1c4625b4-25dd-45b5-b151-4e42a77638d1",
"metadata": {},
"source": [
"### What was the largest male penguin (by body mass) on each island for each year of data collection?"
"### Exercise 3\n",
"\n",
"What was the largest male penguin (by body mass) on each island for each year of data collection?"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9ffe89c3",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"id": "9aa56930",
"metadata": {},
"source": [
"#### Solution"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "23451783",
"metadata": {},
"outputs": [],
"source": [
"%load solutions/nb01-ex03.py"
]
},
{
Expand Down Expand Up @@ -825,7 +835,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.10.10"
}
},
"nbformat": 4,
Expand Down
18 changes: 18 additions & 0 deletions solutions/nb01-ex01-mutate-drop.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Convert the metric units to imperial, and drop the metric columns.
penguins_imperial = (
penguins
.mutate(
bill_length_in=penguins.bill_length_mm / 25.4,
bill_depth_in=penguins.bill_depth_mm / 25.4,
flipper_length_in=penguins.flipper_length_mm / 25.4,
body_weight_lb=penguins.body_mass_g / 453.6,
)
.drop(
"bill_length_mm",
"bill_depth_mm",
"flipper_length_mm",
"body_mass_g",
)
)

penguins_imperial
13 changes: 13 additions & 0 deletions solutions/nb01-ex01-select.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Convert the metric units to imperial, and drop the metric columns.
penguins_imperial = penguins.select(
"species",
"island",
"sex",
"year",
bill_length_in=penguins.bill_length_mm / 25.4,
bill_depth_in=penguins.bill_depth_mm / 25.4,
flipper_length_in=penguins.flipper_length_mm / 25.4,
body_weight_lb=penguins.body_mass_g / 453.6,
)

penguins_imperial
14 changes: 14 additions & 0 deletions solutions/nb01-ex02.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# What was the largest female penguin (by body mass) on each island in the year 2008
sol2 = (
penguins
.filter(
[
penguins.sex == "female",
penguins.year == 2008,
]
)
.group_by("island")
.body_mass_g.max()
)

sol2
8 changes: 8 additions & 0 deletions solutions/nb01-ex03.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
sol3 = (
penguins
.filter(penguins.sex == "male")
.group_by(["island", "year"])
.body_mass_g.max()
)

sol3

0 comments on commit 5a44265

Please sign in to comment.