-
Notifications
You must be signed in to change notification settings - Fork 8
Update query lab with latest environment and execution scripts #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
#!/bin/bash | ||
|
||
# base.sh | ||
# | ||
# Sets a few useful global variables used by other scripts in the lab bin directory | ||
|
||
# Absolute paths to lab directories | ||
SCRIPT_DIR=$(dirname "$0") | ||
export LAB_BASE_DIR=$(cd "$SCRIPT_DIR/.." || return; pwd) | ||
export LAB_ENV_DIR="$LAB_BASE_DIR/lab_env" | ||
export LAB_BIN_DIR="$LAB_BASE_DIR/bin" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
#!/bin/bash | ||
|
||
set -e | ||
|
||
USAGE=" | ||
build.sh your-lab-notebook.ipynb | ||
|
||
A handy script for building the lab directory. Does the following: | ||
1. Remove outputs and cell metadata from the passed notebook | ||
2. Configure the default Jupyter kernel used by the passed notebook | ||
3. Use nbconvert to build index.md from the passed notebook | ||
" | ||
|
||
if [[ "$#" != 1 || "$1" == "help" ]]; then | ||
echo "$USAGE" | ||
exit 0 | ||
fi | ||
|
||
SCRIPT_DIR=$(dirname "$0") | ||
. "$SCRIPT_DIR/base.sh" | ||
LAB_BUILD_DIR=$LAB_BASE_DIR/build | ||
|
||
# Ensure the passed notebook exists | ||
NB="$1" | ||
if [[ ! -f "$NB" ]]; then | ||
echo "Error: $NB does not exist" | ||
exit 1 | ||
fi | ||
|
||
# Create conda build environment | ||
export INSTALL_BUILD_DEPENDENCIES=true | ||
. "$LAB_BIN_DIR/env.sh" | ||
|
||
# Backup passed notebook | ||
echo "Backing up $NB to $LAB_BUILD_DIR/backup.ipynb" | ||
mkdir -p "$LAB_BUILD_DIR" | ||
cp "$NB" "$LAB_BUILD_DIR/backup.ipynb" | ||
|
||
# 1. Remove outputs and cell metadata from the passed notebook | ||
echo "Removing outputs and cell metadata from $NB" | ||
nbstripout "$NB" | ||
|
||
# 2. Configure the default Jupyter kernel used by $NB | ||
echo "Configuring the default Jupyter kernel used by $NB" | ||
KERNELSPEC_PATH=".metadata.kernelspec" | ||
KERNELSPEC='{"name":"optimizelylabs", "language":"python", "display_name":"Python 3 (Optimizely Labs)"}' | ||
UPDATED_TEMP_NB="$LAB_BUILD_DIR/with_kernelspec_updated.ipynb" | ||
jq "$KERNELSPEC_PATH = $KERNELSPEC" "$NB" > "$UPDATED_TEMP_NB" | ||
cp "$UPDATED_TEMP_NB" "$NB" | ||
|
||
# 3. Use nbconvert to build index.md from the passed notebook | ||
jupyter nbconvert --execute --to markdown --output "$LAB_BASE_DIR/index.md" "$NB" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
#!/bin/bash | ||
|
||
# env.sh | ||
# | ||
# A handy script for building and activating the conda environment required to run | ||
# the lab notebook | ||
|
||
SCRIPT_DIR=$(dirname "$0") | ||
. "$SCRIPT_DIR/base.sh" | ||
|
||
CONDA_ENV_NAME=optimizelylabs | ||
BASE_ENV="$LAB_ENV_DIR/base.yml" | ||
DOCKER_BASE_ENV="$LAB_ENV_DIR/docker_base.yml" | ||
LABS_ENV="$LAB_ENV_DIR/labs.yml" | ||
BUILD_ENV="$LAB_ENV_DIR/build.yml" | ||
|
||
# Ensure we can use conda activate | ||
CONDA_BASE=$(conda info --base) | ||
source "$CONDA_BASE/etc/profile.d/conda.sh" | ||
|
||
# Create or update the conda environment | ||
echo "Creating conda environment $CONDA_ENV_NAME" | ||
if [[ -n "${IN_DOCKER_CONTAINER:-}" ]]; then | ||
echo "Running in a docker container; installing docker base dependencies" | ||
conda env update --file "$DOCKER_BASE_ENV" --name "$CONDA_ENV_NAME" | ||
else | ||
echo "Not running in a docker container; installing base dependencies" | ||
conda env update --file "$BASE_ENV" --name "$CONDA_ENV_NAME" | ||
fi | ||
|
||
echo "Installing Optimizely Labs dependencies" | ||
conda env update --file "$LABS_ENV" --name "$CONDA_ENV_NAME" | ||
|
||
if [[ -n "${INSTALL_BUILD_DEPENDENCIES:-}" ]]; then | ||
echo "Installing build dependencies" | ||
conda env update --file "$BUILD_ENV" --name "$CONDA_ENV_NAME" | ||
fi | ||
|
||
# Activate conda environment | ||
echo "Activating conda environment $CONDA_ENV_NAME" | ||
conda activate "$CONDA_ENV_NAME" | ||
|
||
# Install an ipython kernel | ||
echo "Installing ipython kernel $CONDA_ENV_NAME" | ||
python -m ipykernel install --user --name "$CONDA_ENV_NAME" --display-name="Python 3 (Optimizely Labs Environment)" |
42 changes: 42 additions & 0 deletions
42
labs/query-enriched-event-data-with-spark/bin/run-docker.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
#!/bin/bash | ||
|
||
# run-docker.sh | ||
# | ||
# A handy script for running the lab notebook locally in a docker container | ||
|
||
set -e | ||
|
||
# Use the script path to build an absolute path for the Lab's base directory | ||
SCRIPT_DIR=$(dirname "$0") | ||
. "$SCRIPT_DIR/base.sh" | ||
|
||
# The Lab directory should be mounted in ~/lab in the container | ||
CONTAINER_HOME=/home/jovyan | ||
CONTAINER_LAB_BASE_DIR="$CONTAINER_HOME/lab" | ||
CONTAINER_LAB_BIN_DIR="$CONTAINER_LAB_BASE_DIR/bin" | ||
|
||
# If OPTIMIZELY_DATA_DIR is defined, mount the specified data directory in | ||
# the container and set the container OPTIMIZELY_DATA_DIR envar accordingly | ||
echo "Starting docker container" | ||
if [[ -n "${OPTIMIZELY_DATA_DIR:-}" ]]; then | ||
CONTAINER_DATA_DIR="$CONTAINER_HOME/optimizely_data" | ||
echo "OPTIMIZELY_DATA_DIR envar set. Mapping to $CONTAINER_DATA_DIR" | ||
|
||
docker run -it --rm \ | ||
-p 8888:8888 \ | ||
-v "$LAB_BASE_DIR:$CONTAINER_LAB_BASE_DIR" \ | ||
-v "$OPTIMIZELY_DATA_DIR:$CONTAINER_DATA_DIR" \ | ||
-e "IN_DOCKER_CONTAINER=true" \ | ||
-e "OPTIMIZELY_DATA_DIR=$CONTAINER_DATA_DIR" \ | ||
jupyter/pyspark-notebook \ | ||
bash "$CONTAINER_LAB_BIN_DIR/run.sh" | ||
else | ||
docker run -it --rm \ | ||
-p 8888:8888 \ | ||
-v "$LAB_BASE_DIR:$CONTAINER_LAB_BASE_DIR" \ | ||
-e "IN_DOCKER_CONTAINER=true" \ | ||
jupyter/pyspark-notebook \ | ||
bash "$CONTAINER_LAB_BIN_DIR/run.sh" | ||
fi | ||
|
||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
#!/bin/bash | ||
|
||
# run.sh | ||
# | ||
# A handy script for running the Lab notebook locally using a conda environment | ||
|
||
set -e | ||
|
||
SCRIPT_DIR=$(dirname "$0") | ||
. "$SCRIPT_DIR/base.sh" | ||
. "$LAB_BIN_DIR/env.sh" | ||
|
||
if [[ -z "${OPTIMIZELY_DATA_DIR:-}" ]]; then | ||
echo "Note: If you'd like to run this notebook using data stored in a different directory, make sure" | ||
echo " to set the OPTIMIZELY_DATA_DIR environment variable first. For example:" | ||
echo " export OPTIMIZELY_DATA_DIR=~/optimizely_data" | ||
fi | ||
|
||
# Run Jupyter Lab | ||
echo "Running Jupyter Lab in $LAB_BASE_DIR" | ||
jupyter lab "$LAB_BASE_DIR" |
This file was deleted.
Oops, something went wrong.
File renamed without changes.
File renamed without changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
dependencies: | ||
- python=3.7.6 | ||
- openjdk=8 | ||
- ipykernel=5.3.4 | ||
- jupyter=1.0.0 | ||
- jupyterlab=2.2.2 | ||
- pyspark=3.0.0 | ||
channels: | ||
- conda-forge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
dependencies: | ||
- nbstripout | ||
channels: | ||
- conda-forge |
5 changes: 5 additions & 0 deletions
5
labs/query-enriched-event-data-with-spark/lab_env/docker_base.yml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
dependencies: | ||
- python=3.7.6 | ||
- ipykernel=5.3.4 | ||
channels: | ||
- conda-forge |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
dependencies: | ||
- pip=20.2.1 | ||
- plotly=4.9.0 | ||
- pip: | ||
- ssrm-test | ||
channels: | ||
- conda-forge |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me like a second to realize that once Jupyter was running, I should focus on the query_enriched_event_data_with_spark
ipynb
file. I'd be curious if this should be called out specifically or if anyone who knows what Jupyter is will know where to go