|
1 |
| -GeoDocker GeoTrellis Jupyter Notebook |
2 |
| -===================================== |
| 1 | +# GeoDocker GeoTrellis Jupyter Notebook # |
3 | 2 |
|
4 | 3 | A Docker container to provide a Jupyter notebook instance with GeoTrellis functionality to [GeoDocker](https://github.com/geodocker/geodocker).
|
5 | 4 |
|
6 |
| -Configuration and Usage |
7 |
| ------------------------ |
| 5 | +## Configuring and Starting ## |
8 | 6 |
|
9 |
| -Start this container using the `--net=host` option and point a browser on the host machine to http://localhost:7777 to enter the interface. Select `Apache Toree - Scala` from the New dropdown menu and start typing Scala commands. Note that `sc` is pre-initialized to a live `SparkContext` instance. |
| 7 | +### Toy Settings ### |
| 8 | + |
| 9 | +Starting this image in toy settings (e.g. with a "cluster" confined entirely to one physical computer) is very easy. |
| 10 | + |
| 11 | +#### Self-Contained #### |
| 12 | + |
| 13 | +To use the image with a self-contained local master, type |
| 14 | + |
| 15 | +```bash |
| 16 | +docker run -it --rm -p 8000:8000 quay.io/geodocker/geotrellis-jupyter:9b577f1 |
| 17 | +``` |
| 18 | + |
| 19 | +After a few moments, the server should be available at [`localhost:8000`](http://localhost:8000). |
| 20 | + |
| 21 | +#### With GeoDocker #### |
| 22 | + |
| 23 | +First ensure that the `docker-compose` command is installed and working. |
| 24 | +With that command present, simply navigate into a directory containing the appropriate [`docker-compose.yml`](docker-compose.yml) file and bring the "cluster" up |
| 25 | + |
| 26 | +```bash |
| 27 | +cd ~/local/src/geodocker-geotrellis-jupyter |
| 28 | +docker-compose up |
| 29 | +``` |
| 30 | + |
| 31 | +As before, the server should be available at [`localhost:8000`](http://localhost:8000). |
| 32 | + |
| 33 | +### Serious Settings ### |
| 34 | + |
| 35 | +The two most immediate issues with using this image in a more serious setting (with a real cluster) are |
| 36 | + - properly configuring the Spark master, and |
| 37 | + - enabling SSL. |
| 38 | + |
| 39 | +To use the image with a YARN master, the appropriate configuration files must be copied to the image |
| 40 | +(the precise details of how to do that are left as an exercise to the reader). |
| 41 | +Once everything is setup so that it is possible run jobs with a YARN master from within the container, Toree must be reinstalled with the appropriate settings. |
| 42 | +The command to do that might look something like this |
| 43 | + |
| 44 | +```bash |
| 45 | +scl enable python33 'jupyter toree install --spark_opts="--master yarn --jars file:///tmp/geotrellis-uberjar-assembly-1.0.0-RC1.jar"' |
| 46 | +``` |
| 47 | + |
| 48 | +To use the image with Spark in stand-alone mode, Toree must be reinstalled with the appropriate settings. |
| 49 | +The command to do that might look something like this |
| 50 | + |
| 51 | +```bash |
| 52 | +scl enable python33 'jupyter toree install --spark_opts="--master spark://10.0.1.3:7077 --jars file:///tmp/geotrellis-uberjar-assembly-1.0.0-RC1.jar"' |
| 53 | +``` |
| 54 | + |
| 55 | +In stand-alone mode, the version of Spark in the image (currently 2.0.0) must match the version installed on the cluster. |
| 56 | +If that is not true, then it is be necessary to create a new image |
| 57 | +(either derived from this one [in the docker-sense] or built from a fork of this source distribution) |
| 58 | +with the appropriate version of Spark installed. |
| 59 | + |
| 60 | +To run `jupyterhub` with SSL enabled, the [JupyterHub documentation](https://github.com/jupyterhub/jupyterhub) suggests something like this |
| 61 | + |
| 62 | +```bash |
| 63 | +jupyterhub --ip 10.0.1.2 --port 443 --ssl-key my_ssl.key --ssl-cert my_ssl.cert |
| 64 | +``` |
| 65 | + |
| 66 | +Please see the JupyterHub documentation for more detailed discussion; |
| 67 | +The steps/suggestions given here are probably necessary but almost certainly not sufficient to produce a working setup. |
| 68 | + |
| 69 | +The [`geodocker.sh`](scripts/geodocker.sh) script is an example of a script which reinstalls Toree then launches JupyerHub. |
| 70 | +For serious usage, it will probably be necessary to create another docker image derived form this one. |
| 71 | +That image should contain site-specific configuration files and a script similar to `scripts/geodocker.sh` with the appropriate configuration and launch commands encapsulated within. |
| 72 | + |
| 73 | +## Usage ## |
| 74 | + |
| 75 | +The default username and password are both `jack`. |
| 76 | +The default account is suitable for local use, |
| 77 | +but if the image is going to be used in a more serious setting, be sure to disable that account and enable some other login mechanism. |
| 78 | + |
| 79 | +To make use of GeoTrellis, create a new "Apache Toree - Scala" notebook (or use an existing one). |
| 80 | + |
| 81 | + |
0 commit comments