Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove .ipython dir and introduce kedro Jupyter kernel #1355

Merged
merged 32 commits into from
Mar 30, 2022
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
21c6c2b
Tidy requirements.txt
antonymilne Mar 17, 2022
caaead1
Tidy requirements.txt
antonymilne Mar 17, 2022
6523d4d
Modify ipython command
antonymilne Mar 17, 2022
11d7450
Remove ipython_loader
antonymilne Mar 17, 2022
f8ecf3f
Simplify jupyter notebook commands
antonymilne Mar 18, 2022
778c85a
Further simplify and remove ipython_message
antonymilne Mar 18, 2022
dc5bb19
Add kernel installation
antonymilne Mar 18, 2022
f39bf2e
Switch from MappingKernelManager to MultiKernelManager
antonymilne Mar 18, 2022
a782657
Add message for when kernel already exists
antonymilne Mar 18, 2022
4de7d1c
Add comment
antonymilne Mar 18, 2022
32911be
Lint
antonymilne Mar 18, 2022
a82bc70
Add to MANIFEST and update API docs
antonymilne Mar 24, 2022
e88bd82
Move try/except inside _create_kernel
antonymilne Mar 24, 2022
7c05f9b
Correct imports checks
antonymilne Mar 24, 2022
fa5f5f3
ipython -> IPython
antonymilne Mar 24, 2022
87a3849
Tidy requirements
antonymilne Mar 24, 2022
d0e82cc
Fix jupyter notebook tests
antonymilne Mar 24, 2022
35868bb
Fix jupyter lab tests and add _create_kernel tests
antonymilne Mar 24, 2022
292b0eb
Fix stuff
antonymilne Mar 24, 2022
2846f02
Update logos
antonymilne Mar 24, 2022
fbec113
Fix more stuff
antonymilne Mar 24, 2022
6d08da8
Start changes to docs
antonymilne Mar 24, 2022
f9b1380
More changes to docs
antonymilne Mar 25, 2022
9829046
Finish docs changes
antonymilne Mar 25, 2022
d4357fc
Fix stuff
antonymilne Mar 25, 2022
b49142f
Apply review suggestions
Mar 30, 2022
6803d70
Bump ipython
Mar 30, 2022
25fcd62
Add link to Yetu's talk
Mar 30, 2022
d6ee5bf
Merge branch 'develop' into remove-ipython-dir
Mar 30, 2022
443e99d
Undo ipython bump
Mar 30, 2022
a733323
Fix lint
Mar 30, 2022
dc8bec9
Fix test
Mar 30, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view

This file was deleted.

Binary file added kedro/extras/extensions/logo-32x32.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added kedro/extras/extensions/logo-64x64.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
231 changes: 91 additions & 140 deletions kedro/framework/cli/jupyter.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,79 +3,36 @@
"""
import json
import os
import re
import shutil
import sys
from collections import Counter
from glob import iglob
from pathlib import Path
from typing import Any, Dict, Iterable, List
from typing import Any, Dict
from warnings import warn

from ipykernel.kernelspec import install
antonymilne marked this conversation as resolved.
Show resolved Hide resolved
from jupyter_client.kernelspec import find_kernel_specs
import click
from click import secho
from jupyter_client.kernelspec import NATIVE_KERNEL_NAME, KernelSpecManager
from traitlets import Unicode

from kedro.framework.cli.utils import (
KedroCliError,
_check_module_importable,
command_with_verbosity,
env_option,
forward_command,
ipython_message,
load_entry_points,
python_call,
)
from kedro.framework.project import validate_settings
from kedro.framework.startup import ProjectMetadata

JUPYTER_IP_HELP = "IP address of the Jupyter server."
JUPYTER_ALL_KERNELS_HELP = "Display all available Python kernels."
JUPYTER_IDLE_TIMEOUT_HELP = """When a notebook is closed, Jupyter server will
terminate its kernel after so many seconds of inactivity. This does not affect
any open notebooks."""

CONVERT_ALL_HELP = """Extract the nodes from all notebooks in the Kedro project directory,
including sub-folders."""

OVERWRITE_HELP = """If Python file already exists for the equivalent notebook,
overwrite its contents."""


def collect_line_magic():
"""Interface function for collecting line magic functions from plugin entry points."""
return load_entry_points("line_magic")
antonymilne marked this conversation as resolved.
Show resolved Hide resolved


class SingleKernelSpecManager(KernelSpecManager):
"""A custom KernelSpec manager to be used by Kedro projects.
It limits the kernels to the default one only,
to make it less confusing for users, and gives it a sensible name.
"""

default_kernel_name = Unicode(
"Kedro", config=True, help="Alternative name for the default kernel"
)
whitelist = [NATIVE_KERNEL_NAME]

def get_kernel_spec(self, kernel_name):
"""
This function will only be called by Jupyter to get a KernelSpec
for the default kernel.
We replace the name by something sensible here.
"""
kernelspec = super().get_kernel_spec(kernel_name)

if kernel_name == NATIVE_KERNEL_NAME:
kernelspec.display_name = self.default_kernel_name

return kernelspec


def _update_ipython_dir(project_path: Path) -> None:
os.environ["IPYTHONDIR"] = str(project_path / ".ipython")


# pylint: disable=missing-function-docstring
@click.group(name="Kedro")
def jupyter_cli(): # pragma: no cover
Expand All @@ -90,87 +47,123 @@ def jupyter():


@forward_command(jupyter, "notebook", forward_help=True)
@click.option(
"--ip",
"ip_address",
type=str,
default="127.0.0.1",
help="IP address of the Jupyter server.",
)
@click.option(
"--all-kernels", is_flag=True, default=False, help=JUPYTER_ALL_KERNELS_HELP
)
@click.option("--idle-timeout", type=int, default=30, help=JUPYTER_IDLE_TIMEOUT_HELP)
@env_option
@click.pass_obj # this will pass the metadata as first argument
def jupyter_notebook(
metadata: ProjectMetadata,
ip_address,
all_kernels,
env,
idle_timeout,
args,
**kwargs,
): # pylint: disable=unused-argument,too-many-arguments
): # pylint: disable=unused-argument
"""Open Jupyter Notebook with project specific variables loaded."""
_check_module_importable("jupyter_core")

validate_settings()

if "-h" not in args and "--help" not in args:
ipython_message(all_kernels)

_update_ipython_dir(metadata.project_path)
arguments = _build_jupyter_command(
"notebook",
ip_address=ip_address,
all_kernels=all_kernels,
args=args,
idle_timeout=idle_timeout,
project_name=metadata.project_name,
)
kernel_name = f"kedro_{metadata.package_name}"
try:
_create_kernel(kernel_name, f"Kedro ({metadata.package_name})")
except Exception as exc:
antonymilne marked this conversation as resolved.
Show resolved Hide resolved
raise KedroCliError(
f"Cannot setup kedro kernel for Jupyter.\nError: {exc}"
) from exc

if env:
os.environ["KEDRO_ENV"] = env

python_call_kwargs = _build_jupyter_env(env)
python_call("jupyter", arguments, **python_call_kwargs)
python_call(
"jupyter",
("notebook", f"--MappingKernelManager.default_kernel_name={kernel_name}")
+ args,
)


@forward_command(jupyter, "lab", forward_help=True)
@click.option("--ip", "ip_address", type=str, default="127.0.0.1", help=JUPYTER_IP_HELP)
@click.option(
"--all-kernels", is_flag=True, default=False, help=JUPYTER_ALL_KERNELS_HELP
)
@click.option("--idle-timeout", type=int, default=30, help=JUPYTER_IDLE_TIMEOUT_HELP)
@env_option
@click.pass_obj # this will pass the metadata as first argument
def jupyter_lab(
metadata: ProjectMetadata,
ip_address,
all_kernels,
env,
idle_timeout,
args,
**kwargs,
): # pylint: disable=unused-argument,too-many-arguments
): # pylint: disable=unused-argument
"""Open Jupyter Lab with project specific variables loaded."""
_check_module_importable("jupyter_core")

validate_settings()

if "-h" not in args and "--help" not in args:
ipython_message(all_kernels)

_update_ipython_dir(metadata.project_path)
arguments = _build_jupyter_command(
"lab",
ip_address=ip_address,
all_kernels=all_kernels,
args=args,
idle_timeout=idle_timeout,
project_name=metadata.project_name,
kernel_name = f"kedro_{metadata.package_name}"
try:
_create_kernel(kernel_name, f"Kedro ({metadata.package_name})")
except Exception as exc:
raise KedroCliError(
f"Cannot setup kedro kernel for Jupyter.\nError: {exc}"
) from exc

if env:
os.environ["KEDRO_ENV"] = env

python_call(
"jupyter",
("lab", f"--MultiKernelManager.default_kernel_name={kernel_name}") + args,
)


def _create_kernel(kernel_name: str, display_name: str):
"""Creates an ipython kernel for the kedro project, if one does not already exist.
antonymilne marked this conversation as resolved.
Show resolved Hide resolved

Installs the default ipython kernel (which points towards `sys.executable`)
antonymilne marked this conversation as resolved.
Show resolved Hide resolved
and customises it to make the launch command load the kedro extension.

On linux this creates a directory ~/.local/share/jupyter/kernels/{kernel_name}
antonymilne marked this conversation as resolved.
Show resolved Hide resolved
containing kernel.json, logo-32x32.png and logo-64x64.png. An example kernel.json
looks as follows:

{
"argv": [
"/Users/antony_milne/miniconda3/envs/spaceflights/bin/python",
"-m",
"ipykernel_launcher",
"-f",
"{connection_file}",
"--ext",
"kedro.extras.extensions.ipython"
],
"display_name": "Kedro (spaceflights)",
"language": "python",
"metadata": {
"debugger": false
}
}

Args:
kernel_name: Name of the kernel to create.
display_name: Kernel name as it is displayed in the UI.
"""
antonymilne marked this conversation as resolved.
Show resolved Hide resolved
if kernel_name in find_kernel_specs():
secho(
f"Jupyter kernel {kernel_name} already exists and will be used.",
fg="green",
)
return

# Install with user=True rather than system-wide to minimise footprint and
# ensure that we have permissions to write there.
kernel_path = install(
user=True,
kernel_name=kernel_name,
display_name=display_name,
)

python_call_kwargs = _build_jupyter_env(env)
python_call("jupyter", arguments, **python_call_kwargs)
kernel_json = Path(kernel_path) / "kernel.json"
kernel_spec = json.loads(kernel_json.read_text())
kernel_spec["argv"].extend(["--ext", "kedro.extras.extensions.ipython"])
# indent=1 is to match the default ipykernel style (see ipykernel.write_kernel_spec).
kernel_json.write_text(json.dumps(kernel_spec, indent=1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use json.dump here instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not while using pathlib (which is what we generally use now), because json.dump needs a file pointer rather than a path object.


kedro_extensions_dir = Path(__file__).parents[2] / "extras" / "extensions"
shutil.copy(kedro_extensions_dir / "logo-32x32.png", kernel_path)
shutil.copy(kedro_extensions_dir / "logo-64x64.png", kernel_path)


@command_with_verbosity(jupyter, "convert")
Expand Down Expand Up @@ -200,8 +193,6 @@ def convert_notebook(
source_path = metadata.source_dir
package_name = metadata.package_name

_update_ipython_dir(project_path)

if not filepath and not all_flag:
secho(
"Please specify a notebook filepath "
Expand Down Expand Up @@ -247,46 +238,6 @@ def convert_notebook(
secho("Done!", color="green") # type: ignore


def _build_jupyter_command( # pylint: disable=too-many-arguments
antonymilne marked this conversation as resolved.
Show resolved Hide resolved
base: str,
ip_address: str,
all_kernels: bool,
args: Iterable[str],
idle_timeout: int,
project_name: str = "Kedro",
) -> List[str]:
cmd = [
base,
"--ip",
ip_address,
f"--MappingKernelManager.cull_idle_timeout={idle_timeout}",
f"--MappingKernelManager.cull_interval={idle_timeout}",
]

if not all_kernels:
kernel_name = re.sub(r"[^\w]+", "", project_name).strip() or "Kedro"

cmd += [
"--NotebookApp.kernel_spec_manager_class="
"kedro.framework.cli.jupyter.SingleKernelSpecManager",
f"--KernelSpecManager.default_kernel_name='{kernel_name}'",
]

return cmd + list(args)


def _build_jupyter_env(kedro_env: str) -> Dict[str, Any]:
"""Build the environment dictionary that gets injected into the subprocess running
Jupyter. Since the subprocess has access only to the environment variables passed
in, we need to copy the current environment and add ``KEDRO_ENV``.
"""
if not kedro_env:
return {}
jupyter_env = os.environ.copy()
jupyter_env["KEDRO_ENV"] = kedro_env
return {"env": jupyter_env}

antonymilne marked this conversation as resolved.
Show resolved Hide resolved

def _export_nodes(filepath: Path, output_path: Path) -> None:
"""Copy code from Jupyter cells into nodes in src/<package_name>/nodes/,
under filename with same name as notebook.
Expand Down
6 changes: 1 addition & 5 deletions kedro/framework/cli/project.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@
command_with_verbosity,
env_option,
forward_command,
ipython_message,
python_call,
split_string,
)
Expand Down Expand Up @@ -124,12 +123,9 @@ def ipython(
"""Open IPython with project specific variables loaded."""
_check_module_importable("IPython")

os.environ["IPYTHONDIR"] = str(metadata.project_path / ".ipython")
if env:
os.environ["KEDRO_ENV"] = env
if "-h" not in args and "--help" not in args:
ipython_message()
call(["ipython"] + list(args))
call(["ipython", "--ext", "kedro.extras.extensions.ipython"] + list(args))


@project_group.command()
Expand Down
17 changes: 0 additions & 17 deletions kedro/framework/cli/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -297,23 +297,6 @@ def env_option(func_=None, **kwargs):
return opt(func_) if func_ else opt


def ipython_message(all_kernels=True):
"""Show a message saying how we have configured the IPython env."""
ipy_vars = ["startup_error", "context"]
click.secho("-" * 79, fg="cyan")
click.secho("Starting a Kedro session with the following variables in scope")
click.secho(", ".join(ipy_vars), fg="green")
line_magic = click.style("%reload_kedro", fg="green")
click.secho(f"Use the line magic {line_magic} to refresh them")
click.secho("or to see the error message if they are undefined")

if not all_kernels:
click.secho("The choice of kernels is limited to the default one.", fg="yellow")
click.secho("(restart with --all-kernels to get access to others)", fg="yellow")

click.secho("-" * 79, fg="cyan")


@contextmanager
def _filter_deprecation_warnings():
"""Temporarily suppress all DeprecationWarnings."""
Expand Down

This file was deleted.

Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
black==21.5b1
flake8>=3.7.9, <4.0
ipython~=7.10
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just left over from a bad merge - the specification on the next line is the right one.

ipython>=7.31.1, <8.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we relax this one further, e.g. <9.0? The most recent ipython version atm is 8.1.1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but ipython 8 requires Python >= 3.8 and we still support 3.7. Happy to bump it though if you like and let pip work out which version to install for the user's given version of Python. Or maybe even just remove it from the requirements given it's a core dependency of jupyter and jupyterlab anyway?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we can support both (i.e. no breaking changes that make us have different code for different python versions), we can bump the requirement. I'm slightly more in favour of that over removing it completely, just for users that want to use ipython over jupyter and only need to remove the relevant lines. I don't have any strong opinions on this though.

isort~=5.0
jupyter~=1.0
Expand Down
1 change: 0 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ dynaconf>=3.1.2,<4.0.0
fsspec>=2021.4, <=2022.1
gitpython~=3.0
jmespath>=0.9.5, <1.0
jupyter_client>=5.1, <8.0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already in the project template requirements, which is where it really belongs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted elsewhere I've actually removed jupyter_client altogether:

This is a core dependence of jupyter so I've decided to not list it explicitly so we don't have to keep bumping the version requirements.

pip-tools~=6.5
pluggy~=1.0.0
python-json-logger~=2.0
Expand Down
Loading