Skip to content

Client flags for writing environment.environment_management fields to the bundle's manifest.json #461

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Aug 18, 2023

Conversation

dbkegley
Copy link
Collaborator

@dbkegley dbkegley commented Aug 4, 2023

Intent

Add --disable-env-management, --disable-env-management-py and --disable-env-management-r flags for all content types that support environment restores. These flags indicate to Connect that the user is responsible for Python/R package installation, and Connect should not install packages during the build. The Python/R packages must still be available in the runtime environment in order to run the content. This is especially useful if off-host execution is enabled when the execution environment (specified by --image) already contains the required packages. Requires Posit Connect >=2023.07.0.

Type of Change

  • Bug Fix
  • New Feature
  • Breaking Change

Approach

Adds 3 new flags for all sub-commands that might write a manifest.json:

  • --disable-env-management - shorthand flag to disable both R and Python env management
  • --disable-env-management-r - disable only R env management
  • --disable-env-management-py - disable only Python env management.

All 3 flags are Null-able and default to None. None indicates that no value should be written to the manifest.
Disabling environment management means that the respective value in the manifest should be False

Note: There is currently no way to explicitly "enable" env management. "enabled" is the current default behavior so users should rely on None to mean enabled. This is a potential shortcoming of the current --disable-style flag implementation.

Because of the nature of CLI flags (present == True), we use a callback to invert the value of the flag. The value is then stored on a non-negative variable which we pass around as an argument. (e.g., --disable-env-management-py == env_managment_py=False). This logic is controlled by the func env_management_callback(ctx, param, value), which is shared by --disable-env-management-r and --disable-env-management-py.

All 3 new flags are added via the new @runtime_environment_args helper annotation. The existing --image flag was also added to this helper annotation. All @runtime_environment_args flags are used to write values to the environment field in the bundle's manifest.json

Automated Tests

  • Updated/Added manifest-writer tests in bundles_test.py

Directions for Reviewers

We should validate all commands that write manifests have --disable-env-management flags in the help options. These include (flags listed for api are also shared by fastapi,dash,streamlit,bokeh,shiny so those don't need to be validated separately):

rsconnect write-manifest api       --help  
rsconnect write-manifest notebook  --help  
rsconnect write-manifest quarto    --help  
rsconnect write-manifest voila     --help

rsconnect deploy api       --help  
rsconnect deploy notebook  --help  
rsconnect deploy quarto    --help  
rsconnect deploy voila     --help

There are 3 helpers that write manifests in bundle.py:

make_html_manifest()
make_source_manifest()
class Manifest.__init__()

The 3 helpers above can all be exercised by one of the following commands. They should be run with different combinations of the 3 --disable-env-management flags to make sure the right values are written to the manifest.

  • rsconnect write-manifest api - calls make_source_manifest()
  • rsconnect write-manifest voila - calls class Manifest.__init__()
  • rsconnect deploy notebook --static - calls make_html_manifest()

When testing rsconnect deploy notebook --static, the resulting manifest will only be available on the connect server in the data dir

Checklist

  • I have updated CHANGELOG.md to cover notable changes.
  • I have updated all related GitHub issues to reflect their current state.

@dbkegley dbkegley force-pushed the kegs-skip-restore branch from 9a3a8d3 to 4a43eae Compare August 4, 2023 20:11
@github-actions
Copy link

github-actions bot commented Aug 4, 2023

☂️ Python Coverage

current status: ✅

Overall Coverage

Lines Covered Coverage Threshold Status
4284 2761 64% 0% 🟢

New Files

No new covered files...

Modified Files

File Coverage Status
rsconnect/actions.py 33% 🟢
rsconnect/bundle.py 80% 🟢
rsconnect/main.py 55% 🟢
TOTAL 56% 🟢

updated for commit: 568e217 by action🐍

Copy link
Contributor

@toph-allen toph-allen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally looks good! My comments are just nitpicks or thoughts I had in the moment.

I didn't run through the instructions for exercising the code — I'm assuming we'll perform QA on this separately.

There are some areas of the code where I don't have insight into the full context so bear that in mind. Another review wouldn't hurt.

@@ -244,6 +244,58 @@ def wrapper(*args, **kwargs):
return wrapper


# inverts bool args if they are provided, otherwise returns None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't just invert the bool args, but also sets them from the shorthand/"both" flag if that's present.

Comment on lines +123 to +133

if image or env_management_py is not None or env_management_r is not None:
self.data["environment"] = {}
if image:
self.data["environment"]["image"] = image
if env_management_py is not None or env_management_r is not None:
self.data["environment"]["environment_management"] = {}
if env_management_py is not None:
self.data["environment"]["environment_management"]["python"] = env_management_py
if env_management_r is not None:
self.data["environment"]["environment_management"]["r"] = env_management_r
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block seems to be present 3 (as far as I've seen) times. Is it worth putting it in a small helper function e.g. make_environment_data(image, env_management_py, env_management_r)? I don't have the larger context of this file and how manifest generation is factored out so it may not make sense, but worth asking.

Comment on lines 279 to 280
help="Disable Python environment management. Connect will not perform any Python package installation. "
"It is the user's responsibility to ensure required packages are installed in the runtime environment.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are parts of this which might be unclear to end users. Going to think through it in this comment.

Disable Python environment management. Connect will not perform any Python package installation. It is the user's responsibility to ensure required packages are installed in the runtime environment.

  • "the user's responsibility" — actually probably the server admin's responsibility
  • "the runtime environment" — might be unclear for users without a clear mental model of the runtime environment on the Connect server that we're talking about, who might think this is referring to their local runtime environment.

Maybe something like

Disable Python environment management for this bundle. Connect will not create an environment or instal packages. An administrator must install any required packages in the correct Python environment on the Connect server.

And similar for the R help.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this; I think the "for this bundle" distinction is also really important. Since the setting in the manifest.json, and thus the bundle, have the highest precedence it makes sense to explicitly say that these changes must be applied to every bundle for them to take effect.

If the user wants to apply these changes to all bundles then they'll need to use the server API to set an environment management strategy at the Application-level.

@@ -538,6 +540,10 @@ def write_quarto_manifest_json(
:param extra_files: Any extra files to include in the manifest.
:param excludes: A sequence of glob patterns to exclude when enumerating files to bundle.
:param image: the optional docker image to be specified for off-host execution. Default = None.
:param env_management_py: False indicates that the user is responsible for Python package installation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
:param env_management_py: False indicates that the user is responsible for Python package installation
:param env_management_py: False prevents Connect from managing the Python environment for this bundle. The server administrator is responsible for installing packages in runtime environment on the server.

Something like that maybe? I wanna get away from saying it's the user's responsibility, because users likely won't have permission to do it.

@@ -656,7 +693,7 @@ def make_notebook_html_bundle(
bundle_add_buffer(bundle, filename, output)

# manifest
manifest = make_html_manifest(filename, image)
manifest = make_html_manifest(filename, image, env_management_py, env_management_r)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a pity that there are so many functions that this change needs to affect. Not for this PR, but I wonder if there's any way to reduce repetition here, or if it's already as clean as it can be. Just adds more surface area for errors, I feel like. ¯\_(ツ)_/¯

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Completely agree. It would be nice if we relied more on **kwargs throughout. There are a couple places we do this already with **locals() but it’d be nice to be more consistent

@@ -1443,6 +1492,9 @@ def write_manifest_notebook(
file,
extra_files,
image,
disable_env_management,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking my understanding: this var isn't actually needed in the function body because the callback that consumes it has already been called before line 1501?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that it's only required by the callback function. click will complain if it doesn't know how to invoke write_manifest_notebook which is why it needs to be included in the function signature even though it isn't used.

TypeError: write_manifest_notebook() got an unexpected keyword argument 'disable_env_management'

self.assertEqual(
manifest,
{
"version": 1,
"metadata": {"appmode": "python-api"},
"environment": {"image": "rstudio/connect:bionic"},
"environment": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we wanna include a test for one of these that uses the shorthand flag?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test case in main_test.py that covers the callback func. It's not exactly what you're asking for but it's pretty close without actually having to print/parse a manifest.

7e3b4c6

@toph-allen
Copy link
Contributor

toph-allen commented Aug 4, 2023

CI failures are not related to this PR.

@@ -244,6 +244,58 @@ def wrapper(*args, **kwargs):
return wrapper


# inverts bool args if they are provided, otherwise returns None
def env_management_callback(ctx, param, value):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit; type hinting is always appreciated.

Suggested change
def env_management_callback(ctx, param, value):
def env_management_callback(ctx, param, value) -> Optional[bool]:

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to make this a habit, thanks for the reminder!

@ChaitaC
Copy link

ChaitaC commented Aug 18, 2023

verified on kegs-skip-restore

  • Validate --help Options
rsconnect write-manifest api       --help  
rsconnect write-manifest notebook  --help  
rsconnect write-manifest quarto    --help  
rsconnect write-manifest voila     --help

rsconnect deploy api       --help  
rsconnect deploy notebook  --help  
rsconnect deploy quarto    --help  
rsconnect deploy voila     --help

all commands now include the following flags

--disable-env-management     Shorthand to disable environment management for
                               both Python and R.
  --disable-env-management-py  Disable Python environment management for this
                               bundle. Connect will not create an environment
                               or install packages. An administrator must
                               install the  required packages in the correct
                               Python environment on the Connect server.
  --disable-env-management-r   Disable R environment management for this
                               bundle. Connect will not create an environment
                               or install packages. An administrator must
                               install the  required packages in the correct R
                               environment on the Connect server.
  • run with different combinations of the 3 --disable-env-management flags to ensure the right values are written to the manifest.

rsconnect write-manifest bokeh --overwrite --disable-env-management-r .

 "environment": {
    "environment_management": {
      "r": false
    }

rsconnect write-manifest bokeh --overwrite --disable-env-management-py .

  "environment": {
    "environment_management": {
      "python": false
    }

rsconnect write-manifest bokeh --overwrite --disable-env-management .

 "environment": {
    "environment_management": {
      "python": false,
      "r": false
    }

@ChaitaC
Copy link

ChaitaC commented Aug 18, 2023

  • Verified Deploy and manifest options with all 3 flags
    example:

rsconnect deploy notebook --static

  • rsconnect deploy notebook --disable-env-management --new --server https://rsc.radixu.com/ --api-key <you_api_key> ./stock-report-jupyter.ipynb quandl-wiki-tsla.json.gz
    rsconnect deploy notebook --disable-env-management-r --new --server https://rsc.radixu.com/ --api-key <you_api_key> ./stock-report-jupyter.ipynb quandl-wiki-tsla.json.gz
    rsconnect deploy notebook --disable-env-management-py --new --server https://rsc.radixu.com/ --api-key <you_api_key> ./stock-report-jupyter.ipynb quandl-wiki-tsla.json.gz

  • rsconnect write-manifest notebook --disable-env-management-py --overwrite ./stock-report-jupyter.ipynb quandl-wiki-tsla.json.gz
    rsconnect write-manifest notebook --disable-env-management-r ./stock-report-jupyter.ipynb quandl-wiki-tsla.json.gz
    rsconnect write-manifest notebook --disable-env-management ./stock-report-jupyter.ipynb quandl-wiki-tsla.json.gz

rsconnect write-manifest api

rsconnect write-manifest voila

@dbkegley dbkegley merged commit cdbe078 into master Aug 18, 2023
@dbkegley dbkegley deleted the kegs-skip-restore branch August 18, 2023 21:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants