Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weekly patch release v1.9.3 #16790

Merged
merged 10 commits into from
Feb 21, 2023
2 changes: 1 addition & 1 deletion .github/workflows/ci-schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ on:

jobs:
check:
uses: Lightning-AI/utilities/.github/workflows/check-schema.yml@v0.4.1
uses: Lightning-AI/utilities/.github/workflows/check-schema.yml@v0.6.0
with:
# skip azure due to the wrong schema file by MSFT
# https://github.com/Lightning-AI/lightning-flash/pull/1455#issuecomment-1244793607
Expand Down
35 changes: 19 additions & 16 deletions .github/workflows/ci-tests-pytorch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -142,22 +142,25 @@ jobs:
# pip install -r requirements/pytorch/strategies.txt --find-links ${TORCH_URL}
pip list

- name: Reinstall Horovod if necessary
if: runner.os != 'windows'
env:
HOROVOD_BUILD_ARCH_FLAGS: "-mfma"
HOROVOD_WITHOUT_MXNET: 1
HOROVOD_WITHOUT_TENSORFLOW: 1
run: |
HOROVOD_BUILT=$(python -c "import horovod.torch; horovod.torch.nccl_built(); print('SUCCESS')" || true)
if [[ $HOROVOD_BUILT != "SUCCESS" ]]; then
pip uninstall -y horovod
grep "horovod" requirements/pytorch/strategies.txt > requirements/pytorch/horovod.txt
pip install --no-cache-dir -r requirements/pytorch/horovod.txt
fi
horovodrun --check-build
python -c "import horovod.torch"
pip list
# - name: Reinstall Horovod if necessary
# if: runner.os != 'windows'
# env:
# HOROVOD_BUILD_ARCH_FLAGS: "-mfma"
# HOROVOD_WITHOUT_MXNET: 1
# HOROVOD_WITHOUT_TENSORFLOW: 1
# run: |
# HOROVOD_BUILT=$(python -c "import horovod.torch; horovod.torch.nccl_built(); print('SUCCESS')" || true)
# if [[ $HOROVOD_BUILT != "SUCCESS" ]]; then
# pip uninstall -y horovod
# # grep "horovod" requirements/pytorch/strategies.txt > requirements/pytorch/horovod.txt
# # pip install --no-cache-dir -r requirements/pytorch/horovod.txt
#
# # we hardcode the version for testing due to difficulties installing and compiling the package
# pip install --no-cache-dir horovod==0.21.2
# fi
# horovodrun --check-build
# python -c "import horovod.torch"
# pip list

- name: Cache datasets
uses: actions/cache@v3
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/schedule-clear-cache.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,12 @@ jobs:

cron-clear:
if: github.event_name == 'schedule'
uses: Lightning-AI/utilities/.github/workflows/clear-cache.yml@v0.5.0
uses: Lightning-AI/utilities/.github/workflows/clear-cache.yml@v0.6.0
with:
pattern: 'latest|docs'

direct-clear:
if: github.event_name == 'workflow_dispatch'
uses: Lightning-AI/utilities/.github/workflows/clear-cache.yml@v0.5.0
uses: Lightning-AI/utilities/.github/workflows/clear-cache.yml@v0.6.0
with:
pattern: ${{ inputs.pattern }}
2 changes: 1 addition & 1 deletion dockers/base-cuda/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ RUN \
# install ColossalAI
# TODO: 1.13 wheels are not released, remove skip once they are
if [[ $PYTORCH_VERSION != "1.13" ]]; then \
pip install "colossalai==0.2.3"; \
pip install "colossalai==0.2.4"; \
python -c "import colossalai; print(colossalai.__version__)" ; \
fi

Expand Down
6 changes: 6 additions & 0 deletions src/lightning_app/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,12 @@ All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/).

## [1.9.3] - 2023-02-21

### Fixed

- Fixed `lightning open` command and improved redirects ([#16794](https://github.com/Lightning-AI/lightning/pull/16794))


## [1.9.2] - 2023-02-15

Expand Down
4 changes: 3 additions & 1 deletion src/lightning_app/components/serve/auto_scaler.py
Original file line number Diff line number Diff line change
Expand Up @@ -453,7 +453,9 @@ def _get_endpoint_info_page(self) -> Optional["APIAccessFrontend"]: # noqa: F82
try:
from lightning_api_access import APIAccessFrontend
except ModuleNotFoundError:
logger.warn("APIAccessFrontend not found. Please install lightning-api-access to enable the UI")
logger.warn(
"Some dependencies to run the UI are missing. To resolve, run `pip install lightning-api-access`"
)
return

if is_running_in_cloud():
Expand Down
4 changes: 3 additions & 1 deletion src/lightning_app/components/serve/python_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -293,7 +293,9 @@ def configure_layout(self) -> Optional["Frontend"]:
try:
from lightning_api_access import APIAccessFrontend
except ModuleNotFoundError:
logger.warn("APIAccessFrontend not found. Please install lightning-api-access to enable the UI")
logger.warn(
"Some dependencies to run the UI are missing. To resolve, run `pip install lightning-api-access`"
)
return

class_name = self.__class__.__name__
Expand Down
17 changes: 0 additions & 17 deletions src/lightning_app/core/flow.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@

from deepdiff import DeepHash

from lightning_app.core.plugin import Plugin
from lightning_app.core.work import LightningWork
from lightning_app.frontend import Frontend
from lightning_app.storage import Path
Expand Down Expand Up @@ -741,22 +740,6 @@ def configure_api(self):
"""
raise NotImplementedError

def configure_plugins(self) -> Optional[List[Dict[str, Plugin]]]:
"""Configure the plugins of this LightningFlow.

Returns a list of dictionaries mapping a plugin name to a :class:`lightning_app.core.plugin.Plugin`.

.. code-block:: python

class Flow(LightningFlow):
def __init__(self):
super().__init__()

def configure_plugins(self):
return [{"my_plugin_name": MyPlugin()}]
"""
pass

def state_dict(self):
"""Returns the current flow state but not its children."""
return {
Expand Down
189 changes: 100 additions & 89 deletions src/lightning_app/core/plugin.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,11 @@
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import tarfile
import tempfile
from pathlib import Path
from typing import Any, Dict, Optional
from typing import Dict, List, Optional
from urllib.parse import urlparse

import requests
import uvicorn
Expand All @@ -23,77 +25,39 @@
from pydantic import BaseModel

from lightning_app.utilities.app_helpers import Logger
from lightning_app.utilities.cloud import _get_project
from lightning_app.utilities.component import _set_flow_context
from lightning_app.utilities.enum import AppStage
from lightning_app.utilities.network import LightningClient
from lightning_app.utilities.load_app import _load_plugin_from_file

logger = Logger(__name__)


class Plugin:
"""A ``Plugin`` is a single-file Python class that can be executed within a cloudspace to perform actions."""
class LightningPlugin:
"""A ``LightningPlugin`` is a single-file Python class that can be executed within a cloudspace to perform
actions."""

def __init__(self) -> None:
self.app_url = None
self.project_id = ""
self.cloudspace_id = ""
self.cluster_id = ""

def run(self, name: str, entrypoint: str) -> None:
"""Override with the logic to execute on the client side."""
def run(self, *args: str, **kwargs: str) -> None:
"""Override with the logic to execute on the cloudspace."""

def run_app_command(self, command_name: str, config: Optional[BaseModel] = None) -> Dict[str, Any]:
"""Run a command on the app associated with this plugin.
def run_job(self, name: str, app_entrypoint: str, env_vars: Optional[Dict[str, str]] = None) -> None:
"""Run a job in the cloudspace associated with this plugin.

Args:
command_name: The name of the command to run.
config: The command config or ``None`` if the command doesn't require configuration.
name: The name of the job.
app_entrypoint: The path of the file containing the app to run.
env_vars: Additional env vars to set when running the app.
"""
if self.app_url is None:
raise RuntimeError("The plugin must be set up before `run_app_command` can be called.")

command = command_name.replace(" ", "_")
resp = requests.post(self.app_url + f"/command/{command}", data=config.json() if config else None)
if resp.status_code != 200:
try:
detail = str(resp.json())
except Exception:
detail = "Internal Server Error"
raise RuntimeError(f"Failed with status code {resp.status_code}. Detail: {detail}")

return resp.json()

def _setup(self, app_id: str) -> None:
client = LightningClient()
project_id = _get_project(client).project_id
response = client.lightningapp_instance_service_list_lightningapp_instances(
project_id=project_id, app_id=app_id
)
if len(response.lightningapps) > 1:
raise RuntimeError(f"Found multiple apps with ID: {app_id}")
if len(response.lightningapps) == 0:
raise RuntimeError(f"Found no apps with ID: {app_id}")
self.app_url = response.lightningapps[0].status.url


class _Run(BaseModel):
plugin_name: str
project_id: str
cloudspace_id: str
name: str
entrypoint: str
cluster_id: Optional[str] = None
app_id: Optional[str] = None


def _run_plugin(run: _Run) -> None:
"""Create a run with the given name and entrypoint under the cloudspace with the given ID."""
if run.app_id is None and run.plugin_name == "app":
from lightning_app.runners.cloud import CloudRuntime

# TODO: App dispatch should be a plugin
# Dispatch the run
# Dispatch the job
_set_flow_context()

entrypoint_file = Path("/content") / run.entrypoint
entrypoint_file = Path(app_entrypoint)

app = CloudRuntime.load_app_from_file(str(entrypoint_file.resolve().absolute()))

Expand All @@ -103,54 +67,101 @@ def _run_plugin(run: _Run) -> None:
app=app,
entrypoint=entrypoint_file,
start_server=True,
env_vars={},
env_vars=env_vars if env_vars is not None else {},
secrets={},
run_app_comment_commands=True,
)
# Used to indicate Lightning has been dispatched
os.environ["LIGHTNING_DISPATCHED"] = "1"

runtime.cloudspace_dispatch(
project_id=self.project_id,
cloudspace_id=self.cloudspace_id,
name=name,
cluster_id=self.cluster_id,
)

def _setup(
self,
project_id: str,
cloudspace_id: str,
cluster_id: str,
) -> None:
self.project_id = project_id
self.cloudspace_id = cloudspace_id
self.cluster_id = cluster_id


class _Run(BaseModel):
plugin_entrypoint: str
source_code_url: str
project_id: str
cloudspace_id: str
cluster_id: str
plugin_arguments: Dict[str, str]


def _run_plugin(run: _Run) -> List:
"""Create a run with the given name and entrypoint under the cloudspace with the given ID."""
with tempfile.TemporaryDirectory() as tmpdir:
download_path = os.path.join(tmpdir, "source.tar.gz")
source_path = os.path.join(tmpdir, "source")
os.makedirs(source_path)

# Download the tarball
try:
runtime.cloudspace_dispatch(
project_id=run.project_id,
cloudspace_id=run.cloudspace_id,
name=run.name,
cluster_id=run.cluster_id,
)
# Sometimes the URL gets encoded, so we parse it here
source_code_url = urlparse(run.source_code_url).geturl()

response = requests.get(source_code_url)

with open(download_path, "wb") as f:
f.write(response.content)
except Exception as e:
raise HTTPException(status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=str(e))
elif run.app_id is not None:
from lightning_app.utilities.cli_helpers import _LightningAppOpenAPIRetriever
from lightning_app.utilities.commands.base import _download_command
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=f"Error downloading plugin source: {str(e)}.",
)

retriever = _LightningAppOpenAPIRetriever(run.app_id)
# Extract
try:
with tarfile.open(download_path, "r:gz") as tf:
tf.extractall(source_path)
except Exception as e:
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=f"Error extracting plugin source: {str(e)}.",
)

metadata = retriever.api_commands[run.plugin_name] # type: ignore
# Import the plugin
try:
plugin = _load_plugin_from_file(os.path.join(source_path, run.plugin_entrypoint))
except Exception as e:
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=f"Error loading plugin: {str(e)}."
)

with tempfile.TemporaryDirectory() as tmpdir:
# Ensure that apps are dispatched from the temp directory
cwd = os.getcwd()
os.chdir(source_path)

target_file = os.path.join(tmpdir, f"{run.plugin_name}.py")
plugin = _download_command(
run.plugin_name,
metadata["cls_path"],
metadata["cls_name"],
run.app_id,
target_file=target_file,
# Setup and run the plugin
try:
plugin._setup(
project_id=run.project_id,
cloudspace_id=run.cloudspace_id,
cluster_id=run.cluster_id,
)
plugin.run(**run.plugin_arguments)
except Exception as e:
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR, detail=f"Error running plugin: {str(e)}."
)
finally:
os.chdir(cwd)

if isinstance(plugin, Plugin):
plugin._setup(app_id=run.app_id)
plugin.run(run.name, run.entrypoint)
else:
# This should never be possible but we check just in case
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=f"The plugin {run.plugin_name} is an incorrect type.",
)
else:
raise HTTPException(
status_code=status.HTTP_400_BAD_REQUEST, detail="App ID must be specified unless `plugin_name='app'`."
)
# TODO: Return actions from the plugin here
return []


def _start_plugin_server(host: str, port: int) -> None:
Expand Down
5 changes: 4 additions & 1 deletion src/lightning_app/core/work.py
Original file line number Diff line number Diff line change
Expand Up @@ -639,7 +639,10 @@ def _aggregate_status_timeout(self, statuses: List[Dict]) -> WorkStatus:
return WorkStatus(**status, count=len(timeout_statuses))

def on_exit(self):
"""Override this hook to add your logic when the work is exiting."""
"""Override this hook to add your logic when the work is exiting.

Note: This hook is not guaranteed to be called when running in the cloud.
"""
pass

def stop(self):
Expand Down
Loading