Skip to content

Commit

Permalink
[ci] add dependabot, update pre-commit hooks
Browse files Browse the repository at this point in the history
  • Loading branch information
jameslamb committed Sep 2, 2024
1 parent a9352b1 commit 7ad4719
Show file tree
Hide file tree
Showing 10 changed files with 50 additions and 32 deletions.
15 changes: 15 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
version: 2
updates:
- package-ecosystem: github-actions
directory: /
schedule:
interval: monthly
groups:
ci-dependencies:
patterns:
- "*"
commit-message:
prefix: "[ci]"
labels:
- maintenance
18 changes: 5 additions & 13 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,8 @@ jobs:
name: lint
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Set up Python
uses: conda-incubator/setup-miniconda@v3
with:
python-version: 3.11
- name: linting
if: matrix.task == 'linting'
shell: bash
run: |
pip install --upgrade pre-commit
pre-commit run --all-files
- uses: actions/checkout@v4
- uses: pre-commit/action@v3.0.1
build:
name: build
needs: [lint]
Expand All @@ -38,7 +28,9 @@ jobs:
all-tests-successful:
if: always()
runs-on: ubuntu-latest
needs: [build, lint]
needs:
- build
- lint
steps:
- name: Decide whether the needed jobs succeeded or failed
uses: re-actors/alls-green@v1.2.2
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ profiling-output/
__pycache/
*.query
*.rsa
.ruff_cache/
*.so
*.sqlite
*.tar.gz
Expand Down
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ repos:
name: isort (python)
args: ["--settings-path", "pyproject.toml"]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.10.0
rev: v1.11.2
hooks:
- id: mypy
args: ["--config-file", "pyproject.toml"]
Expand All @@ -27,7 +27,7 @@ repos:
- types-requests
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.4.10
rev: v0.6.3
hooks:
# Run the linter.
- id: ruff
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile-cluster
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
ARG BASE_IMAGE
ARG BASE_IMAGE=unset
ARG INSTALL_DIR=/opt/LightGBM

# hadolint ignore=DL3006
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile-cluster-base
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN apt-get update && \
build-essential \
cmake \
libomp-dev && \
pip install --no-cache-dir \
pip install --no-cache-dir --prefer-binary \
blosc \
bokeh \
dask==${DASK_VERSION} \
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile-notebook
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
ARG BASE_IMAGE
ARG BASE_IMAGE=unset
ARG INSTALL_DIR=/root/testing/LightGBM

# hadolint ignore=DL3006
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile-notebook-base
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
FROM python:3.11-slim

ARG DASK_VERSION
ARG DASK_VERSION=unset

ENV \
DASK_VERSION=${DASK_VERSION} \
Expand All @@ -14,7 +14,7 @@ RUN apt-get update && \
cmake \
libomp-dev \
ninja-build && \
pip install --no-cache-dir \
pip install --no-cache-dir --prefer-binary \
'aiobotocore[awscli,boto3]>=2.5.0' \
blosc \
bokeh \
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile-profiling
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
ARG BASE_IMAGE
ARG BASE_IMAGE=unset

# hadolint ignore=DL3006
FROM ${BASE_IMAGE}

RUN pip install --no-cache-dir \
RUN pip install --no-cache-dir --prefer-binary \
memray \
pytest \
pytest-memray \
Expand Down
30 changes: 20 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

[![GitHub Actions status](https://github.com/jameslamb/lightgbm-dask-testing/workflows/Continuous%20Integration/badge.svg?branch=main)](https://github.com/jameslamb/lightgbm-dask-testing/actions)

This repository can be used to test and develop changes to LightGBM's Dask integration. It contains the following useful features:
This repository can be used to test and develop changes to LightGBM's Dask integration.
It contains the following useful features:

* `make` recipes for building a local development image with `lightgbm` installed from a local copy, and Jupyter Lab running for interactive development
* Jupyter notebooks for testing `lightgbm.dask` against a `LocalCluster` (multi-worker, single-machine) and a `dask_cloudprovider.aws.FargateCluster` (multi-worker, multi-machine)
Expand All @@ -22,7 +23,8 @@ This repository can be used to test and develop changes to LightGBM's Dask integ

## Getting Started

To begin, clone a copy of LightGBM to a folder `LightGBM` at the root of this repo. You can do this however you want, for example:
To begin, clone a copy of LightGBM to a folder `LightGBM` at the root of this repo.
You can do this however you want, for example:

```shell
git clone --recursive git@github.com:microsoft/LightGBM.git LightGBM
Expand All @@ -36,7 +38,6 @@ If you're developing a reproducible example for [an issue](https://github.com/mi

This section describes how to test a version of LightGBM in Jupyter.


#### 1. Build the notebook image

Run the following to build an image that includes `lightgbm`, all its dependencies, and a JupyterLab setup.
Expand Down Expand Up @@ -80,27 +81,31 @@ To test `lightgbm.dask` on a `LocalCluster`, run the steps in ["Develop in Jupyt

## Test with a `FargateCluster`

There are some problems with Dask code which only arise in a truly distributed, multi-machine setup. To test for these sorts of issues, I like to use [`dask-cloudprovider`](https://github.com/dask/dask-cloudprovider).
There are some problems with Dask code which only arise in a truly distributed, multi-machine setup.
To test for these sorts of issues, I like to use [`dask-cloudprovider`](https://github.com/dask/dask-cloudprovider).

The steps below describe how to test a local copy of LightGBM on a `FargateCluster` from `dask-cloudprovider`.

#### 1. Build the cluster image

Build an image that can be used for the scheduler and works in the Dask cluster you'll create on AWS Fargate. This image will have your local copy of LightGBM installed in it.
Build an image that can be used for the scheduler and works in the Dask cluster you'll create on AWS Fargate.
This image will have your local copy of LightGBM installed in it.

```shell
make cluster-image
```

#### 2. Install and configure the AWS CLI

For the rest of the steps in this section, you'll need access to AWS resources. To begin, install the AWS CLI if you don't already have it.
For the rest of the steps in this section, you'll need access to AWS resources.
To begin, install the AWS CLI if you don't already have it.

```shell
pip install --upgrade awscli
```

Next, configure your shell to make authenticated requests to AWS. If you've never done this, you can see [the AWS CLI docs](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html).
Next, configure your shell to make authenticated requests to AWS.
If you've never done this, you can see [the AWS CLI docs](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html).

The rest of this section assums that the shell variables `AWS_SECRET_ACCESS_KEY` and `AWS_ACCESS_KEY_ID` have been sett.

Expand All @@ -122,9 +127,12 @@ set +o allexport

#### 3. Push the cluster image to ECR

To use the cluster image in the containers you spin up on Fargate, it has to be available in a container registry. This project uses the free AWS Elastic Container Registry (ECR) Public. For more information on ECR Public, see [the AWS docs](https://docs.amazonaws.cn/en_us/AmazonECR/latest/public/docker-push-ecr-image.html).
To use the cluster image in the containers you spin up on Fargate, it has to be available in a container registry.
This project uses the free AWS Elastic Container Registry (ECR) Public.
For more information on ECR Public, see [the AWS docs](https://docs.amazonaws.cn/en_us/AmazonECR/latest/public/docker-push-ecr-image.html).

The command below will create a new repository on ECR Public, store the details of that repository in a file `ecr-details.json`, and push the cluster image to it. The cluster image will not contain your credentials, notebooks, or other local files.
The command below will create a new repository on ECR Public, store the details of that repository in a file `ecr-details.json`, and push the cluster image to it.
The cluster image will not contain your credentials, notebooks, or other local files.

```shell
make push-image
Expand All @@ -134,7 +142,9 @@ This may take a few minutes to complete.

#### 4. Run the AWS notebook

Follow the steps in ["Develop in Jupyter"](#develop-in-jupyter) to get a local Jupyter Lab running. Open [`aws.ipynb`](./notebooks/fargate-cluster.ipynb). That notebook contains sample code that uses `dask-cloudprovider` to provision a Dask cluster on AWS Fargate.
Follow the steps in ["Develop in Jupyter"](#develop-in-jupyter) to get a local Jupyter Lab running.
Open [`aws.ipynb`](./notebooks/fargate-cluster.ipynb).
That notebook contains sample code that uses `dask-cloudprovider` to provision a Dask cluster on AWS Fargate.

You can view the cluster's current state and its logs by navigating to the Elastic Container Service (ECS) section of the AWS console.

Expand Down

0 comments on commit 7ad4719

Please sign in to comment.