Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: add automated and on demand testing of fluence #49

Merged
merged 1 commit into from
Jan 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 97 additions & 0 deletions .github/test.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
#!/bin/bash

# This will test fluence with two jobs.
# We choose jobs as they generate output and complete, and pods
# are expected to keep running (and then would error)

set -eEu -o pipefail

# ensure upstream exists
# This test script assumes fluence image and sidecar are already built
make prepare

# Keep track of root directory to return to
here=$(pwd)

# Never will use our loaded (just built) images
cd upstream/manifests/install/charts
helm install \
--set scheduler.image=ghcr.io/flux-framework/fluence:latest \
--set scheduler.sidecarimage=ghcr.io/flux-framework/fluence-sidecar:latest \
--set scheduler.pullPolicy=Never \
--set scheduler.sidecarPullPolicy=Never \
schedscheduler-plugins as-a-second-scheduler/

# These containers should already be loaded into minikube
echo "Sleeping 10 seconds waiting for scheduler deploy"
sleep 10
kubectl get pods

# This will get the fluence image (which has scheduler and sidecar), which should be first
fluence_pod=$(kubectl get pods -o json | jq -r .items[0].metadata.name)
echo "Found fluence pod ${fluence_pod}"

# Show logs for debugging, if needed
echo
echo "⭐️ kubectl logs ${fluence_pod} -c sidecar"
kubectl logs ${fluence_pod} -c sidecar
echo
echo "⭐️ kubectl logs ${fluence_pod} -c scheduler-plugins-scheduler"
kubectl logs ${fluence_pod} -c scheduler-plugins-scheduler

# We now want to apply the examples
cd ${here}/examples/test_example

# Apply both example jobs
kubectl apply -f fluence-job.yaml
kubectl apply -f default-job.yaml
Comment on lines +46 to +47
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that scheduling pods with kube-scheduler and Fluence on the same cluster isn't supported. There isn't currently any way to propagate pod-to-node mappings generated by kube-scheduler to Fluence.

It's important that kubectl apply -f fluence-job.yaml is executed before kubectl apply -f default-job.yaml, and that they don't specify limits or requests so they could be scheduled on the same node. That's currently the case in this PR, but I'm emphasizing it for posterity.

Regardless, there still may be some funky race condition that occurs and results in unschedulable pods.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important that kubectl apply -f fluence-job.yaml is executed before kubectl apply -f default-job.yaml, and that they don't specify limits or requests so they could be scheduled on the same node. That's currently the case in this PR, but I'm emphasizing it for posterity.

Gotcha - I think likely for the testing cluster (and the example we already had in main) we are just doing that, putting them on the same node, and since it's a tiny kind or otherwise local cluster, there hasn't been an issue. If we extended this to an actual setup, there would be. This is an important point and I've opened an issue for emphasizing it in in future docs: #53 and maybe we can think of a creative way to allow for both, possibly with kueue resource flavors that create distinct (separate) resources that are labeled for each.


# Get them based on associated job
fluence_job_pod=$(kubectl get pods --selector=job-name=fluence-job -o json | jq -r .items[0].metadata.name)
default_job_pod=$(kubectl get pods --selector=job-name=default-job -o json | jq -r .items[0].metadata.name)

echo
echo "Fluence job pod is ${fluence_job_pod}"
echo "Default job pod is ${default_job_pod}"
sleep 10

# Shared function to check output
function check_output {
check_name="$1"
actual="$2"
expected="$3"
if [[ "${expected}" != "${actual}" ]]; then
echo "Expected output is ${expected}"
echo "Actual output is ${actual}"
exit 1
fi
}

# Get output (and show)
default_output=$(kubectl logs ${default_job_pod})
default_scheduled_by=$(kubectl get pod ${default_job_pod} -o json | jq -r .spec.schedulerName)
echo
echo "Default scheduler pod output: ${default_output}"
echo " Scheduled by: ${default_scheduled_by}"

fluence_output=$(kubectl logs ${fluence_job_pod})
fluence_scheduled_by=$(kubectl get pod ${fluence_job_pod} -o json | jq -r .spec.schedulerName)
echo
echo "Fluence scheduler pod output: ${fluence_output}"
echo " Scheduled by: ${fluence_scheduled_by}"

# Check output explicitly
check_output 'check-fluence-output' "${fluence_output}" "potato"
check_output 'check-default-output' "${default_output}" "not potato"
check_output 'check-default-scheduled-by' "${default_scheduled_by}" "default-scheduler"
check_output 'check-fluence-scheduled-by' "${fluence_scheduled_by}" "fluence"

# But events tell us actually what happened, let's parse throught them and find our pods
# This tells us the Event -> reason "Scheduled" and who it was reported by.
reported_by=$(kubectl events --for pod/${fluence_job_pod} -o json | jq -c '[ .items[] | select( .reason | contains("Scheduled")) ]' | jq -r .[0].reportingComponent)
check_output 'reported-by-fluence' "${reported_by}" "fluence"

# And the second should be the default scheduler, but reportingComponent is empty and we see the
# result in the source -> component
reported_by=$(kubectl events --for pod/${default_job_pod} -o json | jq -c '[ .items[] | select( .reason | contains("Scheduled")) ]' | jq -r .[0].source.component)
check_output 'reported-by-default' "${reported_by}" "default-scheduler"
139 changes: 139 additions & 0 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
name: fluence build test

on:
pull_request: []
# Test on demand (dispath) or once a week, sunday
# We combine the builds into one job to simplify not needing to share
# containers between jobs. We also don't want to push unless the tests pass.
workflow_dispatch:
schedule:
- cron: '0 0 * * 0'

jobs:
build-fluence:
env:
container: ghcr.io/flux-framework/fluence
runs-on: ubuntu-latest
name: build fluence
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v3
with:
go-version: ^1.19

- name: Build Containers
run: |
make prepare
make build REGISTRY=ghcr.io/flux-framework SCHEDULER_IMAGE=fluence

- name: Save Container
run: docker save ${{ env.container }} | gzip > fluence_latest.tar.gz

- name: Upload container artifact
uses: actions/upload-artifact@v4
with:
name: fluence
path: fluence_latest.tar.gz

build-sidecar:
env:
container: ghcr.io/flux-framework/fluence-sidecar
runs-on: ubuntu-latest
name: build sidecar
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v3
with:
go-version: ^1.19

- name: Build Container
run: |
make prepare
make build-sidecar REGISTRY=ghcr.io/flux-framework SIDECAR_IMAGE=fluence-sidecar

- name: Save Container
run: docker save ${{ env.container }} | gzip > fluence_sidecar_latest.tar.gz

- name: Upload container artifact
uses: actions/upload-artifact@v4
with:
name: fluence_sidecar
path: fluence_sidecar_latest.tar.gz

test-fluence:
needs: [build-fluence, build-sidecar]
permissions:
packages: write
env:
fluence_container: ghcr.io/flux-framework/fluence
sidecar_container: ghcr.io/flux-framework/fluence-sidecar

runs-on: ubuntu-latest
name: build fluence
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v3
with:
go-version: ^1.20

- name: Download fluence artifact
uses: actions/download-artifact@v4
with:
name: fluence
path: /tmp

- name: Download fluence_sidecar artifact
uses: actions/download-artifact@v4
with:
name: fluence_sidecar
path: /tmp

- name: Load Docker images
run: |
ls /tmp/*.tar.gz
docker load --input /tmp/fluence_sidecar_latest.tar.gz
docker load --input /tmp/fluence_latest.tar.gz
docker image ls -a | grep fluence

- name: Create Kind Cluster
uses: helm/kind-action@v1.5.0
with:
cluster_name: kind
kubectl_version: v1.28.2
version: v0.20.0

- name: Load Docker Containers into Kind
env:
fluence: ${{ env.fluence_container }}
sidecar: ${{ env.sidecar_container }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
kind load docker-image ${fluence}
kind load docker-image ${sidecar}

- name: Test Fluence
run: /bin/bash ./.github/test.sh

- name: Tag Weekly Images
run: |
# YEAR-MONTH-DAY or #YYYY-MM-DD
tag=$(echo $(date +%Y-%m-%d))
echo "Tagging and releasing ${{ env.fluence_container}}:${tag}"
docker tag ${{ env.fluence_container }}:latest ${{ env.fluence_container }}:${tag}
echo "Tagging and releasing ${{ env.sidecar_container}}:${tag}"
docker tag ${{ env.sidecar_container }}:latest ${{ env.sidecar_container }}:${tag}

# If we get here, tests pass, and we can deploy
- name: GHCR Login
if: (github.event_name != 'pull_request')
uses: docker/login-action@v2
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Deploy Containers
if: (github.event_name != 'pull_request')
run: |
docker push ${{ env.fluence_container }} --all-tags
docker push ${{ env.sidecar_container }} --all-tags
55 changes: 42 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@

Fluence enables HPC-grade pod scheduling in Kubernetes via the [Kubernetes Scheduling Framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/). Fluence uses the directed-graph based [Fluxion scheduler](https://github.com/flux-framework/flux-sched) to map pods or [podgroups](https://github.com/kubernetes-sigs/scheduler-plugins/tree/master/pkg/coscheduling) to nodes. Fluence supports all the Fluxion scheduling algorithms (e.g., `hi`, `low`, `hinode`, etc.). Note that Fluence does not currently support use in conjunction with the kube-scheduler. Pods must all be scheduled by Fluence.

🚧️ Under Construction! 🚧️

## Getting started

For instructions on how to start Fluence on a K8s cluster, see [examples](examples/). Documentation and instructions for reproducing our CANOPIE2022 paper (citation below) can be found in the [canopie22-artifacts branch](https://github.com/flux-framework/flux-k8s/tree/canopie22-artifacts).
Expand Down Expand Up @@ -184,13 +182,14 @@ docker push docker.io/vanessa/fluence

> Prepare a cluster and install the Kubernetes scheduling plugins framework

These steps will require a Kubernetes cluster to install to, and having pushed the plugin container to a registry. If you aren't using a cloud provider, you can
create a local one with `kind`:
These steps will require a Kubernetes cluster to install to, and having pushed the plugin container to a registry. If you aren't using a cloud provider, you can create a local one with `kind`:

```bash
kind create cluster
```

**Important** if you are developing or testing fluence, note that custom scheduler plugins don't seem to work out of the box with MiniKube (but everything works with kind). Likely there are extensions or similar that need to be configured with MiniKube (that we have not looked into).

### Install Fluence

For some background, the [Scheduling Framework](https://kubernetes.io/docs/concepts/scheduling-eviction/scheduling-framework/) provided by
Expand Down Expand Up @@ -220,33 +219,45 @@ helm show values as-a-second-scheduler/

scheduler:
name: fluence
# image: k8s.gcr.io/scheduler-plugins/kube-scheduler:v0.23.10
image: quay.io/cmisale1/fluence:upstream
namespace: scheduler-plugins
image: registry.k8s.io/scheduler-plugins/kube-scheduler:v0.27.8
replicaCount: 1
leaderElect: false
sidecarimage: quay.io/cmisale1/fluence-sidecar:latest
sidecarimage: ghcr.io/flux-framework/fluence-sidecar:latest
policy: lonode
pullPolicy: Always
sidecarPullPolicy: Always

controller:
name: scheduler-plugins-controller
image: k8s.gcr.io/scheduler-plugins/controller:v0.23.10
namespace: scheduler-plugins
image: registry.k8s.io/scheduler-plugins/controller:v0.27.8
replicaCount: 1
pullPolicy: IfNotPresent

# LoadVariationRiskBalancing and TargetLoadPacking are not enabled by default
# as they need extra RBAC privileges on metrics.k8s.io.

plugins:
enabled: ["Fluence"]
disabled: ["CapacityScheduling","NodeResourceTopologyMatch","NodeResourcesAllocatable","PrioritySort","Coscheduling"] # only in-tree plugins need to be defined here

# Customize the enabled plugins' config.
# Refer to the "pluginConfig" section of manifests/<plugin>/scheduler-config.yaml.
# For example, for Coscheduling plugin, you want to customize the permit waiting timeout to 10 seconds:
pluginConfig:
- name: Coscheduling
args:
permitWaitingTimeSeconds: 10 # default is 60
# Or, customize the other plugins
# - name: NodeResourceTopologyMatch
# args:
# scoringStrategy:
# type: MostAllocated # default is LeastAllocated
```

</details>

Note that this plugin is going to allow us to create a Deployment with our plugin to be used as a scheduler!
The `helm install` shown under [deploy](#deploy) is how you can install to your cluster, and then proceed to testing below.
Here would be an example using custom images:
The `helm install` shown under [deploy](#deploy) is how you can install to your cluster, and then proceed to testing below. Here would be an example using custom images:

```bash
cd upstream/manifests/install/charts
Expand All @@ -256,6 +267,22 @@ helm install \
schedscheduler-plugins as-a-second-scheduler/
```

If you load your images into your testing environment and don't need to pull, you can change the pull policy too:

```bash
helm install \
--set scheduler.image=vanessa/fluence:latest \
--set scheduler.sidecarimage=vanessa/fluence-sidecar \
--set scheduler.sidecarPullPolicy=IfNotPresent \
schedscheduler-plugins as-a-second-scheduler/
```

If you need to uninstall (e.g., to redo something):

```bash
helm uninstall schedscheduler-plugins
```

Next you can move down to testing the install.

### Testing Install
Expand Down Expand Up @@ -400,7 +427,9 @@ pod/fluence-scheduled-pod spec.containers{fluence-scheduled-container} kubelet
...
```

There might be a better way to see that? Anyway, really cool! For the above, I found [this page](https://kubernetes.io/docs/tasks/extend-kubernetes/configure-multiple-schedulers/#enable-leader-election) very helpful.
For the above, I found [this page](https://kubernetes.io/docs/tasks/extend-kubernetes/configure-multiple-schedulers/#enable-leader-election) very helpful.

Finally, note that we also have a more appropriate example with jobs under [examples/test_example](examples/test_example). It's slightly more sane because it uses Job, and jobs are expected to complete (whereas pods are not and will get into crash loop backoffs, etc). For example of how to programmatically interact with the job pods and check states, events, see the [test.sh](.github/test.sh) script.


## Papers
Expand Down
14 changes: 14 additions & 0 deletions examples/test_example/default-job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
apiVersion: batch/v1
kind: Job
metadata:
name: default-job
spec:
template:
spec:
schedulerName: default-scheduler
containers:
- name: default-job
image: busybox
command: [echo, not, potato]
restartPolicy: Never
backoffLimit: 4
14 changes: 14 additions & 0 deletions examples/test_example/fluence-job.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
apiVersion: batch/v1
kind: Job
metadata:
name: fluence-job
spec:
template:
spec:
schedulerName: fluence
containers:
- name: fluence-job
image: busybox
command: [echo, potato]
restartPolicy: Never
backoffLimit: 4
Loading