Skip to content

Support running pipelines in scheduled task queue #1871

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 51 commits into from
May 28, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
380cd6f
Add PipelineSchedule and PipelineRun models
keshav-space May 5, 2025
0b7f47c
Add pipeline execution task
keshav-space May 5, 2025
6e48ddc
Add scheduler for pipeline tasks
keshav-space May 5, 2025
21d9d5a
Add management command to init schedule
keshav-space May 5, 2025
7bc98b3
Configure RQ settings
keshav-space May 5, 2025
efd8d2c
Add utility to get latest commit hash
keshav-space May 5, 2025
656ef3c
Update logs in PipelineRun instance
keshav-space May 5, 2025
7a79948
Add API endpoint for Pipeline schedule
keshav-space May 5, 2025
202db9d
Add docker service for pipeline schedule
keshav-space May 5, 2025
dd8b252
Add list view for pipeline schedules
keshav-space May 5, 2025
baae4a8
Add css and js for log highlighting
keshav-space May 5, 2025
ba3879f
Add pipeline run list view
keshav-space May 5, 2025
e8b1cda
Add detail view for pipeline run
keshav-space May 5, 2025
d616ce6
Show default values for empty fields
keshav-space May 6, 2025
0ba0549
Enable datetime localization for client
keshav-space May 7, 2025
34dcd72
Allow temporary copy of .git to extract commit hash
keshav-space May 9, 2025
1936d9f
Populate tag and commit on pulling git archive
keshav-space May 12, 2025
354c605
Extract commit hash from git archive and local docker deployment
keshav-space May 12, 2025
ae1a260
Use uuid to track pipeline job id
keshav-space May 12, 2025
a96f775
Use scheduler to explicitly queue pipeline execution jobs
keshav-space May 12, 2025
279bd06
Handle the stats for queued pipeline
keshav-space May 13, 2025
472edc2
Add pipeline_url property to construct pipeline URL
keshav-space May 13, 2025
0df4d48
Show execution_time for running jobs
keshav-space May 13, 2025
e7c8cba
Do not humanize execution time in api response
keshav-space May 14, 2025
54a1911
Return only the latest pipeline run in API
keshav-space May 14, 2025
deda888
Truncate log to 5000 characters in API response
keshav-space May 14, 2025
edb6b61
Restrict modifications to admin users
keshav-space May 14, 2025
4195e76
Prefix task queue services with vulnerablecode
keshav-space May 14, 2025
d074f6d
Add pipeline schedule to navbar
keshav-space May 16, 2025
129bd9b
Highlight active navbar items
keshav-space May 16, 2025
5c4e7c1
Pass redis hostname to docker image
keshav-space May 16, 2025
c1d6375
Defer unused fields to optimize pipeline queries
keshav-space May 16, 2025
2bfbc46
Render stopped and stale pipeline statuses
keshav-space May 19, 2025
84efc89
Dequeue job awaiting execution when stop job is requested
keshav-space May 19, 2025
7879fb5
Skip linkcheck for unresponsive URL
keshav-space May 20, 2025
fd28465
Update failing view tests
keshav-space May 20, 2025
2ffbb24
Use fa arrow icon for back buttons
keshav-space May 20, 2025
a34f485
Add tests for schedule and run model
keshav-space May 20, 2025
9bdeba6
Ensure run fields are reset before job requeue
keshav-space May 21, 2025
263f503
Add method to reset and requeue pipeline
keshav-space May 21, 2025
6130597
Track execution timeout in schedule model
keshav-space May 21, 2025
fefe611
Show timezone in logs and stats
keshav-space May 22, 2025
587328e
Add clientside pagination for bigger log snippets
keshav-space May 22, 2025
312674f
Handle boundary condition in snippet navigation
keshav-space May 23, 2025
c67cf66
Add setting to toggle live logging for pipelines
keshav-space May 24, 2025
217a466
Add copy button for pipeline and job ids
keshav-space May 26, 2025
54ec053
Restrict modifications to staff users authed via session
keshav-space May 27, 2025
9d8d22e
Add tests for /api/v2/schedule endpoint
keshav-space May 27, 2025
5f2eb94
Display pipeline description in UI
keshav-space May 27, 2025
0db4727
Reset throttling to properly test api rate limits
keshav-space May 27, 2025
3c7def2
Add captcha challenge to staff login page
keshav-space May 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .VERSION
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
refs=$Format:%D$
commit=$Format:%H$
commit=$Format:%h$
abbrev_commit=$Format:%H$
1 change: 0 additions & 1 deletion .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ docker-compose.yml


# Ignore Git directory and files and github directory.
**/.git
**/.gitignore
**/.gitattributes
**/.gitmodules
Expand Down
1 change: 1 addition & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.VERSION export-subst
13 changes: 13 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,21 @@ ENV PYTHONDONTWRITEBYTECODE 1

RUN mkdir -p /var/vulnerablecode/static

RUN apt-get update \
&& apt-get install -y --no-install-recommends \
wait-for-it \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# Keep the dependencies installation before the COPY of the app/ for proper caching
COPY setup.cfg setup.py requirements.txt pyproject.toml /app/
RUN pip install . -c requirements.txt

COPY . /app

# Store commit hash for docker deployment from local checkout.
RUN if [ -d ".git" ]; then \
GIT_COMMIT=$(git rev-parse --short HEAD) && \
echo "VULNERABLECODE_GIT_COMMIT=\"$GIT_COMMIT\"" >> /app/vulnerablecode/settings.py; \
rm -rf .git; \
fi
35 changes: 35 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,15 @@ services:
- db_data:/var/lib/postgresql/data/
- ./etc/postgresql/postgresql.conf:/etc/postgresql/postgresql.conf

vulnerablecode_redis:
image: redis
# Enable redis data persistence using the "Append Only File" with the
# default policy of fsync every second. See https://redis.io/topics/persistence
command: redis-server --appendonly yes
volumes:
- vulnerablecode_redis_data:/data
restart: always

vulnerablecode:
build: .
command: /bin/sh -c "
Expand All @@ -26,6 +35,31 @@ services:
depends_on:
- db

vulnerablecode_scheduler:
build: .
command: wait-for-it web:8000 -- python ./manage.py run_scheduler
env_file:
- docker.env
volumes:
- /etc/vulnerablecode/:/etc/vulnerablecode/
depends_on:
- vulnerablecode_redis
- db
- vulnerablecode

vulnerablecode_rqworker:
build: .
command: wait-for-it web:8000 -- python ./manage.py rqworker default
env_file:
- docker.env
volumes:
- /etc/vulnerablecode/:/etc/vulnerablecode/
depends_on:
- vulnerablecode_redis
- db
- vulnerablecode


nginx:
image: nginx
ports:
Expand All @@ -44,4 +78,5 @@ services:
volumes:
db_data:
static:
vulnerablecode_redis_data:

2 changes: 2 additions & 0 deletions docker.env
Original file line number Diff line number Diff line change
Expand Up @@ -4,3 +4,5 @@ POSTGRES_PASSWORD=vulnerablecode

VULNERABLECODE_DB_HOST=db
VULNERABLECODE_STATIC_ROOT=/var/vulnerablecode/static/

VULNERABLECODE_REDIS_HOST=vulnerablecode_redis
2 changes: 2 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@
"https://example.org/api/non-existent-packages",
"https://github.com/aboutcode-org/vulnerablecode/pull/495/commits",
"https://nvd.nist.gov/products/cpe",
"https://ftp.suse.com/pub/projects/security/yaml/suse-cvss-scores.yaml",
"http://ftp.suse.com/pub/projects/security/yaml/",
]

# Add any Sphinx extension module names here, as strings. They can be
Expand Down
2 changes: 2 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,10 @@ python-dateutil==2.8.2
python-dotenv==0.20.0
pytz==2022.1
PyYAML==6.0.1
redis==5.0.1
requests==2.32.0
restructuredtext-lint==1.4.0
rq==1.15.1
saneyaml==0.6.0
semantic-version==2.9.0
six==1.16.0
Expand Down
2 changes: 2 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,8 @@ install_requires =

#pipeline
aboutcode.pipeline>=0.1.0
django-rq==2.10.1
rq-scheduler==0.13.1

#vulntotal
python-dotenv
Expand Down
148 changes: 148 additions & 0 deletions vulnerabilities/api_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,20 @@
from drf_spectacular.utils import extend_schema
from drf_spectacular.utils import extend_schema_view
from packageurl import PackageURL
from rest_framework import mixins
from rest_framework import serializers
from rest_framework import status
from rest_framework import viewsets
from rest_framework.authentication import SessionAuthentication
from rest_framework.decorators import action
from rest_framework.permissions import BasePermission
from rest_framework.response import Response
from rest_framework.reverse import reverse

from vulnerabilities.models import CodeFix
from vulnerabilities.models import Package
from vulnerabilities.models import PipelineRun
from vulnerabilities.models import PipelineSchedule
from vulnerabilities.models import Vulnerability
from vulnerabilities.models import VulnerabilityReference
from vulnerabilities.models import VulnerabilitySeverity
Expand Down Expand Up @@ -606,3 +611,146 @@ def get_queryset(self):
affected_package_vulnerability__vulnerability__vulnerability_id=vulnerability_id
)
return queryset


class CreateListRetrieveUpdateViewSet(
mixins.CreateModelMixin,
mixins.ListModelMixin,
mixins.RetrieveModelMixin,
mixins.UpdateModelMixin,
viewsets.GenericViewSet,
):
"""
A viewset that provides `create`, `list, `retrieve`, and `update` actions.
To use it, override the class and set the `.queryset` and
`.serializer_class` attributes.
"""

pass


class IsAdminWithSessionAuth(BasePermission):
"""Permit only staff users authenticated via session (not token)."""

def has_permission(self, request, view):
is_authenticated = request.user and request.user.is_authenticated
is_staff = request.user and request.user.is_staff
is_session_auth = isinstance(request.successful_authenticator, SessionAuthentication)

return is_authenticated and is_staff and is_session_auth


class PipelineRunAPISerializer(serializers.HyperlinkedModelSerializer):
status = serializers.SerializerMethodField()
execution_time = serializers.SerializerMethodField()
log = serializers.SerializerMethodField()

class Meta:
model = PipelineRun
fields = [
"run_id",
"status",
"execution_time",
"run_start_date",
"run_end_date",
"run_exitcode",
"run_output",
"created_date",
"vulnerablecode_version",
"vulnerablecode_commit",
"log",
]

def get_status(self, run):
return run.status

def get_execution_time(self, run):
if run.execution_time:
return round(run.execution_time, 2)

def get_log(self, run):
"""Return only last 5000 character of log."""
return run.log[-5000:]


class PipelineScheduleAPISerializer(serializers.HyperlinkedModelSerializer):
url = serializers.HyperlinkedIdentityField(
view_name="schedule-detail", lookup_field="pipeline_id"
)
latest_run = serializers.SerializerMethodField()
next_run_date = serializers.SerializerMethodField()

class Meta:
model = PipelineSchedule
fields = [
"url",
"pipeline_id",
"is_active",
"live_logging",
"run_interval",
"execution_timeout",
"created_date",
"schedule_work_id",
"next_run_date",
"latest_run",
]

def get_next_run_date(self, schedule):
return schedule.next_run_date

def get_latest_run(self, schedule):
if latest := schedule.pipelineruns.first():
return PipelineRunAPISerializer(latest).data
return None


class PipelineScheduleCreateSerializer(serializers.ModelSerializer):
class Meta:
model = PipelineSchedule
fields = [
"pipeline_id",
"is_active",
"run_interval",
"live_logging",
"execution_timeout",
]
extra_kwargs = {
field: {"initial": PipelineSchedule._meta.get_field(field).get_default()}
for field in [
"is_active",
"run_interval",
"live_logging",
"execution_timeout",
]
}


class PipelineScheduleUpdateSerializer(serializers.ModelSerializer):
class Meta:
model = PipelineSchedule
fields = [
"is_active",
"run_interval",
"live_logging",
"execution_timeout",
]


class PipelineScheduleV2ViewSet(CreateListRetrieveUpdateViewSet):
queryset = PipelineSchedule.objects.prefetch_related("pipelineruns").all()
serializer_class = PipelineScheduleAPISerializer
lookup_field = "pipeline_id"
lookup_value_regex = r"[\w.]+"

def get_serializer_class(self):
if self.action == "create":
return PipelineScheduleCreateSerializer
elif self.action == "update":
return PipelineScheduleUpdateSerializer
return super().get_serializer_class()

def get_permissions(self):
"""Restrict addition and modifications to staff users authenticated via session."""
if self.action not in ["list", "retrieve"]:
return [IsAdminWithSessionAuth()]
return super().get_permissions()
23 changes: 23 additions & 0 deletions vulnerabilities/forms.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
#

from django import forms
from django.contrib.admin.forms import AdminAuthenticationForm
from django.core.validators import validate_email
from django_recaptcha.fields import ReCaptchaField
from django_recaptcha.widgets import ReCaptchaV2Checkbox
Expand Down Expand Up @@ -85,3 +86,25 @@ def clean_username(self):

def save_m2m(self):
pass


class PipelineSchedulePackageForm(forms.Form):
search = forms.CharField(
required=True,
label=False,
widget=forms.TextInput(
attrs={
"placeholder": "Search a pipeline...",
"class": "input ",
},
),
)


class AdminLoginForm(AdminAuthenticationForm):
captcha = ReCaptchaField(
error_messages={
"required": ("Captcha is required"),
},
widget=ReCaptchaV2Checkbox(),
)
37 changes: 37 additions & 0 deletions vulnerabilities/management/commands/run_scheduler.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#
# Copyright (c) nexB Inc. and others. All rights reserved.
# VulnerableCode is a trademark of nexB Inc.
# SPDX-License-Identifier: Apache-2.0
# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
# See https://github.com/aboutcode-org/vulnerablecode for support or download.
# See https://aboutcode.org for more information about nexB OSS projects.
#


from django_rq.management.commands import rqscheduler

from vulnerabilities import models
from vulnerabilities.schedules import clear_zombie_pipeline_schedules
from vulnerabilities.schedules import scheduled_job_exists
from vulnerabilities.schedules import update_pipeline_schedule


def init_pipeline_scheduled():
"""Initialize schedule jobs for active PipelineSchedule."""
active_pipeline_qs = models.PipelineSchedule.objects.filter(is_active=True).order_by(
"created_date"
)
for pipeline_schedule in active_pipeline_qs:
if scheduled_job_exists(pipeline_schedule.schedule_work_id):
continue
new_id = pipeline_schedule.create_new_job()
pipeline_schedule.schedule_work_id = new_id
pipeline_schedule.save(update_fields=["schedule_work_id"])


class Command(rqscheduler.Command):
def handle(self, *args, **kwargs):
clear_zombie_pipeline_schedules()
update_pipeline_schedule()
init_pipeline_scheduled()
super(Command, self).handle(*args, **kwargs)
30 changes: 30 additions & 0 deletions vulnerabilities/middleware/timezone.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#
# Copyright (c) nexB Inc. and others. All rights reserved.
# VulnerableCode is a trademark of nexB Inc.
# SPDX-License-Identifier: Apache-2.0
# See http://www.apache.org/licenses/LICENSE-2.0 for the license text.
# See https://github.com/aboutcode-org/vulnerablecode for support or download.
# See https://aboutcode.org for more information about nexB OSS projects.
#

import zoneinfo

from django.utils import timezone


class UserTimezoneMiddleware:
def __init__(self, get_response):
self.get_response = get_response

def __call__(self, request):
try:
# Activate local timezone for user using cookies
tzname = request.COOKIES.get("user_timezone")
if tzname:
timezone.activate(zoneinfo.ZoneInfo(tzname))
else:
timezone.deactivate()
except Exception as e:
timezone.deactivate()

return self.get_response(request)
Loading