Skip to content

Commit

Permalink
Merge pull request #661 from roboflow/florence2-workflows-block
Browse files Browse the repository at this point in the history
Florence2 workflows block
  • Loading branch information
PawelPeczek-Roboflow authored Sep 23, 2024
2 parents e22b900 + 11f10ad commit 2d0a493
Show file tree
Hide file tree
Showing 22 changed files with 1,942 additions and 75 deletions.
6 changes: 3 additions & 3 deletions .github/workflows/integration_tests_workflows_x86.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ jobs:
strategy:
matrix:
python-version: ["3.8", "3.9", "3.10", "3.11"]
timeout-minutes: 10
timeout-minutes: 15
steps:
- name: 🛎️ Checkout
uses: actions/checkout@v4
Expand All @@ -30,6 +30,6 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install --upgrade setuptools
pip install --extra-index-url https://download.pytorch.org/whl/cpu -r requirements/_requirements.txt -r requirements/requirements.cpu.txt -r requirements/requirements.sdk.http.txt -r requirements/requirements.test.unit.txt -r requirements/requirements.http.txt -r requirements/requirements.yolo_world.txt -r requirements/requirements.doctr.txt -r requirements/requirements.sam.txt
pip install --extra-index-url https://download.pytorch.org/whl/cpu -r requirements/_requirements.txt -r requirements/requirements.cpu.txt -r requirements/requirements.sdk.http.txt -r requirements/requirements.test.unit.txt -r requirements/requirements.http.txt -r requirements/requirements.yolo_world.txt -r requirements/requirements.doctr.txt -r requirements/requirements.sam.txt -r requirements/requirements.transformers.txt
- name: 🧪 Integration Tests of Workflows
run: ROBOFLOW_API_KEY=${{ secrets.API_KEY }} python -m pytest tests/workflows/integration_tests
run: ROBOFLOW_API_KEY=${{ secrets.API_KEY }} SKIP_FLORENCE2_TEST=FALSE python -m pytest tests/workflows/integration_tests
82 changes: 82 additions & 0 deletions docker/dockerfiles/Dockerfile.onnx.gpu.dev
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
FROM nvcr.io/nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04 as base

WORKDIR /app

RUN rm -rf /var/lib/apt/lists/* && apt-get clean && apt-get update -y && DEBIAN_FRONTEND=noninteractive apt-get install -y \
ffmpeg \
libxext6 \
libopencv-dev \
uvicorn \
python3-pip \
git \
libgdal-dev \
wget \
&& rm -rf /var/lib/apt/lists/*

COPY requirements/requirements.sam.txt \
requirements/requirements.clip.txt \
requirements/requirements.http.txt \
requirements/requirements.gpu.txt \
requirements/requirements.waf.txt \
requirements/requirements.gaze.txt \
requirements/requirements.doctr.txt \
requirements/requirements.groundingdino.txt \
requirements/requirements.cogvlm.txt \
requirements/requirements.yolo_world.txt \
requirements/_requirements.txt \
requirements/requirements.transformers.txt \
requirements/requirements.pali.flash_attn.txt \
requirements/requirements.sdk.http.txt \
requirements/requirements.cli.txt \
./

RUN python3 -m pip install -U pip
RUN python3 -m pip install --extra-index-url https://download.pytorch.org/whl/cu118 \
-r _requirements.txt \
-r requirements.sam.txt \
-r requirements.clip.txt \
-r requirements.http.txt \
-r requirements.gpu.txt \
-r requirements.waf.txt \
-r requirements.gaze.txt \
-r requirements.groundingdino.txt \
-r requirements.doctr.txt \
-r requirements.cogvlm.txt \
-r requirements.yolo_world.txt \
-r requirements.transformers.txt \
-r requirements.sdk.http.txt \
-r requirements.cli.txt \
jupyterlab \
--upgrade \
&& rm -rf ~/.cache/pip

# Install setup.py requirements for flash_attn
RUN python3 -m pip install packaging==24.1 && rm -rf ~/.cache/pip

# Install flash_attn required for Paligemma and Florence2
RUN python3 -m pip install -r requirements.pali.flash_attn.txt --no-build-isolation && rm -rf ~/.cache/pip

FROM scratch
COPY --from=base / /

WORKDIR /app/
COPY inference inference
COPY inference_sdk inference_sdk
COPY inference_cli inference_cli
ENV PYTHONPATH=/app/
COPY docker/config/gpu_http.py gpu_http.py

ENV PYTHONPATH=/app/
ENV VERSION_CHECK_MODE=continuous
ENV PROJECT=roboflow-platform
ENV NUM_WORKERS=1
ENV HOST=0.0.0.0
ENV PORT=9001
ENV WORKFLOWS_STEP_EXECUTION_MODE=local
ENV WORKFLOWS_MAX_CONCURRENT_STEPS=1
ENV API_LOGGING_ENABLED=True
ENV LMM_ENABLED=True
ENV CORE_MODEL_SAM2_ENABLED=True
ENV CORE_MODEL_OWLV2_ENABLED=True

ENTRYPOINT uvicorn gpu_http:app --workers $NUM_WORKERS --host $HOST --port $PORT
7 changes: 7 additions & 0 deletions docs/workflows/blocks.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ hide:
</div>
<div class="custom-grid">
<!--- AUTOGENERATED_BLOCKS_LIST -->
<p class="card block-card" data-url="timeinzone" data-name="Time in zone" data-desc="Track duration of time spent by objects in zone" data-labels="ANALYTICS, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="bounding_rectangle" data-name="Bounding Rectangle" data-desc="Find minimal bounding rectangle surrounding detection contour" data-labels="TRANSFORMATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="segment_anything2_model" data-name="Segment Anything 2 Model" data-desc="Convert bounding boxes to polygons, or run SAM2 on an entire image to generate a mask." data-labels="MODEL, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="detections_consensus" data-name="Detections Consensus" data-desc="Combine predictions from multiple detections models to make a decision about object presence." data-labels="FUSION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="clip_comparison" data-name="Clip Comparison" data-desc="Compare CLIP image and text embeddings." data-labels="MODEL, APACHE-2.0" data-author="dummy"></p>
Expand All @@ -33,6 +35,7 @@ hide:
<p class="card block-card" data-url="dynamic_crop" data-name="Dynamic Crop" data-desc="Crop an image using bounding boxes from a detection model." data-labels="TRANSFORMATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="detections_filter" data-name="Detections Filter" data-desc="Conditionally filter out model predictions." data-labels="TRANSFORMATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="detection_offset" data-name="Detection Offset" data-desc="Apply a padding around the width and height of detections." data-labels="TRANSFORMATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="byte_tracker" data-name="Byte Tracker" data-desc="Track and update object positions across video frames using ByteTrack." data-labels="TRANSFORMATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="relative_static_crop" data-name="Relative Static Crop" data-desc="Crop an image proportional (%) to its dimensions." data-labels="TRANSFORMATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="detections_transformation" data-name="Detections Transformation" data-desc="Apply transformations on detected bounding boxes." data-labels="TRANSFORMATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="roboflow_dataset_upload" data-name="Roboflow Dataset Upload" data-desc="Save images and predictions in your Roboflow Dataset" data-labels="SINK, APACHE-2.0" data-author="dummy"></p>
Expand All @@ -58,6 +61,7 @@ hide:
<p class="card block-card" data-url="mask_visualization" data-name="Mask Visualization" data-desc="Paints a mask over detected objects in an image." data-labels="VISUALIZATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="pixelate_visualization" data-name="Pixelate Visualization" data-desc="Pixelates detected objects in an image." data-labels="VISUALIZATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="polygon_visualization" data-name="Polygon Visualization" data-desc="Draws a polygon around detected objects in an image." data-labels="VISUALIZATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="line_counter_visualization" data-name="Line Counter Visualization" data-desc="Paints a mask over line zone in an image." data-labels="VISUALIZATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="triangle_visualization" data-name="Triangle Visualization" data-desc="Draws triangle markers on an image at specific coordinates based on provided detections." data-labels="VISUALIZATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="roboflow_custom_metadata" data-name="Roboflow Custom Metadata" data-desc="Add custom metadata to Roboflow Model Monitoring dashboard" data-labels="SINK, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="detections_stitch" data-name="Detections Stitch" data-desc="Merges detections made against multiple pieces of input image into single detection." data-labels="FUSION, APACHE-2.0" data-author="dummy"></p>
Expand All @@ -77,6 +81,9 @@ hide:
<p class="card block-card" data-url="google_gemini" data-name="Google Gemini" data-desc="Run Google's Gemini model with vision capabilities" data-labels="MODEL, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="vl_mas_detector" data-name="VLM as Detector" data-desc="Parses raw string into object-detection prediction." data-labels="FORMATTER, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="anthropic_claude" data-name="Anthropic Claude" data-desc="Run Anthropic Claude model with vision capabilities" data-labels="MODEL, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="line_counter" data-name="Line Counter" data-desc="Count detections passing line" data-labels="ANALYTICS, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="polygon_zone_visualization" data-name="Polygon Zone Visualization" data-desc="Paints a mask over polygon zone in an image." data-labels="VISUALIZATION, APACHE-2.0" data-author="dummy"></p>
<p class="card block-card" data-url="florence2_model" data-name="Florence-2 Model" data-desc="Run Florence-2 on an image" data-labels="MODEL, APACHE-2.0" data-author="dummy"></p>
<!--- AUTOGENERATED_BLOCKS_LIST -->
</div>
</div>
Expand Down
2 changes: 1 addition & 1 deletion docs/workflows/gallery_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ Browse through the various categories to find inspiration and ideas for building
<ul id="workflows-gallery">
<li><a href="/workflows/gallery/workflows_with_multiple_models">Workflows with multiple models</a></li>
<li><a href="/workflows/gallery/workflows_enhanced_by_roboflow_platform">Workflows enhanced by Roboflow Platform</a></li>
<li><a href="/workflows/gallery/basic_workflows">Basic Workflows</a></li>
<li><a href="/workflows/gallery/workflows_with_classical_computer_vision_methods">Workflows with classical Computer Vision methods</a></li>
<li><a href="/workflows/gallery/workflows_with_visual_language_models">Workflows with Visual Language Models</a></li>
<li><a href="/workflows/gallery/basic_workflows">Basic Workflows</a></li>
<li><a href="/workflows/gallery/workflows_with_dynamic_python_blocks">Workflows with dynamic Python Blocks</a></li>
<li><a href="/workflows/gallery/workflows_with_data_transformations">Workflows with data transformations</a></li>
<li><a href="/workflows/gallery/workflows_with_flow_control">Workflows with flow control</a></li>
Expand Down
46 changes: 23 additions & 23 deletions docs/workflows/kinds.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,36 +37,36 @@ for the presence of a mask in the input.

## Kinds declared in Roboflow plugins
<!--- AUTOGENERATED_KINDS_LIST -->
* [`instance_segmentation_prediction`](/workflows/kinds/instance_segmentation_prediction): Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object
* [`list_of_values`](/workflows/kinds/list_of_values): List of values of any type
* [`prediction_type`](/workflows/kinds/prediction_type): String value with type of prediction
* [`zone`](/workflows/kinds/zone): Definition of polygon zone
* [`image_keypoints`](/workflows/kinds/image_keypoints): Image keypoints detected by classical Computer Vision method
* [`serialised_payloads`](/workflows/kinds/serialised_payloads): Serialised element that is usually accepted by sink
* [`detection`](/workflows/kinds/detection): Single element of detections-based prediction (like `object_detection_prediction`)
* [`bar_code_detection`](/workflows/kinds/bar_code_detection): Prediction with barcode detection
* [`language_model_output`](/workflows/kinds/language_model_output): LLM / VLM output
* [`video_metadata`](/workflows/kinds/video_metadata): Video image metadata
* [`rgb_color`](/workflows/kinds/rgb_color): RGB color
* [`float`](/workflows/kinds/float): Float value
* [`top_class`](/workflows/kinds/top_class): String value representing top class predicted by classification model
* [`prediction_type`](/workflows/kinds/prediction_type): String value with type of prediction
* [`object_detection_prediction`](/workflows/kinds/object_detection_prediction): Prediction with detected bounding boxes in form of sv.Detections(...) object
* [`qr_code_detection`](/workflows/kinds/qr_code_detection): Prediction with QR code detection
* [`image_metadata`](/workflows/kinds/image_metadata): Dictionary with image metadata required by supervision
* [`numpy_array`](/workflows/kinds/numpy_array): Numpy array
* [`roboflow_model_id`](/workflows/kinds/roboflow_model_id): Roboflow model id
* [`roboflow_api_key`](/workflows/kinds/roboflow_api_key): Roboflow API key
* [`integer`](/workflows/kinds/integer): Integer value
* [`boolean`](/workflows/kinds/boolean): Boolean flag
* [`language_model_output`](/workflows/kinds/language_model_output): LLM / VLM output
* [`qr_code_detection`](/workflows/kinds/qr_code_detection): Prediction with QR code detection
* [`point`](/workflows/kinds/point): Single point in 2D
* [`float_zero_to_one`](/workflows/kinds/float_zero_to_one): `float` value in range `[0.0, 1.0]`
* [`dictionary`](/workflows/kinds/dictionary): Dictionary
* [`parent_id`](/workflows/kinds/parent_id): Identifier of parent for step output
* [`keypoint_detection_prediction`](/workflows/kinds/keypoint_detection_prediction): Prediction with detected bounding boxes and detected keypoints in form of sv.Detections(...) object
* [`float`](/workflows/kinds/float): Float value
* [`*`](/workflows/kinds/*): Equivalent of any element
* [`contours`](/workflows/kinds/contours): List of numpy arrays where each array represents contour points
* [`boolean`](/workflows/kinds/boolean): Boolean flag
* [`detection`](/workflows/kinds/detection): Single element of detections-based prediction (like `object_detection_prediction`)
* [`roboflow_project`](/workflows/kinds/roboflow_project): Roboflow project name
* [`dictionary`](/workflows/kinds/dictionary): Dictionary
* [`numpy_array`](/workflows/kinds/numpy_array): Numpy array
* [`roboflow_api_key`](/workflows/kinds/roboflow_api_key): Roboflow API key
* [`string`](/workflows/kinds/string): String value
* [`roboflow_model_id`](/workflows/kinds/roboflow_model_id): Roboflow model id
* [`list_of_values`](/workflows/kinds/list_of_values): List of values of any types
* [`instance_segmentation_prediction`](/workflows/kinds/instance_segmentation_prediction): Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object
* [`object_detection_prediction`](/workflows/kinds/object_detection_prediction): Prediction with detected bounding boxes in form of sv.Detections(...) object
* [`roboflow_project`](/workflows/kinds/roboflow_project): Roboflow project name
* [`image`](/workflows/kinds/image): Image in workflows
* [`video_metadata`](/workflows/kinds/video_metadata): Video image metadata
* [`serialised_payloads`](/workflows/kinds/serialised_payloads): Serialised element that is usually accepted by sink
* [`integer`](/workflows/kinds/integer): Integer value
* [`rgb_color`](/workflows/kinds/rgb_color): RGB color
* [`*`](/workflows/kinds/*): Equivalent of any element
* [`classification_prediction`](/workflows/kinds/classification_prediction): Predictions from classifier
* [`image_keypoints`](/workflows/kinds/image_keypoints): Image keypoints detected by classical Computer Vision method
* [`point`](/workflows/kinds/point): Single point in 2D
* [`zone`](/workflows/kinds/zone): Definition of polygon zone
* [`keypoint_detection_prediction`](/workflows/kinds/keypoint_detection_prediction): Prediction with detected bounding boxes and detected keypoints in form of sv.Detections(...) object
<!--- AUTOGENERATED_KINDS_LIST -->
4 changes: 3 additions & 1 deletion inference/core/entities/responses/inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,9 @@ class MultiLabelClassificationInferenceResponse(


class LMMInferenceResponse(CvInferenceResponse):
response: str = Field(description="Text generated by PaliGemma")
response: Union[str, dict] = Field(
description="Text/structured response generated by model"
)


class FaceDetectionPrediction(ObjectDetectionPrediction):
Expand Down
2 changes: 1 addition & 1 deletion inference/core/version.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = "0.19.0"
__version__ = "0.20.0"


if __name__ == "__main__":
Expand Down
Loading

0 comments on commit 2d0a493

Please sign in to comment.