Improve documentation on Yolov8 Detector #2310

2649 · 2024-01-21T21:16:22Z

Issue Type

Documentation Feature Request

Source

source

Keras Version

Keras 3

Custom Code

Yes

OS Platform and Distribution

No response

Python version

No response

GPU model and memory

No response

Current Behavior?

I'm tinkering with Keras 3. Nice work!

I have a hard time training Yolov8 because the documentation only shows the case for training with batch size 1, and it is unclear how to create batches with variable number of bounding boxes.

So far, I have tried using a ragged tensor as input, which throws an error. This is also recommended from the Keras 2 tutorial.

I can run it with a batch size of one. However, it throws an error when the number of boxes changes. So, I guess it needs to be batched with padding. Can the documentation specify how the padding should be created?

Standalone code to reproduce the issue or tutorial link

import tensorflow as tf
import keras_cv

# Create 2 images
images = tf.ones(shape=(2, 512, 512, 3))
labels = {
    "boxes": tf.ragged.constant([
        [
            [0, 0, 100, 100],
            [100, 100, 200, 200],
            [300, 300, 100, 100],
        ],
        # Add a second image with one bbox
        [
            [0, 0, 100, 100]
        ], 
    ], dtype=tf.float32),
    
    "classes": tf.ragged.constant([[1, 1, 1], [1]], dtype=tf.int64),
}

model = keras_cv.models.YOLOV8Detector(
    num_classes=20,
    bounding_box_format="xywh",
    backbone=keras_cv.models.YOLOV8Backbone.from_preset(
        "yolo_v8_m_backbone_coco"
    ),
    fpn_depth=2
)

# Evaluate model without box decoding and NMS
model(images)

# Prediction with box decoding and NMS
model.predict(images)

# Train model
model.compile(
    classification_loss='binary_crossentropy',
    box_loss='ciou',
    optimizer=tf.optimizers.SGD(global_clipnorm=10.0),
    jit_compile=False,
)
model.fit(images, labels)

Relevant log output

TypeError                                 Traceback (most recent call last)
Cell In[4], line 42
     35 # Train model
     36 model.compile(
     37     classification_loss='binary_crossentropy',
     38     box_loss='ciou',
     39     optimizer=tf.optimizers.SGD(global_clipnorm=10.0),
     40     jit_compile=False,
     41 )
---> 42 model.fit(images, labels)

File ~/Desktop/repos/stratai/ai/.keras-venv/lib/python3.11/site-packages/keras/src/utils/traceback_utils.py:123, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    120     filtered_tb = _process_traceback_frames(e.__traceback__)
    121     # To get the full stack trace, call:
    122     # `keras.config.disable_traceback_filtering()`
--> 123     raise e.with_traceback(filtered_tb) from None
    124 finally:
    125     del filtered_tb

File ~/Desktop/repos/stratai/ai/.keras-venv/lib/python3.11/site-packages/keras_cv/src/models/object_detection/yolo_v8/yolo_v8_detector.py:526, in YOLOV8Detector.train_step(self, *args)
    524 args = args[:-1]
    525 x, y = unpack_input(data)
--> 526 return super().train_step(*args, (x, y))

File ~/Desktop/repos/stratai/ai/.keras-venv/lib/python3.11/site-packages/keras_cv/src/models/object_detection/yolo_v8/yolo_v8_detector.py:545, in YOLOV8Detector.compute_loss(self, x, y, y_pred, sample_weight, **kwargs)
    541 stride_tensor = ops.expand_dims(stride_tensor, axis=-1)
    543 gt_labels = y["classes"]
--> 545 mask_gt = ops.all(y["boxes"] > -1.0, axis=-1, keepdims=True)
    546 gt_bboxes = bounding_box.convert_format(
    547     y["boxes"],
    548     source=self.bounding_box_format,
    549     target="xyxy",
    550     images=x,
    551 )
    553 pred_bboxes = dist2bbox(pred_boxes, anchor_points)

TypeError: Failed to convert elements of tf.RaggedTensor(values=tf.RaggedTensor(values=Tensor("Greater:0", shape=(None,), dtype=bool), row_splits=Tensor("data_3:0", shape=(None,), dtype=int64)), row_splits=Tensor("data_2:0", shape=(None,), dtype=int64)) to Tensor. Consider casting elements to a supported type. See https://www.tensorflow.org/api_docs/python/tf/dtypes for supported TF dtypes.

innat · 2024-01-24T16:13:33Z

https://keras.io/guides/keras_cv/object_detection_keras_cv/

sachinprasadhs · 2024-02-08T22:59:50Z

@2649 , Could you please refer to the above linked example and let us know if you have any additional questions.

andreaallegr · 2024-02-15T21:47:00Z

I have same problem...
tensorflow 2.15.0.post1
keras 2.15.0
keras-core 0.1.7
keras-cv 0.8.3

kvlsky · 2024-05-27T08:57:39Z

@2649 , Could you please refer to the above linked example and let us know if you have any additional questions.

using approach from the example:

model = keras_cv.models.YOLOV8Detector(
    num_classes=20,
    bounding_box_format="xywh",
    backbone=keras_cv.models.YOLOV8Backbone.from_preset(
        "yolo_v8_m_backbone_coco"
    ),
    fpn_depth=2
)

model.compile(
    classification_loss='binary_crossentropy',
    box_loss='ciou',
    optimizer=tf.optimizers.SGD(global_clipnorm=10.0),
    jit_compile=False,
)

images = tf.ones(shape=(2, 512, 512, 3))
bbox = [
    [
        [0, 0, 100, 100],
        [100, 100, 200, 200],
        [300, 300, 100, 100],
    ],
    [
        [0, 0, 100, 100],
        [100, 100, 200, 200],
        [300, 300, 100, 100],
    ]
]
classes = [[1, 1, 1], [1, 1, 1]]

def load_dataset(image, classes, bbox):
    # Read Image
    bounding_boxes = {
        "classes": tf.cast(classes, dtype=tf.float32),
        "boxes": bbox,
    }
    return {"images": tf.cast(image, tf.float32), "bounding_boxes": bounding_boxes}

bbox = tf.ragged.constant(bbox, dtype=tf.float32)
classes = tf.ragged.constant(classes, dtype=tf.float32)

BATCH_SIZE = 1

train_data = tf.data.Dataset.from_tensor_slices((images, classes, bbox))
train_ds = train_data.map(load_dataset, num_parallel_calls=tf.data.AUTOTUNE)
train_ds = train_ds.shuffle(BATCH_SIZE * 4)
train_ds = train_ds.ragged_batch(BATCH_SIZE, drop_remainder=True)

def dict_to_tuple(inputs):
    return inputs["images"], inputs["bounding_boxes"]

train_ds = train_ds.map(dict_to_tuple, num_parallel_calls=tf.data.AUTOTUNE)
train_ds = train_ds.prefetch(tf.data.AUTOTUNE)

for x, y in train_ds.take(1):
    print(x.shape, y['classes'].shape, y['boxes'].shape)
    # returns: (1, 512, 512, 3) (1, None) (1, None, None)

model.fit(train_ds)

throws and error:

ValueError                                Traceback (most recent call last)
Cell In[22], line 65
     62     print(x.shape, y['classes'].shape, y['boxes'].shape)
     64 # model.fit(images, labels)
---> 65 model.fit(train_ds)

File ~/work/.venv/lib/python3.12/site-packages/keras/src/utils/traceback_utils.py:122, in filter_traceback.<locals>.error_handler(*args, **kwargs)
    119     filtered_tb = _process_traceback_frames(e.__traceback__)
    120     # To get the full stack trace, call:
    121     # `keras.config.disable_traceback_filtering()`
--> 122     raise e.with_traceback(filtered_tb) from None
    123 finally:
    124     del filtered_tb

File ~/work/.venv/lib/python3.12/site-packages/keras_cv/src/models/object_detection/yolo_v8/yolo_v8_detector.py:526, in YOLOV8Detector.train_step(self, *args)
    524 args = args[:-1]
    525 x, y = unpack_input(data)
--> 526 return super().train_step(*args, (x, y))

File ~/work/.venv/lib/python3.12/site-packages/keras_cv/src/models/object_detection/yolo_v8/yolo_v8_detector.py:546, in YOLOV8Detector.compute_loss(self, x, y, y_pred, sample_weight, **kwargs)
    543 gt_labels = y["classes"]
    545 mask_gt = ops.all(y["boxes"] > -1.0, axis=-1, keepdims=True)
--> 546 gt_bboxes = bounding_box.convert_format(
    547     y["boxes"],
...
    142 def _xywh_to_xyxy(boxes, images=None, image_shape=None):
--> 143     x, y, width, height = ops.split(boxes, ALL_AXES, axis=-1)
    144     return ops.concatenate([x, y, x + width, y + height], axis=-1)

ValueError: Cannot split a ragged dimension. Got `value` with shape <DynamicRaggedShape lengths=[1, None, None] num_row_partitions=2> and `axis` 2.

but works fine if we change

bbox = tf.ragged.constant(bbox, dtype=tf.float32)
classes = tf.ragged.constant(classes, dtype=tf.float32)

to

bbox = tf.constant(bbox, dtype=tf.float32)
classes = tf.constant(classes, dtype=tf.float32)

but the problem is that you cannot use tf.constant for arbitrary shapes..

bbox = [
    [
        [0, 0, 100, 100],
        [100, 100, 200, 200],
        [300, 300, 100, 100],
    ],
    [
        [0, 0, 100, 100],
    ]
]

bbox = tf.constant(bbox, dtype=tf.float32) --> ValueError: Can't convert non-rectangular Python sequence to Tensor.

kvlsky · 2024-05-27T10:33:12Z

found a solution to use bounding_box_format='xyxy' when initializing YOLOV8Detector

sachinprasadhs self-assigned this Feb 8, 2024

sachinprasadhs added the stat:awaiting response from contributor label Feb 8, 2024

sachinprasadhs added the type:docs Improvements or additions to documentation label Apr 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve documentation on Yolov8 Detector #2310

Improve documentation on Yolov8 Detector #2310

2649 commented Jan 21, 2024

innat commented Jan 24, 2024

sachinprasadhs commented Feb 8, 2024

andreaallegr commented Feb 15, 2024

kvlsky commented May 27, 2024 •

edited

Loading

kvlsky commented May 27, 2024 •

edited

Loading

Improve documentation on Yolov8 Detector #2310

Improve documentation on Yolov8 Detector #2310

Comments

2649 commented Jan 21, 2024

Issue Type

Source

Keras Version

Custom Code

OS Platform and Distribution

Python version

GPU model and memory

Current Behavior?

Standalone code to reproduce the issue or tutorial link

Relevant log output

innat commented Jan 24, 2024

sachinprasadhs commented Feb 8, 2024

andreaallegr commented Feb 15, 2024

kvlsky commented May 27, 2024 • edited Loading

kvlsky commented May 27, 2024 • edited Loading

kvlsky commented May 27, 2024 •

edited

Loading

kvlsky commented May 27, 2024 •

edited

Loading