Skip to content

api/ml/{id}/train does not trigger training #2578

Open
@themantalope

Description

@themantalope

Describe the bug
Per the documentation, it seems that the api/ml/{id}/train should trigger the training process on the ML backend, which should trigger the fit function of the LabelStudioMLBase model. However, when running this command, even via curl there is no response from the label-studio server nor does the fit method get triggered.

Here is the current version of the docker-compose.yml file for my project:

version: "3.8"

services:
  redis:
    image: redis:alpine
    container_name: redis
    hostname: redis
    volumes:
      - "./data/redis:/data"
    expose:
      - 6379
  labeling:
    container_name: labeling_container
    image: heartexlabs/label-studio:v1.5.0
    ports: 
      - 8080:8080
    depends_on:
      - modeling
    volumes: 
      - ./data:/label-studio/data
    environment:
      - LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true 
      - LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/label-studio/data/media
    command: > 
      bash -c "
      label-studio start  
      --log-level DEBUG
      --sampling prediction-score-min 
      --ml-backends http://modeling_container:9090"
    restart: always
  modeling:
    container_name: modeling_container
    build: 
      context: ./modeling
    command: >
      bash -c "
      label-studio-ml init modeling_backend 
      --script tools/${MODEL:-model.py}
      --force true
      &&
      label-studio-ml start ./modeling_backend 
      --port 9090
      --debug "
    restart: always
    volumes: 
      - ./data/media:/data/
    environment:
      - MODEL_DIR=/data/models
      - RQ_QUEUE_NAME=default
      - REDIS_HOST=redis
      - REDIS_PORT=6379
      - USE_REDIS=true
    ports:
      - 9090:9090
    depends_on:
      - redis
    links:
      - redis

Here is my model.py file for the ML backend.

from importlib.resources import path
import torch
import torch.nn as nn
import torch.optim as optim
import time
import os
import numpy as np
import requests
import io
import hashlib
import urllib
import cv2
import pathlib
import urllib.parse as urlparse
from skimage import io, color

from PIL import Image
from torch.utils.data import Dataset, DataLoader
from torchvision import models, transforms

from label_studio_ml.model import LabelStudioMLBase
from label_studio_ml.utils import get_single_tag_keys, get_choice, is_skipped


device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')


import layoutparser as lp

image_cache_dir = os.path.join(os.path.dirname(__file__), 'image-cache')
os.makedirs(image_cache_dir, exist_ok=True)


def load_image_from_url(url):
    # is_local_file = url.startswith('http://localhost:') and '/data/' in url
    # purl = pathlib.Path(url)
    pres = urlparse.urlparse(url)
    if pres.scheme == '':
        purl = pathlib.Path(url)
        url = purl.as_uri()
    
    im = io.imread(url)
    if len(im.shape) < 3:
        # needs to be converted to rgb
        im = color.gray2rgb(im)
    return im



def convert_block_to_value(block, image_height, image_width):


    return  {
            "height": block.height / image_height*100,
            "choices": [str(block.type)],
            "rotation": 0,
            "width":  block.width / image_width*100,
            "x":      block.coordinates[0] / image_width*100,
            "y":      block.coordinates[1] / image_height*100,
            "score":  block.score
        }


class ObjectDetectionAPI(LabelStudioMLBase):

    def __init__(self, freeze_extractor=False, **kwargs):

        super(ObjectDetectionAPI, self).__init__(**kwargs)

        # label_map_list = os.environ['LABEL_MAP'].split()
        # {int(label_map_list[i]): str(label_map_list[i+1]) for i in range(0, len(label_map_list), 2)}

        print('parsed label config:\n ')
        print(self.parsed_label_config)
        
        self.from_name, self.to_name, self.value, self.classes =\
            get_single_tag_keys(self.parsed_label_config, 'RectangleLabels', 'Image')
        self.freeze_extractor = freeze_extractor
    
        self.model = lp.Detectron2LayoutModel(
            config_path = 'lp://detectron2/PrimaLayout/mask_rcnn_R_50_FPN_3x/config',
            # model_path  = 'https://www.dropbox.com/s/bitxe8occzb865u/model_final.pth?dl=1',
            ### PLEASE REMEMBER TO CHANGE `dl=0` INTO `dl=1` IN THE END 
            ### OF DROPBOX LINKS 
            extra_config=["MODEL.ROI_HEADS.NMS_THRESH_TEST", 0.2,
                          "MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
            label_map={0: "text"}
        )

    def reset_model(self):
        # self.model = ImageClassifier(len(self.classes), self.freeze_extractor)
        pass

    def predict(self, tasks, **kwargs):

        # print('tasks: ', tasks)
        print(kwargs)
        print('self.value: ', self.value)

        image_urls = [task['data'][self.value] for task in tasks]
        print('image urls: ', image_urls)
        images = [load_image_from_url(url) for url in image_urls]
        print('im sizes: ', [im.shape for im in images])
        layouts = [self.model.detect(image) for image in images]  
        print('label config: ', self.parsed_label_config)
        print('layouts: ', layouts)
        predictions = []
        for image, layout in zip(images, layouts):
            height, width = image.shape[:2]

            result = [
                {
                'from_name': self.from_name,
                'to_name': self.to_name,
                "original_height": height,
                "original_width": width,
                "source": "$image",
                'type': 'rectanglelabels',
                "value": convert_block_to_value(block, height, width),
                } for block in layout
            ]

            predictions.append({'result': result})

        return predictions

    def fit(self, tasks, workdir=None, 
            batch_size=32, num_epochs=10, **kwargs):
        print("now running the fit function....")
        image_urls, image_classes = [], []
        # print('Collecting completions...')
        # for completion in completions:
        #     if is_skipped(completion):
        #         continue
        #     image_urls.append(completion['data'][self.value])
        #     image_classes.append(get_choice(completion))
        print('tasks: ', tasks)


        print('image urls: ', image_urls)
        print('image classes: ', image_classes)

        # print('Creating dataset...')
        # dataset = ImageClassifierDataset(image_urls, image_classes)
        # dataloader = DataLoader(dataset, shuffle=True, batch_size=batch_size)

        # print('Train model...')
        # # self.reset_model()
        # self.model.train(dataloader, num_epochs=num_epochs)

        # print('Save model...')
        # model_path = os.path.join(workdir, 'model.pt')
        # self.model.save(model_path)

        return {'model_path': None, 'classes': None}

Right now, there isn't much in the fit function, I just wanted to make sure it was working however nothing gets printed to the logs of the modeling_container.

To Reproduce
Steps to reproduce the behavior:

  1. Log in to http://localhost:8080
  2. Create a new project (test)
  3. Add data and configuration. In my case I'm using rectangular bounding boxes.
  4. Add the ML backend in settings. Will need to use http://modeling_container:9090 since all containers are on the same docker-compose network.
  5. Add data/annotations
  6. The auto-predictions in the case do indeed work, triggering the predict function specified in model.py
  7. Go to Settings->Machine Learning and click Start Training on the connected ML backend
  8. curl -X POST http://localhost:8080/api/ml/{id}/train -H 'Authorization: Token <token>' also does nothing.

Expected behavior
Code in the fit function should trigger when the curl command is launched or "Start Training" button is clicked.

Screenshots
Can provide if needed.

Environment (please complete the following information):

  • OS: Ubuntu 18.04 running docker 20.10.17, build 100c701 and docker-compose v 1.29.1, build c34c88b
  • Label Studio Version 1.5.0

Additional context
It's entirely possible that I'm not configuring the project correctly, so please let me know.

Metadata

Metadata

Assignees

Labels

MLModel-assisted labeling / pre-annotationsproblembug or something isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions