-
-
Notifications
You must be signed in to change notification settings - Fork 16.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UTF-8 letters support #4351
Comments
👋 Hello @lavrenkov-sketch, thank you for your interest in YOLOv5 🚀! Please visit our ⭐️ Tutorials to get started, where you can find quickstart guides for simple tasks like Custom Data Training all the way to advanced concepts like Hyperparameter Evolution. If this is a 🐛 Bug Report, please provide screenshots and minimum viable code to reproduce your issue, otherwise we can not help you. If this is a custom training ❓ Question, please provide as much information as possible, including dataset images, training logs, screenshots, and a public link to online W&B logging if available. For business inquiries or professional support requests please visit https://ultralytics.com or email Glenn Jocher at glenn.jocher@ultralytics.com. RequirementsPython>=3.6.0 with all requirements.txt installed including PyTorch>=1.7. To get started: $ git clone https://github.com/ultralytics/yolov5
$ cd yolov5
$ pip install -r requirements.txt EnvironmentsYOLOv5 may be run in any of the following up-to-date verified environments (with all dependencies including CUDA/CUDNN, Python and PyTorch preinstalled):
StatusIf this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), validation (val.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit. |
I'm interested too. The See: http://fireant.github.io/misc/2017/01/28/ttf-opencv.html There is |
By the way, not only labels containing UTF-8 characters such as accents don't appear on images, the results ( Example: "Société France" would become "Socit France" |
@kinoute @lavrenkov-sketch we have a PIL-based plotter that might be better for UTF-8: Lines 82 to 94 in e96c74b
|
@glenn-jocher how can i use plot_one_box_PIL it returns an array and doesn't draw any labels on the image |
im0 = plot_one_box_PIL(xyxy, im0, label=label, color=colors(c, True), line_thickness=line_thickness) |
@lavrenkov-sketch I've opened up PR #4407 to assign a default line width with PIL plotting, not sure why it was missing. Can you supply an example class name that is failing with the existing cv2 plotting function? We should update this to switch to PIL automatically on detection of UTF-8 characters. |
@kinoute perfect thanks! I'll see if we can merge a more comprehensive fix this weekend. But wait in this case the cv2 label function works with a silent error of not displaying the é characters? Is this master code or did you modify this to remove these characters? |
I didn't modify anything, it comes from a training I did 3 days ago. I
train with the official docker image.
To avoid this problem, I temporarily removed the accents from my yaml file.
Do note that accents don't show up in "results" at inference time too. It's
not only a CV2 problem.
|
@kinoute I see correct console printout but incorrect cv2 plotting in Colab. Can you run this code to reproduce in you environment: # PyTorch Hub
import torch
# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
model.names[0] = 'Société Générale'
# Images
dir = 'https://ultralytics.com/images/'
imgs = [dir + f for f in ('zidane.jpg', 'bus.jpg')] # batch of images
# Inference
results = model(imgs)
results.print()
results.save()
results.pandas().xyxy[0] |
@glenn-jocher It works like this because you overwrite the class name right before the inference, but I think the problem comes from the time the classes are loaded from the yaml dataset file. I was about to reproduce the issue like this: # download yolov5 master
git clone https://github.com/ultralytics/yolov5
cd yolov5
# download coco128 dataset
chmod +x data/scripts/get_coco128.sh && ./data/scripts/get_coco128.sh
# change first class name in data/coco128.yaml from "person" to "person Société Générale"
vi data/coco128.yaml # train
python train.py --img 640 --batch 16 --epochs 3 --data coco128.yaml --weights yolov5s.pt —cache After the training, we can confirm some images have accents removed, such as Now, on the detection/inference, running: python detect.py --weights runs/train/exp/weights/best.pt --img 640 --conf 0.25 --source data/images/ --save-txt And finally, using the Docker Hub for inference: /sites/yolov5 python3
Python 3.8.9 (default, Apr 3 2021, 01:49:17)
[Clang 11.0.0 (clang-1100.0.33.17)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> model = torch.hub.load('ultralytics/yolov5', 'custom', path="runs/train/exp/weights/best.pt")
Using cache found in /Users/kinoute/.cache/torch/hub/ultralytics_yolov5_master
YOLOv5 🚀 2021-8-14 torch 1.7.0 CPU
Fusing layers...
Model Summary: 224 layers, 7266973 parameters, 0 gradients, 17.0 GFLOPs
Adding AutoShape...
>>> dir = 'https://ultralytics.com/images/'
>>> imgs = [dir + f for f in ('zidane.jpg', 'bus.jpg')] # batch of images
>>> results = model(imgs)
>>> results.print()
image 1/2: 720x1280 2 person Socit Gnrales, 2 ties
image 2/2: 1080x810 4 person Socit Gnrales, 1 bus
Speed: 536.4ms pre-process, 403.4ms inference, 2.1ms NMS per image at shape (2, 3, 640, 640)
>>> results.save()
Saved 2 images to 'runs/detect/exp3'
>>> results.pandas().xyxy[0]
xmin ymin xmax ymax confidence class \
0 748.832520 38.970703 1150.665039 711.768921 0.866193 0
1 113.337891 193.695984 1127.441406 712.344666 0.676592 0
2 433.699280 433.693848 517.761475 715.654663 0.659638 27
3 986.345520 303.328003 1027.942627 420.201050 0.278434 27
name
0 person Socit Gnrale
1 person Socit Gnrale
2 tie
3 tie
>>> We can see in the "textual" result, the accents don't appear as well. |
@kinoute good news 😃! Your original issue may now be partially fixed ✅ in PR #4413. This PR removes 'ascii' encoding that had been in place when reading yaml files, so now the UTF-8 characters are saved correctly in the model names list. Unfortunately the cv2 plotting issue remains though, which will require separate additional fixes. In this example I trained a new model with an updated coco128.yaml as in your above example. The model best.pt correctly saves the UTF-8 characters: To receive this update:
Thank you for spotting this issue and informing us of the problem. Please let us know if this update resolves the issue for you, and feel free to inform us of any other issues you discover or feature requests that come to mind. Happy trainings with YOLOv5 🚀! |
TODO: UTF-8 image plotting fix |
I will have a look this afternoon and report to you it the fix works, thanks! Regarding PIL UTF-8, I think the Nvidia docker image is based on Ubuntu 20.04. We could switch from Arial.ttf to another built-in font such as the Ubuntu font, which is unicode by default. But we have to find a way to deal with people that don't run Yolov5 through the docker image as well. I think there is a way to fix CV2 too, but it will need way more work and changes. |
@kinoute I think we need to merge the PIL and cv2 plotting functions into one, which accepts a |
How to add support for UTF-8 fonts, I want to display predictions in Russian, the opencv library itself seems to be able to do this
The text was updated successfully, but these errors were encountered: