Skip to content

Commit

Permalink
Merge pull request #10 from PaddlePaddle/develop
Browse files Browse the repository at this point in the history
mer upstream
  • Loading branch information
WenmuZhou authored Nov 27, 2020
2 parents 3fb28d2 + e6f89bd commit e51045c
Show file tree
Hide file tree
Showing 274 changed files with 88,065 additions and 9,773 deletions.
2,053 changes: 2,053 additions & 0 deletions PPOCRLabel/PPOCRLabel.py

Large diffs are not rendered by default.

80 changes: 80 additions & 0 deletions PPOCRLabel/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# PPOCRLabel

PPOCRLabel是一款适用于OCR领域的半自动化图形标注工具,使用python3和pyqt5编写,支持矩形框标注和四点标注模式,导出格式可直接用于PPOCR检测和识别模型的训练。

<img src="./data/gif/steps.gif" width="100%"/>

## 安装

### 1. 安装PaddleOCR
参考[PaddleOCR安装文档](https://github.com/PaddlePaddle/PaddleOCR/blob/develop/doc/doc_ch/installation.md)准备好PaddleOCR

### 2. 安装PPOCRLabel
#### Windows + Anaconda

下载安装[Anaconda](https://www.anaconda.com/download/#download) (Python 3+)

```
conda install pyqt=5
cd ./PPOCRLabel # 将目录切换到PPOCRLabel文件夹下
pyrcc5 -o libs/resources.py resources.qrc
python PPOCRLabel.py
```

#### Ubuntu Linux

```
sudo apt-get install pyqt5-dev-tools
sudo apt-get install trash-cli
cd ./PPOCRLabel # 将目录切换到PPOCRLabel文件夹下
sudo pip3 install -r requirements/requirements-linux-python3.txt
make qt5py3
python3 PPOCRLabel.py
```

#### macOS
```
pip3 install pyqt5
pip3 uninstall opencv-python # 由于mac版本的opencv与pyqt有冲突,需先手动卸载opencv
pip3 install opencv-contrib-python-headless # 安装headless版本的open-cv
cd ./PPOCRLabel # 将目录切换到PPOCRLabel文件夹下
make qt5py3
python3 PPOCRLabel.py
```

## 使用

### 操作步骤

1. 安装与运行:使用上述命令安装与运行程序。
2. 打开文件夹:在菜单栏点击 “文件” - "打开目录" 选择待标记图片的文件夹<sup>[1]</sup>.
3. 自动标注:点击 ”自动标注“,使用PPOCR超轻量模型对图片文件名前图片状态<sup>[2]</sup>为 “X” 的图片进行自动标注。
4. 手动标注:点击 “矩形标注”(推荐直接在英文模式下点击键盘中的 “W”),用户可对当前图片中模型未检出的部分进行手动绘制标记框。点击键盘P,则使用四点标注模式(或点击“编辑” - “四点标注”),用户依次点击4个点后,双击左键表示标注完成。
5. 标记框绘制完成后,用户点击 “确认”,检测框会先被预分配一个 “待识别” 标签。
6. 重新识别:将图片中的所有检测画绘制/调整完成后,点击 “重新识别”,PPOCR模型会对当前图片中的**所有检测框**重新识别<sup>[3]</sup>。
7. 内容更改:双击识别结果,对不准确的识别结果进行手动更改。
8. 保存:点击 “保存”,图片状态切换为 “√”,跳转至下一张。
9. 删除:点击 “删除图像”,图片将会被删除至回收站。
10. 标注结果:关闭应用程序或切换文件路径后,手动保存过的标签将会被存放在所打开图片文件夹下的*Label.txt*中。在菜单栏点击 “PaddleOCR” - "保存识别结果"后,会将此类图片的识别训练数据保存在*crop_img*文件夹下,识别标签保存在*rec_gt.txt*中<sup>[4]</sup>。

### 注意

[1] PPOCRLabel以文件夹为基本标记单位,打开待标记的图片文件夹后,不会在窗口栏中显示图片,而是在点击 "选择文件夹" 之后直接将文件夹下的图片导入到程序中。

[2] 图片状态表示本张图片用户是否手动保存过,未手动保存过即为 “X”,手动保存过为 “√”。点击 “自动标注”按钮后,PPOCRLabel不会对状态为 “√” 的图片重新标注。

[3] 点击“重新识别”后,模型会对图片中的识别结果进行覆盖。因此如果在此之前手动更改过识别结果,有可能在重新识别后产生变动。

[4] PPOCRLabel产生的文件包括一下几种,请勿手动更改其中内容,否则会引起程序出现异常。

| 文件名 | 说明 |
| :-----------: | :----------------------------------------------------------: |
| Label.txt | 检测标签,可直接用于PPOCR检测模型训练。用户每保存10张检测结果后,程序会进行自动写入。当用户关闭应用程序或切换文件路径后同样会进行写入。 |
| fileState.txt | 图片状态标记文件,保存当前文件夹下已经被用户手动确认过的图片名称。 |
| Cache.cach | 缓存文件,保存模型自动识别的结果。 |
| rec_gt.txt | 识别标签。可直接用于PPOCR识别模型训练。需用户手动点击菜单栏“PaddleOCR” - "保存识别结果"后产生。 |
| crop_img | 识别数据。按照检测框切割后的图片。与rec_gt.txt同时产生。 |

### 参考资料

1.[Tzutalin. LabelImg. Git code (2015)](https://github.com/tzutalin/labelImg)
Empty file added PPOCRLabel/__init__.py
Empty file.
12 changes: 12 additions & 0 deletions PPOCRLabel/build-tools/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
*.spec
build
dist
pyinstaller
python-2.*
pywin32*
virtual-wine
venv_wine
PyQt4-*
lxml-*
windows_v*
linux_v*
35 changes: 35 additions & 0 deletions PPOCRLabel/build-tools/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
### Deploy to PyPI

```
cd [ROOT]
sh build-tools/build-for-pypi.sh
```

### Build for Ubuntu

```
cd build-tools
sh run-in-container.sh
sh envsetup.sh
sh build-ubuntu-binary.sh
```

### Build for Windows

```
cd build-tools
sh run-in-container.sh
sh envsetup.sh
sh build-windows-binary.sh
```

### Build for macOS High Sierra
```
cd build-tools
./build-for-macos.sh
```

Note: If there are some problems, try to
```
sudo rm -rf virtual-wne venv_wine
```
30 changes: 30 additions & 0 deletions PPOCRLabel/build-tools/build-for-macos.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/sh

brew install python@2
pip install --upgrade virtualenv

# clone labelimg source
rm -rf /tmp/labelImgSetup
mkdir /tmp/labelImgSetup
cd /tmp/labelImgSetup
curl https://codeload.github.com/tzutalin/labelImg/zip/master --output labelImg.zip
unzip labelImg.zip
rm labelImg.zip

# setup python3 space
virtualenv --system-site-packages -p python3 /tmp/labelImgSetup/labelImg-py3
source /tmp/labelImgSetup/labelImg-py3/bin/activate
cd labelImg-master

# build labelImg app
pip install py2app
pip install PyQt5 lxml
make qt5py3
rm -rf build dist
python setup.py py2app -A
mv "/tmp/labelImgSetup/labelImg-master/dist/labelImg.app" /Applications
# deactivate python3
deactivate
cd ../
rm -rf /tmp/labelImgSetup
echo 'DONE'
17 changes: 17 additions & 0 deletions PPOCRLabel/build-tools/build-for-pypi.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/sh
# Packaging and Release
docker run --workdir=$(pwd)/ --volume="/home/$USER:/home/$USER" tzutalin/py2qt4 /bin/sh -c 'make qt4py2; make test;sudo python setup.py sdist;sudo python setup.py install'

while true; do
read -p "Do you wish to deploy this to PyPI(twine upload dist/* or pip install dist/*)?" yn
case $yn in
[Yy]* ) docker run -it --rm --workdir=$(pwd)/ --volume="/home/$USER:/home/$USER" tzutalin/py2qt4; break;;
[Nn]* ) exit;;
* ) echo "Please answer yes or no.";;
esac
done
# python setup.py register
# python setup.py sdist upload
# Net pypi: twine upload dist/*

# Test before upladoing: pip install dist/labelImg.tar.gz
24 changes: 24 additions & 0 deletions PPOCRLabel/build-tools/build-ubuntu-binary.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
#!/bin/bash
### Ubuntu use pyinstall v3.0
THIS_SCRIPT_PATH=`readlink -f $0`
THIS_SCRIPT_DIR=`dirname ${THIS_SCRIPT_PATH}`
cd pyinstaller
git checkout v3.2
cd ${THIS_SCRIPT_DIR}

rm -r build
rm -r dist
rm labelImg.spec
python pyinstaller/pyinstaller.py --hidden-import=xml \
--hidden-import=xml.etree \
--hidden-import=xml.etree.ElementTree \
--hidden-import=lxml.etree \
-D -F -n labelImg -c "../labelImg.py" -p ../libs -p ../

FOLDER=$(git describe --abbrev=0 --tags)
FOLDER="linux_"$FOLDER
rm -rf "$FOLDER"
mkdir "$FOLDER"
cp dist/labelImg $FOLDER
cp -rf ../data $FOLDER/data
zip "$FOLDER.zip" -r $FOLDER
32 changes: 32 additions & 0 deletions PPOCRLabel/build-tools/build-windows-binary.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/bin/bash
### Window requires pyinstall v2.1
wine msiexec -i python-2.7.8.msi
wine pywin32-218.win32-py2.7.exe
wine PyQt4-4.11.4-gpl-Py2.7-Qt4.8.7-x32.exe
wine lxml-3.7.3.win32-py2.7.exe

THIS_SCRIPT_PATH=`readlink -f $0`
THIS_SCRIPT_DIR=`dirname ${THIS_SCRIPT_PATH}`
cd pyinstaller
git checkout v2.1
cd ${THIS_SCRIPT_DIR}
echo ${THIS_SCRIPT_DIR}

#. venv_wine/bin/activate
rm -r build
rm -r dist
rm labelImg.spec

wine c:/Python27/python.exe pyinstaller/pyinstaller.py --hidden-import=xml \
--hidden-import=xml.etree \
--hidden-import=xml.etree.ElementTree \
--hidden-import=lxml.etree \
-D -F -n labelImg -c "../labelImg.py" -p ../libs -p ../

FOLDER=$(git describe --abbrev=0 --tags)
FOLDER="windows_"$FOLDER
rm -rf "$FOLDER"
mkdir "$FOLDER"
cp dist/labelImg.exe $FOLDER
cp -rf ../data $FOLDER/data
zip "$FOLDER.zip" -r $FOLDER
53 changes: 53 additions & 0 deletions PPOCRLabel/build-tools/envsetup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#!/bin/sh

THIS_SCRIPT_PATH=`readlink -f $0`
THIS_SCRIPT_DIR=`dirname ${THIS_SCRIPT_PATH}`
#OS Ubuntu 14.04
### Common packages for linux/windows
if [ ! -e "pyinstaller" ]; then
git clone https://github.com/pyinstaller/pyinstaller
cd pyinstaller
git checkout v2.1 -b v2.1
cd ${THIS_SCRIPT_DIR}
fi

echo "Going to clone and download packages for building windows"
#Pacakges
#> pyinstaller (2.1)
#> wine (1.6.2)
#> virtual-wine (0.1)
#> python-2.7.8.msi
#> pywin32-218.win32-py2.7.exe

## tool to install on Ubuntu
#$ sudo apt-get install wine

### Clone a repo to create virtual wine env
if [ ! -e "virtual-wine" ]; then
git clone https://github.com/htgoebel/virtual-wine.git
fi

apt-get install scons
### Create virtual env
rm -rf venv_wine
./virtual-wine/vwine-setup venv_wine
#### Active virutal env
. venv_wine/bin/activate

### Use wine to install packages to virtual env
if [ ! -e "python-2.7.8.msi" ]; then
wget "https://www.python.org/ftp/python/2.7.8/python-2.7.8.msi"
fi

if [ ! -e "pywin32-218.win32-py2.7.exe" ]; then
wget "http://nchc.dl.sourceforge.net/project/pywin32/pywin32/Build%20218/pywin32-218.win32-py2.7.exe"
fi

if [ ! -e "PyQt4-4.11.4-gpl-Py2.7-Qt4.8.7-x32.exe" ]; then
wget "http://nchc.dl.sourceforge.net/project/pyqt/PyQt4/PyQt-4.11.4/PyQt4-4.11.4-gpl-Py2.7-Qt4.8.7-x32.exe"
fi

if [ ! -e "lxml-3.7.3.win32-py2.7.exe" ]; then
wget "https://pypi.python.org/packages/a3/f6/a28c5cf63873f6c55a3eb7857b736379229b85ba918261d2e88cf886905e/lxml-3.7.3.win32-py2.7.exe#md5=a0f746355876aca4ca5371cb0f1d13ce"
fi

13 changes: 13 additions & 0 deletions PPOCRLabel/build-tools/run-in-container.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
#!/bin/sh
docker run -it \
--user $(id -u) \
-e DISPLAY=unix$DISPLAY \
--workdir=$(pwd) \
--volume="/home/$USER:/home/$USER" \
--volume="/etc/group:/etc/group:ro" \
--volume="/etc/passwd:/etc/passwd:ro" \
--volume="/etc/shadow:/etc/shadow:ro" \
--volume="/etc/sudoers.d:/etc/sudoers.d:ro" \
-v /tmp/.X11-unix:/tmp/.X11-unix \
tzutalin/py2qt4

46 changes: 46 additions & 0 deletions PPOCRLabel/combobox.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Copyright (c) <2015-Present> Tzutalin
# Copyright (C) 2013 MIT, Computer Science and Artificial Intelligence Laboratory. Bryan Russell, Antonio Torralba,
# William T. Freeman. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
# associated documentation files (the "Software"), to deal in the Software without restriction, including without
# limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the
# Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of
# the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT
# NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT
# SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF
# CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.

import sys
try:
from PyQt5.QtWidgets import QWidget, QHBoxLayout, QComboBox
except ImportError:
# needed for py3+qt4
# Ref:
# http://pyqt.sourceforge.net/Docs/PyQt4/incompatible_apis.html
# http://stackoverflow.com/questions/21217399/pyqt4-qtcore-qvariant-object-instead-of-a-string
if sys.version_info.major >= 3:
import sip
sip.setapi('QVariant', 2)
from PyQt4.QtGui import QWidget, QHBoxLayout, QComboBox


class ComboBox(QWidget):
def __init__(self, parent=None, items=[]):
super(ComboBox, self).__init__(parent)

layout = QHBoxLayout()
self.cb = QComboBox()
self.items = items
self.cb.addItems(self.items)

self.cb.currentIndexChanged.connect(parent.comboSelectionChanged)

layout.addWidget(self.cb)
self.setLayout(layout)

def update_items(self, items):
self.items = items

self.cb.clear()
self.cb.addItems(self.items)
Binary file added PPOCRLabel/data/gif/steps.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added PPOCRLabel/data/paddle.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file.
2 changes: 2 additions & 0 deletions PPOCRLabel/libs/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
__version_info__ = ('1', '0', '0')
__version__ = '.'.join(__version_info__)
Loading

0 comments on commit e51045c

Please sign in to comment.