Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: python 3.11 support #1681

Merged
merged 45 commits into from
Mar 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
d6cb2cf
feat: python 3.11 tests
IgnatovFedor Nov 7, 2023
f9c08c9
feat: py311 in jenkinsfile
IgnatovFedor Nov 7, 2023
70ec412
fix: Jenkinsfile
IgnatovFedor Nov 9, 2023
c83957f
Merge branch 'dev' into feat/py311
optk2k Jan 17, 2024
c47c0b2
scikit-learn
optk2k Jan 23, 2024
7af9877
sphinx
optk2k Jan 23, 2024
c99e203
scikit-learn 2
optk2k Jan 24, 2024
737ce18
sphinx 2
optk2k Jan 24, 2024
0c3bb79
docutils
optk2k Jan 29, 2024
a70611d
docutils 2
optk2k Jan 29, 2024
e6ff381
sphinx-rtd-theme
optk2k Jan 29, 2024
40db39e
check correct label
optk2k Feb 1, 2024
d4757c6
faiss-cpu
optk2k Feb 1, 2024
fa73ffb
kenlm
optk2k Feb 1, 2024
e3f638e
datasets
optk2k Feb 6, 2024
8f1dcbf
datasets down
optk2k Feb 8, 2024
9434b4c
sphinx 3
optk2k Feb 8, 2024
46388e8
sphinx 4
optk2k Feb 14, 2024
f5d58c3
clear setup.py
optk2k Feb 14, 2024
5b1bd21
up ci
optk2k Feb 14, 2024
3182dee
sphinx for low python
optk2k Feb 14, 2024
0eb9837
sphinx_rtd_theme for low python
optk2k Feb 14, 2024
1ac8129
sphinx_rtd_theme for low python 2
optk2k Feb 14, 2024
1a36d72
setup.py for low python
optk2k Feb 14, 2024
f185efc
scikit-learn for low python
optk2k Feb 14, 2024
14d1268
requirements for low python
optk2k Feb 15, 2024
c7f7bb5
faiss-cpu down
optk2k Feb 15, 2024
a0ee765
sphinx for python > 9
optk2k Feb 16, 2024
f0ba025
sphinx for python > 9 2
optk2k Feb 16, 2024
914e12d
sphinx for python > 9 3
optk2k Feb 16, 2024
042e053
faiss-cpu 3.10 and 3.11
optk2k Feb 19, 2024
907356d
feat: retries in simple_download
IgnatovFedor Feb 21, 2024
edcc0b0
sphinx for 3.7
optk2k Feb 27, 2024
a1ae370
sphinx for 3.8
optk2k Feb 27, 2024
246ff23
sphinx python<3.10
optk2k Feb 27, 2024
89ce6c5
back to sphinx 3.7
optk2k Feb 28, 2024
e847248
up sphinx to 5.0.0
optk2k Feb 29, 2024
1a65905
down sphinx for 3.7
optk2k Mar 4, 2024
8d4de77
update sphinx
optk2k Mar 4, 2024
c35a429
refactor: fixes in documentation and small improvements
IgnatovFedor Mar 7, 2024
631f7a2
docs: edit kbqa section names
IgnatovFedor Mar 11, 2024
626080c
deleted broken link
optk2k Mar 12, 2024
7e4c906
fix: relation extraction documentation page
IgnatovFedor Mar 12, 2024
6f1e71c
Merge branch 'feat/python3.11' of https://github.com/deeppavlov/DeepP…
IgnatovFedor Mar 12, 2024
0173ed2
Revert "deleted broken link"
IgnatovFedor Mar 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ node('cuda-module') {
docker-compose -f utils/Docker/docker-compose.yml -p $BUILD_TAG ps | grep Exit | grep -v 'Exit 0' && exit 1
docker-compose -f utils/Docker/docker-compose.yml -p $BUILD_TAG up py38 py39
docker-compose -f utils/Docker/docker-compose.yml -p $BUILD_TAG ps | grep Exit | grep -v 'Exit 0' && exit 1
docker-compose -f utils/Docker/docker-compose.yml -p $BUILD_TAG up py310
docker-compose -f utils/Docker/docker-compose.yml -p $BUILD_TAG up py310 py311
docker-compose -f utils/Docker/docker-compose.yml -p $BUILD_TAG ps | grep Exit | grep -v 'Exit 0' && exit 1 || exit 0
"""
currentBuild.result = 'SUCCESS'
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[![License Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
![Python 3.6, 3.7, 3.8, 3.9, 3.10](https://img.shields.io/badge/python-3.6%20%7C%203.7%20%7C%203.8%20%7C%203.9%20%7C%203.10-green.svg)
![Python 3.6, 3.7, 3.8, 3.9, 3.10, 3.11](https://img.shields.io/badge/python-3.6%20%7C%203.7%20%7C%203.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-green.svg)
[![Downloads](https://pepy.tech/badge/deeppavlov)](https://pepy.tech/project/deeppavlov)
<img align="right" height="27%" width="27%" src="docs/_static/deeppavlov_logo.png"/>

Expand Down
2 changes: 1 addition & 1 deletion deeppavlov/_meta.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
__version__ = '1.5.0'
__version__ = '1.6.0'
__author__ = 'Neural Networks and Deep Learning lab, MIPT'
__description__ = 'An open source library for building end-to-end dialog systems and training chatbots.'
__keywords__ = ['NLP', 'NER', 'SQUAD', 'Intents', 'Chatbot']
Expand Down
110 changes: 59 additions & 51 deletions deeppavlov/core/data/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ def s3_download(url: str, destination: str) -> None:
file_object.download_file(destination, Callback=pbar.update)


def simple_download(url: str, destination: Union[Path, str], headers: Optional[dict] = None) -> None:
def simple_download(url: str, destination: Union[Path, str], headers: Optional[dict] = None, n_tries: int = 3) -> None:
"""Download a file from URL to target location.

Displays a progress bar to the terminal during the download process.
Expand All @@ -87,58 +87,66 @@ def simple_download(url: str, destination: Union[Path, str], headers: Optional[d
url: The source URL.
destination: Path to the file destination (including file name).
headers: Headers for file server.
n_tries: Number of retries if download fails.

"""
destination = Path(destination)
destination.parent.mkdir(parents=True, exist_ok=True)

log.info('Downloading from {} to {}'.format(url, destination))

if url.startswith('s3://'):
return s3_download(url, str(destination))

chunk_size = 32 * 1024
temporary = destination.with_suffix(destination.suffix + '.part')

r = requests.get(url, stream=True, headers=headers)
if r.status_code != 200:
raise RuntimeError(f'Got status code {r.status_code} when trying to download {url}')
total_length = int(r.headers.get('content-length', 0))

if temporary.exists() and temporary.stat().st_size > total_length:
temporary.write_bytes(b'') # clearing temporary file when total_length is inconsistent

with temporary.open('ab') as f:
downloaded = f.tell()
if downloaded != 0:
log.warning(f'Found a partial download {temporary}')
with tqdm(initial=downloaded, total=total_length, unit='B', unit_scale=True) as pbar:
while True:
if downloaded != 0:
log.warning(f'Download stopped abruptly, trying to resume from {downloaded} '
f'to reach {total_length}')
headers['Range'] = f'bytes={downloaded}-'
r = requests.get(url, headers=headers, stream=True)
if 'content-length' not in r.headers or \
total_length - downloaded != int(r.headers['content-length']):
raise RuntimeError('It looks like the server does not support resuming downloads.')

try:
for chunk in r.iter_content(chunk_size=chunk_size):
if chunk: # filter out keep-alive new chunks
downloaded += len(chunk)
pbar.update(len(chunk))
f.write(chunk)
except requests.exceptions.ChunkedEncodingError:
if downloaded == 0:
r = requests.get(url, stream=True, headers=headers)

if downloaded >= total_length:
# Note that total_length is 0 if the server didn't return the content length,
# in this case we perform just one iteration and assume that we are done.
break

temporary.rename(destination)
try:
destination = Path(destination)
destination.parent.mkdir(parents=True, exist_ok=True)

log.info('Downloading from {} to {}'.format(url, destination))

if url.startswith('s3://'):
return s3_download(url, str(destination))

chunk_size = 32 * 1024
temporary = destination.with_suffix(destination.suffix + '.part')

r = requests.get(url, stream=True, headers=headers)
if r.status_code != 200:
raise RuntimeError(f'Got status code {r.status_code} when trying to download {url}')
total_length = int(r.headers.get('content-length', 0))

if temporary.exists() and temporary.stat().st_size > total_length:
temporary.write_bytes(b'') # clearing temporary file when total_length is inconsistent

with temporary.open('ab') as f:
downloaded = f.tell()
if downloaded != 0:
log.warning(f'Found a partial download {temporary}')
with tqdm(initial=downloaded, total=total_length, unit='B', unit_scale=True) as pbar:
while True:
if downloaded != 0:
log.warning(f'Download stopped abruptly, trying to resume from {downloaded} '
f'to reach {total_length}')
headers['Range'] = f'bytes={downloaded}-'
r = requests.get(url, headers=headers, stream=True)
if 'content-length' not in r.headers or \
total_length - downloaded != int(r.headers['content-length']):
raise RuntimeError('It looks like the server does not support resuming downloads.')

try:
for chunk in r.iter_content(chunk_size=chunk_size):
if chunk: # filter out keep-alive new chunks
downloaded += len(chunk)
pbar.update(len(chunk))
f.write(chunk)
except requests.exceptions.ChunkedEncodingError:
if downloaded == 0:
r = requests.get(url, stream=True, headers=headers)

if downloaded >= total_length:
# Note that total_length is 0 if the server didn't return the content length,
# in this case we perform just one iteration and assume that we are done.
break

temporary.rename(destination)
except Exception as e:
if n_tries > 0:
log.warning(f'Download failed: {e}, retrying')
simple_download(url, destination, headers, n_tries - 1)
else:
raise e


def download(dest_file_path: [List[Union[str, Path]]], source_url: str, force_download: bool = True,
Expand Down
3 changes: 2 additions & 1 deletion deeppavlov/requirements/datasets.txt
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
datasets>=1.16.0,<2.5.0
datasets>=1.16.0,<2.5.0;python_version<="3.10"
datasets==2.2.*;python_version=="3.11.*"
3 changes: 2 additions & 1 deletion deeppavlov/requirements/faiss.txt
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
faiss-cpu==1.7.2
faiss-cpu==1.7.2;python_version<="3.10"
faiss-cpu==1.7.4;python_version=="3.11.*"
3 changes: 2 additions & 1 deletion deeppavlov/requirements/kenlm.txt
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
pypi-kenlm==0.1.20220713
pypi-kenlm==0.1.20220713;python_version<="3.10"
kenlm==0.2.*;python_version=="3.11.*"
6 changes: 3 additions & 3 deletions docs/features/models/KBQA.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,13 @@
" \n",
" 4.2. [Predict using CLI](#4.2-Predict-using-CLI)\n",
"\n",
" 4.3. [Using entity linking and Wiki parser as standalone services for KBQA](#4.3-Using-entity-linking-and-Wiki-parser-as-standalone-services-for-KBQA)\n",
" 4.3. [Using entity linking and Wiki parser as standalone services for KBQA](#4.3-Using-entity-linking-and-Wiki-parser-as-standalone-tools-for-KBQA)\n",
" \n",
"5. [Customize the model](#5.-Customize-the-model)\n",
" \n",
" 5.1. [Train your model from Python](#5.1-Train-your-model-from-Python)\n",
" 5.1. [Description of config parameters](#5.1-Description-of-config-parameters)\n",
" \n",
" 5.2. [Train your model from CLI](#5.2-Train-your-model-from-CLI)\n",
" 5.2. [Train KBQA components](#5.2-Train-KBQA-components)\n",
"\n",
"# 1. Introduction to the task\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion docs/features/models/NER.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
" \n",
" 4.2. [Predict using CLI](#4.2-Predict-using-CLI)\n",
" \n",
"5. [Evaluate](#6.-Evaluate)\n",
"5. [Evaluate](#5.-Evaluate)\n",
" \n",
" 5.1. [Evaluate from Python](#5.1-Evaluate-from-Python)\n",
" \n",
Expand Down
2 changes: 1 addition & 1 deletion docs/features/models/SQuAD.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@
"`squad_bert` is the name of the model's *config_file*. [What is a Config File?](http://docs.deeppavlov.ai/en/master/intro/configuration.html) \n",
"\n",
"Configuration file defines the model and describes its hyperparameters. To use another model, change the name of the *config_file* here and further.\n",
"The full list of the models with their config names can be found in the [table](#6.-Models-list).\n",
"The full list of the models with their config names can be found in the [table](#3.-Models-list).\n",
"\n",
"# 3. Models list\n",
"\n",
Expand Down
14 changes: 7 additions & 7 deletions docs/features/models/classification.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 3.2 Predict using CLI\n",
"## 4.2 Predict using CLI\n",
"\n",
"You can also get predictions in an interactive mode through CLI (Command Line Interface)."
]
Expand Down Expand Up @@ -198,9 +198,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# 4. Evaluation\n",
"# 5. Evaluation\n",
"\n",
"## 4.1 Evaluate from Python"
"## 5.1 Evaluate from Python"
]
},
{
Expand All @@ -218,7 +218,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4.2 Evaluate from CLI"
"## 5.2 Evaluate from CLI"
]
},
{
Expand All @@ -234,9 +234,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# 5. Customize the model\n",
"# 6. Train the model on your data\n",
"\n",
"## 5.1 Train your model from Python\n",
"## 6.1 Train your model from Python\n",
"\n",
"### Provide your data path\n",
"\n",
Expand Down Expand Up @@ -346,7 +346,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 5.2 Train your model from CLI\n",
"## 6.2 Train your model from CLI\n",
"\n",
"To train the model on your data, create a copy of a config file and change the *data_path* variable in it. After that, train the model using your new *config_file*. You can also change any of the hyperparameters of the model."
]
Expand Down
4 changes: 2 additions & 2 deletions docs/features/models/few_shot_classification.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@
"\n",
"## 4.2 Predict using Python\n",
"\n",
"After [installing](#4.-Get-started-with-the-model) the model, build it from the config and predict."
"After [installing](#2.-Get-started-with-the-model) the model, build it from the config and predict."
]
},
{
Expand Down Expand Up @@ -192,7 +192,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4.2 Predict using CLI\n",
"## 4.3 Predict using CLI\n",
"\n",
"You can also get predictions in an interactive mode through CLI (Сommand Line Interface)."
]
Expand Down
2 changes: 1 addition & 1 deletion docs/features/models/morpho_tagger.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"\n",
" 4.2. [Predict using CLI](#4.2-Predict-using-CLI)\n",
"\n",
"5. [Customize the model](#4.-Customize-the-model)\n",
"5. [Customize the model](#5.-Customize-the-model)\n",
"\n",
"# 1. Introduction to the task\n",
"\n",
Expand Down
30 changes: 28 additions & 2 deletions docs/features/models/relation_extraction.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -198,7 +198,7 @@
"|NUM | Percents, money, quantities |\n",
"|MISC | Products, including vehicles, weapons, etc. <br> Events, including elections, battles, sporting MISC events, etc. Laws, cases, languages, etc. |\n",
"\n",
"**Model Output**: one or several of the [97 relations](#5.1-Relations-used-in-English-model) found between the given entities; relation id in [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page) (e.g. 'P26') and relation name ('spouse').\n",
"**Model Output**: one or several of the [97 relations](#6.1-Relations-used-in-English-model) found between the given entities; relation id in [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page) (e.g. 'P26') and relation name ('spouse').\n",
"\n",
"### Russian"
]
Expand Down Expand Up @@ -244,8 +244,34 @@
"- list of entities positions (i.e. all start and end positions of both entities' mentions)\n",
"- list of NER tags of both entities.\n",
"\n",
"**Model Output**: one or several of the [30 relations](#5.2-Relations-used-in-Russian-model) found between the given entities; a Russian relation name (e.g. \"участник\") or an English one, if Russian one is unavailable, and, if applicable, its id in [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page) (e.g. 'P710').\n",
"**Model Output**: one or several of the [30 relations](#6.2-Relations-used-in-Russian-model) found between the given entities; a Russian relation name (e.g. \"участник\") or an English one, if Russian one is unavailable, and, if applicable, its id in [Wikidata](https://www.wikidata.org/wiki/Wikidata:Main_Page) (e.g. 'P710').\n",
"\n",
"## 4.2 Predict using CLI\n",
"\n",
"You can also get predictions in an interactive mode through CLI."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! python -m deeppavlov interact re_docred [-d]\n",
"! python -m deeppavlov interact re_rured [-d]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`-d` is an optional download key (alternative to `download=True` in Python code). It is used to download the pre-trained model along with embeddings and all other files needed to run the model."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 5. Customize the model\n",
"\n",
"## 5.1 Description of config parameters\n",
Expand Down
2 changes: 1 addition & 1 deletion docs/features/models/spelling_correction.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"\n",
" 4.2. [Predict using CLI](#4.2-Predict-using-CLI)\n",
"\n",
"5. [Customize the model](#4.-Customize-the-model)\n",
"5. [Customize the model](#5.-Customize-the-model)\n",
"\n",
" 5.1. [Training configuration](#5.1-Training-configuration)\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion docs/features/models/syntax_parser.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"\n",
" 4.2. [Predict using CLI](#4.2-Predict-using-CLI)\n",
"\n",
"5. [Customize the model](#4.-Customize-the-model)\n",
"5. [Customize the model](#5.-Customize-the-model)\n",
"\n",
"# 1. Introduction to the task\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion docs/intro/installation.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Installation
============

DeepPavlov supports **Linux**, **Windows 10+** (through WSL/WSL2), **MacOS** (Big Sur+) platforms, **Python 3.6-3.10**.
DeepPavlov supports **Linux**, **Windows 10+** (through WSL/WSL2), **MacOS** (Big Sur+) platforms, **Python 3.6-3.11**.
Depending on the model used, you may need from 4 to 16 GB RAM.

Install with pip
Expand Down
2 changes: 1 addition & 1 deletion docs/intro/quick_start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ QuickStart
------------

First, follow instructions on :doc:`Installation page </intro/installation>`
to install ``deeppavlov`` package for Python 3.6/3.7/3.8/3.9/3.10.
to install ``deeppavlov`` package for Python 3.6-3.11.

DeepPavlov contains a bunch of great pre-trained NLP models. Each model is
determined by its config file. List of models is available on
Expand Down
3 changes: 2 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,8 @@ prometheus-client>=0.13.0,<=1.16.0
pydantic<2
pybind11==2.10.3
requests>=2.19.0,<3.0.0
scikit-learn>=0.24,<1.1.0
scikit-learn>=0.24,<1.1.0;python_version<="3.10"
scikit-learn==1.4.0;python_version=="3.11.*"
tqdm>=4.42.0,<4.65.0
uvicorn>=0.13.0,<0.19.0
wheel
Expand Down
16 changes: 11 additions & 5 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,11 +68,17 @@ def readme():
'pexpect'
],
'docs': [
'sphinx==3.5.4;python_version<"3.10"',
'sphinx==4.5.0;python_version>="3.10"',
'sphinx_rtd_theme==0.5.2',
'docutils<0.17,>=0.12',
'nbsphinx==0.8.4',
'sphinx==3.5.4;python_version<="3.7"',
'sphinx==5.0.0;python_version=="3.8"',
'sphinx==5.0.0;python_version=="3.9"',
'sphinx==5.0.0;python_version=="3.10"',
'sphinx==7.2.*;python_version=="3.11.*"',
'sphinx_rtd_theme==0.5.2;python_version<="3.10"',
'sphinx_rtd_theme==2.0.0;python_version=="3.11.*"',
'docutils<0.17,>=0.12;python_version<="3.10"',
'docutils==0.20.1;python_version=="3.11.*"',
'nbsphinx==0.8.4;python_version<="3.10"',
'nbsphinx==0.9.3;python_version=="3.11.*"',
'ipykernel==5.5.4',
'jinja2<=3.0.3',
'sphinx-copybutton==0.5.0',
Expand Down
Loading