Python: dev_setup and readme updates (#5770)

This pull request primarily updates the `DEV_SETUP.md` and `README.md` files in the `python` directory. The changes focus on improving the setup instructions, updating the code snippets, and providing additional information on running tests, code coverage, and keeping the local repository up-to-date. Changes to setup instructions: * [`python/DEV_SETUP.md`](diffhunk://#diff-22a026622159be328328c5bb6fef874e8aa799efb6ab8844d0c94ad051b08f3cR75-R90): Added instructions on how to use Python 3.11 with Poetry, how to install pre-commit hooks, and how to run tests using VSCode Tasks. [[1]](diffhunk://#diff-22a026622159be328328c5bb6fef874e8aa799efb6ab8844d0c94ad051b08f3cR75-R90) [[2]](diffhunk://#diff-22a026622159be328328c5bb6fef874e8aa799efb6ab8844d0c94ad051b08f3cR112-R114) [[3]](diffhunk://#diff-22a026622159be328328c5bb6fef874e8aa799efb6ab8844d0c94ad051b08f3cR131-R133) [[4]](diffhunk://#diff-22a026622159be328328c5bb6fef874e8aa799efb6ab8844d0c94ad051b08f3cL287-R252) Code snippets updates: * [`python/DEV_SETUP.md`](diffhunk://#diff-22a026622159be328328c5bb6fef874e8aa799efb6ab8844d0c94ad051b08f3cL140-L227): Updated the code snippets to reflect the latest changes in the codebase. This includes changes in the use of Pydantic and the use of KernelBaseModel. [[1]](diffhunk://#diff-22a026622159be328328c5bb6fef874e8aa799efb6ab8844d0c94ad051b08f3cL140-L227) [[2]](diffhunk://#diff-22a026622159be328328c5bb6fef874e8aa799efb6ab8844d0c94ad051b08f3cL244-R177) * [`python/README.md`](diffhunk://#diff-217ed82f87b78b399e304b07a159b27dff327c0f83adf4a2fc30b03bcbf84b01L58-R66): Updated the code snippet to reflect the change in getting request settings from a service ID. Additional information: * [`python/DEV_SETUP.md`](diffhunk://#diff-22a026622159be328328c5bb6fef874e8aa799efb6ab8844d0c94ad051b08f3cL287-R252): Added information on how to run code coverage, how to keep the local repository up-to-date, and how to resolve conflicts after a rebase. * [`python/README.md`](diffhunk://#diff-217ed82f87b78b399e304b07a159b27dff327c0f83adf4a2fc30b03bcbf84b01R7-R14): Added instructions on how to install optional dependencies.
microsoft · Apr 4, 2024 · 0a75ba8 · 0a75ba8
1 parent d050f8d
commit 0a75ba8
Show file tree

Hide file tree

Showing 3 changed files with 69 additions and 90 deletions.
diff --git a/python/.vscode/tasks.json b/python/.vscode/tasks.json
@@ -32,7 +32,7 @@
             }
         },
         {
-            "label": "Python: Run Checks - PR",
+            "label": "Python: Run Checks - Staged",
             "type": "shell",
             "command": "poetry",
             "args": [

diff --git a/python/DEV_SETUP.md b/python/DEV_SETUP.md
@@ -72,15 +72,22 @@ Note: SK requires at least Poetry 1.2.0.
 # Install poetry package
 pip3 install poetry
 
+# optionally, define which python version you want to use
+poetry env use python3.11
+
 # Use poetry to install base project dependencies
 poetry install
 
-# If you want to use connectors such as hugging face
-# poetry install --with <connector group name>
+# If you want to get all dependencies for tests installed, use
+# poetry install --with tests
 # example: poetry install --with hugging_face
 
 # Use poetry to activate project venv
 poetry shell
+
+# Optionally, you can install the pre-commit hooks
+poetry run pre-commit install
+# this will run linters and mypy checks on all the changed code.
 ```
 
 ## VSCode Setup
@@ -90,8 +97,7 @@ command from the command palette. Make sure the virtual env (venv) created by
 `poetry` is selected.
 The python you're looking for should be under `~/.cache/pypoetry/virtualenvs/semantic-kernel-.../bin/python`.
 
-If prompted, install `black` and `flake8` (if VSCode doesn't find those packages,
-it will prompt you to install them).
+If prompted, install `ruff` and `black` (these should have been installed as part of `poetry install`).
 
 ## Tests
 
@@ -103,6 +109,9 @@ You can run the unit tests under the [tests/unit](tests/unit/) folder.
     poetry run pytest tests/unit
 ```
 
+Alternatively, you can run them using VSCode Tasks. Open the command palette
+(`Ctrl+Shift+P`) and type `Tasks: Run Task`. Select `Python: Tests - Unit` or `Python: Tests - Code Coverage` from the list.
+
 You can run the integration tests under the [tests/integration](tests/integration/) folder.
 
 ```bash
@@ -119,6 +128,9 @@ You can also run all the tests together under the [tests](tests/) folder.
     poetry run pytest tests
 ```
 
+Alternatively, you can run them using VSCode Tasks. Open the command palette
+(`Ctrl+Shift+P`) and type `Tasks: Run Task`. Select `Python: Tests - All` from the list.
+
 ## Tools and scripts
 
 ## Implementation Decisions 
@@ -137,94 +149,13 @@ with either `async def` or `def` to understand if something is asynchronous or n
 
 This section describes how one can enable serialization for their class using Pydantic.
 
-IMPORTANT: This document (and SemanticKernel) currently use Pydantic 1.x. When SK is upgraded
-to use Pydantic 2.x, this document will be upgraded accordingly.
-
-### Terminology
-
-There are 3 types of classes you need to be aware of when enabling serialization with Pydantic:
-
-1. Classes which contain no data - examples are Protocols, ABC subclasses and any other classes
-   that don't contain any data that needs to be serialized.
-2. Classes which contain data that need to be serialized, but don't contain any generic classes.
-3. Classes which contain data that need to be serialized, AND contain generic classes.
-
 ### Upgrading existing classes to use Pydantic
 
-#### Classes without any data
-
-Let's take the following classes as examples - 1 ABC, 1 Protocol, and 1 class that only contains
-data that doesn't need to be serialized.
-
-```python
-class A(Protocol):
-    def some_method(self, *args, **kwargs): ...
-
-class B(ABC):
-    def some_method(self, *args, **kwargs): ...
-
-class C:
-    def __init__(self):
-        # IMPORTANT: These variables are NOT being passed into the initializer
-        # so they don't need to be serialized. If the are though, you'll have
-        # to treat this as a class that contains data that needs to be serialized
-        self._a = ...
-```
-
-For `Protocol` subclasses, nothing needs to be done, and they can be left as is.
-
-For the remaining types, SemanticKernel provides a class named `PydanticField`. Subclassing
-from this field is sufficient to have these types of classes as valid Pydantic fields, and allows
-any class using them as attributes to be serialized.
-
-```python
-from semantic_kernel.kernel_pydantic import PydanticField
-
-class B(PydanticField): ... # correct, B is still an ABC because PydanticField subclasses ABC
-class B(PydanticField, ABC): ... # Also correct
-class B(ABC, PydanticField): ... # ERROR: Python cannot find a valid super class ordering.
-
-class C(PydanticField): ... # No other changes needed
-```
-
-The classes B and C can now be used as valid Pydantic Field annotations.
-
-```python
-from pydantic import BaseModel
-
-class MyModel(BaseModel):
-    b: B
-    c: C
-```
-
-Class A can only be used as a Pydantic Field annotation for a Pydantic BaseModel subclass
-which is configured to allow arbitrary field types like so:
-
-```python
-from pydantic import BaseModel
-class IncorrectModel(BaseModel):
-    a: A  # Pydantic error
-
-class CorrectModel(BaseModel):
-    a: A  # Okay
-    class Config:  # Configuration that tells Pydantic to allow field types that it can't serialize
-        arbitrary_types_allowed = True
-```
-
-#### Classes with data, but no Generic types that need to be serialized
-
-If your class has any data that needs to be serialized, but the field annotation for that data type
-in your class is not a Generic type, this section applies to you.
-
 Let's take the following example:
 
 ```python
 class A:
     def __init__(self, a: int, b: float, c: List[float], d: dict[str, tuple[float, str]] = {}):
-        # Since a, b, c and d are needed to initialize this class, they need to be serialized
-        # if can be serialized.
-        # Although a, b, c and d are builtin python types, any valid pydantic field can be used
-        # here. This includes the classes defined in the previous category.
         self.a = a
         self.b = b
         self.c = c
@@ -241,9 +172,9 @@ class A(KernelBaseModel):
     # The notation for the fields is similar to dataclasses.
     a: int
     b: float
-    c: List[float]
+    c: list[float]
     # Only, instead of using dataclasses.field, you would use pydantic.Field
-    d: dict[str, tuple[flost, str]] = Field(default_factory=dict)
+    d: dict[str, tuple[float, str]] = Field(default_factory=dict)
 ```
 
 #### Classes with data that need to be serialized, and some of them are Generic types
@@ -284,5 +215,45 @@ To run the same checks that run during the GitHub Action build, you can use
 this command, from the [python](../python) folder:
 
 ```bash
-    poetry run pre-commit run -c ../.pre-commit-config.yaml -a
+    poetry run pre-commit run -a
 ```
+
+or use the following task (using `Ctrl+Shift+P`):
+- `Python - Run Checks` to run the checks on the whole project.
+- `Python - Run Checks - Staged` to run the checks on the currently staged files only.
+
+Ideally you should run these checks before committing any changes, use `poetry run pre-commit install` to set that up.
+
+## Code Coverage
+
+We try to maintain a high code coverage for the project. To run the code coverage on the unit tests, you can use the following command:
+
+```bash
+    cd python
+    poetry run pytest --cov=semantic_kernel --cov-report=term-missing:skip-covered tests/unit/
+```
+or use the following task (using `Ctrl+Shift+P`):
+- `Python: Tests - Code Coverage` to run the code coverage on the whole project.
+
+This will show you which files are not covered by the tests, including the specific lines not covered.
+
+## Catching up with the latest changes
+There are many people committing to Semantic Kernel, so it is important to keep your local repository up to date. To do this, you can run the following commands:
+
+```bash
+    git fetch upstream main
+    git rebase upstream/main
+    git push --force-with-lease
+```
+or:
+
+```bash
+    git fetch upstream main
+    git merge upstream/main
+    git push
+```
+
+This is assuming the upstream branch refers to the main repository. If you have a different name for the upstream branch, you can replace `upstream` with the name of your upstream branch.
+
+After running the rebase command, you may need to resolve any conflicts that arise. If you are unsure how to resolve a conflict, please refer to the [GitHub's documentation on resolving conflicts](https://docs.github.com/en/get-started/using-git/resolving-merge-conflicts-after-a-git-rebase), or for [VSCode](https://code.visualstudio.com/docs/sourcecontrol/overview#_merge-conflicts).
+
diff --git a/python/README.md b/python/README.md
@@ -4,6 +4,14 @@ Install the latest package:
 
     python -m pip install --upgrade semantic-kernel
 
+If you want to use some of the optional dependencies (OpenAI is installed by default), you can install them with:
+
+    python -m pip install --upgrade semantic-kernel[hugging_face]
+
+of all of them:
+
+    python -m pip install --upgrade semantic-kernel[all]
+
 # AI Services
 
 ## OpenAI / Azure OpenAI API keys
@@ -55,7 +63,7 @@ kernel.add_service(
 # )
 
 # Define the request settings
-req_settings = kernel.get_service(service_id).get_prompt_execution_settings_class()(service_id=service_id)
+req_settings = kernel.get_prompt_execution_settings_from_service_id(service_id)
 req_settings.max_tokens = 2000
 req_settings.temperature = 0.7
 req_settings.top_p = 0.8