This document describes how to setup your environment with Python and Poetry, if you're working on new features or a bug fix for Semantic Kernel, or simply want to run the tests included.
Make sure you have an OpenAI API Key or Azure OpenAI service key
There are two methods to manage keys, secrets, and endpoints:
-
Store them in environment variables. SK Python leverages pydantic settings to load keys, secrets, and endpoints. This means that there is a first attempt to load them from environment variables. The
.env
file naming applies to how the names should be stored as environment variables. -
If you'd like to use the
.env
file, you will need to configure the.env
file with the following keys into a.env
file (see the.env.example
file):
OPENAI_API_KEY=""
OPENAI_ORG_ID=""
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME=""
AZURE_OPENAI_TEXT_DEPLOYMENT_NAME=""
AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME=""
AZURE_OPENAI_ENDPOINT=""
AZURE_OPENAI_API_KEY=""
You will then configure the Text/ChatCompletion class with the keyword argument env_file_path
:
chat_completion = OpenAIChatCompletion(service_id="test", env_file_path=<path_to_file>)
This optional env_file_path
parameter will allow pydantic settings to use the .env
file as a fallback to read the settings.
If using the second method, we suggest adding a copy of the .env
file under these folders:
To get started, you'll need VSCode and a local installation of Python 3.8+.
You can run:
python3 --version ; pip3 --version ; code -v
to verify that you have the required dependencies.
Check that you've cloned the repository to ~/workspace
or a similar folder.
Avoid /mnt/c/
and prefer using your WSL user's home directory.
Ensure you have the WSL extension for VSCode installed (and the Python extension for VSCode installed).
You'll also need pip3
installed. If you don't yet have a python3
install in WSL,
you can run:
sudo apt-get update && sudo apt-get install python3 python3-pip
ℹ️ Note: if you don't have your PATH setup to find executables installed by pip3
,
you may need to run ~/.local/bin/poetry install
and ~/.local/bin/poetry shell
instead. You can fix this by adding export PATH="$HOME/.local/bin:$PATH"
to
your ~/.bashrc
and closing/re-opening the terminal._
Poetry allows to use SK from the local files, without worrying about paths, as if you had SK pip package installed.
To install Poetry in your system, first, navigate to the directory containing this README using your chosen shell. You will need to have Python 3.10, 3.11, or 3.12 installed.
Install the Poetry package manager and create a project virtual environment. Note: SK requires at least Poetry 1.2.0.
It is best to install Poetry using their official installer.
On MacOS, you might find that python
commands are not recognized by default,
and you can only use python3
. To make it easier to run python ...
commands
(which Poetry requires), you can create an alias in your shell configuration file.
Follow these steps:
-
Open your shell configuration file:
- For Bash:
nano ~/.bash_profile
ornano ~/.bashrc
- For Zsh (default on macOS Catalina and later):
nano ~/.zshrc
- For Bash:
-
Add the alias:
alias python='python3'
-
Save the file and exit:
- In
nano
, pressCTRL + X
, thenY
, and hitEnter
.
- In
-
Apply the changes:
- For Bash:
source ~/.bash_profile
orsource ~/.bashrc
- For Zsh:
source ~/.zshrc
- For Bash:
After these steps, you should be able to use python
in your terminal to run
Python 3 commands.
# Install poetry package if not choosing to install via their official installer
pip3 install poetry
# optionally, define which python version you want to use
poetry env use python3.11
# Use poetry to install base project dependencies
poetry install
# If you want to get all dependencies for tests installed, use
# poetry install --with tests
# example: poetry install --with hugging_face
# Use poetry to activate project venv
poetry shell
# Optionally, you can install the pre-commit hooks
poetry run pre-commit install
# this will run linters and mypy checks on all the changed code.
Open the workspace in VSCode.
The Python workspace is the
./python
folder if you are at the root of the repository.
Open any of the .py
files in the project and run the Python: Select Interpreter
command from the command palette. Make sure the virtual env (venv) created by
poetry
is selected.
The python you're looking for should be under ~/.cache/pypoetry/virtualenvs/semantic-kernel-.../bin/python
.
If prompted, install ruff
. (It should have been installed as part of poetry install
).
You also need to install the ruff
extension in VSCode so that auto-formatting uses the ruff
formatter on save.
Read more about the extension here: https://github.com/astral-sh/ruff-vscode
You can run the unit tests under the tests/unit folder.
poetry install
poetry run pytest tests/unit
Alternatively, you can run them using VSCode Tasks. Open the command palette
(Ctrl+Shift+P
) and type Tasks: Run Task
. Select Python: Tests - Unit
or Python: Tests - Code Coverage
from the list.
You can run the integration tests under the tests/integration folder.
poetry install
poetry run pytest tests/integration
You can also run all the tests together under the tests folder.
poetry install
poetry run pytest tests
Alternatively, you can run them using VSCode Tasks. Open the command palette
(Ctrl+Shift+P
) and type Tasks: Run Task
. Select Python: Tests - All
from the list.
It's important to note that most of this library is written with asynchronous in mind. The
developer should always assume everything is asynchronous. One can use the function signature
with either async def
or def
to understand if something is asynchronous or not.
Each file should have a single first line containing: # Copyright (c) Microsoft. All rights reserved.
We follow the Google Docstring style guide for functions and methods. They are currently not checked for private functions (functions starting with '_').
They should contain:
- Single line explaining what the function does, ending with a period.
- If necessary to further explain the logic a newline follows the first line and then the explanation is given.
- The following three sections are optional, and if used should be separated by a single empty line.
- Arguments are then specified after a header called
Args:
, with each argument being specified in the following format:arg_name
(arg_type
): Explanation of the argument, arg_type is optional, as long as you are consistent.- if a longer explanation is needed for a argument, it should be placed on the next line, indented by 4 spaces.
- Default values do not have to be specified, they will be pulled from the definition.
- Returns are specified after a header called
Returns:
orYields:
, with the return type and explanation of the return value. - Finally, a header for exceptions can be added, called
Raises:
, with each exception being specified in the following format:ExceptionType
: Explanation of the exception.- if a longer explanation is needed for a exception, it should be placed on the next line, indented by 4 spaces.
Putting them all together, gives you at minimum this:
def equal(arg1: str, arg2: str) -> bool:
"""Compares two strings and returns True if they are the same."""
...
Or a complete version of this:
def equal(arg1: str, arg2: str) -> bool:
"""Compares two strings and returns True if they are the same.
Here is extra explanation of the logic involved.
Args:
arg1 (str): The first string to compare.
arg2 (str): The second string to compare.
This string requires extra explanation.
Returns:
bool: True if the strings are the same, False otherwise.
Raises:
ValueError: If one of the strings is empty.
"""
...
If in doubt, use the link above to read much more considerations of what to do and when, or use common sense.
This section describes how one can enable serialization for their class using Pydantic.
Let's take the following example:
class A:
def __init__(self, a: int, b: float, c: List[float], d: dict[str, tuple[float, str]] = {}):
self.a = a
self.b = b
self.c = c
self.d = d
You would convert this to a Pydantic class by subclassing from the KernelBaseModel
class.
from pydantic import Field
from semantic_kernel.kernel_pydantic import KernelBaseModel
class A(KernelBaseModel):
# The notation for the fields is similar to dataclasses.
a: int
b: float
c: list[float]
# Only, instead of using dataclasses.field, you would use pydantic.Field
d: dict[str, tuple[float, str]] = Field(default_factory=dict)
Let's take the following example:
from typing import TypeVar
T1 = TypeVar("T1")
T2 = TypeVar("T2", bound=<some class>)
class A:
def __init__(a: int, b: T1, c: T2):
self.a = a
self.b = b
self.c = c
You can use the KernelBaseModel
to convert these to pydantic serializable classes.
from typing import Generic
from semantic_kernel.kernel_pydantic import KernelBaseModel
class A(KernelBaseModel, Generic[T1, T2]):
# T1 and T2 must be specified in the Generic argument otherwise, pydantic will
# NOT be able to serialize this class
a: int
b: T1
c: T2
To run the same checks that run during the GitHub Action build, you can use this command, from the python folder:
poetry run pre-commit run -a
or use the following task (using Ctrl+Shift+P
):
Python - Run Checks
to run the checks on the whole project.Python - Run Checks - Staged
to run the checks on the currently staged files only.
Ideally you should run these checks before committing any changes, use poetry run pre-commit install
to set that up.
We try to maintain a high code coverage for the project. To run the code coverage on the unit tests, you can use the following command:
poetry run pytest --cov=semantic_kernel --cov-report=term-missing:skip-covered tests/unit/
or use the following task (using Ctrl+Shift+P
):
Python: Tests - Code Coverage
to run the code coverage on the whole project.
This will show you which files are not covered by the tests, including the specific lines not covered.
There are many people committing to Semantic Kernel, so it is important to keep your local repository up to date. To do this, you can run the following commands:
git fetch upstream main
git rebase upstream/main
git push --force-with-lease
or:
git fetch upstream main
git merge upstream/main
git push
This is assuming the upstream branch refers to the main repository. If you have a different name for the upstream branch, you can replace upstream
with the name of your upstream branch.
After running the rebase command, you may need to resolve any conflicts that arise. If you are unsure how to resolve a conflict, please refer to the GitHub's documentation on resolving conflicts, or for VSCode.