This document describes how to setup your environment with Python and Poetry, if you're working on new features or a bug fix for Semantic Kernel, or simply want to run the tests included.
Make sure you have an OpenAI API Key or Azure OpenAI service key
Copy those keys into a .env
file (see the .env.example
file):
OPENAI_API_KEY=""
OPENAI_ORG_ID=""
AZURE_OPENAI_DEPLOYMENT_NAME=""
AZURE_OPENAI_ENDPOINT=""
AZURE_OPENAI_API_KEY=""
We suggest adding a copy of the .env
file under these folders:
To get started, you'll need VSCode and a local installation of Python 3.8+.
You can run:
python3 --version ; pip3 --version ; code -v
to verify that you have the required dependencies.
Check that you've cloned the repository to ~/workspace
or a similar folder.
Avoid /mnt/c/
and prefer using your WSL user's home directory.
Ensure you have the WSL extension for VSCode installed (and the Python extension for VSCode installed).
You'll also need pip3
installed. If you don't yet have a python3
install in WSL,
you can run:
sudo apt-get update && sudo apt-get install python3 python3-pip
ℹ️ Note: if you don't have your PATH setup to find executables installed by pip3
,
you may need to run ~/.local/bin/poetry install
and ~/.local/bin/poetry shell
instead. You can fix this by adding export PATH="$HOME/.local/bin:$PATH"
to
your ~/.bashrc
and closing/re-opening the terminal._
Poetry allows to use SK from the local files, without worrying about paths, as if you had SK pip package installed.
To install Poetry in your system, first, navigate to the directory containing this README using your chosen shell. You will need to have Python 3.8+ installed.
Install the Poetry package manager and create a project virtual environment. Note: SK requires at least Poetry 1.2.0.
# Install poetry package
pip3 install poetry
# Use poetry to install base project dependencies
poetry install
# If you want to use connectors such as hugging face
# poetry install --with <connector group name>
# example: poetry install --with hugging_face
# Use poetry to activate project venv
poetry shell
Open any of the .py
files in the project and run the Python: Select Interpreter
command from the command palette. Make sure the virtual env (venv) created by
poetry
is selected.
The python you're looking for should be under ~/.cache/pypoetry/virtualenvs/semantic-kernel-.../bin/python
.
If prompted, install black
and flake8
(if VSCode doesn't find those packages,
it will prompt you to install them).
You can run the unit tests under the tests/unit folder.
cd python
poetry install
poetry run pytest tests/unit
You can run the integration tests under the tests/integration folder.
cd python
poetry install
poetry run pytest tests/integration
You can also run all the tests together under the tests folder.
cd python
poetry install
poetry run pytest tests
It's important to note that most of this library is written with asynchronous in mind. The
developer should always assume everything is asynchronous. One can use the function signature
with either async def
or def
to understand if something is asynchronous or not.
This section describes how one can enable serialization for their class using Pydantic.
IMPORTANT: This document (and SemanticKernel) currently use Pydantic 1.x. When SK is upgraded to use Pydantic 2.x, this document will be upgraded accordingly.
There are 3 types of classes you need to be aware of when enabling serialization with Pydantic:
- Classes which contain no data - examples are Protocols, ABC subclasses and any other classes that don't contain any data that needs to be serialized.
- Classes which contain data that need to be serialized, but don't contain any generic classes.
- Classes which contain data that need to be serialized, AND contain generic classes.
Let's take the following classes as examples - 1 ABC, 1 Protocol, and 1 class that only contains data that doesn't need to be serialized.
class A(Protocol):
def some_method(self, *args, **kwargs): ...
class B(ABC):
def some_method(self, *args, **kwargs): ...
class C:
def __init__(self):
# IMPORTANT: These variables are NOT being passed into the initializer
# so they don't need to be serialized. If the are though, you'll have
# to treat this as a class that contains data that needs to be serialized
self._a = ...
For Protocol
subclasses, nothing needs to be done, and they can be left as is.
For the remaining types, SemanticKernel provides a class named PydanticField
. Subclassing
from this field is sufficient to have these types of classes as valid Pydantic fields, and allows
any class using them as attributes to be serialized.
from semantic_kernel.kernel_pydantic import PydanticField
class B(PydanticField): ... # correct, B is still an ABC because PydanticField subclasses ABC
class B(PydanticField, ABC): ... # Also correct
class B(ABC, PydanticField): ... # ERROR: Python cannot find a valid super class ordering.
class C(PydanticField): ... # No other changes needed
The classes B and C can now be used as valid Pydantic Field annotations.
from pydantic import BaseModel
class MyModel(BaseModel):
b: B
c: C
Class A can only be used as a Pydantic Field annotation for a Pydantic BaseModel subclass which is configured to allow arbitrary field types like so:
from pydantic import BaseModel
class IncorrectModel(BaseModel):
a: A # Pydantic error
class CorrectModel(BaseModel):
a: A # Okay
class Config: # Configuration that tells Pydantic to allow field types that it can't serialize
arbitrary_types_allowed = True
If your class has any data that needs to be serialized, but the field annotation for that data type in your class is not a Generic type, this section applies to you.
Let's take the following example:
class A:
def __init__(self, a: int, b: float, c: List[float], d: dict[str, tuple[float, str]] = {}):
# Since a, b, c and d are needed to initialize this class, they need to be serialized
# if can be serialized.
# Although a, b, c and d are builtin python types, any valid pydantic field can be used
# here. This includes the classes defined in the previous category.
self.a = a
self.b = b
self.c = c
self.d = d
You would convert this to a Pydantic class by subclassing from the KernelBaseModel
class.
from pydantic import Field
from semantic_kernel.kernel_pydantic import KernelBaseModel
class A(KernelBaseModel):
# The notation for the fields is similar to dataclasses.
a: int
b: float
c: List[float]
# Only, instead of using dataclasses.field, you would use pydantic.Field
d: dict[str, tuple[flost, str]] = Field(default_factory=dict)
Let's take the following example:
from typing import TypeVar
T1 = TypeVar("T1")
T2 = TypeVar("T2", bound=<some class>)
class A:
def __init__(a: int, b: T1, c: T2):
self.a = a
self.b = b
self.c = c
You can use the KernelBaseModel
to convert these to pydantic serializable classes.
from typing import Generic
from semantic_kernel.kernel_pydantic import KernelBaseModel
class A(KernelBaseModel, Generic[T1, T2]):
# T1 and T2 must be specified in the Generic argument otherwise, pydantic will
# NOT be able to serialize this class
a: int
b: T1
c: T2
To run the same checks that run during the GitHub Action build, you can use this command, from the python folder:
poetry run pre-commit run -c .conf/.pre-commit-config.yaml -a