The ConfigLoader
and TemplatedConfigLoader
classes have been deprecated since Kedro 0.18.12
and were removed in Kedro 0.19.0
. To use that release or later, you must adopt the OmegaConfigLoader
.
This migration guide outlines the primary distinctions between the old loaders and the OmegaConfigLoader
, providing step-by-step instructions on updating your code base to utilise the new class effectively.
The OmegaConfigLoader
was introduced in Kedro 0.18.5
and is based on OmegaConf. To use it you need Kedro (version 0.18.5
or later) and omegaconf
installed.
You can install both using pip
:
pip install kedro==0.18.5
This would be the minimum required Kedro version which includes omegaconf
as a dependency.
Or you can run:
pip install -U kedro
This command installs the most recent version of Kedro which also includes omegaconf
as a dependency.
To use OmegaConfigLoader
in your project, set the CONFIG_LOADER_CLASS
constant in your src/<package_name>/settings.py
:
+ from kedro.config import OmegaConfigLoader # new import
+ CONFIG_LOADER_CLASS = OmegaConfigLoader
Replace the import statement for ConfigLoader
with the one for OmegaConfigLoader
:
- from kedro.config import ConfigLoader
+ from kedro.config import OmegaConfigLoader
OmegaConfigLoader
supports only yaml
and json
file formats. Make sure that all your configuration files are in one of these formats. If you previously used other formats with ConfigLoader
, convert them to yaml
or json
.
The method to load the configuration using OmegaConfigLoader
differs slightly from that used by ConfigLoader
, which allowed users to access configuration through the .get()
method and required patterns as argument.
When you migrate to use OmegaConfigLoader
it requires you to fetch configuration through a configuration key that points to configuration patterns specified in the loader class or provided in the CONFIG_LOADER_ARGS
in settings.py
.
- conf_path = str(project_path / settings.CONF_SOURCE)
- conf_loader = ConfigLoader(conf_source=conf_path, env="local")
- catalog = conf_loader.get("catalog*")
+ conf_path = str(project_path / settings.CONF_SOURCE)
+ config_loader = OmegaConfigLoader(conf_source=conf_path, env="local")
+ catalog = config_loader["catalog"]
In this example, "catalog"
is the key to the default catalog patterns specified in the OmegaConfigLoader
class.
For error and exception handling, most errors are the same. Those you need to be aware of that are different between the original ConfigLoader
and OmegaConfigLoader
are as follows:
OmegaConfigLoader
throws aMissingConfigException
when configuration paths don't exist, rather than theValueError
used inConfigLoader
.- In
OmegaConfigLoader
, if there is bad syntax in your configuration files, it will trigger aParserError
instead of aBadConfigException
used inConfigLoader
.
The OmegaConfigLoader
was introduced in Kedro 0.18.5
and is based on OmegaConf. Features that replace TemplatedConfigLoader
functionality have been released in later versions, so we recommend users
install Kedro version 0.18.13
or later to properly replace the TemplatedConfigLoader
with OmegaConfigLoader
.
You can install both this Kedro version and omegaconf
using pip
:
pip install "kedro>=0.18.13"
This would be the minimum required Kedro version which includes omegaconf
as a dependency and the necessary functionality to replace TemplatedConfigLoader
.
Or you can run:
pip install -U kedro
This command installs the most recent version of Kedro which also includes omegaconf
as a dependency.
To use OmegaConfigLoader
in your project, set the CONFIG_LOADER_CLASS
constant in your src/<package_name>/settings.py
:
+ from kedro.config import OmegaConfigLoader # new import
+ CONFIG_LOADER_CLASS = OmegaConfigLoader
Replace the import statement for TemplatedConfigLoader
with the one for OmegaConfigLoader
:
- from kedro.config import TemplatedConfigLoader
+ from kedro.config import OmegaConfigLoader
OmegaConfigLoader
supports only yaml
and json
file formats. Make sure that all your configuration files are in one of these formats. If you were using other formats with TemplatedConfigLoader
, convert them to yaml
or json
.
The method to load the configuration using OmegaConfigLoader
differs slightly from that used by TemplatedConfigLoader
, which allowed users to access configuration through the .get()
method and required patterns as argument.
When you migrate to use OmegaConfigLoader
it requires you to fetch configuration through a configuration key that points to configuration patterns specified in the loader class or provided in the CONFIG_LOADER_ARGS
in settings.py
.
- conf_path = str(project_path / settings.CONF_SOURCE)
- conf_loader = TemplatedConfigLoader(conf_source=conf_path, env="local")
- catalog = conf_loader.get("catalog*")
+ conf_path = str(project_path / settings.CONF_SOURCE)
+ config_loader = OmegaConfigLoader(conf_source=conf_path, env="local")
+ catalog = config_loader["catalog"] # note the key accessor syntax
In this example, the "catalog"
key points to the default catalog patterns specified in the OmegaConfigLoader
class.
Templating of values is done through native variable interpolation in OmegaConfigLoader
. Where in TemplatedConfigLoader
it was necessary to
provide the template values in a globals
file or dictionary, in OmegaConfigLoader
you can provide these values within the same file that has the placeholders or a file that has a name that follows the same config pattern specified.
The variable interpolation is scoped to a specific configuration type and environment. If you want to share templated values across configuration types and environments, you will need to use globals.
Suppose you are migrating a templated catalog file from using TemplatedConfigLoader
to OmegaConfigLoader
you would do the following:
- Rename
conf/base/globals.yml
to match the patterns specified for catalog (["catalog*", "catalog*/**", "**/catalog*"]
), for exampleconf/base/catalog_globals.yml
- Add an underscore
_
to any catalog template values. This is needed because of how catalog entries are validated.
- bucket_name: "my_s3_bucket"
+ _bucket_name: "my_s3_bucket" # kedro requires `_` to mark templatable keys
- key_prefix: "my/key/prefix/"
+ _key_prefix: "my/key/prefix/"
- datasets:
+ _datasets:
csv: "pandas.CSVDataset"
spark: "spark.SparkDataset"
- Update
catalog.yml
with the underscores_
at the beginning of the templated value names.
raw_boat_data:
- type: "${datasets.spark}"
+ type: "${_datasets.spark}"
- filepath: "s3a://${bucket_name}/${key_prefix}/raw/boats.csv"
+ filepath: "s3a://${_bucket_name}/${_key_prefix}/raw/boats.csv"
file_format: parquet
raw_car_data:
- type: "${datasets.csv}"
+ type: "${_datasets.csv}"
- filepath: "s3://${bucket_name}/data/${key_prefix}/raw/cars.csv"
+ filepath: "s3://${_bucket_name}/data/${_key_prefix}/raw/cars.csv"
To provide a default for any template values you have to use the omegaconf oc.select
resolver.
boats:
users:
- fred
- - "${write_only_user|ron}"
+ - "${oc.select:write_only_user,ron}"
If you want to share variables across configuration types, for example parameters and catalog, and environments you need to use the custom globals resolver with the OmegaConfigLoader
.
The OmegaConfigLoader
requires global values to be provided in a globals.yml
file. Note that using a globals_dict
to provide globals is not supported with OmegaConfigLoader
. The following section explains the differences between using globals with TemplatedConfigLoader
and the OmegaConfigLoader
.
Let's assume your project contains a conf/base/globals.yml
file with the following contents:
bucket_name: "my_s3_bucket"
key_prefix: "my/key/prefix/"
datasets:
csv: "pandas.CSVDataset"
spark: "spark.SparkDataset"
folders:
raw: "01_raw"
int: "02_intermediate"
pri: "03_primary"
fea: "04_feature"
You no longer need to set CONFIG_LOADER_ARGS
variable in src/<package_name>/settings.py
to find this globals.yml
file, because the
OmegaConfigLoader
is configured to pick up files named globals.yml
by default.
- CONFIG_LOADER_ARGS = {"globals_pattern": "*globals.yml"}
The globals templating in your catalog configuration will need to be updated to use the globals resolver as follows:
raw_boat_data:
- type: "${datasets.spark}"
+ type: "${globals:datasets.spark}" # nested paths into global dict are allowed
- filepath: "s3a://${bucket_name}/${key_prefix}/${folders.raw}/boats.csv"
+ filepath: "s3a://${globals:bucket_name}/${globals:key_prefix}/${globals:folders.raw}/boats.csv"
file_format: parquet
raw_car_data:
- type: "${datasets.csv}"
+ type: "${globals:datasets.csv}"
- filepath: "s3://${bucket_name}/data/${key_prefix}/${folders.raw}/${filename|cars.csv}" # default to 'cars.csv' if the 'filename' key is not found in the global dict
+ filepath: "s3://${globals:bucket_name}/data/${globals:key_prefix}/${globals:folders.raw}/${globals:filename,'cars.csv'}" # default to 'cars.csv' if the 'filename' key is not found in the global dict
OmegaConfigLoader
does not support Jinja2 syntax in configuration. However, users can achieve similar functionality with the OmegaConfigLoader
in combination with dataset factories.
The following example shows how you can rewrite your Jinja2 configuration to work with OmegaConfigLoader
:
# catalog.yml
- {% for speed in ['fast', 'slow'] %}
- {{ speed }}-trains:
+ "{speed}-trains":
type: MemoryDataset
- {{ speed }}-cars:
+ "{speed}-cars":
type: pandas.CSVDataset
- filepath: s3://${bucket_name}/{{ speed }}-cars.csv
+ filepath: s3://${bucket_name}/{speed}-cars.csv
save_args:
index: true
- {% endfor %}
For error and exception handling, most errors are the same. Those you need to be aware of that are different between the original TemplatedConfigLoader
and OmegaConfigLoader
are as follows:
- For missing template values
OmegaConfigLoader
throwsomegaconf.errors.InterpolationKeyError
.