You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
By default, TensorFlowModelDataset saves a model using TensorFlow's native format. This works as expected. Saving the model as an HDF5 file also works as expected, provided you don't have versioning enabled.
Steps to Reproduce
This failure happens with a catalog configuration like the following:
The model should be saved successfully as a versioned DataSet.
Actual Result
You get an error like the following:
Traceback (most recent call last):
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/io/core.py", line 240, in save
self._save(data)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/extras/datasets/tensorflow/tensorflow_model_dataset.py", line 167, in _save
self._fs.copy(path, save_path)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/fsspec/implementations/local.py", line 90, in copy
shutil.copyfile(path1, path2)
File "/home/daniel/.pyenv/versions/3.7.7/lib/python3.7/shutil.py", line 121, in copyfile
with open(dst, 'wb') as fdst:
FileNotFoundError: [Errno 2] No such file or directory: '/home/daniel/git/cotton_counter/data/06_models/fully_trained.hd5/2020-09-19T16.20.54.312Z/fully_trained.hd5'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/daniel/.pyenv/versions/3.7.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/home/daniel/.pyenv/versions/3.7.7/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/__main__.py", line 38, in <module>
main()
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/framework/cli/cli.py", line 724, in main
cli_collection()
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/daniel/git/cotton_counter/kedro_cli.py", line 263, in run
pipeline_name=pipeline,
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/framework/context/context.py", line 767, in run
raise exc
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/framework/context/context.py", line 759, in run
run_result = runner.run(filtered_pipeline, catalog, run_id)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/runner/runner.py", line 101, in run
self._run(pipeline, catalog, run_id)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/runner/sequential_runner.py", line 90, in _run
run_node(node, catalog, self._is_async, run_id)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/runner/runner.py", line 213, in run_node
node = _run_node_sequential(node, catalog, run_id)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/runner/runner.py", line 249, in _run_node_sequential
catalog.save(name, data)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/io/data_catalog.py", line 439, in save
func(data)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/io/core.py", line 625, in save
super().save(data)
File "/home/daniel/git/cotton_counter/.venv/lib/python3.7/site-packages/kedro/io/core.py", line 247, in save
raise DataSetError(message) from exc
kedro.io.core.DataSetError: Failed while saving data to data set TensorFlowModelDataset(filepath=/home/daniel/git/cotton_counter/data/06_models/fully_trained.hd5, load_args={'compile': False}, protocol=file, save_args={'save_format': h5}, version=Version(load=None, save='2020-09-19T16.20.54.312Z')).
[Errno 2] No such file or directory: '/home/daniel/git/cotton_counter/data/06_models/fully_trained.hd5/2020-09-19T16.20.54.312Z/fully_trained.hd5'
Digging deeper, it appears that this issue is caused by TensorFlowModelDataset not properly checking to make sure that all intermediate directories are created when saving the model. I was able to fix it by adding two lines to the _save() method:
def_save(self, data: tf.keras.Model) ->None:
save_path=get_filepath_str(self._get_save_path(), self._protocol)
# New lines are here.save_dir=Path(save_path).parentsave_dir.mkdir(parents=True, exist_ok=True)
I can submit this as a PR also.
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
Kedro version used (pip show kedro or kedro -V): 0.16.5
Python version used (python -V): 3.7.7
Operating system and version: Ubuntu 20.04
The text was updated successfully, but these errors were encountered:
lorenabalan
changed the title
TensorflowModelDataset save fails with hdf5 model when versioning is enabled.
[KED-2140]TensorflowModelDataset save fails with hdf5 model when versioning is enabled.
Oct 6, 2020
Hello! I was having the same error ([Errno 2] No such file or directory) with a custom json dataset that I had created and only could fix the problem with the lines mentioned by @djpetti (Thank you).
Description
The title is pretty self-explanatory.
Context
By default,
TensorFlowModelDataset
saves a model using TensorFlow's native format. This works as expected. Saving the model as an HDF5 file also works as expected, provided you don't have versioning enabled.Steps to Reproduce
This failure happens with a catalog configuration like the following:
Expected Result
The model should be saved successfully as a versioned
DataSet
.Actual Result
You get an error like the following:
Digging deeper, it appears that this issue is caused by
TensorFlowModelDataset
not properly checking to make sure that all intermediate directories are created when saving the model. I was able to fix it by adding two lines to the_save()
method:I can submit this as a PR also.
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
pip show kedro
orkedro -V
): 0.16.5python -V
): 3.7.7The text was updated successfully, but these errors were encountered: