Skip to content

build_image_dataset.py crashes for delf training #10474

Open
@avidullu

Description

@avidullu

Running instructions from https://github.com/tensorflow/models/blob/master/research/delf/delf/python/training/README.md#prepare-the-data-for-training without any GPUs on a Google Cloud VM encounters an error.

Below is the command with the error
python3 build_image_dataset.py --train_csv_path=$LANDMARK_DATA/train/train.csv --train_clean_csv_path=$LANDMARK_DATA/train/train_clean.csv --train_directory=$LANDMARK_DATA/train//// --output_directory=$LANDMARK_DATA/tfrecord/ --num_shards=128 --generate_train_validation_splits --validation_split_size=0.2 --test_csv_path=$LANDMARK_DATA/train/test.csv --test_directory=$LANDMARK_DATA/test////
2022-01-27 22:07:03.277440: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-01-27 22:07:03.277495: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-01-27 22:07:05.252430: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-01-27 22:07:05.252502: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
2022-01-27 22:07:05.252525: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (gcsfuse-experiment): /proc/driver/nvidia/version does not exist
/home/avidullu/mldata/cvdfoundation/google-landmark/train/train_clean.csv
Traceback (most recent call last):
File "build_image_dataset.py", line 491, in
app.run(main)
File "/home/avidullu/.local/lib/python3.7/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/avidullu/.local/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "build_image_dataset.py", line 485, in main
FLAGS.seed)
File "build_image_dataset.py", line 439, in _build_train_tfrecord_dataset
image_dir)
File "build_image_dataset.py", line 144, in _get_clean_train_image_files_and_labels
df = pd.read_csv(csv_file)
File "/home/avidullu/.local/lib/python3.7/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/home/avidullu/.local/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 586, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/avidullu/.local/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 482, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/home/avidullu/.local/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 811, in init
self._engine = self._make_engine(self.engine)
File "/home/avidullu/.local/lib/python3.7/site-packages/pandas/io/parsers/readers.py", line 1040, in _make_engine
return mapping[engine](self.f, **self.options) # type: ignore[call-arg]
File "/home/avidullu/.local/lib/python3.7/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 51, in init
self._open_handles(src, kwds)
File "/home/avidullu/.local/lib/python3.7/site-packages/pandas/io/parsers/base_parser.py", line 229, in _open_handles
errors=kwds.get("encoding_errors", "strict"),
File "/home/avidullu/.local/lib/python3.7/site-packages/pandas/io/common.py", line 724, in get_handle
newline="",
AttributeError: 'GFile' object has no attribute 'readable'

with tf.io.gfile.GFile(csv_path, 'rb') as csv_file:
seems to be using a binary mode for read. On removing the 'b' from here and L116 the script makes progress.

Metadata

Metadata

Assignees

Labels

models:researchmodels that come under research directorytype:bugBug in the code

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions