Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trained checkpoints on waymo open dataset #8

Closed
Shiming94 opened this issue Nov 7, 2024 · 15 comments
Closed

Trained checkpoints on waymo open dataset #8

Shiming94 opened this issue Nov 7, 2024 · 15 comments

Comments

@Shiming94
Copy link

Hi,

Could you please share the Waymo pre-trained weights?

Best,
Shiming

@Shiming94
Copy link
Author

Hi, @Kin-Zhang,

when I am trying to process the waymo open dataset, I have encountered an error:

==> Scene 14810689888487451189_720_000_740_000 has plus label, skip.
==> Scene 1758724094753801109_1251_037_1271_037 has plus label, skip.
==> Scene 15644354861949427452_3645_350_3665_350 has plus label, skip.
==> Scene 6229371035421550389_2220_000_2240_000 has plus label, skip.
==> Scene 15331851695963211598_1620_000_1640_000 has plus label, skip.
==> Scene 169115044301335945_480_000_500_000 has plus label, skip.
==> Scene 4960194482476803293_4575_960_4595_960 has plus label, skip.
==> Scene 4880464427217074989_4680_000_4700_000 has plus label, skip.
==> Scene 175830748773502782_1580_000_1600_000 has plus label, skip.
Start Plus Cluster: 502/798:  98%|███████████▊| 191/194 [00:08<00:00, 22.70it/s]
Traceback (most recent call last):
  File "process.py", line 148, in <module>
    fire.Fire(run_cluster)
  File "/home/shimingwang/miniconda3/envs/dataprocess/lib/python3.8/site-packages/fire/core.py", line 135, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/shimingwang/miniconda3/envs/dataprocess/lib/python3.8/site-packages/fire/core.py", line 468, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/shimingwang/miniconda3/envs/dataprocess/lib/python3.8/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "process.py", line 62, in run_cluster
    hdb.fit(pc0[data["dufo_label"]==1])
  File "/home/shimingwang/miniconda3/envs/dataprocess/lib/python3.8/site-packages/hdbscan/hdbscan_.py", line 1152, in fit
    X = check_array(X, accept_sparse="csr", force_all_finite=False)
  File "/home/shimingwang/miniconda3/envs/dataprocess/lib/python3.8/site-packages/sklearn/utils/validation.py", line 931, in check_array
    raise ValueError(
ValueError: Found array with 0 sample(s) (shape=(0, 3)) while a minimum of 1 is required.

Do you have any ideas how to solve it? Or did I do something wrong when I preprocessed the data?

@Kin-Zhang
Copy link
Member

Hi,

Could you please share the Waymo pre-trained weights?

Best, Shiming

Unfortunately, I tried to share the waymo weights also before, but other author reminder me that according to Waymo terms I don't allow to share weights trained on Waymo data. I don't have any idea how to deal with this sharing part.

Additional Restrictions: To ensure the Dataset is only used for Non-Commercial Purposes, You further agree (a) not to distribute or publish any models trained on or refined using the Dataset, or the weights or biases from such trained models, in whole or in part; and (b) not to use or deploy the Dataset, any models trained on or refined using the Dataset, or the weights or biases from such trained models, in whole or in part, (i) in operation of a vehicle or to assist in the operation of a vehicle, (ii) in any Production Systems, or (iii) for any other primarily commercial purposes.

@Kin-Zhang
Copy link
Member

Do you have any ideas how to solve it? Or did I do something wrong when I preprocessed the data?

I didn't have this error before, but I knew the reason, I pushed a commit to pass it. Please continue run again, it will pass by the finished scene so should be quick. If it's possible could you tell me which scene id you are running for this error?

@Shiming94
Copy link
Author

Hi @Kin-Zhang,

Thank you so much for your quick reply.

Will skipping this scene influence the final results on waymo dataset?

Does dufo_label ==0 mean that I processed the data wrongly at the beginning?

I will let you know the scene_id.

Thank you once again for your help!

@Kin-Zhang
Copy link
Member

I don't think it's a common issue, it may only few scenes that don't have dynamic points (that's true for some time, for example, if we are in a parking area) But I'd like to double-check.

You are free to check visualization through tools/visualization.py script with --res_name dufo_label or --res_name label

@Shiming94
Copy link
Author

Thank you so much!

I have an additional question: I would like to confirm that the lack of dufo labels for some samples will not influence the training process, especially the calculation of the seflow, right?

@Kin-Zhang
Copy link
Member

If it's only few scene lack of dufo label, it should be fine. But if it's common warning with no dufo label, then it will have problem.

When I run the Waymo during SeFlow paper experiment, I didn't have any scenes lacking of dufo label.

@Shiming94
Copy link
Author

Thank you so much! I will try it and let you know!

@Shiming94
Copy link
Author

Hi, I just got the results. The scene is 15367782110311024266_2103_310_2123_310 and the timestamp is 1516410982971073 and 1516410982871081.
But after fixing this error, another error about the hdbscan.fit popped up:

==> Scene 175830748773502782_1580_000_1600_000 has plus label, skip.
Start Plus Cluster: 502/798:  98%|███████████▊| 191/194 [00:09<00:00, 63.43it/s]Warning: 15367782110311024266_2103_310_2123_310 1516410982871081 has no dynamic points, will be skipped. Better to check this scene.
Warning: 15367782110311024266_2103_310_2123_310 1516410982971073 has no dynamic points, will be skipped. Better to check this scene.
Start Plus Cluster: 502/798:  99%|███████████▉| 193/194 [00:09<00:00, 20.56it/s]
Traceback (most recent call last):
  File "process.py", line 152, in <module>
    fire.Fire(run_cluster)
  File "/home/shimingwang/miniconda3/envs/dataprocess/lib/python3.8/site-packages/fire/core.py", line 135, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/shimingwang/miniconda3/envs/dataprocess/lib/python3.8/site-packages/fire/core.py", line 468, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/shimingwang/miniconda3/envs/dataprocess/lib/python3.8/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "process.py", line 66, in run_cluster
    hdb.fit(pc0[data["dufo_label"]==1])
  File "/home/shimingwang/miniconda3/envs/dataprocess/lib/python3.8/site-packages/hdbscan/hdbscan_.py", line 1190, in fit
    ) = hdbscan(clean_data, **kwargs)
  File "/home/shimingwang/miniconda3/envs/dataprocess/lib/python3.8/site-packages/hdbscan/hdbscan_.py", line 822, in hdbscan
    (single_linkage_tree, result_min_span_tree) = memory.cache(
  File "/home/shimingwang/miniconda3/envs/dataprocess/lib/python3.8/site-packages/joblib/memory.py", line 312, in __call__
    return self.func(*args, **kwargs)
  File "/home/shimingwang/miniconda3/envs/dataprocess/lib/python3.8/site-packages/hdbscan/hdbscan_.py", line 325, in _hdbscan_boruvka_kdtree
    alg = KDTreeBoruvkaAlgorithm(
  File "hdbscan/_hdbscan_boruvka.pyx", line 392, in hdbscan._hdbscan_boruvka.KDTreeBoruvkaAlgorithm.__init__
  File "hdbscan/_hdbscan_boruvka.pyx", line 434, in hdbscan._hdbscan_boruvka.KDTreeBoruvkaAlgorithm._compute_bounds
  File "sklearn/neighbors/_binary_tree.pxi", line 1127, in sklearn.neighbors._kd_tree.BinaryTree.query
ValueError: k must be less than or equal to the number of training points

@Kin-Zhang
Copy link
Member

Kin-Zhang commented Nov 8, 2024

Thanks for telling me the scene, I checked it's true that sometime there is no dynamic point in this scene. and I push a new commit to change the condition and run in my computer successfully, I think it won't effect training on this scene.

Let me know if there is any error happening.

@Shiming94
Copy link
Author

Shiming94 commented Nov 8, 2024

Thank you so much! I will try! Could you kindly explain why changing the condition to <20 could solve the problem?

@Shiming94
Copy link
Author

Shiming94 commented Nov 8, 2024

Hi @Kin-Zhang,

I was wondering if you could kindly share all the metrics of SeFlow on the Waymo validation dataset. Now we can only find the numbers for 3-way. Could you also share the performance of the bucketed EPE?

Thank you in advance!

@Kin-Zhang
Copy link
Member

Could you kindly explain why changing the condition to <20 could solve the problem?

Since the HDBSCAN param we set min_num_cluster =20

Now we can only find the numbers for 3-way. Could you also share the performance of the bucketed EPE?

Since when we have SeFlow paper, there is no bucketed EPE metric, afterward I added it but never tested/run in waymo. That means I don't have such bucketed metric result also.... since waymo ground truth is not follow the same instance strategy as av2 did, so I will recommend keep EPE Three way as main metric to suitable in all datasets.

@Shiming94
Copy link
Author

Hi @Kin-Zhang,

Yesterday, I tried to train SeFlow with the processed Waymo data. Unfortunately, I encountered an error while training. I guess it was caused by the lack of labels on some samples. Do you have some ideas on how to avoid this kind of errors?

    out[i] = next(self.iterators[i])
  File "/home/shimingwang/miniconda3/envs/sf_tv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 634, in __next__
    data = self._next_data()
  File "/home/shimingwang/miniconda3/envs/sf_tv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1346, in _next_data
    return self._process_data(data)
  File "/home/shimingwang/miniconda3/envs/sf_tv/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1372, in _process_data
    data.reraise()
  File "/home/shimingwang/miniconda3/envs/sf_tv/lib/python3.8/site-packages/torch/_utils.py", line 644, in reraise
    raise exception
KeyError: Caught KeyError in DataLoader worker process 10.   
Original Traceback (most recent call last):
  File "/home/shimingwang/miniconda3/envs/sf_tv/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/shimingwang/miniconda3/envs/sf_tv/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/shimingwang/miniconda3/envs/sf_tv/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/shimingwang/workspace/sf_tv/sceneflow_tv/src/dataset.py", line 197, in __getitem__
    res_dict['pc0_dynamic'] = torch.tensor(f[key]['label'][:].astype('int16'))
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/shimingwang/miniconda3/envs/sf_tv/lib/python3.8/site-packages/h5py/_hl/group.py", line 357, in __getitem__
    oid = h5o.open(self.id, self._e(name), lapl=self._lapl)  
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 241, in h5py.h5o.open
KeyError: "Unable to synchronously open object (object 'label' doesn't exist)"

Thank you so much in advance!

@Kin-Zhang
Copy link
Member

AHa, thanks for reporting, could you please try pull again and rerun process.py (it will skip finished so won't take long time but only for seq lack of label.

Let me know if problem still occur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants