Releases: SenseTime-FVG/OpenDWM
Releases · SenseTime-FVG/OpenDWM
v0.6.0
- Release LiDAR Diffusion models and with the technique report.
- Add LiDAR visualization tool.
v0.5.1
v0.5.0
v0.4.0
v0.3.3
Functionality
- CTSD pipeline supports action control (training data is converted from ego transform) and pyTorch distributed checkpoint.
- The distributed checkpointing reduces peak memory and GPU memory usage during the loading, resuming, saving stage, which is more friendly for low memory and GPU memory distributed system.
- Warning: Incompatible with the optimizer checkpoint of previous version). To resume the checkpoint saved by the previous version (<0.3.3), you need to launch a new training stage with the recently saved model checkpoint only.
- Update LiDAR VQVAE and Maskgit pipelines for temporal and auto-regressive generation.
- Release Diffusion Forcing Transformer (DFoT) on CTSD 3.5 model config and checkpoint. The DFoT is a kind of self-supervision target with "soft mask", which allow the model to reduce the accumulative degradation during the auto-regressive generation. The interative generation pipeline prefers a DFoT model.
- Release KITTI-360 included LiDAR VQVAE and Maskgit model config and checkpoint.
- Add scripts to make blank code (LiDAR VQVAE), make carla camera parameters (interactive generation), Carla control from steering (interactive generation)
Fixes
- Fix export script as nuScenes data.
- Other minor fixes about the models, datasets, metrics.
v0.3.1
Functionality
- Add experimental interactive generation pipeline code, config, documents.
- CTSD pipeline support action (speed, steering) as additional conditions.
Fix & minor updates
- Fix dataset text condition bug for random text partial drop, (triggered by the seed being given in the image_description_settings).
- Update dependencies for the safety issue.
- Update metric record for released LiDAR models.
v0.3.0
v0.2.1
Functionality
- Release layout conditioned CTSD 3.5 config, checkpoint, example.
- Trained with 4 datasets (nuScenes, Waymo, Argoverse, OpenDV) in single stage for better generalization ability.
- Bucket for multi-resolution, following OpenSora ...
- nuScenes dataset allow to config BEV condition drawing by solid shape or outline.
Fix
- Fix that S3FS cannot list more than 1k items.
- Fix reshape error for CTSD 3.x pointwise temporal attention.
- Fix link to UniMLVG config in the README.