-
Notifications
You must be signed in to change notification settings - Fork 0
RTDT-3331_improve_regression_model_training #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| # def load_data(train_dir, val_dir, args): | ||
| # # Define transforms | ||
| # train_transforms = torchvision.transforms.Compose([ | ||
| # torchvision.transforms.RandomResizedCrop(224), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
antialiasing=False
| # ]) | ||
| # | ||
| # val_transforms = torchvision.transforms.Compose([ | ||
| # torchvision.transforms.Resize(256), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
antialiasing=False
|
|
||
| scaled_loaded_loss = loaded_loss * target[:, 1] | ||
|
|
||
| penalties = torch.zeros_like(loaded_loss) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need a penalty?
| self.annotations = json.load(f) | ||
|
|
||
| self.image_files = [f for f in os.listdir(self.root_dir) | ||
| if f.lower().endswith(('.png', '.jpg', '.jpeg')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use IMG_EXTENSIONS
|
|
||
| def forward(self, pred, target): | ||
|
|
||
| loaded_loss = (pred[:, 0] - target[:, 0]) ** 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are inheriting from nn.Module instead of Function, you can calculate internal losses like this::
loaded_loss = nn.MSELoss()(output[:, 0], target[:, 0])
| with torch.cuda.amp.autocast(enabled=scaler is not None): | ||
| output = model(image) # output shape: [batch_size, 2] | ||
| loss = criterion(output, target) | ||
| loaded_loss = nn.MSELoss()(output[:, 0], target[:, 0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to calculate and save this loss during training?
| for pred, true in zip(output.cpu().numpy(), target.cpu().numpy()): | ||
| if threshold_accuracy(true, pred, threshold): | ||
| correct += 1 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, the metrics can be calculated for two outputs simultaneously, since we do not use these parameters separately anywhere, but the idea is good, but I would still add total loss, mae, r2
|
|
||
| total_loss = scaled_loaded_loss.mean() + confidence_loss.mean() + penalty_term | ||
| return total_loss | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't you use the already written backwards method? During our research, we found that the custom backwards loss reduces losses by ~20% less than with the usual autograd.
The regression training is based on the classification, with added last layer. Here is also another class for dataset, that expects three folders, train, val and test, and annotations.json. Command to run is:
python -m torch.distributed.run /data/vision/references/classification/train.py --model resnet18 --batch-size 32 --lr 0.01 --lr-scheduler cosineannealinglr --lr-warmup-epochs 5 --lr-warmup-method linear --auto-augment ta_wide --epochs 500 --random-erase 0.1 --weight-decay 0.00002 --norm-weight-decay 0.0 --train-crop-size 220 --val-resize-size 232 --ra-sampler --ra-reps 4 --data /data/bruggen_regression --annotations_file /data/bruggen_regression/annotations.json --output-dir /data/bruggen_regression/regression
resnet50 or resnet101 also can be used here, pathes (--data and --annotations_file) should be adjusted
Here is also added tensorboard, command tensorboard --logdir=/data/bruggen_regression/ (adjust path)