Support `channels_last` with training #15175

Queuecumber · 2022-10-18T16:10:54Z

🚀 Feature

I'd like to try out some channels_last training to see if it improves performance (https://pytorch.org/tutorials/intermediate/memory_format_tutorial.html)

I'm not entirely sure what the best way to do this with lightning is but I also think it should probably be one of those features that you set on the trainer and it just magically works.

Motivation

Using the channels_last memory format can improve performance in some cases

Pitch

Add a trainer flag that does whatever is needed for channels_last so

trainer = pl.Trainer( ..., memory_format='channels_last')

or something like that and then before anything happens with training/testing you need to convert the module

if self.memory_format == 'channels_last':
    lightning_module = lightning_module.to(memory_format=torch.channels_last)

and each batch in the train/test/val loops

if self.memory_format == 'channels_last':
    batch = batch.to(memory_format=torch.channels_last)
lightning_module(batch)

Alternatives

I have no idea but I assume I could do this without changing lightning although I'm not sure how yet

Additional context

I am not sure I will be able to PR this one but I'm not opposed to trying

cc @Borda @carmocca @justusschock @awaelchli

The text was updated successfully, but these errors were encountered:

rohitgr7 · 2022-10-21T18:17:52Z

how about doing it manually?

class LitModel(LightningModule):
    def __init__(self):
        ...

    def on_fit_start(self):
        self = self.to(memory_format=torch.channels_last)

Queuecumber · 2022-10-22T01:18:42Z

I think it would probably work but:

I also think it should probably be one of those features that you set on the trainer and it just magically works.

rohitgr7 · 2022-10-23T19:52:37Z

I think it's very domain specific plus just one linear so, I don't think this should be added to the trainer

Queuecumber · 2022-10-23T19:58:38Z

How about a plugin?

Queuecumber · 2022-10-23T19:59:31Z

Keep in mind you don't just call to on the model itself, you need to call it in all inputs too

rohitgr7 · 2022-10-23T20:12:52Z

class LitModel(LightningModule):
    def __init__(self):
        ...

    def on_fit_start(self):
        self = self.to(memory_format=torch.channels_last)

    def on_after_batch_transfer(self, batch, *args, **kwargs):
        batch = batch.to(memory_format=torch.channels_last)
        return batch

haven't checked more in defail, but based on your info, I guess this should work

Queuecumber · 2022-10-23T22:08:06Z

Yeah you could also write

class LitModel(LightningModule):
    def __init__(self):
        ...

    def train_step(batch):
        with torch.amp.autocast(self.device):
            ...

Right? That's not really the point though

This isn't some niche technology, huggingface used it to speed up stable diffusion inference by ~2.8x (https://huggingface.co/docs/diffusers/optimization/fp16) I assume that's a pie lightning wants a piece of

So does anyone besides you have opinions on this?

Borda · 2023-04-17T16:51:27Z

This isn't some niche technology, huggingface used it to speed up stable diffusion inference by ~2.8x

cc: @JustinGoheen @justusschock

Pedrexus · 2023-05-01T06:10:01Z

Hello all. Any update on this? I've been using a simple callback and getting around 30%-40% speedup while training a torchvision ResNet50. I could open a PR if anyone is interested.

HelixPiano · 2023-05-17T12:59:07Z

Sounds good to me

awaelchli · 2023-05-28T23:53:37Z

@Pedrexus Should the callback also do?

undo the transformation in teardown
handle input conversion
warn if no conv layers in model?

If the callback handles this boilerplate, then I see value in a callback like this, but otherwise, model.to(memory_format=torch.channels_last) is a one-liner and should work as expected already.

carmocca · 2023-05-29T15:17:16Z

Another alternative would be to have this in the docs as an example of a callback

Pedrexus · 2023-05-30T08:21:23Z

@Pedrexus Should the callback also do?

undo the transformation in teardown

handle input conversion

warn if no conv layers in model?

If the callback handles this boilerplate, then I see value in a callback like this, but otherwise, model.to(memory_format=torch.channels_last) is a one-liner and should work as expected already.

sure, I agree with you. The one I use in my codebase does a bit more than just the one liner, I think I can add these soon.

However, I'm not sure input conversion is necessary. I tried this and saw no positive performance effect. I could add it behind a feature flag anyway.

BTW, I could make it a bit more general as "MemoryFormat" callback and not only "ChannelsLast". I believe it shouldn't be over engineering.

TezRomacH · 2023-12-09T20:35:43Z

@Pedrexus
Hi! Any updates on the callback? :)

Pedrexus · 2023-12-13T00:56:48Z

@TezRomacH I'm sorry. I got busy with things, but I will try to work on the failing checks soon.

Pedrexus · 2023-12-13T02:02:47Z

Ok, I fixed some problems, but it seems mypy is failing. From what I glanced, it seems it is using the wrong overload of torch.Tensor.to() https://pytorch.org/docs/stable/generated/torch.Tensor.to.html

Queuecumber added the needs triage Waiting to be triaged by maintainers label Oct 18, 2022

awaelchli added feature Is an improvement or enhancement lightningmodule pl.LightningModule trainer: argument and removed needs triage Waiting to be triaged by maintainers labels Oct 18, 2022

stale bot added the won't fix This will not be worked on label Apr 15, 2023

Borda changed the title ~~Support channels_last~~ Support channels_last with training Apr 17, 2023

stale bot removed the won't fix This will not be worked on label Apr 17, 2023

Lightning-AI deleted a comment from stale bot Apr 17, 2023

Pedrexus linked a pull request May 23, 2023 that will close this issue

Add MemoryFormat callback (channels last) #17680

Open

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support `channels_last` with training #15175

Support `channels_last` with training #15175

Queuecumber commented Oct 18, 2022 •

edited by github-actions bot

Loading

rohitgr7 commented Oct 21, 2022

Queuecumber commented Oct 22, 2022

rohitgr7 commented Oct 23, 2022

Queuecumber commented Oct 23, 2022

Queuecumber commented Oct 23, 2022

rohitgr7 commented Oct 23, 2022

Queuecumber commented Oct 23, 2022

Borda commented Apr 17, 2023 •

edited

Loading

Pedrexus commented May 1, 2023 •

edited

Loading

HelixPiano commented May 17, 2023

awaelchli commented May 28, 2023

carmocca commented May 29, 2023

Pedrexus commented May 30, 2023 •

edited

Loading

TezRomacH commented Dec 9, 2023

Pedrexus commented Dec 13, 2023

Pedrexus commented Dec 13, 2023 •

edited

Loading

Support channels_last with training #15175

Support channels_last with training #15175

Comments

Queuecumber commented Oct 18, 2022 • edited by github-actions bot Loading

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

rohitgr7 commented Oct 21, 2022

Queuecumber commented Oct 22, 2022

rohitgr7 commented Oct 23, 2022

Queuecumber commented Oct 23, 2022

Queuecumber commented Oct 23, 2022

rohitgr7 commented Oct 23, 2022

Queuecumber commented Oct 23, 2022

Borda commented Apr 17, 2023 • edited Loading

Pedrexus commented May 1, 2023 • edited Loading

HelixPiano commented May 17, 2023

awaelchli commented May 28, 2023

carmocca commented May 29, 2023

Pedrexus commented May 30, 2023 • edited Loading

TezRomacH commented Dec 9, 2023

Pedrexus commented Dec 13, 2023

Pedrexus commented Dec 13, 2023 • edited Loading

Support `channels_last` with training #15175

Support `channels_last` with training #15175

Queuecumber commented Oct 18, 2022 •

edited by github-actions bot

Loading

Borda commented Apr 17, 2023 •

edited

Loading

Pedrexus commented May 1, 2023 •

edited

Loading

Pedrexus commented May 30, 2023 •

edited

Loading

Pedrexus commented Dec 13, 2023 •

edited

Loading