Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting Tensorflow 2 #275

Merged
merged 6 commits into from
Oct 18, 2021
Merged

Supporting Tensorflow 2 #275

merged 6 commits into from
Oct 18, 2021

Conversation

duhaime
Copy link
Contributor

@duhaime duhaime commented Sep 3, 2021

Hello! The folks at @YaleDHLab are huge fans of your work @minimaxir. I wanted to send on the first draft of a pull request to make the library function in Tensorflow 1 or 2.

The only feature not ported to tf2 here is the memory saving gradients logic. The upstream library from which that logic is adopted still hasn't been ported to tf2, and some of the underlying graph traversal methods have been removed from tensorflow itself in 2.x, so there would be a good bit of work getting this running.

That said, the memory saving gradients are simply a performance enhancement. For interested parties, there's a thread going in @cybertronai/gradient-checkpointing and another thread in the @tensorflow repo that discusses some decorater-based approaches to gradient checkpointing in tf2...

In any event, thanks for this great work!

@IoIxD
Copy link

IoIxD commented Sep 5, 2021

Actually I don't think anyone is even maintaining this. The last push was in February and the last issue closure was in March.

@minimaxir
Copy link
Owner

I'm maintaining it but it's less of a priority relative to my other work.

If this does support TF2 then I can investigate.

@duhaime
Copy link
Contributor Author

duhaime commented Sep 5, 2021

@minimaxir that's totally understandable! Yes, you should find this will support TF2--I was building TF2 and TF1 models yesterday.

I could create some simple tests for TF1 and TF2, but we'd want some kind of automated environment like Travis/Jenkins where the various Tensorflow versions could be tested...

@tiagovaz
Copy link

I've tested this PR over TF2.6.0 and everything works nicely so far, thanks @duhaime!

@osirisguitar
Copy link

This is awesome! generate works flawlessly.

I could only find a working version of Tensorflow 2.3.4 for my non-avx CPU, so this was a life saver!

@leetfin
Copy link
Contributor

leetfin commented Oct 11, 2021

Unfortunately this no longer seems to work - back_prop seems to be depreciated. I tried to figure out a fix but was unable to find what changes were needed. Hopefully @duhaime can figure this out? Big fan of your work!

@duhaime
Copy link
Contributor Author

duhaime commented Oct 11, 2021

@L33Tech thanks for your note. Can you please say more? Im' not sure what you mean when you say "back_prop seems to be deprecated". If you add some more details (and possibly your tf version) I'm sure we can iron this out...

@leetfin
Copy link
Contributor

leetfin commented Oct 11, 2021

@duhaime Thanks for the response - I'm running TF 2.6.0 but they seem to have updated & depreciated something sometime in the past week. (Maybe 2.7.0rc0?)
I used to get the following message, but it would proceed anyway:
image

Earlier today it started saying that back_prop was depreciated entirely and failing to produce any output at all.

I've reinstalled and rolled back things in the mean time to stay on an older version (which is why I don't have a screenshot of the new error), but hopefully this can be fixed!
If there's any more info you need, please let me know! Happy to help with this any way I can.

Edit: Screenshot of new error.
image

@duhaime
Copy link
Contributor Author

duhaime commented Oct 12, 2021

@L33Tech if you have a moment can you please try again, using the aebb53b commit? I just tested with TF 2.6 and the model converged. A plot is hatching now:

GLOUCESTER:

A small fish, that thirty-two years old, that I trust will be slain with a very fast sword.

@leetfin
Copy link
Contributor

leetfin commented Oct 12, 2021

@duhaime That seems to have fixed it, thank you!

Unfortunately my model seems to be giving way simpler outputs now, I might have to retrain it with this update.

@duhaime
Copy link
Contributor Author

duhaime commented Oct 13, 2021

@L33Tech I was thinking about this this morning, and I'm wondering if the back_prop change should be treated as a separate issue.

It seems like that deprecation arises in TF 2.6, so if we unstack that last commit we can consider this PR as an opportunity to support TF <= 2.5, then we can look at the 2.6+ integrations later?

What do you think?

@leetfin
Copy link
Contributor

leetfin commented Oct 13, 2021

@duhaime I agree, but I was running it just fine on 2.6 - then TF seems to have force pushed an update onto it over the weekend to depreciate, as the same issue happened when I rolled back to 2.5. Have you seen this behavior, and if so know a fix? Thanks!

@duhaime
Copy link
Contributor Author

duhaime commented Oct 14, 2021

@L33Tech I receive the same warning, but the model converges just fine on TF 2.5 and TF 2.6.

If your model is not converging or producing output, perhaps something is wrong with your environment? Can you try reinstalling Tensorflow? Either way, I think handling this warning should happen in another issue, as this PR is aimed to add support for TF 2.x.

@leetfin
Copy link
Contributor

leetfin commented Oct 14, 2021

@duhaime I guess it must be something wrong on my end then, sorry about that. I'd say go ahead with unstacking the last commit and adding support for up to 2.6. Thanks for the awesome work and the help!

@duhaime
Copy link
Contributor Author

duhaime commented Oct 15, 2021

@minimaxir I leave it to you!

(The commit @L33Tech and I have been discussing aebb53b should be fine to leave, as it resolves the deprecation warning about the back_prop kwarg passed to tf.while_loop.)

@minimaxir
Copy link
Owner

minimaxir commented Oct 15, 2021

Thanks everyone for your contributions! I have time this weekend to verify things work; if they do, I'll merge and push out a new release to PyPI.

@minimaxir minimaxir merged commit 87720a1 into minimaxir:master Oct 18, 2021
@minimaxir
Copy link
Owner

minimaxir commented Oct 18, 2021

Testing it seems fine; just merged it and working on a 0.8.0 release now!

Thanks everyone for your input!

@iatenothingbutriceforthreedays

this is awesome, thank you!

@minimaxir
Copy link
Owner

v0.8.1 is out, let me know if that works and feel free to file another PR if there's something else broken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants