Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Colab Train Pytorch model save on Google drive, exit training midway, underlying folder is corrupted #791

Closed
karanchahal opened this issue Oct 5, 2019 · 3 comments

Comments

@karanchahal
Copy link

When I train a pytorch model on a google drive mounted colab notebook, it runs fine.

I can say weight periodically, however when I stop training midway to change some paramters, I can't access the underlying folder structure anymore.

Its says transport endpoint not connected

The error message is

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/IPython/core/ultratb.py", line 1132, in get_records
    return _fixed_getinnerframes(etb, number_of_lines_of_context, tb_offset)
  File "/usr/local/lib/python3.6/dist-packages/IPython/core/ultratb.py", line 313, in wrapped
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/IPython/core/ultratb.py", line 358, in _fixed_getinnerframes
    records = fix_frame_records_filenames(inspect.getinnerframes(etb, context))
  File "/usr/lib/python3.6/inspect.py", line 1490, in getinnerframes
    frameinfo = (tb.tb_frame,) + getframeinfo(tb, context)
  File "/usr/lib/python3.6/inspect.py", line 1448, in getframeinfo
    filename = getsourcefile(frame) or getfile(frame)
  File "/usr/lib/python3.6/inspect.py", line 696, in getsourcefile
    if getattr(getmodule(object, filename), '__loader__', None) is not None:
  File "/usr/lib/python3.6/inspect.py", line 725, in getmodule
    file = getabsfile(object, _filename)
  File "/usr/lib/python3.6/inspect.py", line 709, in getabsfile
    return os.path.normcase(os.path.abspath(_filename))
  File "/usr/lib/python3.6/posixpath.py", line 383, in abspath
    cwd = os.getcwd()
OSError: [Errno 107] Transport endpoint is not connected
@bharaniabhishek123
Copy link

I got same issue
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/IPython/core/ultratb.py", line 1132, in get_records
return _fixed_getinnerframes(etb, number_of_lines_of_context, tb_offset)
File "/usr/local/lib/python3.6/dist-packages/IPython/core/ultratb.py", line 313, in wrapped
return f(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/IPython/core/ultratb.py", line 358, in _fixed_getinnerframes
records = fix_frame_records_filenames(inspect.getinnerframes(etb, context))
File "/usr/lib/python3.6/inspect.py", line 1490, in getinnerframes
frameinfo = (tb.tb_frame,) + getframeinfo(tb, context)
File "/usr/lib/python3.6/inspect.py", line 1448, in getframeinfo
filename = getsourcefile(frame) or getfile(frame)
File "/usr/lib/python3.6/inspect.py", line 696, in getsourcefile
if getattr(getmodule(object, filename), 'loader', None) is not None:
File "/usr/lib/python3.6/inspect.py", line 725, in getmodule
file = getabsfile(object, _filename)
File "/usr/lib/python3.6/inspect.py", line 709, in getabsfile
return os.path.normcase(os.path.abspath(_filename))
File "/usr/lib/python3.6/posixpath.py", line 383, in abspath
cwd = os.getcwd()
OSError: [Errno 107] Transport endpoint is not connected

@tempdeltavalue
Copy link

Try to train tensorflow model , got same issue

Screenshot 2021-06-30 100846

@craigcitro
Copy link
Contributor

Can you provide a minimal repro notebook?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants