Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File was not opened in read | write mode with the S3Storage backend #976

Closed
atleta opened this issue Jan 18, 2021 · 8 comments · Fixed by #1321
Closed

File was not opened in read | write mode with the S3Storage backend #976

atleta opened this issue Jan 18, 2021 · 8 comments · Fixed by #1321

Comments

@atleta
Copy link

atleta commented Jan 18, 2021

I saw that this issue was reported at least twice in the past few years, but it probably was never properly described and thus got closed. It seems that once the file is open in either mode (read or write) you can't switch to the other.

First of all, it's not immediately obvious that you shouldn't be able to open it multiple times. E.g. given

class MyModel(models.Model):
    some_field = ImageField()

and m = MyModel,objects.first() if you use m.some_field for a read operation (e.g. to process the image) then it gets opened in read mode by default. But then there is no way to write back to it. I've tried it the following ways:

f=m.some_field.open('w')
f.write('some content')

This could work, though one has to be aware of the magic Django does with FileFields (without that you'd expect this to return a new file-like object, whereas open really influences the field, which kind of looks like a file). But it could work, it would just have to change mode (basically probably close the file, and reopen it, causing it to flush if the transition is w -> r). Django doesn't do it either, though. (It seeks to 0.)

However, closing the file manually doesn't work either:

m.some_field.close()
f=m.some_field.open('w')
f.write('some content')

This causes the same exception. Indeed, the field is left in a bogus, state, it's probably not aware that it was closed (m.some_field.closed returns False even after doing a m.some_field.close()). read still works, though it seems to reopen the file (at least probably reaches out to S3 as the call takes some time to return). This happens with the open('w') call (after closing the file) as well but the file is still stuck in read mode and doesn't allow writing.

(I'm using an ImageField in the example because that's what I'm getting the error with, but I'm 99% sure it's the same with a plain FileField).

The only solution, for now, seems to be refreshing the model instance from the db, but that's slow and inconvenient. (Now I see that django actually doesn't allow re-opening a closed file, which is sad, but even then, closing the file should close the file and also conflicting file mode opens should be rejected. But that's likely a django bug.)

@rabbit-aaron
Copy link

rabbit-aaron commented Mar 11, 2022

Hey buddy, I kinda figured out what's happening.

The doc does not say s3file.open is supported at all by this library.

I attempted this (not thoroughly tested), it seems to allow me to open file with read mode then write mode.

You can extend S3Boto3StorageFile then implement the open method like so:

    def open(self, mode='rb'):
        if mode != self._mode:
            self.close()
            self.__init__(self.name, mode, self._storage, self.buffer_size)

Then create a storage class that uses this file class.
I hope this issue eventually gets resolved, but looking at the date of your issue creation I'm not very optimistic :(

@rabbit-aaron
Copy link

rabbit-aaron commented Mar 11, 2022

Minimal steps to reproduce this issue:

storage = S3Boto3Storage(...)
my_file = storage.open("test.txt", "w")
my_file.write("hi")
my_file.close()
my_file.open("r")
print(my_file.read())

exception:

Traceback (most recent call last):
  File "/.../main.py", line 0, in <module>
    main()
  File "/.../main.py", line 12, in main
    print(my_file.read())
  File "/.../lib/python3.8/site-packages/storages/backends/s3boto3.py", line 153, in read
    raise AttributeError("File was not opened in read mode.")
AttributeError: File was not opened in read mode.
Django 3.1.14
django-storages 1.12.3

@rabbit-aaron
Copy link

Hey @jschneier, love this lib and I've been using it since 2018!
Is there any plan to support s3file.open? if so I could attempt to implement and submit a pull request.

@yywing
Copy link

yywing commented Jun 22, 2022

#1145

@skrat
Copy link

skrat commented Dec 6, 2022

I'm fighting the django file system all the time, it's insane, I feel like I'm walking in circles. They have problem with this use-case even when using default FileStorage. Unfortunately here the issue persists, and @rabbit-aaron workaround doesn't seem to be working.

inst.data.file.file._file # smells a bit?

Looks weird doesn't it? I don't think I'm wrong when I say that most user code expects the file/storage system to be stateless, meaning it's not caching any state (mode, buffer). So when I open the file for reading, then close it, I naturally expect that all state is lost. So next time I open it, I get a fresh file, be it for reading or writing. Django doesn't do that and it's buggy. I think django-storages can do better.

For any sane person, always use it like this:

with inst.data.storage.open(inst.data.name, 'rb') as f:
    ... # do read stuff
with inst.data.storage.open(inst.data.name, 'wb') as f:
    ... # do write stuff

@jaycle
Copy link

jaycle commented Mar 24, 2023

Was bitten by this issue today. I fixed it by just deleting the .file after exiting the with block. But ideally we'd not need to do this. I spent a little time trying to understand what's going on.

I'm not sure if it's relevant, but in S3Boto3StorageFile we have:

def close(self):
    # Partly omitted..
    if self._file is not None:
        self._file.close()
        self._file = None

When _file is closed, self.closed property is True. Once it's set to none, self.closed is False. When the Django core File is then open()ed again, we end up doing seek(0) instead of resetting the file. Can anything be done at the django-storages layer to update the self._mode on open?

EDIT:
I'll add that Django's docs explicitly state support for reopening with a different mode. To make the django-storages abstraction be less leaky, I'd be in favor of keeping those semantics.

ref: https://docs.djangoproject.com/en/4.1/ref/files/file/#django.core.files.File.open

@jschneier
Copy link
Owner

I think the correct fix here would be to override S3Boto3StorageFile.open, right? The default with the django.core.files.base.File that hits the local filesystem doesn't make any sense in a cloud context. But calling self._storage.open does. I'll accept a PR with this fix.

Not sure if we need the re-seeking? If not, we'd want to close the file then I suppose. So there is some nuance.

I think failing that, @skrat has the correct approach.

@rabbit-aaron
Copy link

rabbit-aaron commented Mar 28, 2023

My attempt: #1227

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants