-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue/#513 New geoip database download URL #514
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just don't get why to use inner functions. Like they're not making code readable, and don't give us much, also they're not using local variables, they're just small functions, that can be extracted out of the class, to utils or something like that. They're just helping functions, that can be separate.
|
||
self._logger.info("Downloading new geoip database") | ||
|
||
# Download new file to a temp location | ||
temp_file_path = "/tmp/geoip.mmdb.gz" | ||
await self._download_file(config.GEO_IP_DATABASE_URL, temp_file_path) | ||
temp_file_path = "/tmp/geoip.mmdb.tar.gz" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe use https://docs.python.org/3/library/tempfile.html ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a reason to. The purpose is just so it doesn't overwrite the existing file while downloading. There's no need to use a module just to add /tmp
to the beginning of a file path :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If something goes wrong, file could stay locked for a while. I just propose the tempfile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would it get locked? It's opened in a context manager and this function is supposed to be called once every 3 weeks...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. NP.
server/geoip_service.py
Outdated
temp_file_path | ||
) | ||
|
||
def get_db_member(tar: tarfile.TarFile) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason for inner function? Like i mean, you don't use local variables from outer scope, etc...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's for readability. If I put it inline it looks like this:
with tarfile.open(temp_file_path, 'r:gz') as tar:
for name in tar.getnames():
if name.endswith("GeoLite2-Country.mmdb"):
with open(self.file_path, 'wb') as f_out:
f_in = tar.extractfile(name)
Which is too much indentation IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also do name = next((n for n in tar.getnames() if n.endswith(foo)), None)
, then throw if it's None.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yea, that's pretty clean. There's also a function for getting TarInfo
objects instead of just names which we can use to make sure that we only look at files and not directories as well.
server/geoip_service.py
Outdated
with open(self.file_path, 'wb') as f_out: | ||
f_in = tar.extractfile(name) | ||
assert f_in is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the reason behind of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like, you get_db_member
raising if there is no file, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because tar.extractfile
has return type Optional[IO[bytes]]
I get a type error (warning) if I use it directly since the type checker doesn't know about the relationship between tar.getnames
and tar.extractfile
.
The assert should theoretically always succeed (unless there is a bug in tarfile
I guess) but it makes the type checker happy since it basically changes the type from Optional[IO[bytes]]
to just IO[bytes]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
=^_^= couldn't see that without analyzer. Yes, i thought about wrong tar too, but i don't know, if there is such possibility, when we already could read name.. like it's highly non possible to not to have whole file, isn't it? :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, like I said. I don't think it's possible for that assert to fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless someone's feeling malicious and the tar has a directory with GeoLite2-Country.mmdb
in it. IMHO it does not hurt to just throw.
I did think about whether to move the functions The other one as I explained is for readability because I didn't want to have 5 levels of nested indentation. |
How about to extract them from class and leave in this file? |
bbd863b
to
63bc758
Compare
63bc758
to
40d8c17
Compare
Updated the geoip url and added a license key parameter. The license key does not have a default and must be set by a server admin in order for the new database to be downloaded.
The database is now served via a
tar.gz
which contains a copyright and a license text as well as the database. The server now downloads the archive and compares its checksum against the md5 hash available at the same url. Once the archive has been successfully downloaded, the database file is extracted from the archive and copied over the old version of the file (if it exists).Closes #513