Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue/#513 New geoip database download URL #514

Merged
merged 3 commits into from
Jan 11, 2020

Conversation

Askaholic
Copy link
Collaborator

Updated the geoip url and added a license key parameter. The license key does not have a default and must be set by a server admin in order for the new database to be downloaded.

The database is now served via a tar.gz which contains a copyright and a license text as well as the database. The server now downloads the archive and compares its checksum against the md5 hash available at the same url. Once the archive has been successfully downloaded, the database file is extracted from the archive and copied over the old version of the file (if it exists).

Closes #513

Copy link
Collaborator

@norraxx norraxx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just don't get why to use inner functions. Like they're not making code readable, and don't give us much, also they're not using local variables, they're just small functions, that can be extracted out of the class, to utils or something like that. They're just helping functions, that can be separate.


self._logger.info("Downloading new geoip database")

# Download new file to a temp location
temp_file_path = "/tmp/geoip.mmdb.gz"
await self._download_file(config.GEO_IP_DATABASE_URL, temp_file_path)
temp_file_path = "/tmp/geoip.mmdb.tar.gz"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a reason to. The purpose is just so it doesn't overwrite the existing file while downloading. There's no need to use a module just to add /tmp to the beginning of a file path :P

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If something goes wrong, file could stay locked for a while. I just propose the tempfile.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would it get locked? It's opened in a context manager and this function is supposed to be called once every 3 weeks...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. NP.

temp_file_path
)

def get_db_member(tar: tarfile.TarFile) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason for inner function? Like i mean, you don't use local variables from outer scope, etc...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's for readability. If I put it inline it looks like this:

with tarfile.open(temp_file_path, 'r:gz') as tar:
    for name in tar.getnames():
        if name.endswith("GeoLite2-Country.mmdb"):
            with open(self.file_path, 'wb') as f_out:
                f_in = tar.extractfile(name)

Which is too much indentation IMO

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could also do name = next((n for n in tar.getnames() if n.endswith(foo)), None), then throw if it's None.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yea, that's pretty clean. There's also a function for getting TarInfo objects instead of just names which we can use to make sure that we only look at files and not directories as well.

server/geoip_service.py Show resolved Hide resolved
with open(self.file_path, 'wb') as f_out:
f_in = tar.extractfile(name)
assert f_in is not None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the reason behind of it?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like, you get_db_member raising if there is no file, isn't it?

Copy link
Collaborator Author

@Askaholic Askaholic Jan 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because tar.extractfile has return type Optional[IO[bytes]] I get a type error (warning) if I use it directly since the type checker doesn't know about the relationship between tar.getnames and tar.extractfile.

The assert should theoretically always succeed (unless there is a bug in tarfile I guess) but it makes the type checker happy since it basically changes the type from Optional[IO[bytes]] to just IO[bytes].

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=^_^= couldn't see that without analyzer. Yes, i thought about wrong tar too, but i don't know, if there is such possibility, when we already could read name.. like it's highly non possible to not to have whole file, isn't it? :-)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, like I said. I don't think it's possible for that assert to fail.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless someone's feeling malicious and the tar has a directory with GeoLite2-Country.mmdb in it. IMHO it does not hurt to just throw.

@Askaholic
Copy link
Collaborator Author

I did think about whether to move the functions get_db_file_with_checksum and get_checksum somewhere else, but I think it makes more sense to leave them there. They aren't being used anywhere else in the code so I feel that it would just be annoying to read if you had to go find them in a different file. They are basically like lambda functions except that python doesn't allow async lambdas, so they have to be real functions (well coroutines actually).

The other one as I explained is for readability because I didn't want to have 5 levels of nested indentation.

@norraxx
Copy link
Collaborator

norraxx commented Jan 8, 2020

How about to extract them from class and leave in this file?

@Askaholic Askaholic force-pushed the issue/#513-geoip-url branch 2 times, most recently from bbd863b to 63bc758 Compare January 10, 2020 23:54
@Askaholic Askaholic merged commit 23194da into FAForever:develop Jan 11, 2020
@Askaholic Askaholic deleted the issue/#513-geoip-url branch January 11, 2020 00:10
Brutus5000 pushed a commit that referenced this pull request Jan 27, 2020
* Changed database download to use the new license keys and file format

* Refactor tarfile extraction and error conditions

* Move error checking to outer function
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Download GeoIP database with Maxmind account token
3 participants