-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recording exceeded limits #212
Conversation
Providing signal and app configure to allow configuration of app witout having to modify its code. Also to provide reasonable defaults that users can override.
Attempt records
Hi, sorry for not responding sooner—I wanted to say this looks like a really good direction! I gave it a read and there are probably some nuances in implementation, like avoiding accidentally running migrations for users who don't want to opt into this behavior. It'll probably take me a couple of more weeks to do a good, thorough dive, just to set expectations. But I did see this and will follow-up and pursue it. One global request, if you're willing: this project has preferred single quotes, and small as it might seem I'd rather that be consistent (except in cases where the quoted string contains a single quote, e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just added some of my thoughts and performance recommendations (though I think this is pretty out of date now):
ratelimit/migrations/0001_initial.py
Outdated
migrations.CreateModel( | ||
name='ExceededLimitRecord', | ||
fields=[ | ||
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note in Django 3.2 that the default is BigAutoField.
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), | |
('id', models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')), |
ratelimit/migrations/0001_initial.py
Outdated
options={ | ||
'verbose_name': 'exceeded limit record', | ||
'verbose_name_plural': 'exceeded limit records', | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd use a linter if possible since there's this weird mix of single quote vs double quotes. But I'm just nit picking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The convention is to not touch the migration files! As we have consistent code signature everywhere else.
migrations.AddField( | ||
model_name="exceededlimitrecord", | ||
name="last_blocked_at", | ||
field=models.DateTimeField(auto_now=True, verbose_name="Last Blocked At"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a part of the first migration?
ratelimit/models.py
Outdated
from django.utils.translation import gettext_lazy as _ | ||
|
||
|
||
class BaseModel(models.Model): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rename this to:
class BaseModel(models.Model): | |
class BaseRatelimitLog(models.Model): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this would also mean you need to regenerate the migration file
limit_record = ExceededLimitRecord.objects.get( | ||
user_agent=user_agent, | ||
ip_address=client_ip_address, | ||
username=username, | ||
path_info=path_info, | ||
) | ||
limit_record.access_attempt_failures += 1 | ||
limit_record.save() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be made more efficient with F expressions and only getting the ID so we can update the record via the ID but not have to get all the values from the record (without the only
).
limit_record = ExceededLimitRecord.objects.get( | |
user_agent=user_agent, | |
ip_address=client_ip_address, | |
username=username, | |
path_info=path_info, | |
) | |
limit_record.access_attempt_failures += 1 | |
limit_record.save() | |
limit_record = ExceededLimitRecord.objects.get( | |
user_agent=user_agent, | |
ip_address=client_ip_address, | |
username=username, | |
path_info=path_info, | |
).only("id") | |
limit_record.access_attempt_failures = F("access_attempt_failures") + 1 | |
limit_record.save() |
The "only" should be checked in a test. Test case can look like:
with self.assertNumQueries(2):
limit_record = ExceededLimitRecord.objects.get(
user_agent=user_agent,
ip_address=client_ip_address,
username=username,
path_info=path_info,
).only("id")
limit_record.access_attempt_failures = F("access_attempt_failures") + 1
limit_record.save()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion!
ratelimit/record_handlers/helpers.py
Outdated
from ratelimit.conf import settings | ||
|
||
|
||
def get_client_ip_address(request) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's already a tool that does that so that we don't need to install an external package (i.e. ipware):
django-ratelimit/django_ratelimit/core.py
Lines 29 to 59 in 651c730
def _get_ip(request): | |
ip_meta = getattr(settings, 'RATELIMIT_IP_META_KEY', None) | |
if not ip_meta: | |
ip = request.META['REMOTE_ADDR'] | |
if not ip: | |
raise ImproperlyConfigured( | |
'IP address in REMOTE_ADDR is empty. This can happen when ' | |
'using a reverse proxy and connecting to the app server with ' | |
'Unix sockets. See the documentation for ' | |
'RATELIMIT_IP_META_KEY: https://bit.ly/3iIpy2x') | |
elif callable(ip_meta): | |
ip = ip_meta(request) | |
elif isinstance(ip_meta, str) and '.' in ip_meta: | |
ip_meta_fn = import_string(ip_meta) | |
ip = ip_meta_fn(request) | |
elif ip_meta in request.META: | |
ip = request.META[ip_meta] | |
else: | |
raise ImproperlyConfigured( | |
'Could not get IP address from "%s"' % ip_meta) | |
if ':' in ip: | |
# IPv6 | |
mask = getattr(settings, 'RATELIMIT_IPV6_MASK', 64) | |
else: | |
# IPv4 | |
mask = getattr(settings, 'RATELIMIT_IPV4_MASK', 32) | |
network = ipaddress.ip_network('{}/{}'.format(ip, mask), strict=False) | |
return str(network.network_address) |
ratelimit/record_handlers/helpers.py
Outdated
def get_client_username(request) -> str: | ||
if request.user.is_authenticated: | ||
return request.user.username | ||
return "Anonymous" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not everyone has a "username" field. You might want to look into django.contrib.auth.models
where they define the Username field inside a list (specifically AbstractUser or AbstractBaseUser). I also think this should instead be based on user ID/pk instead of username to be more efficient (string lookup vs integer lookup).
If you're going to stick with username, then try this (but I'd still recommend using the integer pk AutoField):
def get_client_username(request) -> str: | |
if request.user.is_authenticated: | |
return request.user.username | |
return "Anonymous" | |
from django.contrib.auth.models import AnonymousUser | |
def get_client_username(request) -> str: | |
if request.user.is_authenticated: | |
return getattr(request.user, request.user.USERNAME_FIELD) | |
return AnonymousUser |
ratelimit/record_handlers/helpers.py
Outdated
def make_cache_key(key_components): | ||
|
||
cache_key_components = "".join(value for value in key_components.values() if value) | ||
cache_key_digest = md5(cache_key_components.encode()).hexdigest() | ||
cache_key = "drl-{}".format(cache_key_digest) | ||
|
||
return cache_key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this in django_ratelimit.core
?
This probably needs a bit of rework with the conflicts that have cropped up from some other refactors. One thing I want to avoid is baking too much optional behavior into the "core." I wonder if we could tuck some of this into something like |
Yes definitely doable. I think all you gotta do is just move the dirs and it should be fine (ref the token_blacklist app from here: https://github.com/jazzband/django-rest-framework-simplejwt/tree/master/rest_framework_simplejwt where the rest of the dir is not affiliated with any app) |
Update ratelimit with latest changes from the fork
I have updated the PR and fixed the conflicts. Sadly it's hard for me to find some spare time to work on this more but, I will make some more adjustments. I appreciate any help 👍 |
…-update Efficient access attempts update
Hi @jsocol . I've been busy doing all these for this issue #210 .
So I've made a proxy and two handlers, one for cache and another for database backend as you said "I can definitely also see wanting to know which users had historically exceeded the limits, not just current."
You can make as many as handlers you want, if there's any!
This functionality is completely separated from the core to not produce any security flaws.
Tests for the new functionality has been added and all the tests pass.
An abstract doc has been added for the settings. I wasn't sure if there's more explaining needed on the doc as it's simple and straight forward, but if needed, let me know.
I also did some re-structuring to the tests module because the test file was getting a Mega file.