Skip to content
This repository has been archived by the owner on Jan 20, 2023. It is now read-only.

Implement anti-spam algorithm #23

Open
baptiste0928 opened this issue Feb 2, 2022 · 0 comments
Open

Implement anti-spam algorithm #23

baptiste0928 opened this issue Feb 2, 2022 · 0 comments
Labels
c-event Related to event handling t-feature Introduces a new feature

Comments

@baptiste0928
Copy link
Owner

baptiste0928 commented Feb 2, 2022

The anti-spam algorithm performs analysis on messages sent in the last 30 seconds.

Goals

Targets

The algorithm targets the following abusive behaviors:

  • Flooding a channel with high amount of messages in a short delay.
  • Spam of role and/or user mentions, either by mentioning a lot of users quickly or the same user repeatedly.
  • Links, files and/or media spamming, most of the time used to spread malware, self-promote or send abusive content (flashing gifs, ...).

Detection requirements

In order to be as efficient as possible and the least disruptive to members, the algorithm must meet the following requirements:

  • Normal conversions between users should not trigger the anti-spam, even when sending multiple messages quickly, like a sentence split across multiple messages.
  • When a user has an abusive behavior, the bot must respond quickly (less than 5 messages). This is primary for mentions and link spamming, that can be very annoying for members.

Threat response

In case an abusive behavior is detected, an appropriate response must be given by the bot:

  • The user must be sanctioned in order to allow the moderation team to handle the situation. Last messages should also be cleared to remove potential threats such as links.
  • The sanction should be proportional to the severity of the abuse. Users that flood should have a small sanction in order to warn them, whereas user that show a deliberate abusive behavior must be muted for a long period, or even banned.
  • In case the bot cannot perform a sanction, it should warn moderator team, but does not perform any action such as removing messages, to avoid being rate-limited. It could try to fall back on a lower sanction, such as muting instead of banning.
  • Moderators must be informed of the bot actions, using the logs channel. History of deleted message should be kept to allow inspecting the precise detected abuse.

Algorithm implementation

Message analysis

Before processing, some information is extracted from received messages.

  • The message content is transformed into ASCII-only string and split in words according to the Unicode specification.
  • Links are extracted and categorized (invite, media, other).

Performance

  • The algorithm is only run if there are more than 3 messages in the cache.
  • Expensive computation results such as message analysis is done only when the message is received, then cached.

Possible improvements

The proposed implementation is a minimal implementation, that does not resist to all circumvention attempts. It is necessary to plan to improve it in the future.

@baptiste0928 baptiste0928 added t-feature Introduces a new feature c-anti-spam labels Feb 2, 2022
@baptiste0928 baptiste0928 added c-event Related to event handling and removed c-anti-spam labels Mar 22, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
c-event Related to event handling t-feature Introduces a new feature
Projects
None yet
Development

No branches or pull requests

1 participant