Skip to content

Commit 9f7ceea

Browse files
author
Sebastian Wagner
committed
Merge branch 'maintenance' into develop
2 parents dd15c62 + 70fd758 commit 9f7ceea

File tree

7 files changed

+65
-48
lines changed

7 files changed

+65
-48
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,14 +21,18 @@ CHANGELOG
2121
#### Collectors
2222
- `intelmq.bots.collectors.shodan.collector_stream`: Fix access to parameters, the bot wrongly used `self.parameters` (PR#2020 by Mikk Margus Möll).
2323
- `intelmq.bots.collectors.mail.collector_mail_attach`: Add attachment file name as `extra.file_name` also if the attachment is not compressed (PR#2021 by Alex Kaplan).
24+
- `intelmq.bots.collectors.http.collector_http_stream`: Fix access to parameters, the bot wrongly used `self.parameters` (by Sebastian Wagner).
2425

2526
#### Parsers
2627

2728
#### Experts
2829

2930
#### Outputs
31+
- `intelmq.bots.outputs.mcafee.output_esm_ip`: Fix access to parameters, the bot wrongly used `self.parameters` (by Sebastian Wagner).
32+
- `intelmq.bots.outputs.misp.output_api`: Fix access to parameters, the bot wrongly used `self.parameters` (by Sebastian Wagner).
3033

3134
### Documentation
35+
- Various formatting fixes (by Sebastian Wagner).
3236

3337
### Packaging
3438
- intelmq-update-database crontab: Add missing `recordedfuture_iprisk` update call (by Sebastian Wagner).

docs/dev/feeds-wishlist.rst

Lines changed: 7 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ See :ref:`feeds documentation` for more information on this.
1414
This list evolved from the issue :issue:`Contribute: Feeds List (#384) <384>`.
1515

1616
- Lists of feeds:
17+
1718
- `threatfeeds.io <https://threatfeeds.io>`_
1819
- `TheCyberThreat <http://thecyberthreat.com/cyber-threat-intelligence-feeds/>`_
1920
- `sbilly: Awesome Security <https://github.com/sbilly/awesome-security#threat-intelligence>`_
@@ -27,16 +28,16 @@ This list evolved from the issue :issue:`Contribute: Feeds List (#384) <384>`.
2728
- List of potentially interesting data sources:
2829

2930
- `Abuse.ch SSL Blacklists <https://sslbl.abuse.ch/blacklist/>`_
30-
- `Adblock Plus Malwaredomains <https://easylist-msie.adblockplus.org/malwaredomains_full.tpl>`_
31+
- `Adblock Plus <https://adblockplus.org/en/subscriptions>`_
3132
- `apivoid IP Reputation API <https://www.apivoid.com/api/ip-reputation/>`_
3233
- `Anomali Limo Free Intel Feed <https://www.anomali.com/resources/limo>`_
3334
- `APWG's ecrimex <https://www.ecrimex.net>`_
34-
- `Bad IPs <https://www.badips.com>`_
3535
- `Berkeley <https://security.berkeley.edu/aggressive_ips/ips>`_
3636
- `Binary Defense <https://www.binarydefense.com/>`_
3737
- `Bot Invaders Realtime tracker <http://www.marc-blanchard.com/BotInvaders/index.php>`_
3838
- `Botherder Targetedthreats <https://github.com/botherder/targetedthreats/>`_
3939
- `Botscout Last Caught <http://botscout.com/last_caught_cache.htm>`_
40+
- `botvrij <https://www.botvrij.eu/>`_
4041
- `Carbon Black Feeds <https://github.com/carbonblack/cbfeeds>`_
4142
- `CERT.pl Phishing Warning List <http://hole.cert.pl/domains/>`_
4243
- `Chaos Reigns <http://www.chaosreigns.com/spam/>`_
@@ -45,7 +46,6 @@ This list evolved from the issue :issue:`Contribute: Feeds List (#384) <384>`.
4546
- `Cyber Crime Tracker <http://cybercrime-tracker.net/all.php>`_
4647
- `drb-ra C2IntelFeeds <https://github.com/drb-ra/C2IntelFeeds>`_
4748
- `DNS DB API <https://api.dnsdb.info>`_
48-
- `Dyn DNS <http://security-research.dyndns.org/pub/>`_
4949
- `ESET Malware Indicators of Compromise <https://github.com/eset/malware-ioc>`_
5050
- `Facebook Threat Exchange <https://developers.facebook.com/docs/threat-exchange>`_
5151
- `FilterLists <https://filterlists.com>`_
@@ -57,20 +57,18 @@ This list evolved from the issue :issue:`Contribute: Feeds List (#384) <384>`.
5757
- `HP Feeds <https://github.com/rep/hpfeeds>`_
5858
- `IBM X-Force Exchange <https://exchange.xforce.ibmcloud.com/>`_
5959
- `ImproWare AntiSpam <https://antispam.imp.ch/>`_
60-
- `ISC SANS <https://isc.sans.edu/ipsascii.html>`_
6160
- `ISightPartners <http://www.isightpartners.com/>`_
6261
- `James Brine <https://jamesbrine.com.au/>`_
6362
- `Joewein <http://www.joewein.net>`_
6463
- Maltrail:
64+
6565
- `Malware <https://github.com/stamparm/maltrail/tree/master/trails/static/malware>`_
6666
- `Suspicious <https://github.com/stamparm/maltrail/tree/master/trails/static/suspicious>`_
6767
- `Mass Scanners <https://github.com/stamparm/maltrail/blob/master/trails/static/mass_scanner.txt>`_ (for whitelisting)
6868
- `Malshare <https://malshare.com/>`_
6969
- `MalSilo Malware URLs <https://malsilo.gitlab.io/feeds/dumps/url_list.txt>`_
7070
- `Malware Config <http://malwareconfig.com>`_
7171
- `Malware DB (cert.pl) <https://mwdb.cert.pl/>`_
72-
- `MalwareDomainList <http://www.malwaredomainlist.com/zeuscsv.php>`_
73-
- `MalwareDomains <http://www.malwaredomainlist.com/hostslist/yesterday_urls.php>`_
7472
- `MalwareInt <http://malwareint.com>`_
7573
- `Malware Must Die <https://malwared.malwaremustdie.org/rss.php>`_
7674
- `Manity Spam IP addresses <http://www.dnsbl.manitu.net/download/nixspam-ip.dump.gz>`_
@@ -79,20 +77,18 @@ This list evolved from the issue :issue:`Contribute: Feeds List (#384) <384>`.
7977
- `mIRC Servers <http://www.mirc.com/servers.ini>`_
8078
- `Monzymerza <https://github.com/monzymerza/parthenon>`_
8179
- `Multiproxy <http://multiproxy.org/txt_all/proxy.txt>`_
82-
- `MVPS <http://mvps.org>`_
8380
- `Neo23x0 signature-base <https://github.com/Neo23x0/signature-base/tree/master/iocs>`_
84-
- `Null Secure <http://nullsecure.org>`_
8581
- `OpenBugBounty <https://www.openbugbounty.org/>`_
86-
- `Payload Security <http://payload-security.com>`_
82+
- `Phishing Army <https://phishing.army/>`_
8783
- `Project Honeypot (#284) <http://www.projecthoneypot.org/list_of_ips.php?rss=1>`_
8884
- `RST Threat Feed <https://rstcloud.net/>`_ (offers a free and a commercial feed)
85+
- `SANS ISC <https://isc.sans.edu/api/>`_
8986
- `ShadowServer Sandbox API <http://www.shadowserver.org/wiki/pmwiki.php/Services/Sandboxapi>`_
9087
- `Shodan search API <https://shodan.readthedocs.io/en/latest/tutorial.html#searching-shodan>`_
9188
- `Snort <http://labs.snort.org/feeds/ip-filter.blf>`_
9289
- `stopforumspam Toxic IP addresses and domains <https://www.stopforumspam.com/downloads>`_
93-
- `Spamhaus BGP feed (BGPf) <https://www.spamhaus.org/bgpf/>`_
90+
- `Spamhaus Botnet Controller List <https://www.spamhaus.org/bcl/>`_
9491
- `SteveBlack Hosts File <https://github.com/StevenBlack/hosts>`_
95-
- `TheCyberThreat <http://thecyberthreat.com/cyber-threat-intelligence-feeds/>`_
9692
- `The Haleys <http://charles.the-haleys.org/ssh_dico_attack_hdeny_format.php/hostsdeny.txt>`_
9793
- `Threat Crowd <https://www.threatcrowd.org/feeds/hashes.txt>`_
9894
- `Threat Grid <http://www.threatgrid.com/>`_
@@ -106,5 +102,4 @@ This list evolved from the issue :issue:`Contribute: Feeds List (#384) <384>`.
106102
- `Virustotal <https://www.virustotal.com/gui/home/search>`_
107103
- `virustream <https://github.com/ntddk/virustream>`_
108104
- `VoIP Blacklist <http://www.voipbl.org/update/>`_
109-
- `Wordpress Callback Domains <http://callbackdomains.wordpress.com>`_
110105
- `YourCMC <http://vmx.yourcmc.ru/BAD_HOSTS.IP4>`_

docs/user/bots.rst

Lines changed: 28 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -314,6 +314,7 @@ Generic Mail Attachment Fetcher
314314
* `ssl_ca_certificate`: Optional string of path to trusted CA certificate. Applies only to IMAP connections, not HTTP. If the provided certificate is not found, the IMAP connection will fail on handshake. By default, no certificate is used.
315315

316316
The resulting reports contains the following special fields:
317+
317318
* `extra.email_date`: The content of the email's `Date` header
318319
* `extra.email_subject`: The subject of the email
319320
* `extra.email_from`: The email's from address
@@ -353,6 +354,7 @@ Generic Mail Body Fetcher
353354
- `string`, e.g. `'plain'`
354355

355356
The resulting reports contains the following special fields:
357+
356358
* `extra.email_date`: The content of the email's `Date` header
357359
* `extra.email_subject`: The subject of the email
358360
* `extra.email_from`: The email's from address
@@ -545,6 +547,7 @@ MISP Generic
545547
* `misp_tag_processed`: MISP tag for processed events, optional
546548

547549
Generic parameters used in this bot:
550+
548551
* `http_verify_cert`: Verify the TLS certificate of the server, boolean (default: `true`)
549552

550553
**Workflow**
@@ -1788,6 +1791,7 @@ Aggregate
17881791
**Configuration Parameters**
17891792

17901793
* **Cache parameters** (see in section :ref:`common-parameters`)
1794+
17911795
* TTL is not used, using it would result in data loss.
17921796
* **fields** Given fields which are used to aggregate like `classification.type, classification.identifier`
17931797
* **threshold** If the aggregated event is lower than the given threshold after the timespan, the event will get dropped.
@@ -1989,10 +1993,16 @@ Deduplicator
19891993
**Parameters for "fine-grained" deduplication**
19901994

19911995
* `filter_type`: type of the filtering which can be "blacklist" or "whitelist". The filter type will be used to define how Deduplicator bot will interpret the parameter `filter_keys` in order to decide whether an event has already been seen or not, i.e., duplicated event or a completely new event.
1996+
19921997
* "whitelist" configuration: only the keys listed in `filter_keys` will be considered to verify if an event is duplicated or not.
19931998
* "blacklist" configuration: all keys except those in `filter_keys` will be considered to verify if an event is duplicated or not.
19941999
* `filter_keys`: string with multiple keys separated by comma. Please note that `time.observation` key will not be considered even if defined, because the system always ignore that key.
19952000

2001+
When using a whitelist field pattern and a small number of fields (keys), it becomes more important, that these fields exist in the events themselves.
2002+
If a field does not exist, but is part of the hashing/deduplication, this field will be ignored.
2003+
If such events should not get deduplicated, you need to filter them out before the deduplication process, e.g. using a sieve expert.
2004+
See also `this discussion thread <https://lists.cert.at/pipermail/intelmq-users/2021-July/000370.html>`_ on the mailing-list.
2005+
19962006
**Parameters Configuration Example**
19972007

19982008
*Example 1*
@@ -2049,6 +2059,7 @@ DO Portal Expert Bot
20492059
* `description:` The DO portal retrieves the contact information from a DO portal instance: http://github.com/certat/do-portal/
20502060

20512061
**Configuration Parameters**
2062+
20522063
* `mode` - Either `replace` or `append` the new abuse contacts in case there are existing ones.
20532064
* `portal_url` - The URL to the portal, without the API-path. The used URL is `$portal_url + '/api/1.0/ripe/contact?cidr=%s'`.
20542065
* `portal_api_key` - The API key of the user to be used. Must have sufficient privileges.
@@ -2068,6 +2079,7 @@ Field Reducer Bot
20682079
* `description:` The field reducer bot is capable of removing fields from events.
20692080

20702081
**Configuration Parameters**
2082+
20712083
* `type` - either `"whitelist"` or `"blacklist"`
20722084
* `keys` - Can be a JSON-list of field names (`["raw", "source.account"]`) or a string with a comma-separated list of field names (`"raw,source.account"`).
20732085

@@ -2093,17 +2105,18 @@ The filter bot is capable of filtering specific events.
20932105
* `lookup:` none
20942106
* `public:` yes
20952107
* `cache (redis db):` none
2096-
* `description:` filter messages (drop or pass messages) FIXME
2108+
* `description:` A simple filter for messages (drop or pass) based on a exact string comparison or regular expression
20972109

20982110
**Configuration Parameters**
20992111

21002112
*Parameters for filtering with key/value attributes*
21012113

2102-
* `filter_key` - key from data format
2103-
* `filter_value` - value for the key
2104-
* `filter_action` - action when a message match to the criteria (possible actions: keep/drop)
2105-
* `filter_regex` - attribute determines if the `filter_value` shall be treated as regular expression or not.
2106-
If this attribute is not empty, the bot uses python's "search" function to evaluate the filter.
2114+
* ``filter_key`` - key from data format
2115+
* ``filter_value`` - value for the key
2116+
* ``filter_action`` - action when a message match to the criteria (possible actions: keep/drop)
2117+
* ``filter_regex`` - attribute determines if the ``filter_value`` shall be treated as regular expression or not.
2118+
If this attribute is not empty (can be ``true``, ``yes`` or whatever), the bot uses python's ```re.search`` <https://docs.python.org/3/library/re.html#re.search>`_ function to evaluate the filter with regular expressions.
2119+
If this attribute is empty or evaluates to false, an exact string comparison is performed. A check on string *inequality* can be achieved with the usage of *Paths* described below.
21072120

21082121
*Parameters for time based filtering*
21092122

@@ -2175,17 +2188,19 @@ Format Field
21752188

21762189
.. code-block:: json
21772190
2178-
"columns": "malware.name,extra.tags"
2191+
"columns": "malware.name,extra.tags"
21792192
2180-
* `strip_chars` - a set of characters to remove as leading/trailing characters(default: ` ` or whitespace)
2193+
* `strip_chars` - a set of characters to remove as leading/trailing characters(default: space)
21812194

21822195
*Parameters for replacing chars*
2196+
21832197
* `replace_column` - key from data format
21842198
* `old_value` - the string to search for
21852199
* `new_value` - the string to replace the old value with
21862200
* `replace_count` - number specifying how many occurrences of the old value you want to replace(default: `1`)
21872201

21882202
*Parameters for splitting string to list of string*
2203+
21892204
* `split_column` - key from data format
21902205
* `split_separator` - specifies the separator to use when splitting the string(default: `,`)
21912206

@@ -2725,13 +2740,15 @@ Sources:
27252740
**Configuration Parameters**
27262741
27272742
* `fields`: string, comma-separated list of fields e.g. `destination.ip,source.asn,source.url`. Supported fields are:
2743+
27282744
* `destination.asn` & `source.asn`
27292745
* `destination.fqdn` & `source.fqdn`
27302746
* `destination.ip` & `source.ip`
27312747
* `destination.url` & `source.url`
27322748
* `policy`: string, comma-separated list of policies, e.g. `del,drop,drop`. `drop` will cause that the the entire event to be removed if the field is , `del` causes the field to be removed.
27332749
27342750
With the example parameter values given above, this means that:
2751+
27352752
* If a `destination.ip` value is part of a reserved network block, the field will be removed (policy "del").
27362753
* If a `source.asn` value is in the range of reserved AS numbers, the event will be removed altogether (policy "drop).
27372754
* If a `source.url` value contains a host with either an IP address part of a reserved network block, or a reserved domain name (or with a reserved TLD), the event will be dropped (policy "drop")
@@ -3150,6 +3167,7 @@ Threshold
31503167
**Limitations**
31513168
31523169
This bot has certain limitations and is not a true threshold filter (yet). It works like this:
3170+
31533171
1. Every incoming message is hashed according to the `filter_*` parameters.
31543172
2. The hash is looked up in the cache and the count is incremented by 1, and the TTL of the key is (re-)set to the timeout.
31553173
3. If the new count matches the threshold exactly, the message is forwarded. Otherwise it is dropped.
@@ -3319,6 +3337,7 @@ Events without `source.url`, `source.fqdn`, `source.ip`, or `source.asn`, are ig
33193337
only contains the domain. uWhoisd will automatically strip the subdomain part if it is present in the request.
33203338
33213339
Example: `https://www.theguardian.co.uk`
3340+
33223341
* TLD: `co.uk` (uWhoisd uses the `Mozilla public suffix list <https://publicsuffix.org/list/>`_ as a reference)
33233342
* Domain: `theguardian.co.uk`
33243343
* Subdomain: `www`
@@ -3877,6 +3896,7 @@ The parameters marked with 'PostgreSQL' will be sent to libpq via psycopg2. Chec
38773896
**PostgreSQL**
38783897
38793898
You have two basic choices to run PostgreSQL:
3899+
38803900
1. on the same machine as intelmq, then you could use Unix sockets if available on your platform
38813901
2. on a different machine. In which case you would need to use a TCP connection and make sure you give the right connection parameters to each psql or client call.
38823902

intelmq/bots/collectors/http/collector_http_stream.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,8 +23,9 @@
2323

2424
from intelmq.lib.bot import CollectorBot
2525
from intelmq.lib.mixins import HttpMixin
26-
from intelmq.lib.utils import decode, create_request_session
27-
from intelmq.lib.exceptions import MissingDependencyError
26+
from intelmq.lib.utils import decode
27+
28+
import requests.exceptions
2829

2930

3031
class HTTPStreamCollectorBot(CollectorBot, HttpMixin):
@@ -73,12 +74,12 @@ def process(self):
7374
IncompleteRead,
7475
ReadTimeoutError) as exc:
7576
self.__error_count += 1
76-
if (self.__error_count > self.parameters.error_max_retries):
77+
if (self.__error_count > self.error_max_retries):
7778
self.__error_count = 0
7879
raise
7980
else:
8081
self.logger.info('Got exception %r, retrying (consecutive error count %d <= %d).',
81-
exc, self.__error_count, self.parameters.error_max_retries)
82+
exc, self.__error_count, self.error_max_retries)
8283

8384
self.logger.info('Stream stopped.')
8485

intelmq/bots/outputs/mcafee/output_esm_ip.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ def init(self):
4343

4444
self.esm = ESM()
4545
try:
46-
self.esm.login(self.parameters.esm_ip, self.parameters.esm_user, self.parameters.esm_password)
46+
self.esm.login(self.esm_ip, self.esm_user, self.esm_password)
4747
except Exception:
4848
raise ValueError('Could not Login to ESM.')
4949

@@ -53,7 +53,7 @@ def init(self):
5353
retVal = self.esm.post('sysGetWatchlists?hidden=false&dynamic=false&writeOnly=false&indexedOnly=false',
5454
watchlist_filter)
5555
for WL in retVal:
56-
if (WL['name'] == self.parameters.esm_watchlist):
56+
if (WL['name'] == self.esm_watchlist):
5757
self.watchlist_id = WL['id']
5858
except TypeError:
5959
self.logger.error('Watchlist not found. Please verify name of the watchlist.')
@@ -64,7 +64,7 @@ def process(self):
6464
self.logger.info('Message received.')
6565
try:
6666
self.esm.post('sysAddWatchlistValues', {'watchlist': {'value': self.watchlist_id},
67-
'values': '["' + event.get(self.parameters.field) + '"]'},
67+
'values': '["' + event.get(self.field) + '"]'},
6868
raw=True)
6969
self.logger.info('ESM Watchlist updated')
7070
self.acknowledge_message()

0 commit comments

Comments
 (0)