You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The celery tasks that extract data from elastic and write it to the GreedyBear DB have become much slower over time. This seems to correlate with a growing GreedyBear DB. On an empty DB, all extraction processes combined took about 20 seconds. Now, with about 500k IOCs in the DB, they take over 2 minutes.
One of the performance killers seems to be the _check_first_time_run method in attacks.py:
# So we increment the time range to get the data from the last 3 days
self.first_time_run=True
In line 100 and 112 QuerySet evaluation is forced by testing it in a boolean context. To my understanding this means that all IOCs are fetched from the DB, which is very expensive. This should rather be done by using QuerySet.exists(). I will further investigate this and open a PR.
The text was updated successfully, but these errors were encountered:
* create index on name field of IOC model to speed up _add_ioc function
* use QuerySet.exist() for better performance
* hand over previously added IOC record to _get_sessions method to reduce number of DB queries
* fix returning wrong IOC object
* add more error-resistant time window calculation
* document additional_lookback argument
* minor improvements to get_time_window function
* add test cases for get_time_window function
* fix error in docstring
* remove argument from function that is already a configuration setting and adapt tests accordingly
The celery tasks that extract data from elastic and write it to the GreedyBear DB have become much slower over time. This seems to correlate with a growing GreedyBear DB. On an empty DB, all extraction processes combined took about 20 seconds. Now, with about 500k IOCs in the DB, they take over 2 minutes.
One of the performance killers seems to be the
_check_first_time_run
method inattacks.py
:GreedyBear/greedybear/cronjobs/attacks.py
Lines 98 to 115 in 602fcf9
In line 100 and 112 QuerySet evaluation is forced by testing it in a boolean context. To my understanding this means that all IOCs are fetched from the DB, which is very expensive. This should rather be done by using
QuerySet.exists()
. I will further investigate this and open a PR.The text was updated successfully, but these errors were encountered: