-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added s3 storage for attack event files #285
Conversation
vorband
commented
Mar 9, 2018
- added log_s3.py logging module to store attack event files on external s3 storage
- added s3storage config section in dist config
- added botocore to requirements.txt
- tested with pithos s3 (http) and AWS s3 (https).
- Note: For pithos use signature_version "s3" , for AWS S3 uses "s3v4"
- added log_s3.py logging module to store attack event files on external s3 storage - added s3storage config section in dist config - added botocore to requirements.txt - tested with pithos s3 (http) and AWS s3 (https). - Note: For pithos use signature_version "s3" , for AWS S3 uses "s3v4"
glastopf/glastopf.cfg.dist
Outdated
aws_access_key_id = YOUR_aws_access_key_id | ||
aws_secret_access_key = YOUR_aws_access_key_id | ||
bucket = glastopf | ||
region = europe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use eu-west-1
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was meant as a dummy and needs to be set according to the configuration anyway. However, I set it to eu-west-1. Changed in e906972
|
||
def insert(self, attack_event): | ||
if self._initial_connection_happend: | ||
if attack_event.file_name is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you thought about deduplication? I think you will see some of the file very often. Maybe add a flag to the event indicating this is a known file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
implemented in 0508e99
checks both locally and on the remote s3 for duplicate files to avoid upload. Other logging modules (except for log_sql) will need to be adapted in order to log this status flag.
- add flag for known file (known_file) to AttackEvent: set if file already locally exists - extended log_sql to new flag - added deduplication handling in log_s3 - local check (flag) and remote check (avoid upload)
Allow interpolation of environment variables from glastopf.cfg. This is useful in a docker environment in case you want settings to be controlled via docker-compose.yml. Interpolation is invoked in glastopf.cfg, for example ... [hpfeed] … chan_events = %(CHANEVENTS)s …
- changed filename to use sha256 hash - stix test to use sha256 - added flag for known rci for php cgi rce
changed s3 storage filename to use sha256 hash, added AttackEvent dict entry for file_sha256 |
…reliant log processing removed sha256 as default hash for files to maintain interoperability with md5 reliant log processing - removed previous changes in test_stix, php_cgi_rce, in order to still use md5 - added file_sha256 to AttackEvent event_dict dict - log_s3.py now stores on s3 and uses sha256 as filename
config = os.path.join(work_dir, config) | ||
BaseLogger.__init__(self, config) | ||
self.files_dir = os.path.join(data_dir, 'files/') | ||
self.enabled = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.enabled = self.config.getboolean("s3storage", "enabled")
and then use self.enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simply followed the way it was implemented in log_hpfeeds.py. However, I will change it according to your comments.
self.files_dir = os.path.join(data_dir, 'files/') | ||
self.enabled = False | ||
self._initial_connection_happend = False | ||
self.options = {'enabled': self.enabled} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you using this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope. removed
|
||
def _start_connection(self, endpoint, accesskey, secretkey, version, region, bucket): | ||
self.s3session = botocore.session.get_session() | ||
self.s3session.set_credentials(accesskey, secretkey) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.accesskey
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
def _start_connection(self, endpoint, accesskey, secretkey, version, region, bucket): | ||
self.s3session = botocore.session.get_session() | ||
self.s3session.set_credentials(accesskey, secretkey) | ||
self.s3client = self.s3session.create_client( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you sure this actually does a connection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, it does not. I will perform a connection test instead.
self.options = {'enabled': self.enabled} | ||
self.s3client = None | ||
self.s3session = None | ||
gevent.spawn(self._start_connection, self.endpoint, self.accesskey, self.secretkey, self.version, self.region, self.bucket) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe make this blocking so we ensure we have a connection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test if bucket is available and if it fails during initial start, it will no longer try.
Will update the module now.
if self._initial_connection_happend: | ||
if attack_event.file_sha256 is not None: | ||
if attack_event.known_file: | ||
logger.debug('sha256:{0} / md5:{1} is a known file, it will not be uploaded.'.format(attack_event.file_sha256, attack_event.file_name)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return here and drop the else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed.
searchFile = self.s3client.list_objects_v2(Bucket=self.bucket, Prefix=attack_event.file_sha256) | ||
if (len(searchFile.get('Contents', []))) == 1 and str(searchFile.get('Contents', [])[0]['Key']) == attack_event.file_sha256: | ||
logger.debug('Not storing file (sha256:{0}) to s3 bucket "{1}" on {2} as it already exists in the bucket.'.format(attack_event.file_sha256, self.bucket, self.endpoint)) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return in the if and drop the else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed.
self._initial_connection_happend = True | ||
|
||
def insert(self, attack_event): | ||
if self._initial_connection_happend: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the initial connection is blocking you can drop this if/else
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed.
changes implemented. hope it's ok now. |
Awesome, looks good! |
Great to see this! Note to anyone finding this PR in future; Pithos after 8th March 2018 does support s3v4. |