-
Notifications
You must be signed in to change notification settings - Fork 367
Configuration example: save to Redis & MySQL
Assuming you have Redis and SQL already installed for this example. (Or SQL driver of your choice, see "Non MySQL" section below how to configure for other SQL dialects!)
In your Go-Guerrilla configuration, configure the "backend_config" property like so:
"backend_config" :
{
"save_process": "HeadersParser|Debugger|Hasher|Header|Compressor|Redis|Sql",
"log_received_mails" : true,
"sql_driver": "mysql",
"sql_dsn": "root:ok@tcp(127.0.0.1:3306)/gmail_mail?readTimeout=10s&writeTimeout=10s",
"mail_table":"new",
"redis_interface" : "127.0.0.1:6379",
"redis_expire_seconds" : 7200,
"save_workers_size" : 1,
"primary_mail_host":"sharklasers.com"
},
The save_process
property defines which processors will be called in sequence from left to right.
it will first parse the headers using the HeadersParser processor, and finally save the email body to redis
and other data to SQL. The other settings are pretty much self explanatory.
Note that you may add other processors to the save_process
setting if you like.
Note that the SQL & Redis processors is rely on the following processors:
- HeadersParser` for header parsing, to get such fields as the Subject, To and Reply-to headers, but doesn't really need it.
- The Hasher is used to derive the redis key for storing the email body.
- The
Compressor
doesn't actually compress anything, it just sets up a compressor that the Redis or SQL processor will use if they find that it was set. - The 'Header' will fill the envelope with delivery header fields, so that when the body is saved with the delivery header fields.
The following options are available in your config since 2019-08-19:
sql_max_open_conns - sets the maximum number of open connections to the database. The default is 0 (unlimited).
sql_max_idle_conns - sets the maximum number of connections in the idle connection pool. The default is 2.
sql_max_conn_lifetime - sets the maximum amount of time a connection may be reused.
Let's create a table (example below may be using MySQL specific features):
CREATE TABLE IF NOT EXISTS `new` (
`mail_id` BIGINT(20) unsigned NOT NULL AUTO_INCREMENT,
`message_id` varchar(256) character set latin1 NOT NULL COMMENT 'value of [Message-ID] from headers',
`date` datetime NOT NULL,
`from` varchar(256) character set latin1 NOT NULL COMMENT 'value of [From] from headers or return_path (MAIL FROM) if no header present',
`to` varchar(256) character set latin1 NOT NULL COMMENT 'value of [To] from headers or recipient (RCPT TO) if no header present',
`reply_to` varchar(256) NULL COMMENT 'value of [Reply-To] from headers if present',
`sender` varchar(256) NULL COMMENT 'value of [Sender] from headers of present',
`subject` varchar(255) NOT NULL,
`body` varchar(16) NOT NULL,
`mail` longblob NOT NULL,
`spam_score` float NOT NULL,
`hash` char(32) character set latin1 NOT NULL,
`content_type` varchar(64) character set latin1 NOT NULL,
`recipient` varchar(255) character set latin1 NOT NULL COMMENT 'set by the RCPT TO command.',
`has_attach` int(11) NOT NULL,
`ip_addr` varbinary(16) NOT NULL,
`return_path` VARCHAR(255) NOT NULL COMMENT 'set by the MAIL FROM command. Can be empty to indicate a bounce, i.e <>',
`is_tls` BIT(1) DEFAULT b'0' NOT NULL,
PRIMARY KEY (`mail_id`),
KEY `to` (`to`),
KEY `hash` (`hash`),
KEY `date` (`date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
mail_id
is the primary auto-increment key, message_id
is a value that was parsed from the email headers.
If not present in the headers, it will use <hash>.<local-part-to>@<primary_mail_host>
- where <primary_mail_host>
comes from the config, <hash>
is the hash of the body, <local-part-to>
is the local part of the recipient address.
The above table does not store the body of the email which makes it quick
to query and join, while the body of the email is fetched from Redis
for future processing. The mail
field can contain data in case Redis is down.
Otherwise, if data is in Redis, the mail
will be blank, and
the body
field will contain the word redis
.
To get the email body from Redis, use the value from the hash
column for the redis key.
The ip_address is packed into a 16 byte binary.
For details see this stack overflow answer.
Eg, In MySQL, one would use select INET6_NTOA(ip_addr) from new;
and in PHP one would use inet_ntop
function to get the human readable format.
These columns are always the extracted email address, and do not include sender name. See individual comments embedded in the table definition SQL for details.
Columns has_attach
and spam_score
are unused. Perhaps in the future these may be populated,
or your system could populate later in the pipeline.
By there's an index on to
hash
and date
. Adding indexes may reduce insert performance, it's up to whenever you want to keep these or not, or add more.
You can customize the insert query so that it's compatible with non-MySQL databases.
Eg.
- What if backticks need to be replaced with double quotes?
- What if ? placeholders need to be replaced with $x, where x increments. (From discussion in issue #145 https://github.com/flashmob/go-guerrilla/issues/145)
Use the following settings in your config file:
sql_insert
- specifies the columns for the query
sql_values
- specifies the values, as the name implies.
This change renamed the MySQL processor to Sql
and added the following config options
"sql_driver": "mysql",
"sql_dsn": "root:ok@tcp(127.0.0.1:3306)/gmail_mail?readTimeout=10s&writeTimeout=10s",
removing the following
"mysql_db":"gmail_mail",
"mysql_host":"127.0.0.1:3306",
"mysql_pass":"ok",
"mysql_user":"root",
renamed redis_mysql_batch_timeout
to redis_sql_batch_timeout