Because I couldn’t find something to fully automate the SpamAssassin sa-learn process for virtual email users in a MySQL database, I wrote my own (borrowing liberally from Jason Schaefer’s SpamAssassin training and spam cleanup script).
This script assumes that:
- You have a MySQL database with virtual users, with a user table called ‘virtual_users’ and the full email address stored in a field called ’email’.
- Your email is stored in Maildir folders, with a heirarchy starting from /var/vmail/domain.com/username/…
#!/bin/bash ## Database details USER='' PASS='' HOST='' DB='' ## Where to log stuff LOG='/var/log/sa-learn.log' ## How many days to wait before deleting spam ## Comment out to disable CLEAN=30 echo -e "\n\nRun started `date +%c`" >> $LOG 2>&1 ## Spam and ham training for all virtual users ## Delete spam older than $CLEAN days mysql --skip-column-names -u$USER -p$PASS -h$HOST -D$DB -e "SELECT SUBSTRING(email, 1, LOCATE('@', email) - 1) AS user, SUBSTRING(email, LOCATE('@', email) + 1) AS domain FROM virtual_users" | while read user domain; do ## Spam echo "Spam training for $user@$domain" >> $LOG 2>&1 /usr/bin/sa-learn --no-sync --spam /var/vmail/$domain/$user/.Junk/{cur,new} >> $LOG 2>&1 ## Ham echo "Ham training for $user@$domain" >> $LOG 2>&1 /usr/bin/sa-learn --no-sync --ham /var/vmail/$domain/$user/{cur} >> $LOG 2>&1 ## Delete if [ -n $CLEAN ]; then echo "Deleting spam for $user@$domain older than $CLEAN days" >> $LOG 2>&1 find /var/vmail/$domain/$user/.Junk/cur/ -type f -mtime +$CLEAN -exec rm {} \; fi done ## Sync the SpamAssassin journal and print out stats echo "Syncing the SpamAssassin journal" >> $LOG 2>&1 /usr/bin/sa-learn --sync >> $LOG 2>&1 echo "Statistics for this run:" >> $LOG 2>&1 /usr/bin/sa-learn --dump magic >> $LOG 2>&1 echo -e "Run finished `date +%c`" >> $LOG 2>&1 exit