SpamAssassin Configuration: Complete Anti-Spam Setup Guide

Introduction

Spam email remains one of the most persistent challenges in email administration. Despite improvements in email authentication and filtering, spam continues to consume bandwidth, storage, and user time while posing security risks through phishing and malware distribution. SpamAssassin provides a powerful, open-source solution for identifying and filtering spam on mail servers.

SpamAssassin uses a sophisticated scoring system that evaluates multiple characteristics of each email message. It examines headers, content, DNS blacklists, Bayesian filtering, and numerous other factors to calculate a spam score. Messages exceeding a configurable threshold are flagged as spam, allowing them to be rejected, tagged, or moved to quarantine folders automatically.

What sets SpamAssassin apart is its flexibility and accuracy. Unlike simple keyword-based filters, SpamAssassin combines dozens of tests weighted by their effectiveness. It learns from training data using Bayesian analysis, adapts to new spam techniques, and can be customized extensively for specific needs. When properly configured and trained, SpamAssassin achieves excellent spam detection rates while maintaining minimal false positives.

This comprehensive guide walks you through installing SpamAssassin, integrating it with Postfix, configuring rules and scoring, implementing Bayesian filtering, optimizing performance, and maintaining the system for long-term effectiveness.

Prerequisites

Before installing SpamAssassin, ensure you have:

System Requirements

  • Working mail server with Postfix configured
  • Ubuntu 20.04/22.04, Debian 10/11, CentOS 8/Rocky Linux 8, or similar
  • Minimum 1GB RAM (2GB+ recommended for Bayesian filtering)
  • At least 10GB free disk space
  • Root or sudo access

Mail Server Requirements

  • Postfix properly sending and receiving email
  • Dovecot configured for local delivery (recommended)
  • Access to mail server logs
  • Understanding of current mail flow

Knowledge Prerequisites

  • Linux command-line proficiency
  • Basic understanding of Postfix configuration
  • Familiarity with email protocols
  • Text editor skills (nano, vim)

Understanding SpamAssassin

How SpamAssassin Works

  1. Email Reception: Mail arrives at your server
  2. Content Analysis: SpamAssassin scans headers, body, attachments
  3. Test Execution: Runs multiple tests (50+ by default)
  4. Score Calculation: Each test contributes points to spam score
  5. Threshold Comparison: Total score compared to threshold (default: 5.0)
  6. Action: Email tagged, rejected, or quarantined based on score

Scoring System

SpamAssassin assigns positive points for spam indicators:

  • Score < 5.0: Probably legitimate
  • Score 5.0-10.0: Likely spam
  • Score > 10.0: Definitely spam

Tests contribute different point values:

BAYES_99        3.5    # Bayesian analysis says 99% spam
HTML_MESSAGE    0.001  # Message is HTML only
URIBL_BLACK     3.0    # URL found in blacklist
DKIM_VALID     -0.1    # Valid DKIM signature

Types of Tests

Header Tests:

  • Malformed headers
  • Suspicious routing
  • Missing required headers
  • Forged sender information

Body Tests:

  • Spam keywords and phrases
  • HTML characteristics
  • Character encoding tricks
  • Obfuscation techniques

DNS Tests:

  • RBL (Realtime Blackhole List) lookups
  • URIBL (URI Blacklist) checks
  • Domain reputation

Bayesian Analysis:

  • Statistical learning from training data
  • Adapts to your specific spam patterns
  • Improves accuracy over time

Network Tests:

  • Razor collaborative filtering
  • Pyzor digest-based detection
  • DCC (Distributed Checksum Clearinghouse)

Step 1: Install SpamAssassin

Ubuntu/Debian Installation

# Update package lists
sudo apt update

# Install SpamAssassin and dependencies
sudo apt install spamassassin spamc -y

# Create SpamAssassin user (if not exists)
sudo groupadd -g 5001 spamd
sudo useradd -u 5001 -g spamd -s /bin/false -d /var/lib/spamassassin spamd

CentOS/Rocky Linux Installation

# Install SpamAssassin
sudo dnf install spamassassin -y

# Create SpamAssassin user
sudo groupadd -g 5001 spamd
sudo useradd -u 5001 -g spamd -s /bin/false -d /var/lib/spamassassin spamd

Verify Installation

# Check SpamAssassin version
spamassassin --version

# Check spamc version
spamc --version

# Verify service exists
systemctl list-unit-files | grep spamassassin

Step 2: Configure SpamAssassin

Main Configuration File

Edit the main SpamAssassin configuration:

sudo nano /etc/spamassassin/local.cf

Add these optimized settings:

# Required score to be flagged as spam (default: 5.0)
required_score 5.0

# Rewrite subject line for spam
rewrite_header Subject [SPAM]

# Report with spam/ham
report_safe 0

# Use Bayesian auto-learning
use_bayes 1
bayes_auto_learn 1
bayes_auto_learn_threshold_nonspam -0.1
bayes_auto_learn_threshold_spam 12.0

# Bayesian database location
bayes_path /var/lib/spamassassin/.spamassassin/bayes

# Enable network tests
skip_rbl_checks 0
use_razor2 1
use_pyzor 1
use_dcc 1

# Enable DKIM checking
use_dkim 1

# Scores for common tests
score BAYES_99 3.5
score BAYES_00 -1.9
score HTML_MESSAGE 0.001
score MIME_HTML_ONLY 0.723
score URIBL_BLOCKED 0.001

# Whitelist authentication
def_whitelist_auth *@example.com

# Network test timeouts
razor2_timeout 5
pyzor_timeout 5
dcc_timeout 5
rbl_timeout 5

# Language settings (adjust for your needs)
ok_languages en
ok_locales en

# Maximum message size to scan (default 512KB)
# Larger messages not scanned (performance)
body_part_scan_size 50000
rawbody_part_scan_size 50000

# Trust authenticated senders
trusted_networks 127.0.0.0/8

Configuration Explained

required_score: Threshold for spam flagging (5.0 is standard, lower = stricter)

rewrite_header: Modifies subject line of spam (helps filtering)

report_safe:

  • 0 = Don't encapsulate spam
  • 1 = Encapsulate as attachment
  • 2 = Encapsulate as plain text

use_bayes: Enable Bayesian learning (improves accuracy significantly)

skip_rbl_checks: Enable DNS blacklist checking

Network tests: Razor, Pyzor, DCC for collaborative spam detection

Step 3: Configure SpamAssassin Daemon

Edit Daemon Configuration

sudo nano /etc/default/spamassassin

Configure daemon settings:

# Enable SpamAssassin daemon
ENABLED=1

# Options passed to the daemon
# -d = daemon mode
# -m = maximum children (concurrent scans)
# -u = run as user
# -g = run as group
# --max-conn-per-child = respawn after N connections (prevents memory leaks)
# --helper-home-dir = home directory
OPTIONS="--create-prefs --max-children 5 --username spamd --helper-home-dir /var/lib/spamassassin -s /var/log/spamassassin/spamd.log"

# PID file location
PIDFILE="/var/run/spamassassin.pid"

# Nice level (lower priority)
NICE="--nicelevel 10"

# Log location
LOGFILE="/var/log/spamassassin/spamd.log"

Adjust Children Count

For low-traffic servers:

--max-children 2

For medium traffic:

--max-children 5

For high traffic:

--max-children 10

Create Log Directory

# Create log directory
sudo mkdir -p /var/log/spamassassin

# Set ownership
sudo chown spamd:spamd /var/log/spamassassin

# Set permissions
sudo chmod 750 /var/log/spamassassin

Set SpamAssassin Home Directory Permissions

# Create home directory if needed
sudo mkdir -p /var/lib/spamassassin

# Set ownership
sudo chown -R spamd:spamd /var/lib/spamassassin

# Set permissions
sudo chmod 750 /var/lib/spamassassin

Step 4: Update SpamAssassin Rules

SpamAssassin needs updated rules for optimal performance:

# Update rules
sudo sa-update

# Compile rules for faster performance
sudo sa-compile

Set up automatic updates:

# Create update script
sudo nano /usr/local/bin/update-spamassassin.sh

Add:

#!/bin/bash
# Update SpamAssassin rules

# Update rules
sa-update

# Compile if successful
if [ $? -eq 0 ]; then
    sa-compile
    systemctl reload spamassassin
fi

Make executable:

sudo chmod +x /usr/local/bin/update-spamassassin.sh

Schedule daily updates:

sudo crontab -e

Add:

0 3 * * * /usr/local/bin/update-spamassassin.sh

Step 5: Integrate with Postfix

Option 1: Use Spamass-Milter (Recommended)

Install spamass-milter for milter integration:

# Ubuntu/Debian
sudo apt install spamass-milter -y

# CentOS/Rocky Linux
sudo dnf install spamass-milter -y

Configure spamass-milter:

sudo nano /etc/default/spamass-milter

Add:

# Enable spamass-milter
ENABLED=1

# Options
OPTIONS="-u spamd -i 127.0.0.1 -m -r -1 -- --socket=/var/run/spamassassin/spamd.sock"

# -u = run as user
# -i = ignore authenticated users
# -m = modify subject
# -r -1 = reject spam above threshold -1 (disabled)

Configure Postfix to use milter:

sudo nano /etc/postfix/main.cf

Add:

# SpamAssassin milter
smtpd_milters = unix:/spamass/spamass.sock
non_smtpd_milters = unix:/spamass/spamass.sock
milter_connect_macros = i j {daemon_name} v {if_name} _
milter_default_action = accept

Option 2: Use Content Filter

Alternative method using content_filter:

sudo nano /etc/postfix/master.cf

Add these lines:

spamassassin unix -     n       n       -       -       pipe
  user=spamd argv=/usr/bin/spamc -f -e
  /usr/sbin/sendmail -oi -f ${sender} ${recipient}

Configure Postfix to use filter:

sudo nano /etc/postfix/main.cf

Add:

content_filter = spamassassin

Start Services

# Enable and start SpamAssassin
sudo systemctl enable spamassassin
sudo systemctl start spamassassin

# Enable and start spamass-milter (if using)
sudo systemctl enable spamass-milter
sudo systemctl start spamass-milter

# Reload Postfix
sudo systemctl reload postfix

Verify Integration

# Check SpamAssassin is running
sudo systemctl status spamassassin

# Check spamass-milter (if using)
sudo systemctl status spamass-milter

# Check Postfix configuration
sudo postfix check

# Test mail flow
echo "Test email" | mail -s "SpamAssassin Test" [email protected]

Check logs:

sudo tail -f /var/log/mail.log
sudo tail -f /var/log/spamassassin/spamd.log

Step 6: Configure Bayesian Filtering

Bayesian filtering is SpamAssassin's learning component.

Initialize Bayesian Database

# Create database directory
sudo mkdir -p /var/lib/spamassassin/.spamassassin

# Set ownership
sudo chown -R spamd:spamd /var/lib/spamassassin/.spamassassin

# Set permissions
sudo chmod 700 /var/lib/spamassassin/.spamassassin

Train with Ham (Legitimate Email)

# Train with ham (good email)
sudo -u spamd sa-learn --ham /path/to/ham/maildir/cur/

# Or single file
sudo -u spamd sa-learn --ham /path/to/ham/email.txt

Example for training from user mailbox:

# Train from user's inbox
sudo -u spamd sa-learn --ham /var/mail/vmail/example.com/user/Maildir/cur/

# Train multiple users
for user in /var/mail/vmail/example.com/*/Maildir/cur/; do
    sudo -u spamd sa-learn --ham "$user"
done

Train with Spam

# Train with spam
sudo -u spamd sa-learn --spam /path/to/spam/maildir/cur/

# Or single file
sudo -u spamd sa-learn --spam /path/to/spam/email.txt

Example:

# Train from spam folder
sudo -u spamd sa-learn --spam /var/mail/vmail/example.com/user/Maildir/.Junk/cur/

Check Bayesian Database

# Show statistics
sudo -u spamd sa-learn --dump magic

Output shows:

0.000          0          3          0  non-token data: bayes db version
0.000          0      15234          0  non-token data: nspam
0.000          0      30145          0  non-token data: nham

You need:

  • At least 200 spam messages
  • At least 200 ham messages
  • For Bayesian filtering to activate

Automate Training

Create training script:

sudo nano /usr/local/bin/train-spamassassin.sh

Add:

#!/bin/bash

# Train from all users' spam folders
for spam_folder in /var/mail/vmail/*/*/Maildir/.Junk/cur/; do
    if [ -d "$spam_folder" ]; then
        sa-learn --spam "$spam_folder"
    fi
done

# Train from all users' inbox (ham)
for ham_folder in /var/mail/vmail/*/*/Maildir/cur/; do
    if [ -d "$ham_folder" ]; then
        sa-learn --ham "$ham_folder"
    fi
done

# Show statistics
sa-learn --dump magic

Make executable:

sudo chmod +x /usr/local/bin/train-spamassassin.sh

Schedule weekly:

sudo crontab -e

Add:

0 2 * * 0 /usr/local/bin/train-spamassassin.sh

Step 7: Configure Network Tests

Razor Configuration

# Install Razor
sudo apt install razor -y

# Create Razor home
sudo -u spamd mkdir -p /var/lib/spamassassin/.razor

# Discover Razor servers
sudo -u spamd razor-admin -home=/var/lib/spamassassin/.razor -create
sudo -u spamd razor-admin -home=/var/lib/spamassassin/.razor -discover

# Register (optional but recommended)
sudo -u spamd razor-admin -home=/var/lib/spamassassin/.razor -register

Configure in local.cf:

sudo nano /etc/spamassassin/local.cf

Add:

use_razor2 1
razor_config /var/lib/spamassassin/.razor/razor-agent.conf

Pyzor Configuration

# Install Pyzor
sudo apt install pyzor -y

# Create Pyzor home
sudo -u spamd mkdir -p /var/lib/spamassassin/.pyzor

# Discover Pyzor servers
sudo -u spamd pyzor --homedir /var/lib/spamassassin/.pyzor discover

Configure in local.cf:

use_pyzor 1
pyzor_options --homedir=/var/lib/spamassassin/.pyzor

DCC Configuration

# Install DCC (optional, not in all repos)
# May need to compile from source
wget https://www.dcc-servers.net/dcc/source/dcc.tar.Z
tar xzf dcc.tar.Z
cd dcc-*
./configure
make
sudo make install

# Configure
sudo nano /etc/spamassassin/local.cf

Add:

use_dcc 1
dcc_path /usr/local/bin/dccproc

Step 8: Configure Whitelisting and Blacklisting

Whitelist Email Addresses

sudo nano /etc/spamassassin/local.cf

Add:

# Whitelist specific addresses
whitelist_from [email protected]
whitelist_from *@trusted-domain.com

# Whitelist authenticated senders
def_whitelist_auth *@example.com

# Whitelist by IP
whitelist_from_rcvd [email protected] mail.domain.com

Blacklist Email Addresses

# Blacklist specific addresses
blacklist_from [email protected]
blacklist_from *@spam-domain.com

Whitelist Entire Domains

# Don't scan mail from these domains
all_spam_to [email protected]

Create Custom Rules

sudo nano /etc/spamassassin/local.cf

Add custom scoring:

# Custom header rules
header CUSTOM_RULE_1 Subject =~ /viagra|cialis/i
describe CUSTOM_RULE_1 Subject contains pharmacy spam
score CUSTOM_RULE_1 3.0

# Custom body rules
body CUSTOM_RULE_2 /click here to unsubscribe/i
describe CUSTOM_RULE_2 Body has suspicious unsubscribe
score CUSTOM_RULE_2 1.5

# Custom URI rules
uri CUSTOM_RULE_3 /bit\.ly/i
describe CUSTOM_RULE_3 Contains shortened URL
score CUSTOM_RULE_3 0.5

Step 9: Test SpamAssassin

Test with GTUBE

GTUBE is a test string guaranteed to trigger SpamAssassin:

# Create test email
cat > /tmp/spam-test.txt << 'EOF'
Subject: Test spam detection

This is the GTUBE, the Generic Test for Unsolicited Bulk Email

XJS*C4JDBQADN1.NSBN3*2IDNEN*GTUBE-STANDARD-ANTI-UBE-TEST-EMAIL*C.34X
EOF

# Test with SpamAssassin
spamassassin -t < /tmp/spam-test.txt

Output should show:

X-Spam-Flag: YES
X-Spam-Level: **************************************************
X-Spam-Status: Yes, score=1000.0 required=5.0

Test Real Email

# Save an email to file
cat > /tmp/real-test.txt << 'EOF'
From: [email protected]
To: [email protected]
Subject: Legitimate test email

This is a normal email with regular content.
No spam characteristics here.
EOF

# Test
spamassassin -t < /tmp/real-test.txt

Should show low score:

X-Spam-Status: No, score=-0.1 required=5.0

Test Through Mail Flow

# Send test email
echo "Testing SpamAssassin integration" | mail -s "SA Test" [email protected]

Check headers in received email:

X-Spam-Checker-Version: SpamAssassin 3.4.6
X-Spam-Level:
X-Spam-Status: No, score=-0.1 required=5.0 tests=DKIM_SIGNED,DKIM_VALID

Debug Mode

For troubleshooting:

# Run in debug mode
spamassassin -D -t < /tmp/test-email.txt 2>&1 | less

Shows detailed rule matching and scoring.

Step 10: Performance Optimization

Adjust Process Limits

sudo nano /etc/default/spamassassin

For low traffic (1-10 emails/minute):

OPTIONS="--max-children 2"

For medium traffic (10-50 emails/minute):

OPTIONS="--max-children 5"

For high traffic (50+ emails/minute):

OPTIONS="--max-children 10"

Limit Message Size

sudo nano /etc/spamassassin/local.cf

Add:

# Don't scan messages larger than 500KB
body_part_scan_size 500000
rawbody_part_scan_size 500000

Disable Unnecessary Tests

If certain tests are slow or unnecessary:

sudo nano /etc/spamassassin/local.cf

Add:

# Disable DCC if too slow
use_dcc 0

# Reduce RBL checks
skip_rbl_checks 0
dns_available test  # Only do network tests if DNS working

Use Memory Caching

sudo nano /etc/default/spamassassin

Add:

OPTIONS="--max-children 5 --max-conn-per-child 200"

Respawns children after 200 connections to prevent memory leaks.

Monitor Performance

# Check SpamAssassin process count
ps aux | grep spamd | wc -l

# Monitor memory usage
ps aux | grep spamd

# Check log for slow scans
grep "check: tests" /var/log/spamassassin/spamd.log | tail -20

Monitoring and Maintenance

Daily Monitoring

# Check SpamAssassin status
sudo systemctl status spamassassin

# Check recent spam detections
grep "X-Spam-Status: Yes" /var/log/mail.log | wc -l

# Check Bayesian learning
sudo -u spamd sa-learn --dump magic

Weekly Tasks

# Train Bayesian filter
sudo /usr/local/bin/train-spamassassin.sh

# Check false positives
grep "X-Spam-Status: Yes" /var/log/mail.log | tail -50

# Review whitelist/blacklist
sudo nano /etc/spamassassin/local.cf

Monthly Maintenance

# Update rules
sudo sa-update
sudo sa-compile

# Rebuild Bayesian database if corrupted
sudo -u spamd sa-learn --rebuild

# Review custom rules effectiveness
spamassassin --lint

Create Monitoring Script

sudo nano /usr/local/bin/check-spamassassin.sh

Add:

#!/bin/bash

echo "=== SpamAssassin Status ==="
echo ""

echo "Service Status:"
systemctl is-active spamassassin

echo ""
echo "Bayesian Statistics:"
sudo -u spamd sa-learn --dump magic | grep "non-token data"

echo ""
echo "Today's Spam Count:"
grep "X-Spam-Status: Yes" /var/log/mail.log | grep "$(date '+%b %d')" | wc -l

echo ""
echo "Recent Spam Scores:"
grep "X-Spam-Status: Yes" /var/log/mail.log | tail -5 | awk '{print $NF}'

echo ""
echo "Process Count:"
ps aux | grep [s]pamd | wc -l

Make executable:

sudo chmod +x /usr/local/bin/check-spamassassin.sh

Schedule daily:

sudo crontab -e

Add:

0 10 * * * /usr/local/bin/check-spamassassin.sh | mail -s "SpamAssassin Daily Report" [email protected]

Troubleshooting Common Issues

Issue 1: SpamAssassin Not Starting

Diagnosis:

sudo systemctl status spamassassin
sudo journalctl -u spamassassin -n 50

Common causes:

  • Permission issues
  • Configuration syntax errors
  • Missing dependencies

Solutions:

# Check configuration
spamassassin --lint

# Fix permissions
sudo chown -R spamd:spamd /var/lib/spamassassin
sudo chmod 750 /var/lib/spamassassin

# Restart service
sudo systemctl restart spamassassin

Issue 2: Mail Not Being Scanned

Diagnosis:

# Check mail headers
# Look for X-Spam-Status

# Check Postfix is using milter
sudo postconf | grep milter

# Check logs
sudo grep spamassassin /var/log/mail.log

Solutions:

  • Verify Postfix integration configured
  • Restart both services
  • Check milter socket permissions

Issue 3: Too Many False Positives

Diagnosis:

# Check scores
grep "X-Spam-Status: Yes" /var/log/mail.log | grep "legitimate-sender"

Solutions:

# Increase threshold
sudo nano /etc/spamassassin/local.cf

Change:

required_score 7.0   # More lenient

Or whitelist sender:

whitelist_from [email protected]

Issue 4: Missing Spam

Diagnosis:

# Test known spam
spamassassin -t < /path/to/spam/email.txt

Solutions:

# Lower threshold
sudo nano /etc/spamassassin/local.cf

Change:

required_score 3.5   # More strict

Train with more spam:

sudo -u spamd sa-learn --spam /path/to/spam/folder/

Issue 5: High CPU Usage

Diagnosis:

top -u spamd

Solutions:

# Reduce children
sudo nano /etc/default/spamassassin

Change:

OPTIONS="--max-children 2"

Disable slow tests:

sudo nano /etc/spamassassin/local.cf

Add:

use_dcc 0
skip_rbl_checks 1

Best Practices

1. Regular Training

  • Train weekly with new spam/ham
  • Minimum 200 messages each category
  • Use real user feedback

2. Monitor False Positives

  • Review flagged mail regularly
  • Adjust scores for problem rules
  • Whitelist legitimate senders

3. Keep Rules Updated

  • Update weekly: sa-update
  • Subscribe to rule updates
  • Compile rules: sa-compile

4. Optimize Performance

  • Limit max children appropriately
  • Set message size limits
  • Monitor resource usage

5. Use with Other Tools

  • Combine with SPF/DKIM/DMARC
  • Use alongside greylisting
  • Implement rate limiting

6. Document Custom Rules

# Always add descriptions
describe CUSTOM_RULE Purpose of this rule

Conclusion

You now have SpamAssassin fully configured and integrated with your mail server, providing robust spam filtering with minimal false positives. Combined with proper email authentication (SPF, DKIM, DMARC), SpamAssassin creates a comprehensive anti-spam solution.

Key Accomplishments

  1. SpamAssassin Installed: Service running and integrated
  2. Postfix Integration: Mail being scanned automatically
  3. Bayesian Filtering: Learning from your specific spam patterns
  4. Network Tests: Collaborative spam detection active
  5. Optimized Performance: Configured for your traffic levels

Next Steps

  1. Train Bayesian filter: Add more spam/ham samples
  2. Monitor effectiveness: Review caught spam and false positives
  3. Fine-tune scoring: Adjust threshold and rule weights
  4. Create custom rules: Address specific spam patterns
  5. Regular maintenance: Keep rules updated and database trained

Important Reminders

  • Train regularly: Bayesian filter improves with data
  • Monitor false positives: Review and whitelist legitimate mail
  • Update rules weekly: Spam techniques evolve constantly
  • Optimize for your traffic: Adjust children count appropriately
  • Document changes: Keep track of custom rules and modifications

With SpamAssassin properly configured and maintained, you can achieve 95%+ spam detection rates while keeping false positives below 1%. Continue training, monitoring, and optimizing for best results.