© 2018 Peter N. M. Hansteen
SMTP email is not going away any time soon. If you run a mail service, when and to whom you present the code signifying a temporary local problem code is well worth your attention.
Any functioning domain MUST have at least one MX (mail exchanger) record published via the domain name system, and registrars will generally not even let you register a domain unless you have set up somewhere to receive mail for the domain.
But email worked most of the time anyway, and while you would occasionally hear about valid mail not getting delivered, it was a rarer occurrence than you might think.
Then a few years along, the Internet grew out of the pure research arena and became commercial, and spam started happening. Even in the early days of spam it seems that a significant subset of the messages, possibly even the majority, was sent with faked sender addresses in domains not connected to the actual senders.
Over time people have tried a number of approaches to the problems involved in getting rid of unwanted commercial and/or malware carrying email. If you are interested in a deeper dive into the subject, you could jump over to my earlier piece Effective Spam and Malware Countermeasures - Network Noise Reduction Using Free Tools.
Two very different methods of reducing spam traffic were originally formulated at roughly the same time, and each method's adherents are still duking it out over which approach is the better one.
One method consists simply of implementing a strict interpretation of a requirement that was already formulated in the SMTP RFC at the time.
The other is a complicated extension of the SMTP-relevant data that is published via DNS, and full implementation would require reconfiguration of every SMTP email system in the world.
As you might have guessed, the first is what is commonly referred to as greylisting, where we point to the RFC's requirement that on encountering a temporary error, the sender MUST (RFC language does not get stronger than this) retry delivery at a later time and keep trying for a reasonable amount of time.
Spammers generally did not retry as per the RFC specifications, and even early greylisting adopters saw huge drop in the volume of spam that actually made it to mailboxes.
On the other hand, end users would sometimes wonder why their messages were delayed, and some mail administrators did not take well to seeing the volume of data sitting in the mail spool directories grow measurably, if not usually uncontrollably, while successive retries after waiting were in progress.
In what could almost almost appear as a separate, unconnected universe, other network engineers set out to fix the now glaringly obvious omission in the existing RFCs.
A way to announce valid senders was needed, and the specification that was to be known as the Sender Policy Framework (SPF for short) was offered to the world. SPF offered a way to specify which IP addresses valid mail from a domain were supposed to come from, and even included ways to specify how strictly the limitations it presented should be enforced at the receiving end.
The downsides were that all mail handling would need to be upgraded with code that supported the specification, and as it turned out, traditional forwarding such as performed by common mailing list software would not easily be made compatible with SPF.
The flame wars over both methods. You either remember them or should be able to imagine how they played out.
And while the flames grew less frequent and generally less fierce over time, mail volumes grew to the level where operators would have a large number of servers for outgoing mail, and while the site would honor the requirement to retry delivery, the retries would not be guaranteed to come from the same IP address as the original attempt.
It was becoming clear to greylisting practitioners that interpreting published SPF data as known good senders was the most workable way forward. Several of us already had started maintaining nospamd tables (see eg this slide and this), and using the output of
$ host -ttxt domain.tld
(sometimes many times over because some domains use include statements), we generally made do. I even made a habit of publishing my nospamd file.
As hinted in this slide, smtpctl (part of the OpenSMTPd system and in your OpenBSD base system) now since OpenBSD 6.3 is able to retrieve the entire contents of the published SPF information for any domain you feed it.
Looking over my old nospamd file during the last week or so I found enough sedimentary artifacts there, including IP addresses for which there was no explanation and that lacked a reverse lookup, that I turned instead to deciphering which domains had been problematic and wrote a tiny script to generate a fresh nospamd on demand, based on fresh SPF lookups on those domains. The list of domains fed to the script is available here, but please do edit to suit your local needs.
For those wary of clicking links to scripts, it reads like this:
#!/bin/sh domains=`cat thedomains.txt` outfile=nospamd generatedate=`date` operator="Peter Hansteen <peter@bsdly.net>" locals=local-additions echo "##############################################################################################">$outfile; echo "# This is the `hostname` nospamd generated from domains at $generatedate. ">>$outfile;echo "# See https://bsdly.blogspot.com/2018/11/goodness-enumerated-by-robots-or.html for some">>$outfile;echo "# background and on why you should generate your own and not use this one.">>$outfile;
echo "# Any questions should be directed to $operator. ">>$outfile; echo "##############################################################################################">>$outfile; echo >>$outfile; for dom in $domains; do echo "processing $dom"; echo "# $dom starts #########">>$outfile; echo >>$outfile; echo $dom | doas smtpctl spf walk >>$outfile; echo "# $dom ends ###########">>$outfile; echo >>$outfile; done echo "##############################################################################################">>$outfile; echo "# processing done at `date`.">>$outfile; echo "##############################################################################################">>$outfile; echo "adding local additions from $locals"; echo "# local additions below here ----" >>$outfile; cat $locals >> $outfile;
A little later I'm clearly pondering what to do, including doing another detailed writeup.The downside of maintaining a 55+ thousand entry spamtrap list and whitelisting by SPF is seeing one of the whitelisted sites apparently trying to spam every one of your spamtraps (see https://t.co/ulWt1EloRp). Happening now. Wondering is collecting logs and forwarding worth it?— Peter N. M. Hansteen (@pitrh) November 9, 2018
Fortunately I had had some interaction with this operator earlier, so I knew roughly how to approach them. I wrote a couple of quick messages to their abuse contacts and made sure to include links to both my spamtrap resources and a fresh log excerpt that indicated clearly that someone or someones in their network was indeed progressing from top to bottom of the spamtraps list.Then again it is an indication that the collected noise is now a required part of the spammer lexicon. One might want to point sites at throwing away outgoing messages to any address on https://t.co/3uthWgKWmL (direct link to list https://t.co/mTaBpF5ucU - beware of html tags!).— Peter N. M. Hansteen (@pitrh) November 9, 2018
As the last tweet says, delivery attempts stopped after progressing to somewhere into the Cs. The moral might be that a list of spamtraps like the one I publish might be useful for other sites to filtering their outgoing mail. Any activity involving the known-bad addresses would be a strong indication that somebody made a very unwise purchasing decision involving address lists.I ended up contacting their abuse@ with pointers to the logs that showed evidence of several similar campaigns over the last few days (the period I cared to look at) plus pointers to the spamtrap list and articles. About 30m after the second email to abuse@ the activity stopped.— Peter N. M. Hansteen (@pitrh) November 10, 2018