Build your own spam filter with PHP and DNSBLs

December 2, 2006

Have you ever gotten an email asking if want certain parts of your body enlarged, parts that you might not even have? Was the next email you read one asking if you want to loose the inches you’ve recently gained? Did you ever notice how these emails are always from people that you are fairly certain have nothing to do with the contents of the email. Did MTeresa@Vatican.org really send that diet pill email? Have you ever gotten returned or rejected “can’t be delivered” emails from addressed you’ve never ever sent an email to?

I have.

SPAM. It’s HORRIBLE! My email box for Brian@TheCodeCave.com probably gets 3 to 1 spam over real email. I expected that. I put that address out everywhere and don’t protect it. It is meant to be my public address. But the FROM addresses on all that email never indicates who the email is really from. Even the company information inside the email header is faked. The spammers will grab someother name on their spam list and use it as their from address. I’ve had my name put into the from address of emails a few times. It’s an annoying problem, just ask the Nuclear Moose.

Why this can happen is a long story. It all relates back to the fact that SMTP and port 25 were never meant for submitting emails to email servers. SMTP was only meant for server to server communications. However, that’s for a different post. The long and short of it is that everything can be faked except for one thing: the IP address of the server that sent the email.

Because that IP address is accurate, you can use it to tell if the person that sent the email is a spammer. The post tells you a couple ways to do that. And because this is The Code Cave, you get a fully functional php routine to boot.

First let me show you what I am talking about. Emails contain lots of information that you don’t normally see. Most of that information is in the header of the email. Each email client has different ways of showing email headers. You might find it by viewing the properties of the email. In Outlook, you can see it by choosing, from an open email, View->Options. It will be there under the name of “Internet Headers”.

Here is what one email header looked like that came to me back in June:

Quote:

Return-Path: <tkuhnel@alushiptechnology.com>
Delivery-Date: Mon, 26 Jun 2006 23:53:47 -0400
Received-SPF: none (mxus6: 74.139.17.40 is neither permitted nor denied by domain of alushiptechnology.com) client-ip=74.139.17.40; envelope-from=tkuhnel@alushiptechnology.com; helo=Laskowski6;
Received: from [74.139.17.40] (helo=Laskowski6)
by mx.perfora.net (node=mxus6) with ESMTP (Nemesis),
id 0MKvMg-1Fv4dv28vt-0006m9 for brian@thecodecave.com; Mon, 26 Jun 2006 23:53:47 -0400
From: "Adam Field" <tkuhnel@alushiptechnology.com>
To: <brian@thecodecave.com>
Subject: RICARDO examined BENJAMIN of a please
Date: Tue, 27 Jun 2006 03:53:44 +0480
MIME-Version: 1.0
Content-Type: multipart/related;
type="multipart/alternative";
boundary="—-=_NextPart_000_006A_01C69962.9CE1DB50"
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.2670
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2670
Message-ID: <0MKvMg-1Fv4dv28vt-0006m9@mx.perfora.net>
Envelope-To: brian@thecodecave.com

Almost all of that information in there about who sent this stuff is garbage. “alushiptechnology.com” had nothing to do with the email. Poor TKuhnel certainly had nothing to do with it. He/She just has their email address out there in the spam databases. Google even shows 10 or more names associated with this poor shlep.

But the secret is that “client-ip=74.139.17.40″ in the Received SPF header is accurate. That is the machine that sent the spam. And if it sent one spam, it’s quite likely that it has sent more. That’s when DNSBLs come in. DNSBL stands for Domain Name Server Black List. Usually these lists are generated by creating an email address that is never meant for real use. Then any emails that arrive at that address is, by definition, spam. These addresses are called SPAM Traps. And the places where they are located are often called Honey Pots.

These lists of spammer IP address are made available to anyone anyone that wants to use them. Let’s take a look at how another IP address, 202.177.183.110, is identified by various DNSBLs:

DNSBL – Result (Reason)
AHBL – LISTED (127.0.0.3)
CBL – LISTED (127.0.0.2)
DNSBLNETAUOSPS – LISTED (127.0.0.2)
DNSBLNETAUT1 – LISTED (127.0.0.2)
DSBL – LISTED (127.0.0.2)
DSBLALL – LISTED (127.0.0.2)
EMAILBASURA – LISTED (127.0.0.2)
NJABLPROXIES – LISTED (127.0.0.9)
PSBL – LISTED (127.0.0.2)
SBL-XBL – LISTED (127.0.0.2)
SORBS-HTTP – LISTED (127.0.0.2)
SORBS-SOCKS – LISTED (127.0.0.3)
TQM-DYNAMIC – LISTED (127.0.0.2)
TQM-SPAMTRAP – LISTED (127.0.0.3)
UCEPROTECTL1 – LISTED (127.0.0.2)

Clearly this address was used by a spammer. But how can you take advantage of this?

Well, one of two ways. First you can contact these DNSBL sites and download the list. Or you can ask them to process a single address at a time. That’s what my routine does. It takes an IP address (202.177.183.110) and formats it like this: 110.183.177.202.bl.spamcannibal.org for each spam list I want to check. Then I call a GetHostByName to ask that server who that host really is. If I get back the exact same text that I sent, then I know the address is clean. If I get back something like 127.0.0.2, I know that the address is listed in their DNSBL for reason #2 (whatever that means to them). 127.0.0.3 would indicate reason 3.

When that happens, my routine deletes the email.

What I’m posting here is my preliminary version. I have a fuller version that is much more optimized and is much nicer to the DNSBLs. I’ll give that to anyone who makes a donation to the site and requests it. I’d like to start getting a little something out of this site and this routine adds some real value. Its eliminated 98% of my SPAM emails. I’m confident enough in it that I have it set to permanently delete the emails without my even seeing them. This version doesn’t do that and if it is used too frequently, it could cause your requests to be ignored by the DNSBLs. I’ll also provide a large list of all of the name servers I have and provide some tools you can use to tune your lists.

Here is the download link:
http://www.thecodecave.com/downloads/php/TCCSpamFilter.php.txt
[php]
// *************************************************************************
// TCCSpamFilter.php 11/27/2006
// Written by Brian Layman
//
// A PHP example written to filter out spam from a webmail account.
// Provides an example of DNSBL filtering via domain name lookups.
// See http://www.thecodecave.com/article288 for details
//
// Usage:
// Customize the password, place it on your web site, and call it.
// Alternatively, add it to your cron tab file with a line like this:
// 00,15,30,45 * * * * wget -q http://www.example.com/ISpamFilter.php
//
// WARNING: DO NOT USE THIS SCRIPT MORE OFTEN THAN EVERY 15 MINUTE.
// You will be blocked if you abuse DNSBL services in this fashion.
// I have written an optimized version of this script that can be run
// once a minute and will produce MUCH less traffic than this version.
// Anyone who makes a donation to the site, and requests that source
// can get it.
//
// Original Author - Brian Layman
//
// Created - 27/Nov/2006
// Last Modified - 02/Dec/2006
// Contributors: (Put your name & Initials at the top)
// Brian Layman - BL - http://www.TheCodeCave.com
//
//
// History:
// 27/Nov/2006 - BL - Created
// 02/Dec/2006 - BL - Further Cleaning. Final comments about Rot13
//
// License - If this helps you - Great! Use it; modify it; share it,
// link back to my site.
//
// Indemnity -
// Use this file at your own risk. I'm not going to deliberately hack
// your server, but others might. I may or may not have been worried
// about security when I wrote this routine. It is up to YOU to make
// certain that ANY routines that you put on your Site are safe. Just
// because you see a variable here protected by AddSlashes or
// HTMLSpecialChar does not mean that ALL variables are protected.
//
// If this file allows a hole into your site, it is not my fault. In
// fact, you should just stop right now and delete this file. For if
// it causes blue smoke to be emitted from your web server, if it
// resets your business URL to point to MyClientsSuck.com, or if it
// causes your sister break up with her lawyer boyfriend and start
// dating a caver, it is not my fault. (Actually that last one might
// be an improvement, but it is still not my fault.) YOU are
// responsible for YOUR site. Learn how to protected it and understand
// what every line of code does that you use.
//
// Donations - If this batch file really helps you out, feel free to make
// donate an expresso via Paypal to Brian@TheCodeCave.com or just
// leave a comment at http://www.thecodecave.com/did-that-help and
// include your country of origin.

/*********************************************************************************/
/* Support Routines */
/*********************************************************************************/

// *******************************************************************************
// IP In Mask Range - NOT CURRENTLY USED
// Allows filtering via IP masks just as a normal network configurations do.
// Example: ipinmaskrange("192.168.100.0", "255.255.255.0", "192.168.100.20")
// Returns true because the example is in the network
// Example2: ipinmaskrange("192.168.100.0", "255.255.255.0", "192.168.101.20")
// Returns false because the example is outside the network
// *******************************************************************************
function ipinmaskrange($network, $mask, $ip) {
$ip_long=ip2long($ip);
$network_long=ip2long($network);
$mask_long=ip2long($mask);

if (($ip_long & $mask_long) == $network_long) {
return true;
} else {
return false;
}
}

// *******************************************************************************
// IP In Range
// Specify a range in the form a-b and the routine returns a true if the passed
// IP address is in that range.
// Example: ipinrange("192.168.100.0-192.168.100.255", "192.168.100.20")
// Returns true because the example is in the network
// Example2: ipinrange("192.168.100.0-192.168.100.255", "192.168.101.20")
// Returns false because the example is outside the network
// *******************************************************************************
function ipinrange($range, $ip) {
$range = explode("-", $range);
$rangestart = ip2long($range[0]);
$rangeend = ip2long($range[1]);
$remote_ip = ip2long($ip);
if (($remote_ip >= $rangestart) && ($remote_ip <= $rangeend)) {
return true;
}
else {
return false;
}
}

// *******************************************************************************
// IMAP Get Full Header
// (Thanks JamieD - http://www.codingforums.com/archive/index.php?t-89994.html)
// Returns an array containing the original message header
// *******************************************************************************
function imap_get_full_header( $p_stream, $p_msg_number )
{
$header_string = imap_fetchheader ( $p_stream, $p_msg_number );
$header_array = explode ( "\n", $header_string );
foreach($header_array as $line)
{
if(eregi("^([^:]*): (.*)", $line, $arg))
{
$header_obj[$arg[1]] = $arg[2];
$last = $arg[1];
}
else
{
$header_obj[$last] .= "\n" . $line;
}
}
return ( $header_obj );
}

// *******************************************************************************
// Blocked IP
// Performs a DNS check against a specific IP address.
// Domain Name Server Blacklists (DNSBLs) use this method to declare whether an
// email has been sent from an IP address that has been known to send spam.
// *******************************************************************************
function BlockedIP($Suspect_IP, $DNSvr_Address)
{
$ReverseOrderedIP = array_reverse(explode('.', $Suspect_IP));
$FullLookupAddress = implode('.', $ReverseOrderedIP) . '.' . $DNSvr_Address;
if ($FullLookupAddress != gethostbyname($FullLookupAddress)) {
return true;
} else {
return false;
}
}

// *******************************************************************************
// Sender IP
// Given a mailbox and message number, this routine returns the IP address of
// computer that sent the email. The "from" address can be faked, this IP
// address cannot.
// *******************************************************************************
function senderip($mbox, $num){
$struct = imap_get_full_header($mbox, $num);
$str_in = $struct['Received-SPF'];

$tween=""; // not needed but good practise when appending
$chr1='client-ip=';
$chr2=';';

for ($i=strpos($str_in, $chr1)+10;$i $tween=$tween.$str_in[$i];
}
return $tween;
}

// *******************************************************************************
// Is Black Listed
// This the core routine. Given an IP address, it runs some checks to decide if
// the email was sent from a black listed spammer.
// Usage: $is_it_spam = isblacklisted("192.168.100.1");
// *******************************************************************************
function isblacklisted($ip){
// If there are some people I never even want to see an email from, I would put
// their IP address in the blacklist ranges.
// Example: $BlackList = array("192.168.100.1-192.168.100.5","192.168.102.112-192.168.102.112");
$BlackList = array();

// If there are some people that are declared as spammers by a blocking service I want to use
// I would declare them in the white list.
$WhiteList = array("64.233.160.0-64.233.191.255", // Google mail is allowed
"12.196.88.128-12.196.88.159"); // A false listing due to virus infection that has been purged.

// Check white list membership first for optimization reasons.
$allowed = false;
foreach($WhiteList as $range) {
if(ipinrange($range, $ip)) {
$allowed=true;
}
}

// If this address doesn't get a free pass, check it out further.
if (!$allowed) {
// Iterate the black lists and check the ip address against them
foreach($BlackList as $range) {
if(ipinrange($range, $ip)) {
$blocked=true;
}
}

// PHP uses "Short Circuit Evaluation" so as soon as at true is hit, the routine exits out.
// This statement should be optimized with the local check first, and then the DNSBLs from
// the most inclusive to the least. You want to do as few external checks as possible.
// The full version of this script comes with several other recommended DNSBLs and my full list
// of DNSBLs of which I am aware.
return ($blocked ||
(BlockedIP($ip, 'bl.spamcop.net')) ||
(BlockedIP($ip, 'sbl-xbl.spamhaus.org')) ||
(BlockedIP($ip, 'dnsbl-2.uceprotect.net')) ||
(BlockedIP($ip, 'blackholes.five-ten-sg.com')) ||
(BlockedIP($ip, 'bl.spamcannibal.org')) ||
(false));
}
else {
return false ;
}
// NEVER BLOCK WITH: BLARSBL, FIVETENIGNORE, FIVETENSRC, JAMMDNSBL, SPAMBAG, SPEWS (these block large IP ranges)
// NEVER BLOCK WITH: MAPS-DUL, SORBS-DUHL (these knowingly list IPs that do not meet listing criteria).
}

// This routine iterates all of the emails on the server and checks if they are spam.
// if they are it deletes them. Because it is an IMAP server, they are still online
// If you wish to remove them, use a purge command.
function blockspam($MAILSERVER, $PHP_AUTH_USER, $PHP_AUTH_PW){
$mbox=imap_open($MAILSERVER, $PHP_AUTH_USER, $PHP_AUTH_PW);

// In this example version, this iterates ALL messages in your mailbox.
// The full version only iterates the new messages that have come in.
// By doing that, you can run this check as often as once a once a minute and have much
// less traffic running the check once an hour. If you abuse a DNSBL service, they might
// block your IP address.
for($x=0; $x < imap_num_msg($mbox); $x++) {
$idx=($x+1);
$ip=senderip($mbox, $idx);
if (isblacklisted($ip)) {
imap_delete($mbox, $idx);
}
}
imap_close($mbox);
}

/*********************************************************************************/
/* Main Calls */
/*********************************************************************************/
// This initial version only works with imap servers. That means you use
// port 143.
// Uncomment this line and put in your email server, email login and password
// blockspam("{imap.example.com:143}", "you@example.com","yourpassword");

// I don't like storing the passwords in plain text. You can use Rot13 as a
// really simple encryption method then you can have this file on your screen
// without a passer-by seeing your email password. Rot13 doesn't make it safe
// no two-way encyrption in source does, but it will block wandering eyes.
// Use http://rot13.thecodecave.com to get the encyrpted versions of the text.
// That above line would look something like this:
// blockspam(str_rot13("{vznc.rknzcyr.pbz:143}"), str_rot13("lbh@rknzcyr.pbz"), str_rot13("lbhecnffjbeq"));

// If you have multiple accounts, add another block spam line.

?>

[/php]

Share and Enjoy:
  • del.icio.us
  • Fark
  • Reddit
  • Digg
  • DZone
  • email
  • Facebook
  • FriendFeed
  • Google Bookmarks
  • Netvibes
  • Ping.fm
  • Posterous
  • Slashdot
  • StumbleUpon
  • Suggest to Techmeme via Twitter
  • Technorati
  • Tumblr
  • Yahoo! Bookmarks
  • Add to favorites
  • Blogosphere News
  • HackerNews
  • Identi.ca
  • LinkedIn
  • MySpace
  • Print
  • Yahoo! Buzz

Comments

2 Responses to “Build your own spam filter with PHP and DNSBLs”

  1. Rocket on September 17th, 2009 3:31 pm
  2. Brian on September 22nd, 2009 9:13 am

    huh.. the file is there but the server can’t see it… Here try this.. http://www.thecodecave.com/downloads/php/TCCSpamFilter.txt

Got something to say?





Who is Brian Layman

I am a WordPress expert living in North East Ohio. I am part of the ever expanding Open Source Internet workforce. I am able to stay at home, with my wife and four home schooled kids, while working as the Senior Developer for b5media - a blogging network that has hosted over 300+

I co-host the NEO WordPress Monthly meetup. I am the board chair of our local church. I host and have provided development services for clients such TV personalities Rhett and Link as well as corporations such as Borland International.

In my spare time I try to sneak out, canoe, mountain bike and camp as often as I can. Sometimes I also defend the earth against zombies and aliens, but usually not during the camping trips.

Services Provided

In providing hosting, email, theme and plugin development to my clients, I function as a single point of contact answering to the needs of their expanding sites.

My service portfolio includes but is not limited to WordPress hosting, optimization, theme development and custom plugin creation. Community creation via vBulletin, Ning and BuddyPress and bbpress

I also am well experienced in site conversion, transition and merges. To clarify this, website technologies change and giving up your data is not an option. I have transitioned literally hundreds of sites from one platform to another.

viagra 50 mg indian version of viagra cialis cheapest viagra india online viagra cost comparison viagra for sale without prescription generic tadalafil online buy viagra in korea indian levitra discount cialis online viagra prescription over the counter vardenafil cialis otc cialis no rx cialis 30 mg viagra ranbaxy buy levitra in uk cialis low price tadalafil tablets 10mg cheap viagra fast shipping cheap generic levitra cialis discount cialis 5mg viagra discount prices buy levitra without prescription vardenafil online generic levitra canada viagra professional price cheapest sildenafil citrate indian version of cialis viagra lowest price viagra online prescriptions tadalafil 10mg levitra over the counter levitra prescriptions online buy viagra without a prescription liquid tadalafil citrate buy viagra prescription online tadalafil 20mg india india viagra generic sildenafil citrate for sale vardenafil hcl 10mg cialis discount coupon buy levitra australia viagra over the counter in canada liquid sildenafil tadalafil price comparison viagra cost in india cialis mail order sildenafil sales buy vardenafil cialis offer cheap vardenafil generic cialis no prescription viagra tabs generic indian names viagra price canada vardenafil hcl 20 mg generic viagra without prescription viagra by scilla biotechnologies buy generic cialis free viagra viagra over the counter viagra pills kamagra 100 mg cialis from india tadalafil australia tadalafil 20mg tablets tadalafil soft tabs sildenafil pills viagra no prescription required generic viagra paypal tadalafil online indian viagra cost tadalafil online pharmacy generic soft viagra sildenafil soft tablets viagra generic names buy viagra in ireland levitra without prescription levitra online purchase cialis pill indian tadalafil levitra 5mg cialis cost per pill tadalafil oral jelly sildenafil no prescription vardenafil price generic cialis 10mg cheap cialis no prescription order sildenafil citrate indian generic viagra blue viagra buy cialis usa apcalis 20mg tablets viagra overnight delivery sildenafil india purchase viagra without a prescription viagra prescriptions order viagra without prescription viagra with no prescription levitra for sale purchase viagra canada discount levitra viagra 200mg cheap viagra 100mg cialis overnight delivery buy sildenafil online viagra made in india cialis tabs 10mg viagra indian pharmacy viagra for sale in ireland viagra uk prices buy viagra in europe generic cialis india levitra online viagra for sale india buy viagra in dublin generic cialis soft tabs viagra 50mg cost generic sildenafil 100mg tadalafil generic viagra super active 100 mg kamagra 100mg sildenafil 100 mg tablets cialis no prescription viagra low price online cialis suhagra tablets buy cialis daily use tadalafil sample cialis prices viagra prescription online buy cialis pill kamagra from india cialis online levitra mg vigora india vardenafil 10 mg sildenafil citrate 100mg buy viagra in india buy cialis professional viagra in india buy viagra in singapore generic revatio viagra substitutes sildenafil canada viagra no script cheap kamagra viagra retail price cheap lovegra order viagra uk buy cialis in mexico viagra prescription price purchase cialis online without prescription online cialis prescription ranbaxy caverta buy viagra in hong kong sildenafil price cialis mastercard buy viagra in england viagra mail order canada cialis tablets for sale order cialis cialis soft tabs generic levitra india tadalafil prices cheap sildenafil citrate tablets cialis online prescriptions cialis 5 mg daily levitra prices prescriptions viagra viagra over the counter alternative cialis 20 mg tablets cialis generic india cialis prescribing cialis 20mg daily sildenafil 50 mg viagra drug prices tadalafil generic india cialis sale viagra prices buy viagra 50 mg levitra pharmacy buy viagra generic viagra prescription drug cialis daily cost vardenafil uk viagra soft tabs online buy viagra super active cialis 10mg price 25mg viagra silagra 100mg online viagra prescriptions cialis prescription cheap cialis india revatio 20 mg indian equivalent of viagra tadalafil india viagra capsules cheapest viagra buy cialis without prescription tadalafil overnight cheap tadalafil online purchase viagra online no prescription