Email & Spam Filtering Stats: Jan 12, 2006

by Danny Sullivan on January 13, 2006

in Email

I wrote earlier that I
hoped to do some side-by-side looks at how SpamCop, Yahoo & Gmail all handle
spam fighting. Here’s the rundown for the mail received yesterday from around
12:30pm UK time to 10:30am today, with the summary being that SpamCop captures
the most spam with the least amount of false matches. But that might also be
helped by training I’ve done with it over time. Here’s the rundown:

Service

Yahoo

Gmail

SpamCop

Inbox

508

396

254

Spam

142

269

408

False Match

11

0

5

Total Mail

661

665

667

% Spam Caught

21%

40%

61%

% False Match

8%

0%

1%

What do the figures on the chart mean? Here we go:

Inbox: How much mail was in my inbox when I checked at each place at
the same time today. Gmail is the exception. Since I download email from Gmail
throughout the day, my inbox was empty. I’ve estimated how much would be in
there if I had NOT downloaded by looking at the total mail I received at Yahoo
and SpamCop (about 665 items) and using that as an estimate. Why’s the inbox
figure important?
It shows how much mail each system forced me to deal with.
If a lot of spam gets through to your inbox, more work. SpamCop would have saved
me the most.

Spam: Shows how much spam each service filtered. The more spam
filtered, the better — EXCEPT if there are a lot of false matches, as explained
below. SpamCop did the best job in spam catching. Some additional notes:

  • Yahoo uses what it calls SpamGuard, and it’s an off or on affair,
    no degrees, no way to crank it down to 2 or up to 11. If you pay, you can get
    what’s promised to be better protection with SpamGuard Plus.
     
  • Gmail has spam filtering on by default. You can’t turn it off or
    adjust settings.
     
  • SpamCop has a wide range of options. You can use none of them or
    all. I currently have it to use SpamAssassin with a limit of 5, plus to check
    against the SpamCop Blacklist, the DSBL open relays and blacklists for South
    Korea, China, Nigeria, Argentina, Brazil & the SORBS list. This degree of
    control is nice, but the options and why you might use them aren’t really
    explained.

False Match: Shows how many items were considered spam and held when
in reality, they were legit mail. Why’s the false match rate important?
The higher the false match rate, the more work is required to manually check and
make sure important messages weren’t missing. More for each service:

  • Yahoo in particular savaged newsletters I receive. Search Engine
    Guide, iMedia, MediaPost, MarketWatch, my weekly Odeon cinema listings and our
    own SearchDay newsletter all got nabbed as spam. Four emails from people
    responding to messages I’d sent were also held. There is a way to flag things
    held as "Not Spam." I believe that over time, this would help train Yahoo not
    to reject such material. I may test this in the future.
     
  • Gmail had no false matches. It also caught more actual spam than
    Yahoo, so points for that. However, it didn’t catch as much as SpamCop. Like
    Yahoo, Gmail has a "Not Spam" button that I suspect if used may whitelist
    things and prevent them from being caught in the future.
     
  • SpamCop grabbed the most spam with the lowest false match rate, so
    a pretty good compromise. Like Yahoo, it nabbed the Search Engine Guide
    newsletter. It also grabbed two notifications from
    my Yahoo Groups mailing
    list, a message to that list and one response to a message I’d sent someone
    else.

    One thing I love about SpamCop is that junk mail is sorted alphabetically by
    default. It makes it very easy to see all the non-English spam I’m getting,
    plus see the duplicate messages and so on. I could easily do the same at Yahoo
    (in the beta I used) by clicking to sort by subject. Gmail has no ability like
    this.

    SpamCop also has the ability to "Release & Whitelist" items similar to Yahoo &
    Google. I’ve used this over time at SpamCop, so that is one reason why it
    might have a lower false match rate than Yahoo
    . I currently have 210
    addresses on my whitelist. I might delete these and start fresh in the future,
    to better compare. I also have three addresses on my blacklist.

Total Mail: The total amount of mail I received. My "real" mail is
uncertain.

Gmail sent me 396 items, and some of those were definitely spam that got
through. MailWasher, which I use to prescreen further, has a statistics report
window showing what I’ve deleted. Estimated the best I can, I’d say about 100
items of spam got past Gmail. So call it 300 items, which is pretty close to
what was in my SpamCop inbox.

Why not take the SpamCop inbox figure as "real" mail given the high amount of
spam it pulled? I know from experience some spam still gets through. Using the
SpamCop figure, I’d say my real mail was about 200 items, based on a typical
day. So overall, 200-300 items, that’s my estimate of what I dealt with
yesterday.

% Spam Caught: The higher the better, assuming the false match rate
isn’t high. Percentage comes from spam caught divided by total mail received.

% False Match: The lower the better, assuming you also have a high
spam caught rate. Percentage comes from false matches divided by total spam
caught.

  • Share/Save/Bookmark

{ 1 comment… read it below or add one }

1 SEO-siti-web July 5, 2008 at 11:58 pm

SpamCop seems to be the best choice, but i think gmail will overcome it! :)

Leave a Comment

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>


Previous post: My Accidental Americans In Britain Mailing List

Next post: SpamCop, Yahoo & Gmail Spam Filtering Stats: Jan 13-15, 2006