C’mon Gmail — Give Me Language Filtering To Stop Spam

by Danny Sullivan on April 3, 2006

in Email

I’ve written before that I wish Gmail would allow you to build better blacklists, sort your inbox alphabetically, see more than 100 results at a time or nab non-English mail from hitting my inbox, all as ways to stop spam. On the “a picture’s worth a thousand words” front, here’s a visual illustration of why all that would be helpful:

That’s just some of the spam that’s making it through Gmail’s spam filter but being flagged by my Mailwasher tool, before I decide what to download. I probably caught 100 messages today like this, when checking on what’s built up over the weekend. I see plenty like this on any typically day, when starting my morning.

I wrote before that tagging Asian-language or Cyrillic-language messages as spam would be a huge improvement for me. I speak no languages using these character sets. Everything predominantly using these character sets is going to be spam. Heck, pretty much anything that’s not in English is likely to be spam headed my way.

Google’s long had the ability to determine the language of a web page. They’ve done tons of work on machine translation. Why not apply some of these smarts to Gmail? Why not have an option called “Flag anything not in my preferred language as spam?” After all, I’m sure I have a counterpart in China who doesn’t speak English but is sick of getting English-language spam.

In the meantime, I thought I’d revisit the filter approach, to see if it helps. I explained before how I used to find a common Asian-language character at SpamCop as a means of catching spam it may have missed. Time to try the same with Gmail.

Gmail allows you to create a filter, up to 20 of them. You’ll see a link for this right next to the “Search the Web” button at the top of the page. Next, I needed a common character. I went for a common Chinese character, the top one list here. I pasted that into the “Has the words” button and did a test search. Bingo — thousands of matches

The next part is tricky. I could “Skip the Inbox,” which automatically archives the mail. But this is spam. I don’t want to save it. I could also delete it automatically, but there’s the “you never know” factor. Ideally, it would be nice to send any catches to my spam filter, so I could review what’s caught each day along with any other catches, in case of a false match. Sadly, that’s not an option.

The only other choice is to create a custom label. So I do that, one called “Asian Spam.” What’s the result? I’ll let you know. I suspect tomorrow there will be a ton caught, with the upside being a cleaner inbox and the downside having to review both my spam and Asian spam folders.

  • Google Buzz

  • Share/Bookmark

{ 2 comments… read them below or add one }

1 Grr June 5, 2009 at 10:42 am

It would be great if we didn’t need to worry about our Spam folders at all, but Google constantly flags a handful of things as spam when they are not, so I’m forced to go through my Spam folder manually and filter it by hand. What’s the point?!? Give us better tools for blocking spam.

2 Eric June 7, 2009 at 10:27 pm

I also am absolutley ^&*%$# off with this spam stuff.
Unfortunatley I know nuttin about computers.
However I have an idea. Why can’t you computer techs devise a system where by you flag all your spam mail and the whole lot (whatever is in your spam box) is then sent to ALL senders who have sent you spam.
That would change thier ideas a bit.
Must be possible eh!

Leave a Comment

Thinking of dropping your link spam? Consider this. Seriously, STOP & READ. The guy who runs Google's spam fighting team? I know him pretty well. In fact, it's sort of a joke between us to see what's the latest absurd link drop I can share. So if you want your site to be a poster child on his idiots wall -- and probably to encounter a Google penalty -- go ahead, drop your link. It's nofollow anyway, plus I do have built-in spam fighting and what gets past that usually gets nabbed in a few minutes to a few hours. So you got to ask yourself. Are you feeling lucky?

You can use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Previous post:

Next post: