C'mon Gmail -- Give Me Language Filtering To Stop Spam

I've written before that I wish Gmail would allow you to build better blacklists, sort your inbox alphabetically, see more than 100 results at a time or nab non-English mail from hitting my inbox, all as ways to stop spam. On the "a picture's worth a thousand words" front, here's a visual illustration of why all that would be helpful:

That's just some of the spam that's making it through Gmail's spam filter but being flagged by my Mailwasher tool, before I decide what to download. I probably caught 100 messages today like this, when checking on what's built up over the weekend. I see plenty like this on any typically day, when starting my morning.

I wrote before that tagging Asian-language or Cyrillic-language messages as spam would be a huge improvement for me. I speak no languages using these character sets. Everything predominantly using these character sets is going to be spam. Heck, pretty much anything that's not in English is likely to be spam headed my way.

Google's long had the ability to determine the language of a web page. They've done tons of work on machine translation. Why not apply some of these smarts to Gmail? Why not have an option called "Flag anything not in my preferred language as spam?" After all, I'm sure I have a counterpart in China who doesn't speak English but is sick of getting English-language spam.

In the meantime, I thought I'd revisit the filter approach, to see if it helps. I explained before how I used to find a common Asian-language character at SpamCop as a means of catching spam it may have missed. Time to try the same with Gmail.

Gmail allows you to create a filter, up to 20 of them. You'll see a link for this right next to the "Search the Web" button at the top of the page. Next, I needed a common character. I went for a common Chinese character, the top one list here. I pasted that into the "Has the words" button and did a test search. Bingo -- thousands of matches

The next part is tricky. I could "Skip the Inbox," which automatically archives the mail. But this is spam. I don't want to save it. I could also delete it automatically, but there's the "you never know" factor. Ideally, it would be nice to send any catches to my spam filter, so I could review what's caught each day along with any other catches, in case of a false match. Sadly, that's not an option.

The only other choice is to create a custom label. So I do that, one called "Asian Spam." What's the result? I'll let you know. I suspect tomorrow there will be a ton caught, with the upside being a cleaner inbox and the downside having to review both my spam and Asian spam folders.

By Danny Sullivan on Apr. 3, 2006 | Permalink
See related posts in: Email

Next Post: Pull The Spring, Not The Trampoline
Previous Post: The Body Shop Versus Body Time
All Posts: Daggle Archives
Posts By Category: Daggle Categories
Return To: Daggle Home Page

Comments

Want to comment? If you are signed into TypeKey, you'll see a form below. No form? Click on the sign-in link below, and you can sign-in or sign-up for a free account. Sorry you have to use TypeKey, but I use it to avoid comment spam. All comments currently appear automatically after posting.

Leave a comment

Subscribe!
Subscribe Via Web Feed
Subscribe with Google
Add to My Yahoo!
Subscribe with Bloglines
Add to netvibes
Subscribe with Live.com
Subscribe in NewsGator Online
Subscribe in Rojo

Add to My AOL

Get new entries via email. Enter your address below:


follow dannysullivan at http://twitter.com
Search