The big squeeze: closing down the junk e-mail pipe - Internet

Computer Technology Review, Dec, 2003 by Jeff Ready

Anti-spam measures are big on the legislative horizon, ranging from strict legislation at the state level to not-so-strict federal bills like CANSPAM (which doesn't do much to can spam, according to states' attorneys general.) The House/Senate/state wrangling is intense as lawmakers try to balance anti-spam measures while protecting legitimate commercial interests. No matter who wins, spammers aren't going to shut down and go home. Just as they do now, companies will have to continue to run anti-spam filters on their corporate e-mail. Catching junk e-mail is a technical challenge due to its volume and sophistication. (Individual junk e-mailers may be bottom feeders, but the spam companies who enable them have deep pockets and smart programmers.)

Organizations use a variety of locations and technologies when using anti-spam technology. Spam filters come in both client-based and server-based flavors (client-based runs on end-user machines and serverbased can run from an outside Internet service, to the firewall, to the e-mail server). Spam filter approaches fall into four major categories, with many filters combining the technologies: Whitelisting/blacklisting, pattern matching, signature filtering, and natural language processing.

Whitelist/Blacklist

This basic filtering level works on lists of good e-mail addresses (whitelist) or spammer e-mail addresses or domains (blacklist). Blacklist filters reject any messages originating from or routed through blacklisted addresses or domains, while whitelists only accept any messages from an address or domain on a user-approved list. Some filtering applications use one or the other but many combine them.

Whitelisting

Whitelisting, or positive filters, checks incoming e-mail against a list of approved addresses. If the e-mail sender is not on the list, the filter can delete it, send it into a quarantined folder, or send back a challenge e-mail to the sender. If the sender personally replies to the challenge, the whitelist believes there is a real person at the other end and adds the address to the approved list. This option is extremely selective about incoming e-mail, but challenge responses can seriously annoy legitimate senders. It is also susceptible to sophisticated address forging.

E-mail users should be able to add addresses to the whitelist. Most whitelist filters will start by building themselves from e-mail addresses found in the user's existing mailbox and address book. Whitelists won't catch spammers who have hijacked good known addresses, but will catch spammers who haven't. They will also catch e-mail from your mother if she isn't in your e-mail whitelist, so users should check the list periodically.

[ILLUSTRATION OMITTED]

Blacklisting

Blacklisting, or negative filters, compares incoming addresses, subject lines and messages to a blacklist. It intercepts any offending messages and deletes or moves them into a quarantine folder. For example, common filters include rules for blocking mail with "free" or "cash" in the subject line as well as shady words we won't mention. Filters can also block certain ISPs or specific addresses. Blacklisting used to be simpler, but must now adjust to ridiculous punctuation use in spam message subject lines. Blacklisting also requires a large number of filters and CPU processing time, and often returns false positives--identifying an innocent message as spam. In fact, blacklists are better at blocking known viruses than spam--e-mail administrators can use them to deny attachments with common virus extensions such as .exe, .bat and .vbs. Blacklisting needs carefully maintained lists to work since spam programmers are flexible, creative and can turn on a dime. Companies using blacklists can keep a database in-house, though this is labor intensive. Many sign-up with a third-party service provider who constantly updates its blacklists for client companies.

Pattern Matching

Pattern matching defines a set of criteria that classify messages as spam. Characteristics include such items as all capitalized subject lines, frequent spam phrases, and suspicious header lines. Administrators and users can assign point values to individual characteristics (for example, a high value for porn and a lower one for business offers). The filter then marks any messages scoring at, or higher than, the threshold as spam. Some systems allow the user to train the software to recognize spam or to exempt messages from spam blocking.

Pattern matching filters often use whitelist/blacklist techniques as well, but depend on more sophisticated technologies like content pattern recognition and flexible content filtering. Typical approaches include:

* Identifying invalid HTML tags: Spammers try to disguise HTML-enabled spam by inserting meaningless content within specific HTML tags

* Making case-sensitive checks: Another common spammer technique is displaying subject lines exclusively in upper case

* Practicing intelligent word recognition: To avoid blacklists, spammers will deliberately alter the subject line by adding or removing punctuation, adding nonsense phrases, misspelling words or compressing spaces.

 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
CXO UnpluggedSmart Business interviews on BNET

See and hear how senior level executives across the Asia Pacific are developing smart business ideas across a variety of sectors. The focus is on the future, and on how businesses need to evolve.

advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement
Click Here

Content provided in partnership with Thompson Gale