<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Spam-A-Lot</title>
	<atom:link href="http://theboysaunders.com/2006/11/spam-a-lot/feed/" rel="self" type="application/rss+xml" />
	<link>http://theboysaunders.com/2006/11/spam-a-lot/</link>
	<description>You Little Punks Think You Own This Town...</description>
	<pubDate>Mon, 21 May 2012 01:52:57 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.3</generator>
		<item>
		<title>By: jsp</title>
		<link>http://theboysaunders.com/2006/11/spam-a-lot/#comment-110</link>
		<dc:creator>jsp</dc:creator>
		<pubDate>Wed, 29 Nov 2006 08:54:30 +0000</pubDate>
		<guid isPermaLink="false">http://theboysaunders.com/2006/11/spam-a-lot/#comment-110</guid>
		<description>Yeah, it's a sad little arms race indeed. Making it even worse: 15 min after I posted that, I got a spam which &lt;a href="http://theboysaunders.com/jsp/spamcrop.gif" rel="nofollow"&gt;included this disclaimer&lt;/a&gt;:

&lt;blockquote&gt;The Publisher of this report was compensated by an unrelated third party twenty five thousand dollars for distribution of this report.&lt;/blockquote&gt;

If true, that means the spammer got $25k for "distribution" of this scam. Sigh...</description>
		<content:encoded><![CDATA[<p>Yeah, it&#8217;s a sad little arms race indeed. Making it even worse: 15 min after I posted that, I got a spam which <a href="http://theboysaunders.com/jsp/spamcrop.gif" rel="nofollow">included this disclaimer</a>:</p>
<blockquote><p>The Publisher of this report was compensated by an unrelated third party twenty five thousand dollars for distribution of this report.</p></blockquote>
<p>If true, that means the spammer got $25k for &#8220;distribution&#8221; of this scam. Sigh&#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Phil Saunders</title>
		<link>http://theboysaunders.com/2006/11/spam-a-lot/#comment-109</link>
		<dc:creator>Phil Saunders</dc:creator>
		<pubDate>Tue, 28 Nov 2006 17:05:04 +0000</pubDate>
		<guid isPermaLink="false">http://theboysaunders.com/2006/11/spam-a-lot/#comment-109</guid>
		<description>Thanks John. For some sad reason I actually found myself hoping against hope that the gibberish was something profound. 

It is fascinating how every weird, seemingly random, little facet of such a superficially odd email is part of a contrived and cynical attempt to fool the Interweb's spam filters.

I'd say my faith in the fundamental goodness of human nature has now been irretrievably shattered, but that happened many years ago.</description>
		<content:encoded><![CDATA[<p>Thanks John. For some sad reason I actually found myself hoping against hope that the gibberish was something profound. </p>
<p>It is fascinating how every weird, seemingly random, little facet of such a superficially odd email is part of a contrived and cynical attempt to fool the Interweb&#8217;s spam filters.</p>
<p>I&#8217;d say my faith in the fundamental goodness of human nature has now been irretrievably shattered, but that happened many years ago.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: jsp</title>
		<link>http://theboysaunders.com/2006/11/spam-a-lot/#comment-108</link>
		<dc:creator>jsp</dc:creator>
		<pubDate>Fri, 24 Nov 2006 22:22:45 +0000</pubDate>
		<guid isPermaLink="false">http://theboysaunders.com/2006/11/spam-a-lot/#comment-108</guid>
		<description>Allow me.

At present, the best tech available for spam filtering is statistical analysis (also called Bayesian filtering) based upon your own personal inbox. Rather than using generic rules/filters (DELETE IF "Viagra"), you feed programs such as &lt;a href="http://dspam.nuclearelephant.com/" rel="nofollow"&gt;DSPAM&lt;/a&gt; (which I use) a corpus of your e-mail (I trained it with around ~2,000 "spam" and ~2,500 "ham" messages) and it then assigns each incoming message a spam probability and confidence level.

This works very well: you can get around "Viagra" with "V1@gr@", but once every person has a customized filter based on their own friends, your job becomes much tougher. Hence messages with nonsensical subjects and large blocks of unrelated text in an attempt to "dilute" the message so that the filter doesn't spot the loaded words.

The image you got takes the battle further. First, by appearing as a graphic, anti-spam software can't read the text and therefore can miss dead giveaway words ("strong buy", "3-day target"). Theoretically, the spam scanner could employ &lt;a href="http://en.wikipedia.org/wiki/Optical_character_recognition" rel="nofollow"&gt;OCR&lt;/a&gt; to "read" the text in an image, but in practice this is way too computationally intensive. Your spammer takes no chances, though, employing an anti-OCR varied background and shifting baseline to make a &lt;a href="http://en.wikipedia.org/wiki/Captcha" rel="nofollow"&gt;captcha&lt;/a&gt;-style image that would be tough to deciper even by humans.

Thus, the statistical spam filter only "sees" a message that seems to be highly original, with none of the words that set it off. And that's why these are the only kinds of spam that actually make it into my inbox.

Bonus 1: I would be shocked if the company itself was involved in the sending of these messages. The bump in value is &lt;a href="http://www.spamstocktracker.com/" rel="nofollow"&gt;incredibly short-lived&lt;/a&gt;, and soliciting a stock purchase (without a prospectus) is against the law in the U.S. The company's execs could be punished and the stock delisted as a result.

Which is why third parties love the penny (OTC) stocks: low investment, high volatility. Buy, pump, dump.

Bonus 2: I installed DSPAM in February, and my spam folder just passed 8,000 messages. You'd except them all to be from Feb-Nov '06, but instead I have messages from 1969, 1980, 1997... through to 2037 and 2038.

Why? I expect the answer lies in how many e-mail clients handle new mail. In my case (Thunderbird), when new mail is checked the screen automatically scrolls to &lt;em&gt;show the first (oldest) new message&lt;/em&gt;. Suppose I checked my e-mail right now and got 10 new e-missives, one of which was dated 1998. Thunderbird would helpfully scroll to the bottom of my e-mail, showing a boldfaced spam... all by its attention-grabbing lonesome.</description>
		<content:encoded><![CDATA[<p>Allow me.</p>
<p>At present, the best tech available for spam filtering is statistical analysis (also called Bayesian filtering) based upon your own personal inbox. Rather than using generic rules/filters (DELETE IF &#8220;Viagra&#8221;), you feed programs such as <a href="http://dspam.nuclearelephant.com/" rel="nofollow">DSPAM</a> (which I use) a corpus of your e-mail (I trained it with around ~2,000 &#8220;spam&#8221; and ~2,500 &#8220;ham&#8221; messages) and it then assigns each incoming message a spam probability and confidence level.</p>
<p>This works very well: you can get around &#8220;Viagra&#8221; with &#8220;V1@gr@&#8221;, but once every person has a customized filter based on their own friends, your job becomes much tougher. Hence messages with nonsensical subjects and large blocks of unrelated text in an attempt to &#8220;dilute&#8221; the message so that the filter doesn&#8217;t spot the loaded words.</p>
<p>The image you got takes the battle further. First, by appearing as a graphic, anti-spam software can&#8217;t read the text and therefore can miss dead giveaway words (&#8221;strong buy&#8221;, &#8220;3-day target&#8221;). Theoretically, the spam scanner could employ <a href="http://en.wikipedia.org/wiki/Optical_character_recognition" rel="nofollow">OCR</a> to &#8220;read&#8221; the text in an image, but in practice this is way too computationally intensive. Your spammer takes no chances, though, employing an anti-OCR varied background and shifting baseline to make a <a href="http://en.wikipedia.org/wiki/Captcha" rel="nofollow">captcha</a>-style image that would be tough to deciper even by humans.</p>
<p>Thus, the statistical spam filter only &#8220;sees&#8221; a message that seems to be highly original, with none of the words that set it off. And that&#8217;s why these are the only kinds of spam that actually make it into my inbox.</p>
<p>Bonus 1: I would be shocked if the company itself was involved in the sending of these messages. The bump in value is <a href="http://www.spamstocktracker.com/" rel="nofollow">incredibly short-lived</a>, and soliciting a stock purchase (without a prospectus) is against the law in the U.S. The company&#8217;s execs could be punished and the stock delisted as a result.</p>
<p>Which is why third parties love the penny (OTC) stocks: low investment, high volatility. Buy, pump, dump.</p>
<p>Bonus 2: I installed DSPAM in February, and my spam folder just passed 8,000 messages. You&#8217;d except them all to be from Feb-Nov &#8216;06, but instead I have messages from 1969, 1980, 1997&#8230; through to 2037 and 2038.</p>
<p>Why? I expect the answer lies in how many e-mail clients handle new mail. In my case (Thunderbird), when new mail is checked the screen automatically scrolls to <em>show the first (oldest) new message</em>. Suppose I checked my e-mail right now and got 10 new e-missives, one of which was dated 1998. Thunderbird would helpfully scroll to the bottom of my e-mail, showing a boldfaced spam&#8230; all by its attention-grabbing lonesome.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

