<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Becoming paranoid &#187; Email</title>
	<atom:link href="http://becomingparanoid.com/category/email/feed/" rel="self" type="application/rss+xml" />
	<link>http://becomingparanoid.com</link>
	<description>Tips about computer security, privacy and staying safe online</description>
	<lastBuildDate>Wed, 03 Oct 2007 13:03:29 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.5</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Chain letters</title>
		<link>http://becomingparanoid.com/2006/05/15/chain-letters/</link>
		<comments>http://becomingparanoid.com/2006/05/15/chain-letters/#comments</comments>
		<pubDate>Mon, 15 May 2006 10:54:38 +0000</pubDate>
		<dc:creator>madelman</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://becomingparanoid.com/2006/05/15/chain-letters/</guid>
		<description><![CDATA[With some regularity, everyone receives in our inbox some e-mails sent by someone they know where they try to warn you against some kind of really dangerous virus or asking for collaboration in a project to help a poor kid,&#8230;
These e-mails are known as hoaxes and, although they are send with a good intention, they [...]]]></description>
			<content:encoded><![CDATA[<p>With some regularity, everyone receives in our inbox some e-mails sent by someone they know where they try to warn you against some kind of really dangerous virus or asking for collaboration in a project to help a poor kid,&hellip;</p>
<p>These e-mails are known as hoaxes and, although they are send with a good intention, they are almost always false, a kind of urban legend spread through Internet.</p>
<p>You can spot this kind of e-mails because they say you will have a big loss if you don&rsquo;t forward them, they are not signed, they promise some presents from a company or offer some difficult to believe information.</p>
<p>Some examples of these kind of messages:</p>
<ul>
<li>The Make A Wish Foundation, has agreed to donate 7 cents evertime this message is sent on.</li>
<li>If you forward it to 20 friends, you will receive the brand new Ericsson R320 WAP-phone. </li>
<li>DO NOT RELY ON YOUR ANTI-VIRUS SOFTWARE. McAFEE NOR NORTON CAN DETECT IT BECAUSE IT DOES NOT BECOME A VIRUS UNTIL JUNE 1ST. IT WILL BE TO LATE THEN. WHATEVER YOU DO, DO NOT OPEN THE FILE!!! </li>
</ul>
<p>These e-mails have all been extracted from <a href="http://www.breakthechain.org/">Break the chain</a>, a site dedicated to recopilating them, so you can know if an e-mail you receive is a hoax or not.</p>
<p>You should never forward this letters to your friends, because they are very annoying, clutter up your inbox and many times, they can be used to get e-mail addresses to spam them. If your friends send them to you, you should tell them not to do it and why is it bad, redirecting them to <a href="http://www.breakthechain.org/">Break the chain</a>&nbsp;if necessary.</p>
]]></content:encoded>
			<wfw:commentRss>http://becomingparanoid.com/2006/05/15/chain-letters/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>E-mail security: detecting spam (V)</title>
		<link>http://becomingparanoid.com/2006/04/05/e-mail-security-detecting-spam-v/</link>
		<comments>http://becomingparanoid.com/2006/04/05/e-mail-security-detecting-spam-v/#comments</comments>
		<pubDate>Wed, 05 Apr 2006 22:44:46 +0000</pubDate>
		<dc:creator>madelman</dc:creator>
				<category><![CDATA[Email]]></category>
		<category><![CDATA[Medium]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://becomingparanoid.com/2006/04/05/e-mail-security-detecting-spam-v/</guid>
		<description><![CDATA[We saw some techniques spammers use to try to evade Bayesian spam filters and how the use of this techniques is making spam a bit less effective and, sometimes, even more easy to detect.
But spammers know this and they wont&#8217; allow their business to go down so easily. So what is the future of filter [...]]]></description>
			<content:encoded><![CDATA[<p>We saw some techniques spammers use <a href="http://becomingparanoid.com/2006/03/29/e-mail-security-detecting-spam-ii/">to try to evade Bayesian spam filters</a> and how the use of this techniques is making spam a bit less effective and, sometimes, even more easy to detect.</p>
<p>But spammers know this and they wont&#8217; allow their business to go down so easily. So what is the future of filter evasion? I have been thinking about some techniques which would probably evade most of current filters and perhaps it&#8217;s time to prepare against them before it&#8217;s too late.</p>
<p>The idea for this list came from a post by <a href="http://vivekjishtu.blogspot.com/2006/03/beware-of-new-form-of-spam-greetings.html">Vivek Jishtu</a> where he explains how a spammer is using the Yahoo greeting cards to send his messages without being detected by filters. This service allows anyone to send a card to someone, who will be notified by e-mail and will receive a link to go to a site to view the card. In this card, the spammer can include arbitrary content so he can put his spam message there and as this will not pass through any filter it won&#8217;t be detected. So, if the user receiving the card visits the link he will see this (translated from Chinese by Google Translator):</p>
<p><center><br />
<img src="http://becomingparanoid.com/images/spamgreeting.png"><br />
</center><br />
<span id="more-51"></span><br />
With another link to the site the spammer is promoting. This is a neat trick and a difficult one to avoid. The only solution is to educate to user not to follow links coming in unexpected mails or from unknown sources.</p>
<p>But there are also other methods that spammers might use now or in the future (I&#8217;m not aware any of this is currently in use, but you never know). </p>
<p>The first technique is copied from viruses or worms which have used this for a long time. Instead of sending the content of the spam in the main body of the message, <strong>a ZIP file can be attached containing a text file with the advertisement</strong> from the spammer. If this becomes popular, Bayesian spam filters might be unable to detect it as the analyzed content can have no malicious word and can look innocuous. To be able to analyze the spam, the filter should decompress the ZIP file and search for text files inside it. This also can be avoided with another technique coming from the virus world, the use of ZIP files protected with a password, like the <a href="http://www.f-secure.com/v-descs/bagle_j.shtml">Bagle-J</a> virus did. The user is told to open the ZIP file using a password contained in the main body, so the filter won&#8217;t be able to decompress the file but the user will.</p>
<p>Another technique, similar to the use of images instead of text, is <strong>sending their advertisements in attached files in some popular file format</strong>, like PDF or Microsoft Word files. Again, the content of the main body might be totally innocuous, asking the user to open the attached file. The filter will need to understand the file format to be able to extract the text and analyze it, which will consume resources from the computer, something sometimes not feasible in servers with lots of users. </p>
<p>These two techniques can be stopped by disallowing the use of attached files or, at least, restricting the formats accepted, as some servers already do to prevent the reception of viruses. We also can educate the users not to open attached files coming from unkown sources, although I doubt this will work as we can see with the expansion of some viruses which work this way.</p>
<p>Spammers could even do another loop and send their spam inside a PDF file compressed in a ZIP file protected by a password&#8230; OK, enough, enough,&#8230;</p>
<p>I don&#8217;t know if any of these or similar techniques will be used by spammers in a near future. If they do use them it will be harder to filter the spam but, at the same time, will mean we are winning a battle in this war against spam. We should better be prepared before it&#8217;s too late.</p>
<p>Of course, Bayesian filtering is not the only way to detect spam, although we have been concentrating on it. There are other techniques currently in use, which probably might be more effective against these new attacks and we&#8217;ll see them in another post.</p>
]]></content:encoded>
			<wfw:commentRss>http://becomingparanoid.com/2006/04/05/e-mail-security-detecting-spam-v/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>E-mail security: detecting spam (IV)</title>
		<link>http://becomingparanoid.com/2006/04/03/e-mail-security-detecting-spam-iv/</link>
		<comments>http://becomingparanoid.com/2006/04/03/e-mail-security-detecting-spam-iv/#comments</comments>
		<pubDate>Mon, 03 Apr 2006 09:15:01 +0000</pubDate>
		<dc:creator>madelman</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://becomingparanoid.com/2006/04/03/e-mail-security-detecting-spam-iv/</guid>
		<description><![CDATA[Knowing how Bayesian filtering works we will try to find some programs which use it and see which is the most useful one for us. I&#8217;ll give a list and you should choose the most appropiate for you.
We can split the filtering programs depending on where they work: on the server or on the client. [...]]]></description>
			<content:encoded><![CDATA[<p>Knowing how Bayesian filtering works we will try to find some programs which use it and see which is the most useful one for us. I&#8217;ll give a list and you should choose the most appropiate for you.</p>
<p>We can split the filtering programs depending on where they work: on the server or on the client. The programs working on the server have some advantages, as they look at more mail messages (they see mail from all users in a system) it is easier and faster to train them. Furthermore, there is only one place to administer it, making the administrator task easier. At the same time, the users don&#8217;t need to receive the spam so they don&#8217;t spend additional bandwith and time. On the other hand, they are not so customizable by the user, which might prefer his own techniques to detect spam and false postivesand, if the user doesn&#8217;t have access to the server he will not be able to install it.</p>
<p>One of the most known server-side filtering software is <a href="http://spamassassin.apache.org/">SpamAssassin</a>, which uses different checks to test for spam, one of them being Bayesian filtering. Each one of this tests adds or substracts a score from the mail and at the end of the runs this score will determine if the mail is spam or not. Amongst other these test include mail-header tests, text-content rules, white-lists and black-lists and collaborative databases, making this program one of the most accurate. This can also be used as client-side filtering, although the installation will not be as easy as others.<br />
<span id="more-50"></span><br />
Another aproach to server-side filtering is the one used by <a href="http://assp.sourceforge.net/">ASSP</a> which works with any king of mail server, as it stands as a proxy (getting the data and transmiting) in front of the real mail server and filters the data before it is delivered. It also uses Bayesian filtering and allows the settings of white-lists so you can define addresses which will be always accepted. It can also scan messages against viruses, which will drop even more malicious mail.</p>
<p>The last server-side software we are going to see is <a href="http://dspam.nuclearelephant.com/">DSpam</a>. This has some characteristics differentiating it from other Bayesian filters. In this case, the tokens are not only analyzed one by one, but also in pairs, which gives a better view to know if a mail is spam or not. It laso uses Bayesian Noise Reduction and other new approaches to filtering, which promise to give a hight detection rate. It includes a web-based interface to administer it, where each user can train it depending on the personal tastes.</p>
<p>If we don&#8217;t have access to the server or we don&#8217;t want to play with it, we can use a client-based filter, installed in our computer which will analyze the mail once it has been downloaded (or while downloading) and will flag it as spam or legitimate mail. The advantage of this kind of approach is that it can be highly integrated in our mail reader, so might be easier to use by the user.</p>
<p>If we use <a href="http://www.mozilla.com/thunderbird/">Thunderbird</a>, it already includes a filter, as we saw in <a href="http://becomingparanoid.com/2006/03/30/e-mail-security-detecting-spam-iii/">the last post about spam</a>. This is really easy to use, as we only have to click a button to tell it if we think a message is spam and once it is trained it will move automatically all spam to a predefined folder, or can even delete it automatically (I don&#8217;t recommend it in case of false positives).</p>
<p>If we use Outlook instead of Thunderbird, one good option is <a href="http://spambayes.sourceforge.net/">SpamBayes</a>. This software also uses some new approaches which are explained in the <a href="http://spambayes.sourceforge.net/background.html">background page</a>. One interesting characteristic of SpamBayes is that it doesn&#8217;t have only two states, spam and non-spam, but also a third one, unsure, when it doesn&#8217;t know how to classify a message. This way, we can choose what we want to do with it: keep it, delete it or use it to train the program. Although it includes a plugin for using it with Ooutlook, it can also be used with other programs as a proxy, and even in other operating systems like Linux or Mac OS X.</p>
<p>To finish this list, we are going to have a look at one of the first mail filters I used. It&#8217;s called <a href="http://popfile.sourceforge.net">POPFile</a> and works as a proxy in front of the mail server. Our mail client will connect to POPFile and POPFile will connect to the mail server, analyzing the mail as it downloads. One of the things I like most about it is the ability to classify any kind of e-mail, not only spam. So, POPFile can distinguish between work-related mail, mail from our children,&#8230; or any other different classification we want to do. We only have to create the categories and assign some messages to each one to train it and it will classify the received e-mails. It also has a web-based interface to manage all of this.</p>
<p>The list of spam filters is quite long and this is only a selection of some of them. You will have to see which one fits you better and use it. Remember to always train it correctly before you do automatic actions on the mail received as you could lose some mails if you don&#8217;t do it correctly.</p>
]]></content:encoded>
			<wfw:commentRss>http://becomingparanoid.com/2006/04/03/e-mail-security-detecting-spam-iv/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>E-mail security: detecting spam (III)</title>
		<link>http://becomingparanoid.com/2006/03/30/e-mail-security-detecting-spam-iii/</link>
		<comments>http://becomingparanoid.com/2006/03/30/e-mail-security-detecting-spam-iii/#comments</comments>
		<pubDate>Thu, 30 Mar 2006 10:23:23 +0000</pubDate>
		<dc:creator>madelman</dc:creator>
				<category><![CDATA[Advanced]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://becomingparanoid.com/2006/03/30/e-mail-security-detecting-spam-iii/</guid>
		<description><![CDATA[Before talking about other methods for detecting spam, let&#8217;s have a closer look to Bayesian filters and programs using this technique to classify mail. This will be a technical post, so it might not interest to all of you. In next posts we&#8217;ll see some software which uses these filters.
I&#8217;m not a mathematician, so I [...]]]></description>
			<content:encoded><![CDATA[<p>Before talking about other methods for detecting spam, let&rsquo;s have a closer look to Bayesian filters and programs using this technique to classify mail. This will be a technical post, so it might not interest to all of you. In next posts we&rsquo;ll see some software which uses these filters.</p>
<p>I&rsquo;m not a mathematician, so I might make a few errors when trying to explain the theory behind the filters. Please forgive me. The article in <a href="http://en.wikipedia.org/wiki/Bayesian_filtering">Wikipedia</a>&nbsp;explains it better than I can do it.</p>
<p>The main formula where Bayesian filtering stands is:</p>
<p><img alt="Bayes1" src="http://becomingparanoid.com/images/bayes1.png" border="0" / /></p>
<p>which says that the probability of an e-mail being spam given the words contained in it is equal to the probability of these words appearing in a spam message, multiplied by the probability of a message being spam divided by the probability of the words appearing in any message.</p>
<p>Wow, it looks quite complicated. One of the most known papers about this kind of filtering is <a href="http://www.paulgraham.com/spam.html">A plan for spam</a>&nbsp;from Paul Graham. We&rsquo;ll see some code from it. </p>
<p><span id="more-48"></span></p>
<p>Well, to be able to calculate this result we need, in first place, to break the message in words, which are called <em>tokens</em>, from where the probabilities are taken. This partitions are really important, as they will affect the final result depending on how they are done. If we have the sentence <em>It&rsquo;s a shame</em> we could break the words in <em>It-s-a-shame</em> or maybe in <em>Its-a-shame</em> or even <em>It&rsquo;s-a-shame</em> and each of them might give different results when used.</p>
<p>Once the message is broken in tokens, we can calculate the Pr(word|spam) with the next code (this was code in Lisp originally):</p>
<p><code>(let ((g (* 2 (or (gethash word good) 0)))<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (b (or (gethash word bad) 0)))<br />&nbsp;&nbsp; (unless (&lt; (+ g b) 5)<br />&nbsp;&nbsp;&nbsp;&nbsp; (max .01<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (min .99 (float (/ (min 1 (/ b nbad))<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (+ (min 1 (/ g ngood))<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (min 1 (/ b nbad)))))))))</code> </p>
<p>When we have calculated the probabilities for all the tokens in the message, we get the most relevant ones (the ones which probability is farther from 0.5, so the nearest to 0 or 1). Paul decided to use the 15 most relevant and stores them in a list called probs, applying the next formula to it:</p>
<p><code>(let ((prod (apply #'* probs)))<br />&nbsp; (/ prod (+ prod (apply #'* (mapcar #'(lambda (x)<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (- 1 x))<br />&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; probs)))))</code> </p>
<p>If the result is bigger than 0.9 we consider that the e-mail is spam and classify it as such. So, although the theory may look hard once implemented it is far easier. Maybe the only problem with this code is it&rsquo;s Lisp, which not so many people know about.</p>
<p>Let&rsquo;s make this even easier by looking at the source code of Mozilla Thunderbird, the famous opensource mail reader, which includes a Bayesian module to classify mail. The implementation in Thunderbird is slightly different from the original, but the concept remains the same.</p>
<p>The algorithm is implemented in the file mozilla\mailnews\extensions\bayesian-spam-filter\src\nsBayesianFilter.cpp in the function classifyMessage. It&rsquo;s implemented in C++, but we are seeing it in &ldquo;pseudo-code&rdquo;. It uses some different variables:</p>
<ul>
<li>mGoodCount: number of non-spam messages classified</li>
<li>mBadCount: number of spam messages classified</li>
<li>mGoodTokens: hash table with good tokens and number of times they have appeared</li>
<li>mBadTokens: hash table with spam tokens and number of times they have appeared</li>
</ul>
<p>Take care, as the same token might appear in both hash tables with different number of apparitions. For example, the word <em>hello</em> is equally probable in spam and non-spam messages. When the algorithm is not yet trained default values are assigned:</p>
<p><code>if&nbsp;(mGoodCount == 0 || mGoodTokens.count() == 0)<br />&nbsp;&nbsp;&nbsp; message is spam<br />si (mBadCount == 0 || mBadTokens.count() == 0)<br />&nbsp;&nbsp;&nbsp; message is not spam</code> </p>
<p>If the algorithm has been trained then it&rsquo;s applied with the next formula (adapted from <a href="https://bugzilla.mozilla.org/attachment.cgi?id=138425&amp;action=view">Bugzilla</a>):</p>
<p><code>for each&nbsp;token {<br />&nbsp;hamcount = number of token appearances in non-spam<br />&nbsp;spamcount = number of token appearances in spam&nbsp;<br />&nbsp;hamratio = hamcount / nGoodCount<br />&nbsp;spamratio = spamcount / nBadCount<br />&nbsp;<br />&nbsp;prob = spamratio / (hamratio + spamratio)<br />&nbsp;<br />&nbsp;n = hamcount +&nbsp; spamcount<br />&nbsp;prob = (0.225 + n * prob) / (.45 + n)<br />&nbsp;<br />&nbsp;distance = abs(prob - 0.5)<br />&nbsp;if (distance &gt; = .1) {<br />&nbsp;&nbsp;token.distance = distance<br />&nbsp;&nbsp;token.prob = prob<br />&nbsp;}<br />}</code> </p>
<p>With this code, we have the probability for each token. This is saved in a list sorted by distance (distance is taken as the difference between probabilities) and the first 150 elements are taken. A probability distribution chi<sup>2</sup> is calculated and if the result is bigger or equal to 0.9 the message will be classified as spam.</p>
<p>But, we don&rsquo;t need to know all of this unless we want to write one filter ourselves. There are lots of already available filters which work quite good and get rates of detection around 99%, sometimes even better than a human.</p>
]]></content:encoded>
			<wfw:commentRss>http://becomingparanoid.com/2006/03/30/e-mail-security-detecting-spam-iii/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>E-mail security: detecting spam (II)</title>
		<link>http://becomingparanoid.com/2006/03/29/e-mail-security-detecting-spam-ii/</link>
		<comments>http://becomingparanoid.com/2006/03/29/e-mail-security-detecting-spam-ii/#comments</comments>
		<pubDate>Wed, 29 Mar 2006 09:47:04 +0000</pubDate>
		<dc:creator>madelman</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://becomingparanoid.com/2006/03/29/e-mail-security-detecting-spam-ii/</guid>
		<description><![CDATA[As spam filters get more advanced, less spam is allowed to enter into user&#8217;s inbox so the business model of spammers gets hurt. Instead of thinking that people don&#8217;t really like to receive spam and they would prefer less intrusive ways to get publicity, they try to workaround these filters in, sometimes, really clever ways. [...]]]></description>
			<content:encoded><![CDATA[<p>As spam filters get more advanced, less spam is allowed to enter into user&rsquo;s inbox so the business model of spammers gets hurt. Instead of thinking that people don&rsquo;t really like to receive spam and they would prefer less intrusive ways to get publicity, they try to workaround these filters in, sometimes, really clever ways. So, spam filters have to be continually modified and adapted to not fall into these new tricks.</p>
<p>As Bayesian filtering is the most common used technique, this is what spammers try to escape more frequently. We told that <a href="http://becomingparanoid.com/2006/03/27/e-mail-security-detecting-spam/">Bayesian works</a>&nbsp;by calculating the probability that a word is from spam or from legitimate mail, so what spammers do is modify the messages so they get more probability of being legitimate mail.</p>
<p>One of the ways to do this is insert random but common words in spam, so the <em>spam words</em> contribute less to the score and the message goes under the filter. We can see an example of&nbsp;a real spam:</p>
<p align="center"><img alt="Spam1" src="http://becomingparanoid.com/images/spam1.png" border="0" / /></p>
<p align="left">The real content of the spam is contained at the bottom but at the beginning of the e-mail there are some lines with text which come from the novel <a href="http://en.wikipedia.org/wiki/The_Master_and_Margarita">The Master and Margarita</a>&nbsp;and try to hide the fact that this is an spam.</p>
<p><span id="more-47"></span></p>
<p align="left">
<p align="left">Another way to try to evade the filters is by sending the content as an image. This technique&nbsp;is also used in the last example we have seen, but it&rsquo;s a really common one, as we can see in this other e-mail:</p>
<p align="center"><img alt="Spam2" src="http://becomingparanoid.com/images/spam2.png" border="0" / /></p>
<p align="left">Although this may look like an HTML email, in fact all the content is inside an image, with no text to be analyzed by the filters, so it gets more difficult to identify the message as spam because we have no words to compute the probability. Sometimes this technique works against the spammer, as it&rsquo;s quite strange for a legitimate mail to contain only an image with a link, so some more advanced filters might detect this message as spam correctly.</p>
<p align="left">One last technique is the use of unknown or made-up words to confuse the filter. As Bayesian works by looking the probability of already seen words and knowing if they are more likely to occur in legitimate mail or in spam, when an unknown word is found the filter can&rsquo;t really know if it belongs to spam or not, so it can&rsquo;t classify it correctly and the spam might just evade the filter. Let&rsquo;s see an example:</p>
<p align="center"><img alt="Spam3" src="http://becomingparanoid.com/images/spam3.png" border="0" / /></p>
<p align="left">We can see that instead of <em>ordering </em>the message says <em>orderinq</em> with a Q as the last letter, which looks quite similar to the G. Also, the word Viagra is not written with a V letter at the beginning, but with the slash and forward-slash symbols like this \ /. There are more example in these two sentences, as almost every word is modified to evade the filters.</p>
<p align="left">Sometimes, it gets so difficult for spammers to be sure their junk will reach the recipient that the messages they sent have almost no sense and it is quite hard to know what they are really trying to advertise.</p>
<p align="center"><img alt="Spam4" src="http://becomingparanoid.com/images/spam4.png" border="0" / /></p>
<p>Can you guess what they are trying to say?</p>
<p>If we have a Bayesian filter which checks our e-mail it is a good idea to keep it updated and trained. It&rsquo;s quite easy and shouldn&rsquo;t consume a lot of our time, unless we receive incredible amounts of e-mail. To do this we should check from time to time the folder where spam is moved to check if there has been any false positive (that is, a legitimate mail message which has been classified as spam). If there is any, we must tell the filter that message is not spam so it changes the probabilities of the words included in it. Checking this folder from time to time is a good idea anyways, so we don&rsquo;t lose any important e-mail which might have been miscategorised. It&rsquo;s also important when we receive spam which is not filtered as such, not only delete it, but tell the filter that message is spam so we can keep it trained.</p>
<p>There are other methods to classify spam which are not based in Bayesian filters and we will see them in next posts.</p>
]]></content:encoded>
			<wfw:commentRss>http://becomingparanoid.com/2006/03/29/e-mail-security-detecting-spam-ii/feed/</wfw:commentRss>
		<slash:comments>19</slash:comments>
		</item>
		<item>
		<title>E-mail security: detecting spam</title>
		<link>http://becomingparanoid.com/2006/03/27/e-mail-security-detecting-spam/</link>
		<comments>http://becomingparanoid.com/2006/03/27/e-mail-security-detecting-spam/#comments</comments>
		<pubDate>Mon, 27 Mar 2006 18:08:30 +0000</pubDate>
		<dc:creator>madelman</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://becomingparanoid.com/2006/03/27/e-mail-security-detecting-spam/</guid>
		<description><![CDATA[If the volume of spam we receive is overwhelming us and we can&#8217;t keep up with classifying , we need an automated way to separate spam from legitimate mail. One of the&#160;most famous&#160;methods was proposed proposed by Paul Graham in a paper called A plan for spam, where he talked about some algorithms which use [...]]]></description>
			<content:encoded><![CDATA[<p>If the volume of spam we receive is overwhelming us and we can&rsquo;t keep up with classifying , we need an automated way to separate spam from legitimate mail. One of the&nbsp;most famous&nbsp;methods was proposed proposed by Paul Graham in a paper called <a href="http://www.paulgraham.com/spam.html">A plan for spam</a>, where he talked about some algorithms which use probability to classify each&nbsp;message.</p>
<p>The basis for this method is a previous training of the algorithm, where we must feed it with spam messages and legitimate mail telling which is which. With this data, the algorithm breaks the messages in words and assign a probability to each word for being in a spam message and another for being in a legitimate mail.</p>
<p>When a new message is received, it&rsquo;s broken in words like the training messages and the saved probabilities of each word are analyzed with a formula called <em>Naive Bayes</em>, which returns a final probability for the mail being spam or not.</p>
<p>Most of the known mail classifier use, at least, this method, usually combined with others, but we can see this is a really powerful way of classifying.</p>
<p>Another approach to classification is the one used by <a href="http://spamassassin.apache.org/">Spamassassin</a>&nbsp;which has a series of rules that assign some points when it applies to the mail. As more points are assigned the mail has more probability of being spam, and it is classified as such when it surpasses a threshold.</p>
<p>Spamassassin also uses the Bayesian filter but it&rsquo;s not the only way to check for spam, as it usually has distinguishable characteristics which may make it different enough from legitimate mail to be easily classifiable.</p>
<p>But spammers are adapting to the measures, modifying the mails they send so they are not detected as spam by the filters and it&rsquo;s necessary to tweak these filters and find new ways to throw spam to trash.</p>
]]></content:encoded>
			<wfw:commentRss>http://becomingparanoid.com/2006/03/27/e-mail-security-detecting-spam/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>E-mail security: spam</title>
		<link>http://becomingparanoid.com/2006/03/25/e-mail-security-spam/</link>
		<comments>http://becomingparanoid.com/2006/03/25/e-mail-security-spam/#comments</comments>
		<pubDate>Sat, 25 Mar 2006 19:14:55 +0000</pubDate>
		<dc:creator>madelman</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Spam]]></category>

		<guid isPermaLink="false">http://becomingparanoid.com/2006/03/25/e-mail-security-spam/</guid>
		<description><![CDATA[Spam is one of the most common types of undesired mail. It is sent in bulk to lots of people trying to sell some product or service. Many times, these products are not legal at all, as some drugs, but other times legal services are offered this way.
For an e-mail to be spam it must [...]]]></description>
			<content:encoded><![CDATA[<p>Spam is one of the most common types of undesired mail. It is sent in bulk to lots of people trying to sell some product or service. Many times, these products are not legal at all, as some drugs, but other times legal services are offered this way.</p>
<p>For an e-mail to be spam it must be sent without the consent of the recipient, that is, an e-mail with a commercial advertisement is not spam if you have asked for it. The legislation of each country is more specific as to what is spam and what is not.</p>
<p>The products which get more advertising in spam vary with time, but it is quite usual to receive spam about drugs like viagra or valium, about how to get fake college diplomas, how to get a mortgage or illegal software.</p>
<p>The problem of spam is economic. Sending spam is really cheap, so even if only a really small percentage of the receivers buy the product it&rsquo;s still profitable. So, you must never buy products advertised this way, so spammers get the message that people don&rsquo;t like to receive these kind of messages and won&rsquo;t buy their products.</p>
<p>In the same way, the most expensive part of the spam is not payed by the spammer. He only has to find somewhere from where to send the spam and, once it has been send, he doesn&rsquo;t have to pay anything more for it. But the message has to travel through other networks, has to be stored somewhere and has to be, finally, read or deleted. This has a cost in network bandwidth, in disk space occupied in, more importantly, in time spent by the final recipient having to classify and delete the e-mail.</p>
<p>For&nbsp;many people, the quantity of spam received is bigger than the quantity of legitimate mail, so they need some way to classify it automatically, as it almost gets impossible to do it by hand in a short time.</p>
]]></content:encoded>
			<wfw:commentRss>http://becomingparanoid.com/2006/03/25/e-mail-security-spam/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>E-mail security: how they send the e-mail</title>
		<link>http://becomingparanoid.com/2006/03/22/e-mail-security-how-they-send-the-e-mail/</link>
		<comments>http://becomingparanoid.com/2006/03/22/e-mail-security-how-they-send-the-e-mail/#comments</comments>
		<pubDate>Wed, 22 Mar 2006 22:18:48 +0000</pubDate>
		<dc:creator>madelman</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://becomingparanoid.com/2006/03/22/e-mail-security-how-they-send-the-e-mail/</guid>
		<description><![CDATA[Once spammers have a list of e-mails addresses they have to send their message to these recipients. When undesired mail was not a big problem as it is now, they could use their own infraestructure to send it, that is, their own servers or even from their own e-mail account. But as more spammers used [...]]]></description>
			<content:encoded><![CDATA[<p>Once spammers have a list of e-mails addresses they have to send their message to these recipients. When undesired mail was not a big problem as it is now, they could use their own infraestructure to send it, that is, their own servers or even from their own e-mail account. But as more spammers used this server administrators began to implement techniques to avoid being used to send spam, as it was a big consumption of resources, so they had to switch to using other&#8217;s servers.</p>
<p>This is a big annoyance for the owners of the servers, as they will be probably black-listed and will not be able to send legitimate mail, causing a disruption of the service for legitimate users.</p>
<p>In a first instance, spammers used mail servers which were incorrectly configured and allowed anyone to send e-mail through it (technically, it is known as relaying mail). It&#8217;s very cheap to use this technique, as to send massive amounts from the server the spammer only needed to send it once. Fortunately, nowadays most administrators configure their servers correctly and only allow authorized users to send e-mail, so spammers needed to find another way to send their junk. If you administer an e-mail server and you don&#8217;t have secured it against relaying you <a href="http://www.mail-abuse.com/an_sec3rdparty.html">should check how to disable it</a>.</p>
<p>The most common used technique nowadays is relay mail through <a href="http://en.wikipedia.org/wiki/Botnet">botnets</a>. Botnets are groups of compromised computers controlled remotely by the attacker and spammers use them to send the e-mails to the world. Unfortunately, there are a lot of botnets in Internet and it&#8217;s quite cheap to find someone who controls one and sends the e-mails for us.</p>
<p>For this reason, it&#8217;s important to protect our computer so it doesn&#8217;t get used to spam all the world. Also, some ISPs implement filters so e-mail from their users can only be send through their server (technically, they close the outbound TCP port 25). This way, they can&#8217;t send spam from that account but this is also an annoyance for more advanced users, which sometimes need to use other e-mail servers as they might have different accounts in other places.</p>
]]></content:encoded>
			<wfw:commentRss>http://becomingparanoid.com/2006/03/22/e-mail-security-how-they-send-the-e-mail/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>E-mail security: where they get our e-mail</title>
		<link>http://becomingparanoid.com/2006/03/21/e-mail-security-where-they-get-our-e-mail/</link>
		<comments>http://becomingparanoid.com/2006/03/21/e-mail-security-where-they-get-our-e-mail/#comments</comments>
		<pubDate>Tue, 21 Mar 2006 16:43:30 +0000</pubDate>
		<dc:creator>madelman</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://becomingparanoid.com/2006/03/21/e-mail-security-where-they-get-our-e-mail/</guid>
		<description><![CDATA[Almost everyone who has an e-mail account receives some undesired mail, be it 1 or 2&#160;a week&#160;or hundreds every day, so one has to ask how our e-mail address is collected and how to avoid it. Although we can&#8217;t know for sure all the methods used by spammers, there are some common techniques which really [...]]]></description>
			<content:encoded><![CDATA[<p>Almost everyone who has an e-mail account receives some undesired mail, be it 1 or 2&nbsp;a week&nbsp;or hundreds every day, so one has to ask how our e-mail address is collected and how to avoid it. Although we can&rsquo;t know for sure all the methods used by spammers, there are some common techniques which really work.</p>
<p>One of the most common ones is by browsing the web. Spammers send their computers to <em>spider</em> the web, that is navigate and follow links, retrieving the text in the pages and analyzing it looking for e-mail addresses. They usually only look for addresses which match the pattern <a href="mailto:user@server.tld"><em>user@server.tld</em></a>, so if we write our address in some webpage, be it our personal website, in the comments section of another site or anywhere else, it&rsquo;s easy some of this robots find it and we begin receiving undesired mail.</p>
<p>Another method is analyzing <em>chain letters</em>. These are usually full of working e-mail addresses, as they are send to all the addressbook and when forwarded these list is not deleted, filling it with more and more addresses as it is being forwarded.</p>
<p>Some time ago, Usenet News were a really popular service where people could read and send messages. These messages contain a header with the e-mail address of the sender, so spammers collected messages and analyzed them to get addresses. Nowadays, Usenet is not so used as before and the ones who use it are more knowledgeable, so I suspect these method is falling into oblivion, although it might be used by some spammers.</p>
<p>There have always been dishonest companies and some of them sell their databases to spammers, so depending on where we get registered we might be giving away our e-mail address to someone unknown. Depending on the country, there might be severe laws to prevent this, but it&rsquo;s not always the case.</p>
<p>Another method is getting the address used when registering a domain. When you register a domain (like <a href="http://www.example.com">www.example.com</a>) you have to provide three addresses (might be the same) which are lately made public so people can contact you about the domain. As it&rsquo;s really easy to get them, spammers only have to get a list of domains and scan them for addresses.</p>
<p>Finally, one of the most used ones is just guessing or, we might say, <em>bruteforcing</em>. That is, try different addresses hoping they work. As it&rsquo;s really cheap to send and e-mail they don&rsquo;t lose almost anything for trying a really big number of addresses, even if most of them don&rsquo;t work. You can find some examples of this in some of the spam received, when looking at the destination you find a lot of e-mail addresses very similar to yours.</p>
<p>There are other techniques not so widely in use, so these are the most important ones. From some of them we can protect ourselves, but there&rsquo;s nothing we can do to protect from the other, so we have to simply trust other people to do it for us.</p>
]]></content:encoded>
			<wfw:commentRss>http://becomingparanoid.com/2006/03/21/e-mail-security-where-they-get-our-e-mail/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>E-mail security: types of undesired mail</title>
		<link>http://becomingparanoid.com/2006/03/20/e-mail-security-types-of-undesired-mail/</link>
		<comments>http://becomingparanoid.com/2006/03/20/e-mail-security-types-of-undesired-mail/#comments</comments>
		<pubDate>Mon, 20 Mar 2006 17:29:36 +0000</pubDate>
		<dc:creator>madelman</dc:creator>
				<category><![CDATA[Beginner]]></category>
		<category><![CDATA[Email]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://becomingparanoid.com/2006/03/20/e-mail-security-types-of-undesired-mail/</guid>
		<description><![CDATA[We have already seen a brief discussion of how e-mails works both when we send it&#160;and when we receive it, so now it&#8217;s time to know which kind of undesired mail we can receive.
Spam: the classic and oldest type of undesired mail. In fact, any kind of e-mail sent massively and without the consent of [...]]]></description>
			<content:encoded><![CDATA[<p>We have already seen a brief discussion of how e-mails works both when <a href="http://becomingparanoid.com/2006/03/14/e-mail-security-how-does-e-mail-work-i/">we send it</a>&nbsp;and when <a href="http://becomingparanoid.com/2006/03/16/e-mail-security-how-does-e-mail-work-ii/">we receive it</a>, so now it&rsquo;s time to know which kind of undesired mail we can receive.</p>
<p><strong>Spam: </strong>the classic and oldest type of undesired mail. In fact, any kind of e-mail sent massively and without the consent of the receiver is considered to be spam, but to distinguish them we usually call it spam when it&rsquo;s some kind of advertising trying to sell legal or illegal products.</p>
<p><strong>Phishing: </strong>this is a technique used to collect sensitive information from users, such as passwords or bank account details. E-mail of this type tries to disguise as legitimate mail but points to fake webservers where you are asked to enter the information.</p>
<p><strong>Viruses: </strong>in old times viruses spread through floppy disks but with the rising of the use of e-mail creators have changed the distribution method and <em>worms</em> (as these kinds of viruses are known) are nowadays one of the most common type of virus.</p>
<p><strong>Chain letters: </strong>these usually come from people we know, so it&rsquo;s easier to trust them, but almost always contain false information. We can distinguish them because they&nbsp;try to expand&nbsp;some kind of rumour, such as non-deletable viruses, methods of obtaining free presents from some companies or threats of something bad happening to us if they are not forwarded to a specified quantity of people.</p>
<p><strong>Trojans:</strong> similar to viruses but usually not send massively, only to an intended recipient as a method of gaining control of his computer or information stored in it.</p>
<p>There are some more kinds of undesired mail but these are the most important ones. We are going to have a look at each of them and discover how they work and how to avoid them.</p>
]]></content:encoded>
			<wfw:commentRss>http://becomingparanoid.com/2006/03/20/e-mail-security-types-of-undesired-mail/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
