Week 8 - Messaging Security
This week we will be focusing on messaging security, or email security. Several terms will be defined and used this week.
Spam - illegitimate email message that is phishing or intending to do harm
Ham - legitimate email message
Spamtrap/honeypot - an email address or domain that doesn't have any filtering on it to collect spam. The spam is then analyzed to study spam trends
Botnet - a collection of computers that have been hijacked to perform an action (such as spamming) that the end user does not condone
Snowshoe spam - distributed spamming efforts across a broad footprint
Phishing - a general spam message that tries to get information that targets a wide demographic
Spear Phishing - a targeted phishing attack against a specific person or demographic
Realtime Blackhole List (RBL) - a list of known IPs that send spam
Heuristics - a spam detection technique that uses basic feature matches (strings, sender, etc) to detect spam. If a message matches the phrase "buy <product> now!" it is likely spam
Bayesean filtering - tokenize a message and calculate each token against known spam and ham corpora to determine if it has spam characteristics
Some classic spam include the Nigerian prince and their generous wealth message that we've all seen. This is classified as 419 phishing. Another classic phishing attack is the cheap drugs from a Canadian pharmacy or something along those lines. These messages have their key words in HTML <span> tags to get around heuristic rules. The Rustock botnet was the primary offender of this type of spam. Artificial inflation of stocks are another classic spam (pump and dumps).
How can spam be defended against? There are two general defense types, reputation-driven and content-driven.
Tools used to research spam:
DIG (Domain Information Groper) - used to verify DNS records to check email exchanges
WHOIS - used to get IP and domain records about the organization
There are two fundamental methods to scoring spam, additive scoring and probability scoring. With additive scoring, each "red flag" that is triggered by a message (Bayesean method used) gets a point. Once a certain threshold of points is crossed, the message is classified as spam.
Email headers contain tons of information in them. They are read from the bottom up, and contain the sender, subject, date, origin, and hop information until received by the recipient. A sample email header is below.
source: lecture slides, CS373 Defense Against the Dark Arts, Oregon State University
Another way that spam can be identified is by the shape identified in a star plot. Once a shape is determined for known spam, if a new message is very close to the same shape it is marked as spam. The same thing can be done for ham messages and now you have two visual representations of spam and ham.
Spam - illegitimate email message that is phishing or intending to do harm
Ham - legitimate email message
Spamtrap/honeypot - an email address or domain that doesn't have any filtering on it to collect spam. The spam is then analyzed to study spam trends
Botnet - a collection of computers that have been hijacked to perform an action (such as spamming) that the end user does not condone
Snowshoe spam - distributed spamming efforts across a broad footprint
Phishing - a general spam message that tries to get information that targets a wide demographic
Spear Phishing - a targeted phishing attack against a specific person or demographic
Realtime Blackhole List (RBL) - a list of known IPs that send spam
Heuristics - a spam detection technique that uses basic feature matches (strings, sender, etc) to detect spam. If a message matches the phrase "buy <product> now!" it is likely spam
Bayesean filtering - tokenize a message and calculate each token against known spam and ham corpora to determine if it has spam characteristics
Some classic spam include the Nigerian prince and their generous wealth message that we've all seen. This is classified as 419 phishing. Another classic phishing attack is the cheap drugs from a Canadian pharmacy or something along those lines. These messages have their key words in HTML <span> tags to get around heuristic rules. The Rustock botnet was the primary offender of this type of spam. Artificial inflation of stocks are another classic spam (pump and dumps).
How can spam be defended against? There are two general defense types, reputation-driven and content-driven.
Tools used to research spam:
DIG (Domain Information Groper) - used to verify DNS records to check email exchanges
WHOIS - used to get IP and domain records about the organization
There are two fundamental methods to scoring spam, additive scoring and probability scoring. With additive scoring, each "red flag" that is triggered by a message (Bayesean method used) gets a point. Once a certain threshold of points is crossed, the message is classified as spam.
Email headers contain tons of information in them. They are read from the bottom up, and contain the sender, subject, date, origin, and hop information until received by the recipient. A sample email header is below.
source: lecture slides, CS373 Defense Against the Dark Arts, Oregon State University
Another way that spam can be identified is by the shape identified in a star plot. Once a shape is determined for known spam, if a new message is very close to the same shape it is marked as spam. The same thing can be done for ham messages and now you have two visual representations of spam and ham.
Comments
Post a Comment