SPAM Filtering - Losing the war!
5 answers - 6803 bytes -

In my opinion, each postmaster (except Postgres daemon :-) MUST prevent
"false positives" as much, as he can. Many of mail systems is ruled by
incompetent systems administrators and it is great likelihood that those
hosts will be blocked by spam filter, that related to wrong
configuration of this mailers (or configuratoin of DNS system, if we
speaks about DNS checks).
But then, there is a "public" mail systems (as free mail services, or
mail servers of hosting providers). A lot of them is rightly configured.
However, whose hosts may be listed in DNSBL's (very often in Spamhaus,
IMC), because company client (or user of free mail service) makes a
spam distribution. We make mistake rejecting mail from those systems,
bacause administrators quickly discover and stops spam distributions
like that, and the most part of mail from that systems is "legitimate".
Instead of starting "holy wars" flames, we SHULD understand, how to
decrease the volume of false positive rejects in spam checks of mail
systems, without decreasing the volume of true positive mail drops?
A lot of spammers hosts has more sins, besides listing in SBL's, or
giving "HEL friend". :-)
So that is the simple example for MTA Exim in which we can see, how to
block spammer hosts more efficiently, and making Exim more efficient worker:
First, we need three markers (because this simple conf has three
checks). Each marker has boolean (1 or 0) value, which contains the
result of some spamcheck.
warn set acl_m0 = 0
set acl_m1 = 0
set acl_m2 = 0
Each sender MAY go through the all of spamtests, and have some labeles
on finish. Examine this labeles at the finish we assessing the
personality of this host and reject or accept the message.
We make the SBL listing test and remember result in variable acl_m0
warn dnslists = sbl.spamhaus.org: \
bl.spamcop.net: \
relays.mail-abuse.org
set acl_m0 = 1
set acl_c0 = $acl_c0 Listed in DNSBL $dnslist_domain;
Also, we include result in status string variable acl_c0. We will need
it later. "warn" means that is no action performed, only
passive check.
This is second spamtest in our example, test of coincidence of PTR
(reverse) and A (direct) DNS records using Exim stored procedure
"reverse_host_lookup":
warn !verify = reverse_host_lookup
set acl_m1 = 1
set acl_c0 = $acl_c0 Reverse host lookup failed;
The result is remembered in acl_m1.
The last spamtest is the check of HEL command argument, the result of
which we put in acl_m2.
warn !condition = ${if or {\
{eq{$sender_helo_name}{$sender_host_name}}\
{match\{$sender_helo_name}{\\[$sender_host_address\\]}}\
}}
set acl_m2 = 1
set acl_c0 = $acl_c0 HEL forged;
HEL command argumend should be an FQDN (Full Qualified Domain Name) or
IP literal (IP-address in brackets, like [1.2.3.4]) and points to this
host (this is summarily, see RFC 2821 for full details).
Now we have three results and should consider about action, taken to
that SMTP session - abort it, accept message, or gave defer.
This section describes how to handle the message. If two (or all three)
variables is true, session reset, else - the message is accept.
deny condition = ${if and{\
{eq{$acl_m0}{1}}\
{eq{$acl_m1}{1}}\
}{1}{0}}
message = $acl_c0
deny condition = ${if or {\
{eq{$acl_m0}{1}}\
{eq{$acl_m2}{1}}\
}{1}{0}}
message = $acl_c0
deny condition = ${if or {\
{eq{$acl_m1}{1}}\
{eq{$acl_m2}{1}}\
}{1}{0}}
message = $acl_c0
accept
Note, if mailer is rejecting message, it MUST include the description of
reason of mail reject in message, transmitted with 5xx status code. This
helps postmasters to resolve problems in case of "false positive".
Exim give implementation to detect many traces of spam and can handle
them very good.
This is an example of more progressive config, that provides an
"spamweight" of the session by counting the "spamscore". It based on
supposition that SBL listing is more significant than lack of
convergence in DNS record and still more important than forged HEL
arguments:
warn set acl_m0 = 0
warn dnslists = sbl.spamhaus.org: \
bl.spamcop.net: \
relays.mail-abuse.org
set acl_m0 = ${eval:$acl_m0+30}
set acl_c0 = $acl_c0 Listed in DNSBL $dnslist_domain;
warn !verify = reverse_host_lookup
set acl_m0 = ${eval:$acl_m0+10}
set acl_c0 = $acl_c0 Reverse host lookup failed;
warn !condition = ${if or {\
{eq{$sender_helo_name}{$sender_host_name}}\
{match\{$sender_helo_name}{\\[$sender_host_address\\]}}\
}}
set acl_m0 = ${eval:$acl_m0+20}
set acl_c0 = $acl_c0 HEL forged;
To bring cleaness in understand of process and make it more intresting,
I added a block of sender name validation, using internal exim procedure
"callout":
warn !verify = sender/callout=1m,defer_ok
set acl_m0 = ${eval:$acl_m0+20}
set acl_c0 = $acl_c0 Cannot complete sender verify;
And now we reject messages, spamscore of which equals or greater then 40:
deny condition = ${if <={$acl_c0}{${eval:40}}}
message = $acl_c0
Summary, we can handle more situations to reject the message:
1.sender is listed in sbl
PTR and A DNS records does not not coincide
HEL argument is forged
sender name is forged
2.sender is listed in sbl
PTR and A DNS records does not not coincide
HEL argument is forged
3.sender is listed in sbl
HEL argument is forged
sender name is forged
4.sender is listed in sbl
PTR and A DNS records does not not coincide
sender name is forged
5.sender is listed in sbl
PTR and A DNS records does not not coincide
6.sender is listed in sbl
HEL argument is forged
7.sender is listed in sbl
sender name is forged
8.PTR and A DNS records does not not coincide
HEL argument is forged
sender name is forged
9.HEL argument is forged
sender name is forged
Uff. Seems that all possible (except defers) situations on this sample
of tests. :-)
Use of this "intelligent" behavior of MTA we can handle mail
transactions more accuracy. This is very simple and efficient way.
But not only one. Also, we can prevent false positive rejects in spam
checks of mail systems by use methods of data mining, too.
Any questions are welcome.
SPAM Filtering - Losing the war!
Losing the war?! I don't think so. Take the defence! :-)
Best regards, ded3axap.
Ramenskoye. Russia.
No.1 | | 1792 bytes |
| 
Vitaly A Zakharov wrote:
*snip* (details of some well-written examples)
We would add that it can be very beneficial to defer actually 'acting on' these
strict tests (rDNS fail, HEL mismatch, RBL hit, etc.) until at least
acl_smtp_rcpt phase, where 'per-recipient' filtering is practical.
The reasons are economic.
Given that in any given 'organization-specific' domain - and arrivals are
grouped by target domain - there is, or most often *should be* - at least one
address that is *very* forgiving, and many others that are less so.
Example: Clients to whom a missed opportunity for a unit sale to a new customer
is worth several thousand US$ per each. New user registrations. Helpdesks.
So - a 'sales@<domain>.<tld>', 'info@<domain>.<tldor similar spam-target
initial-point-of-contact address needs the Mark 1 human eyeball to sort copious
arrivals of spam in order to find the one or two potentially valuable arrivals -
then respond and whitelist them if need be.
Best if staff can share that sort of unpleasant workload!
In acl_smtp_rcpt, we can pull the per-recipient thresholds, still reject any/all
that are NT 'tolerant' recipients, and onpass only the survivors.
Also - the 'tighter' the filters, the more attention needs to be paid to
maintaining very current exception whitelists and applying code that has a
similar 'automagical' effect. e.g. - allowing traffic from any domain your
clients have intentionally *sent to* [ ever | x-times in y-months), and similar
lookups.
We are, after all, not supposed to shoot the bystanders in this 'war'.
;-)
Bill
No.2 | | 557 bytes |
| 
Use of this "intelligent" behavior of MTA we can handle mail
transactions more accuracy. This is very simple and efficient way.
We use a similar approach. I wouldn't call it "simple". I wouldn't call
it "\"intelligent\"". It's simply hard work. We combine asn, hostname,
rbl and helo tests, dns adress verifikation and several other sanity
checks in the pre data phase. Nevertheless, a lot of spam and malware
pass through. a (full featured) spamassassin in conjunction with
clamAV is able to clean it up.
- oliver
No.3 | | 1538 bytes |
| 
Egginger :
>Use of this "intelligent" behavior of MTA we can handle mail
>transactions more accuracy. This is very simple and efficient way.
We use a similar approach. I wouldn't call it "simple". I wouldn't call
it "\"intelligent\"". It's simply hard work. We combine asn, hostname,
rbl and helo tests, dns adress verifikation and several other sanity
checks in the pre data phase. Nevertheless, a lot of spam and malware
pass through. a (full featured) spamassassin in conjunction with
clamAV is able to clean it up.
Yor just not understand the basic terms of my post. , maybe, my English so bad, that it is hard to understand what I
write. :-)
I never say "Do not use bayesian filters!".
I never say "Do not use antivirus!"
As about "intelligence":
Three of four check is not "intelligence behavior", but ~20 tests + greylisting + challengelisting + blacklisting +
whitelisting and using the MTA logic to manipulate of all of this is really intelligence.
Nevertheless, a lot of spam and malware
pass through.
Try to use a well-known construction, just above virus checking in Exim configuration:
acl_check_mime:
warn decode = default
drop message = Blacklisted file extension detected.
condition = ${if match{${lc:$mime_filename}}{\N(\.cpl|\.pif|\.bat|\ .scr|\.lnk|\.com|\.hta)$\N}{1}{0}}
accept
You would be surprised, the volume of viruses will decrease about a half.
No.4 | | 1016 bytes |
| 
Sunday 29 2006 05:36, Vitaly A Zakharov wrote:
Try to use a well-known construction, just above virus checking in Exim
configuration:
acl_check_mime:
warn decode = default
drop message = Blacklisted file extension detected.
condition = ${if
match{${lc:$mime_filename}}{\N(\.cpl|\.pif|\.bat|\ .scr|\.lnk|\.com|\.hta)$\
>N}{1}{0}}
>
accept
You would be surprised, the volume of viruses will decrease about a half.
You would be surprised, the number of users who complain because these
extensions (especially .lnk and .scr) are blocked.
In fact it was such a common problem among our (mostly non-IT) users, that we
ended up defaulting to NT blocking executable extensions, though it can be
turned on per-domain.
I don't really like blocking simply on extension anyways - I ran into it
myself when trying to E-mail an HTML file without an extension (it was named
simply somedomain.com).
Cheers,
No.5 | | 2438 bytes |
| 
SeattleServer.com wrote:
Sunday 29 2006 05:36, Vitaly A Zakharov wrote:
>Try to use a well-known construction, just above virus checking in Exim
>configuration:
>>
>acl_check_mime:
>>
>warn decode = default
>drop message = Blacklisted file extension detected.
>condition = ${if
>match{${lc:$mime_filename}}{\N(\.cpl|\.pif|\.bat|\ .scr|\.lnk|\.com|\.hta)$\
>N}{1}{0}}
>>
>accept
>>
>You would be surprised, the volume of viruses will decrease about a half.
You would be surprised, the number of users who complain because these
extensions (especially .lnk and .scr) are blocked.
In fact it was such a common problem among our (mostly non-IT) users, that we
ended up defaulting to NT blocking executable extensions, though it can be
turned on per-domain.
I don't really like blocking simply on extension anyways - I ran into it
myself when trying to E-mail an HTML file without an extension (it was named
simply somedomain.com).
Cheers,
We have two such rules - both with far more extensive lists, as we cover mostly
Mac and other 'non-MS' platforms. Both add 'points' and user prefs do
modification to 'Subject:' and quarantining.
- But the 'surprise' here is that they almost never triggered until recently.
Client branch offices that need to send photos and such are whitelisted and/or
trained to alter the file extent or encapsulate, and the villainous *were* being
stopped before they got as far as that.
That said, the recent rise in otherwise innocuous body with text-bearing graphic
attached says we need a server-global tightening up on a *combination* of
any-graphic + [stranger AND/R rudebugger].
- Where 'stranger' is anyone we have never sent 'T:', and 'rudebugger' is
weighted scores for failure on rDNS, HEL, dynamic-IP, RBL, header format
etc.
If we have to get into the insanity of CPU cycles needed for CR inspection of
graphics, I'd call that a dead loss, strip the dodgy attachments, and point the
user community back to their fax machines (color, for the most part) or FedEx.
:-(
Bill