[Pymilter] exploding messages
Eric S. Johansson
esj at harvee.org
Fri Apr 23 22:38:48 EDT 2004
Stuart D. Gathman wrote:
> On Fri, 23 Apr 2004, Eric S. Johansson wrote:
>
>
>>As I described earlier, in camram, there can be either individual or
>>aggregate filters. a mail message may hit multiple individual or
>>aggregate filters. The message must be compared to each filter and will
>>get a different rating as to whether or not it is spam. At this point,
>>the message will get different additional headers and therefore create
>>multiple, almost identical, copies of the same message.
>
>
> The dspam filter creates a "TAG" that is filter (dictionary) specific and
> is used to lookup a token stats record (signature) to change the status
> of a message.
>
> Rather than copy the entire message, I simply add all the tags to the
> message. If the user changes the status (marks as spam), the tags
> not belonging to them are simply ignored. This way, there
> is only one message delivered. For smart message stores (like Exchange -
> but don't buy it, it sucks in too many other ways), the message is
> only stored once as well.
>
> If you simply make it easy to recognize which headers go with a
> filter/individual (perhaps by including a filter id), you can add 'em all and
> not duplicate the entire message. If you are worried about users finding
> out about Bcc recipients due to information leakage in the extra headers,
> then only duplicate the message for Bcc recipients. I suppose you
> might also be concerned that officemates getting the same email would see each
> others spam score ("Your spam score for than pr0n spam is 0.00??"), but
> that wouldn't bother me.
wish we had had this conversation before I wrote the code to aggregate
by filter group and replicate everything per group. :-)
the information I store in each message is recipient (for reinjecting
message), score, spamtrap ID. So if I have the message which is
interpreted by three different sets a filter rules, I could potentially
end up with three recipient address groups, three different scores and
up to three spamtrap ID's. I must point out that if a message has a
spamtrap ID it is not propagated to the end-user's mailbox.
while this could work, it would seriously bollix up the internals which
are, what I perceive as, MTA independent. for example, the headers I
added to a message are used during the reinjection process once someone
has approved a message as "good". The header information is used during
the spamtrap message list generation process as well. Having multiple
copies of the same headers is unpleasant at best enforces the user
interface to do more testing for message than I am comfortable with.
on the plus side, it's forced me to redo the filter front-end so that
it's a little more general. I think it needs a couple more passes (i.e.
postfix, exim) before its really decent but it's definitely better.
I will admit to being disappointed that the sendmail folks didn't see
far enough in the future to see the need for "forking" messages like
we're talking about doing. It would have been quite useful.
I was also playing with trying to see what address the traffic came in
on and it looks like getsymval is the way to go but
Milter: connect something?? redweb.harvee.org None at None
is what I get for the receiver, interface_name and interface_address
what I connect from local host. I noticed something in the
documentation saying that it doesn't give back a valid answer for local
host and I guess this is what they mean...grumble. They could at least
give the port number but I guess I'll have to look into macros a little
more closely and modify my mc file at the appropriate time.
---eric
<!DSPAM:FC1A60E29336721215824778>
More information about the Pymilter
mailing list