From Christian.Wyglendowski at greenville.edu Fri Apr 2 12:27:55 2004 From: Christian.Wyglendowski at greenville.edu (Christian Wyglendowski) Date: Fri, 2 Apr 2004 11:27:55 -0600 Subject: [Pymilter] milter switching from 'poll' to 'RUN' state Message-ID: Hi, I have written a simple milter that strips attachments that usually carry viruses (like the sample.py and bms.py milters do) and also appends a warning to the body of emails that contain ZIP attachments. It will run fine for hours and sometimes days at a time in the 'poll' state (as viewed from the 'top' command) and consumes negligable CPU time. However, it always at some point switches to the 'RUN' state and consumes as much of the CPU resources as it can. It doesn't switch back on its own. I have to stop sendmail, stop the milter, and restart them both to correct the problem. Anyone else seeing this issue and/or have any idea why it is happening? Thanks, Christian Wyglendowski Network Administrator Greenville College 618-664-7073 From stuart at bmsi.com Fri Apr 2 12:36:55 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Fri, 2 Apr 2004 12:36:55 -0500 (EST) Subject: [Pymilter] milter switching from 'poll' to 'RUN' state In-Reply-To: Message-ID: On Fri, 2 Apr 2004, Christian Wyglendowski wrote: > time. However, it always at some point switches to the 'RUN' state and > consumes as much of the CPU resources as it can. It doesn't switch back > on its own. I have to stop sendmail, stop the milter, and restart them > both to correct the problem. Anyone else seeing this issue and/or have > any idea why it is happening? Haven't seen this. However, sending a SIGINT might cause a traceback to see where the loop is. I suspect the loop is trying to traverse a mangled MIME structure. I have encountered many screwy things from all the worms and viruses. I have always gotten an exception, not a loop, however. I have to run at the moment. If SIGINT doesn't work, the thing to work on is a way to get a traceback for looping milters. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Very few of our customers are going to have a pure Unix or pure Windows environment." - Dennis Oldroyd, Microsoft Corporation From Christian.Wyglendowski at greenville.edu Fri Apr 2 12:41:19 2004 From: Christian.Wyglendowski at greenville.edu (Christian Wyglendowski) Date: Fri, 2 Apr 2004 11:41:19 -0600 Subject: [Pymilter] milter switching from 'poll' to 'RUN' state Message-ID: Thanks for the tip. I'll send a SIGINT next time it happens and see what I get. I'll post to the list if I find anything interesting. > -----Original Message----- > From: Stuart D. Gathman [mailto:stuart at bmsi.com] > Sent: Friday, April 02, 2004 11:37 AM > To: Christian Wyglendowski > Cc: pymilter at bmsi.com > Subject: Re: [Pymilter] milter switching from 'poll' to 'RUN' state > > > On Fri, 2 Apr 2004, Christian Wyglendowski wrote: > > > time. However, it always at some point switches to the 'RUN' state > > and consumes as much of the CPU resources as it can. It doesn't > > switch back on its own. I have to stop sendmail, stop the > milter, and > > restart them both to correct the problem. Anyone else seeing this > > issue and/or have any idea why it is happening? > > Haven't seen this. However, sending a SIGINT might cause a > traceback to see where the loop is. I suspect the loop is > trying to traverse a mangled MIME structure. I have > encountered many screwy things > from all the worms and viruses. I have always gotten an > exception, not a loop, however. > > I have to run at the moment. If SIGINT doesn't work, the > thing to work on is a way to get a traceback for looping milters. > > -- > Stuart D. Gathman > Business Management Systems Inc. Phone: 703 591-0911 > Fax: 703 591-6154 > "Very few of our customers are going to have a pure Unix > or pure Windows environment." - Dennis Oldroyd, > Microsoft Corporation > > From esj at harvee.org Sat Apr 3 12:57:53 2004 From: esj at harvee.org (Eric S. Johansson) Date: Sat, 03 Apr 2004 12:57:53 -0500 Subject: [Pymilter] simple milter design needed Message-ID: <406EFB21.1030806@harvee.org> just need to double check to see if I am barking up the wrong tree or if there is an existing milter I can cannibalize trying to put the camram hybrid sender-pays anti spam system (www.camram.org) into a milter. What I need at invocation is a list of all recipients and the sender, and the message passing through the milter either the form of a string or a email.message object. It would be wonderful if I could find out what interface the message came in on but I have a backup plan in case that's not possible. on return, I would either pass the message back (modified) or nothing at all (i.e. message has been spamtrapped). is there any glaring problems with these needs? Or will I have no problem creating a wrapper bridging between the milter data model and the camram data model? ---eric From stuart at bmsi.com Sat Apr 3 18:05:44 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Sat, 3 Apr 2004 18:05:44 -0500 (EST) Subject: [Pymilter] simple milter design needed In-Reply-To: <406EFB21.1030806@harvee.org> Message-ID: On Sat, 3 Apr 2004, Eric S. Johansson wrote: > (www.camram.org) into a milter. What I need at invocation is a list of > all recipients and the sender, and the message passing through the > milter either the form of a string or a email.message object. It would > be wonderful if I could find out what interface the message came in on > but I have a backup plan in case that's not possible. import Milter import mime import rfc822 class camramMilter(Milter.Milter): # The connect callback tells you connecting IP. Furthermore, the connect # interface is available as a "macro". def connect(self,hostname,unused,hostaddr): self.receiver = self.getsymval('j') self.if_name = self.getsymval('if_name') self.if_addr = self.getsymval('if_addr') if hostaddr and len(hostaddr) > 0: ipaddr = hostaddr[0] self.connectip = ipaddr else: self.connectip = None self.log("connect from %s at %s" % (hostname,hostaddr)) return Milter.CONTINUE def hello(self,hostname): self.hello_name = hostname self.log("hello from %s" % hostname) return Milter.CONTINUE # The envfrom callback tells you who the message is (purportedly) from. # multiple messages can be received on a single connection # envfrom (MAIL FROM in the SMTP protocol) marks the start # of each message. def envfrom(self,f,*str): self.log("mail from",f,str) self.fp = StringIO.StringIO() # file to save message in self.mailfrom = f self.recipients = [] return Milter.CONTINUE def envrcpt(self,to,*str): self.log("rcpt to",to,str) self.recipients.append(to) return Milter.CONTINUE def header(self,name,val): self.fp.write("%s: %s\n" % (name,val)) # add header to buffer return Milter.CONTINUE def eoh(self): # possibly camram would know by this time whether to discard the # message if self.check_camram(): return Milter.DISCARD return Milter.CONTINUE def body(self,chunk): # copy body to temp file if self.fp: self.fp.write(chunk) # IOError causes TEMPFAIL in milter self.bodysize += len(chunk) return Milter.CONTINUE def _headerChange(self,msg,name,value): if value: # add header self.addheader(name,value) else: # delete all headers with name h = msg.getheaders(name) if h: for i in range(len(h),0,-1): self.chgheader(name,i-1,'') > on return, I would either pass the message back (modified) or nothing at > all (i.e. message has been spamtrapped). def eom(self): #msgtxt = self.fp.getvalue() # get message as string # # get message as enhanced email.Message with bug fixes and support # # for changing attachments and propagating header changes self.fp.seek(0) msg = mime.MimeMessage(self.fp) # pass header changes in top level message to sendmail msg.headerchange = self._headerChange # crunch message with camram if self.camram(msg): return Milter.DISCARD # camram says to ignore message if not msg.ismodified(): return Milter.CONTINUE # pass modified message to sendmail out = StringIO.StringIO() try: msg.dump(out) # flatten modified message out.seek(0) msg = rfc822.Message(out) # just need to skip headers msg.rewindbody() # skip headers while True: buf = out.read(8192) if len(buf) == 0: break self.replacebody(buf) # feed modified message to sendmail if spam_checked: self.log("dspam") return Milter.CONTINUE except: return Milter.TEMPFAIL finally: out.close() def abort(self): self.log('connection aborted!') return Milter.CONTINUE def close(self): # do any cleanup here return Milter.CONTINUE > is there any glaring problems with these needs? Or will I have no > problem creating a wrapper bridging between the milter data model and > the camram data model? Hope the quick tutorial helped. This really brings home the need to create a plugin structure for Python milter addons. Sendmail can, of course, run several milters in series. And that is the best approach for C milters. However, with Python it would be better to handle lots of optional features within the same VM. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Very few of our customers are going to have a pure Unix or pure Windows environment." - Dennis Oldroyd, Microsoft Corporation From stuart at bmsi.com Sat Apr 3 18:24:49 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Sat, 3 Apr 2004 18:24:49 -0500 (EST) Subject: [Pymilter] simple milter design needed In-Reply-To: Message-ID: On Sat, 3 Apr 2004, Stuart D. Gathman wrote: > # pass modified message to sendmail > out = StringIO.StringIO() > try: > msg.dump(out) # flatten modified message ... > return Milter.CONTINUE > except: > return Milter.TEMPFAIL > finally: > out.close() Above based on code that normally uses a temp file to handle large emails. Doing it all in memory could be as simple as: # pass modified message to sendmail newhdrs,newbody = msg.as_string().split('\n\n',1) self.replacebody(newbody) -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Very few of our customers are going to have a pure Unix or pure Windows environment." - Dennis Oldroyd, Microsoft Corporation From esj at harvee.org Sat Apr 3 23:05:24 2004 From: esj at harvee.org (Eric S. Johansson) Date: Sat, 03 Apr 2004 23:05:24 -0500 Subject: [Pymilter] simple milter design needed In-Reply-To: References: Message-ID: <406F8984.4030204@harvee.org> Stuart D. Gathman wrote: > On Sat, 3 Apr 2004, Eric S. Johansson wrote: > > >>(www.camram.org) into a milter. What I need at invocation is a list of >>all recipients and the sender, and the message passing through the >>milter either the form of a string or a email.message object. It would >>be wonderful if I could find out what interface the message came in on >>but I have a backup plan in case that's not possible. > > > import Milter ... goodness. that is very generous of you. > # The connect callback tells you connecting IP. Furthermore, the connect > # interface is available as a "macro". If I'm interpreting this correctly, it is presenting the address of the host initiating the smtp transaction. I need to know something different which is which interface the message arrives on not where it came from. reason being that I perform asymectric operation on the mail stream. in one direction, I filter for spam, in another I stamp and log outbound messages ( I guard this interface heavely :-). it is no biggie if it is not possible, I will just run two copies of sendmail with separate queues. > > def connect(self,hostname,unused,hostaddr): > self.receiver = self.getsymval('j') > self.if_name = self.getsymval('if_name') > self.if_addr = self.getsymval('if_addr') > if hostaddr and len(hostaddr) > 0: > ipaddr = hostaddr[0] ... > # The envfrom callback tells you who the message is (purportedly) from. > # multiple messages can be received on a single connection > # envfrom (MAIL FROM in the SMTP protocol) marks the start > # of each message. > def envfrom(self,f,*str): > self.log("mail from",f,str) > self.fp = StringIO.StringIO() # file to save message in > self.mailfrom = f > self.recipients = [] is class instance preserved between callbacks making self a safe place to accumulate data? missed the docs on that. also occurs to me to wonder how/if you transion between C threads and python threads. also wonder about mem leaks and how to detect/recover. camram usually is spawned by procmail and never lives for more than 1 process lifetime. hmm I clearly have some work to do. > def eoh(self): > # possibly camram would know by this time whether to discard the > # message in future maybe. I should have enough info at this time for stamp, fast whitelist, slow white list and leave the heavyweight content filter to later.. > > Hope the quick tutorial helped. > yes indeed. I am really learning a lot from this example, things that would take many hours of skull sweat. after 2 years of working out how to make a user friendly hybrid sender-pays system, I can not thank you enough for the help. many thanks > This really brings home the need to create a plugin structure for Python milter > addons. Sendmail can, of course, run several milters in series. And that is > the best approach for C milters. However, with Python it would be better to > handle lots of optional features within the same VM. that is a point of some debate. many threads or many processes. I prefer a mixture. in some cases plugins work. camram has a filtering framework and logic structure for interpreting the results of many filters. could it be broken down into a series of plug in modules? yes it could and be the better for it. it may take me some time to do so as the project would need a patron. but small steps first. let me get the basic filter working. first gentoo now this. fun things to work with. more tomorrow. --- eric From esj at harvee.org Sat Apr 3 23:17:43 2004 From: esj at harvee.org (Eric S. Johansson) Date: Sat, 03 Apr 2004 23:17:43 -0500 Subject: [Pymilter] simple milter design needed In-Reply-To: References: Message-ID: <406F8C67.2080709@harvee.org> Stuart D. Gathman wrote: > Above based on code that normally uses a temp file to handle large emails. > Doing it all in memory could be as simple as: > > # pass modified message to sendmail > newhdrs,newbody = msg.as_string().split('\n\n',1) > self.replacebody(newbody) understood. it will be interesting to see if your email derived object interoperates with mine. I just modify as_string, message_from_string, and message_from_file to disable header wrapping. you still having problems even if you use 2.5.4? http://www.python.org/sigs/email-sig/ --- eric From stuart at bmsi.com Sun Apr 4 16:08:26 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Sun, 4 Apr 2004 16:08:26 -0400 (EDT) Subject: [Pymilter] simple milter design needed In-Reply-To: <406F8984.4030204@harvee.org> Message-ID: On Sat, 3 Apr 2004, Eric S. Johansson wrote: > > # The connect callback tells you connecting IP. Furthermore, the connect > > # interface is available as a "macro". > > If I'm interpreting this correctly, it is presenting the address of the > host initiating the smtp transaction. I need to know something > different which is which interface the message arrives on not where it > came from. reason being that I perform asymectric operation on the mail The if_name macro tells you that, I believe. The if_addr macro tells you the destination IP the message arrived on. These are distinct from the parameters passed to the connect callback, which tell you which host name and IP the connect came from. The getsymval hack provides a way to get arbitrary data that wasn't thought of when the basic API was designed (e.g. interface the message arrives on). If necessary, you can gather any information you need in sendmail-cf code and pass it via a named macro. Sendmail.cf defines which macros are available to milters. You can call an arbitrary program in any language to gather said information via a program map if the cf language is not sufficient (or looks too much like chicken scratches). > is class instance preserved between callbacks making self a safe place > to accumulate data? missed the docs on that. also occurs to me to Yes. That is the point of the OO layer provided by Milter.py. Also, the connection object is discarded after calling close - unless you leave a reference around somewhere. Note that the connection object lifetime spans a connection - which can involve any number of messages. Each call to envfrom starts a new message. > wonder how/if you transion between C threads and python threads. also Secret incantations gathered from careful study of the Python/C API, and a few offerings to the Python gods on comp.lang.python. The key API is the following: PyThreadState *t = PyThreadState_New(interp); if (t == NULL) return NULL; PyEval_AcquireThread(t); /* lock interp and assign to current thread */ > wonder about mem leaks and how to detect/recover. camram usually is > spawned by procmail and never lives for more than 1 process lifetime. > hmm I clearly have some work to do. Python is garbage collected, and the C milter module has been very tight after the last two leaks plugged by Alexander. And those leaks were small. I run milter for months with 400 msgs/hr. Alexander runs at 10 times that load. As long as you don't keep adding to a global collection (a situation I call "data cancer" and not a true memory leak), it is all automatic (barring bugs in Python or a C module). -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Very few of our customers are going to have a pure Unix or pure Windows environment." - Dennis Oldroyd, Microsoft Corporation From stuart at bmsi.com Thu Apr 8 14:02:27 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Thu, 8 Apr 2004 14:02:27 -0400 (EDT) Subject: [Pymilter] simple milter design needed In-Reply-To: Message-ID: On Sun, 4 Apr 2004, Stuart D. Gathman wrote: > > If I'm interpreting this correctly, it is presenting the address of the > > host initiating the smtp transaction. I need to know something > > different which is which interface the message arrives on not where it > > came from. reason being that I perform asymectric operation on the mail > > The if_name macro tells you that, I believe. The if_addr macro tells > you the destination IP the message arrived on. These are distinct from I did some experiments with getsymval. It turns out that you need the braces for the query: self.if_name = self.getsymval('{if_name}') self.if_addr = self.getsymval('{if_addr}') whereas for other macros you don't: self.receiver = self.getsymval('j') If_name does not return the interface name as you would expect, but rather the host name looked up from the if_addr. So the if_addr is far more useful. With the combination of source and destination ip (connect_ip and if_addr), you can query the ip routing system to find the (current) interface address(es), if that is what you want. However, these can change over the life of a TCP connection. Your security should really be based on the source (passed to connect) and destination (if_addr) IP of the TCP connection. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Very few of our customers are going to have a pure Unix or pure Windows environment." - Dennis Oldroyd, Microsoft Corporation From esj at harvee.org Wed Apr 14 15:49:07 2004 From: esj at harvee.org (Eric S. Johansson) Date: Wed, 14 Apr 2004 15:49:07 -0400 Subject: [Pymilter] simple milter design needed In-Reply-To: References: Message-ID: <407D95B3.40008@harvee.org> Stuart D. Gathman wrote: > On Sat, 3 Apr 2004, Eric S. Johansson wrote: >>(www.camram.org) into a milter. What I need at invocation is a list of >>all recipients and the sender, and the message passing through the >>milter either the form of a string or a email.message object. It would >>be wonderful if I could find out what interface the message came in on >>but I have a backup plan in case that's not possible. > > > import Milter > import mime > import rfc822 > > class camramMilter(Milter.Milter): > ... finally started work on this again. [root at redweb milter-0.6.7]# python setup.py --help /usr/lib/python2.2/distutils/dist.py:215: UserWarning: Unknown distribution option: 'classifiers' warnings.warn(msg) Global options: --verbose (-v) run verbosely (default) ever see these errors? I see them listed in google but nobody ever seems to know how to fix them. ---eric From esj at harvee.org Wed Apr 14 16:31:01 2004 From: esj at harvee.org (Eric S. Johansson) Date: Wed, 14 Apr 2004 16:31:01 -0400 Subject: [Pymilter] simple milter design needed In-Reply-To: <407D95B3.40008@harvee.org> References: <407D95B3.40008@harvee.org> Message-ID: <407D9F85.7090906@harvee.org> Eric S. Johansson wrote: > [root at redweb milter-0.6.7]# python setup.py --help > /usr/lib/python2.2/distutils/dist.py:215: UserWarning: Unknown > distribution option: 'classifiers' > warnings.warn(msg) > Global options: > --verbose (-v) run verbosely (default) > > ever see these errors? I see them listed in google but nobody ever > seems to know how to fix them. seems that brute force wins the day. I just deleted classifiers... ---eric From esj at harvee.org Thu Apr 15 08:33:45 2004 From: esj at harvee.org (Eric S. Johansson) Date: Thu, 15 Apr 2004 08:33:45 -0400 Subject: [Pymilter] annoying errors and explanations Message-ID: <407E8129.2020200@harvee.org> started testing and the milter process starts up fine and runs fine. But, sendmail tells me: Starting sendmail: 451 4.0.0 /etc/mail/sendmail.cf: line 1644: Xmilter: local socket name /tmp/camram/pythonsock unsafe: World writable directory and it's not. Directory and sockets are owned by root.root, 660. I figure it's probably being root phobic so I start testing with the final userid. now it's the milter's turn to throw up a hairball. bash-2.05b$ python camram_milter.py Removing /tmp/camram/pythonsock Traceback (most recent call last): File "camram_milter.py", line 133, in ? Milter.runmilter("milter",socketname,240) File "/usr/lib/python2.2/site-packages/Milter.py", line 198, in runmilter milter.main() milter.error: cannot run main even though the directory was 660 and owned by camram, I could not write to it (permissions 101). what threw me was the error "cannot run main" which is the error you give when there is a problem with threads. Curiouser and curiouser. in the end, sendmail is still sick about something masquerading as a world writable directory. it's enough to make you go to postfix.. ---eric From esj at harvee.org Thu Apr 15 08:53:08 2004 From: esj at harvee.org (Eric S. Johansson) Date: Thu, 15 Apr 2004 08:53:08 -0400 Subject: [Pymilter] annoying errors and explanations In-Reply-To: <407E8129.2020200@harvee.org> References: <407E8129.2020200@harvee.org> Message-ID: <407E85B4.1000005@harvee.org> Eric S. Johansson wrote: > started testing and the milter process starts up fine and runs fine. > But, sendmail tells me: > > Starting sendmail: 451 4.0.0 /etc/mail/sendmail.cf: line 1644: Xmilter: > local socket name /tmp/camram/pythonsock unsafe: World writable directory ... > > in the end, sendmail is still sick about something masquerading as a > world writable directory. it was complaining about /tmp. so, the real lesson is: don't use /tmp for transient fifos like everybody else if you are using sendmail. Not sure where I'm going to put them for real but for now /var/camram works. ---eric From stuart at bmsi.com Thu Apr 15 11:03:22 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Thu, 15 Apr 2004 11:03:22 -0400 (EDT) Subject: [Pymilter] simple milter design needed In-Reply-To: <407D95B3.40008@harvee.org> Message-ID: On Wed, 14 Apr 2004, Eric S. Johansson wrote: > [root at redweb milter-0.6.7]# python setup.py --help > /usr/lib/python2.2/distutils/dist.py:215: UserWarning: Unknown > distribution option: 'classifiers' > warnings.warn(msg) > Global options: > --verbose (-v) run verbosely (default) > > ever see these errors? I see them listed in google but nobody ever > seems to know how to fix them. That is a new setup parm added in python2.3. Making setup.py work with older pythons will require testing python version (straightforward) and then passing keywords options to setup() as a dict (ugly). How needed is pre 2.3 compatibility? -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Very few of our customers are going to have a pure Unix or pure Windows environment." - Dennis Oldroyd, Microsoft Corporation From esj at harvee.org Thu Apr 15 13:17:50 2004 From: esj at harvee.org (Eric S. Johansson) Date: Thu, 15 Apr 2004 13:17:50 -0400 Subject: [Pymilter] simple milter design needed In-Reply-To: References: Message-ID: <407EC3BE.30802@harvee.org> Stuart D. Gathman wrote: > That is a new setup parm added in python2.3. Making setup.py work with > older pythons will require testing python version (straightforward) and > then passing keywords options to setup() as a dict (ugly). How needed is > pre 2.3 compatibility? it's needed. I'm supporting back to 2.2.1 in: http://www.python.org/doc/current/dist/setup-script.html#SECTION000360000000000000000 I found: """ if you wish to include classifiers in your setup.py file and also wish to remain backwards-compatible with Python releases prior to 2.2.3, then you can include the following code fragment in your setup.py before the setup() call. # patch distutils if it can't cope with the "classifiers" or # "download_url" keywords if sys.version < '2.2.3': from distutils.dist import DistributionMetadata DistributionMetadata.classifiers = None DistributionMetadata.download_url = None """ does this help? ---eric From stuart at bmsi.com Tue Apr 20 19:08:47 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Tue, 20 Apr 2004 19:08:47 -0400 (EDT) Subject: [Pymilter] Release 0.6.9 Message-ID: Mostly SPF enhancements. Still no word from Terrence Way, but I'm CCing him for this announcement. SPF now passes the test suite except for -local and -rcpt-to, which are not part of the RFC anyway. I wonder if I should start releasing a forked pyspf package instead of including my version with milter. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Very few of our customers are going to have a pure Unix or pure Windows environment." - Dennis Oldroyd, Microsoft Corporation From esj at harvee.org Wed Apr 21 16:52:10 2004 From: esj at harvee.org (Eric S. Johansson) Date: Wed, 21 Apr 2004 16:52:10 -0400 Subject: [Pymilter] something odd I've noticed about the milter supplied wrapper to e-mail module Message-ID: <4086DEFA.5080609@harvee.org> code of mine which had worked using the standard e-mail module is now causing errors because automatic conversion to string type is not happening with the milter wrapped email module. End result is a need to sprinkle str() every time I try to assign an integer or a float to a header. any ideas why this is happening? ---eric From esj at harvee.org Wed Apr 21 17:26:56 2004 From: esj at harvee.org (Eric S. Johansson) Date: Wed, 21 Apr 2004 17:26:56 -0400 Subject: [Pymilter] something odd I've noticed about the milter supplied wrapper to e-mail module In-Reply-To: <4086DEFA.5080609@harvee.org> References: <4086DEFA.5080609@harvee.org> Message-ID: <4086E720.6010704@harvee.org> Eric S. Johansson wrote: > any ideas why this is happening? after a bit more research figured out why it's happening and it's not pleasant (side effects at least anyway). Looks like I'm going to need to write out the entire message as a string, create a python e-mail object (import by string) and reverse the process on the way back. ugh.. don't know what's worse. Combing some 6000 plus lines of source files for places where I write to the e-mail headers for playing manual deep copy games with potentially MB of messages. might also try doing explicit type conversions in milter which might be a not so ugly compromise. Will think about it while I'm hunting spring frogs. ---eric From stuart at bmsi.com Wed Apr 21 18:08:44 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Wed, 21 Apr 2004 18:08:44 -0400 (EDT) Subject: [Pymilter] something odd I've noticed about the milter supplied wrapper to e-mail module In-Reply-To: <4086DEFA.5080609@harvee.org> Message-ID: On Wed, 21 Apr 2004, Eric S. Johansson wrote: > code of mine which had worked using the standard e-mail module is now > causing errors because automatic conversion to string type is not > happening with the milter wrapped email module. End result is a need to > sprinkle str() every time I try to assign an integer or a float to a > header. > > any ideas why this is happening? class MimeMessage(Message): def __setitem__(self, name, value): rc = Message.__setitem__(self,name,value) self.modified = True if self.headerchange: self.headerchange(self,name,value) return rc I call the parent setitem, but use the original value to pass on (to sendmail through the headerchange hook). If there was a simple way to retrieve the value just stored, we could pass that on to sendmail instead. Unfortunately, it is tricky to find the value we just stored when there are multiple headers with the same name. (I think the one we just stored would be the last in the list, but I'm not sure.) The other option is to duplicate the string conversion. If wrapping with str() is all that is required, that is probably simplest: class MimeMessage(Message): def __setitem__(self, name, value): rc = Message.__setitem__(self,name,value) self.modified = True if self.headerchange: self.headerchange(self,name,str(value)) return rc -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Very few of our customers are going to have a pure Unix or pure Windows environment." - Dennis Oldroyd, Microsoft Corporation From esj at harvee.org Wed Apr 21 21:50:40 2004 From: esj at harvee.org (Eric S. Johansson) Date: Wed, 21 Apr 2004 21:50:40 -0400 Subject: [Pymilter] something odd I've noticed about the milter supplied wrapper to e-mail module In-Reply-To: References: Message-ID: <408724F0.3070201@harvee.org> Stuart D. Gathman wrote: > On Wed, 21 Apr 2004, Eric S. Johansson wrote: > If there was a simple way to retrieve the value just stored, we could > pass that on to sendmail instead. Unfortunately, it is tricky to find > the value we just stored when there are multiple headers with the same name. > (I think the one we just stored would be the last in the list, but I'm > not sure.) the Message class __setitem__ is simply: self._headers.append((name, val)) which means it is just storing a tuple so I suspect is some form of string conversion on output that is doing the conversion I have come to know and love. I should probably investigate this a bit more to see if it is doing exactly what I think it's doing and why. > The other option is to duplicate the string conversion. If wrapping with > str() is all that is required, that is probably simplest: probably not a bad idea. I will experiment with this tomorrow and let you know how well it works. ---eric From esj at harvee.org Fri Apr 23 09:38:38 2004 From: esj at harvee.org (Eric S. Johansson) Date: Fri, 23 Apr 2004 09:38:38 -0400 Subject: [Pymilter] something odd I've noticed about the milter supplied wrapper to e-mail module In-Reply-To: References: Message-ID: <40891C5E.9090805@harvee.org> Stuart D. Gathman wrote: > The other option is to duplicate the string conversion. If wrapping with > str() is all that is required, that is probably simplest: > > class MimeMessage(Message): > > def __setitem__(self, name, value): > rc = Message.__setitem__(self,name,value) > self.modified = True > if self.headerchange: self.headerchange(self,name,str(value)) > return rc > this seems to be working OK. Although I am chasing a couple of other problems that keep me from testing it fully. From esj at harvee.org Fri Apr 23 09:49:47 2004 From: esj at harvee.org (Eric S. Johansson) Date: Fri, 23 Apr 2004 09:49:47 -0400 Subject: [Pymilter] other problems being chased Message-ID: <40891EFB.6090708@harvee.org> haven't gone back to the original milter documentation yet. That's next on the list of things to do this morning. Camram is a different filter from most. it associates filtering information on a per user or per domain basis. for example, you can assign a given user complete filtering capabilities over domain. You can then exclude individuals from that domain aggregate filtering and allow them to have their own filtering data sets (Bayesian, white list etc.) when you are running camram filtering at the delivery agent level, this is not a problem. It works OK but when you start doing things at the milter level, it gets interesting. For example, if a message is delivered to a at x.c, b at x.c, c at x.c and b,c have individual filtering, what happens when c marks the message as spam and holds it back? The answer is unpleasant. the filter needs to tell the milter to remove c from the list of recipients. now I am sure there is some way to do this. And I will probably know how to do it in about an hour. But right now, I'm a bit puzzled. is a similar separation on the outbound side. stamping reveals recipient identities. Therefore, if you have a blind carbon copy, you only want to aggregate stamps to those not on the blind carbon copy. All of those who are on the blind carbon copy one individual stamps and individual copies of messages. it's amusing. ---eric From esj at harvee.org Fri Apr 23 10:17:47 2004 From: esj at harvee.org (Eric S. Johansson) Date: Fri, 23 Apr 2004 10:17:47 -0400 Subject: [Pymilter] other problems being chased In-Reply-To: <40891EFB.6090708@harvee.org> References: <40891EFB.6090708@harvee.org> Message-ID: <4089258B.70506@harvee.org> Eric S. Johansson wrote: > when you are running camram filtering at the delivery agent level, this > is not a problem. It works OK but when you start doing things at the > milter level, it gets interesting. For example, if a message is > delivered to a at x.c, b at x.c, c at x.c and b,c have individual filtering, what > happens when c marks the message as spam and holds it back? The answer > is unpleasant. the filter needs to tell the milter to remove c from the > list of recipients. looks like delrcpt should do it if I update the recipient's list in eom what's quarantine for? From stuart at bmsi.com Fri Apr 23 15:03:08 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Fri, 23 Apr 2004 15:03:08 -0400 (EDT) Subject: [Pymilter] other problems being chased In-Reply-To: <40891EFB.6090708@harvee.org> Message-ID: On Fri, 23 Apr 2004, Eric S. Johansson wrote: > when you are running camram filtering at the delivery agent level, this > is not a problem. It works OK but when you start doing things at the > milter level, it gets interesting. For example, if a message is > delivered to a at x.c, b at x.c, c at x.c and b,c have individual filtering, what > happens when c marks the message as spam and holds it back? The answer > is unpleasant. the filter needs to tell the milter to remove c from the > list of recipients. > > now I am sure there is some way to do this. And I will probably know > how to do it in about an hour. But right now, I'm a bit puzzled. If you want to remove a specific recipient, call delrcpt() from eom(). bms.py also has a del_recipient() that can be called earlier. It just accumulates a list to remove in eom(). I have suggested adding basic tools like del_recipient() and add_header() (allows headers to be added before eom() - just queue 'em up and add later) to Milter.py, but have gotten strong resistence. There seems to be a demand for a "bare bones" OO interface to libmilter. So such features would go into an extended class derived from Milter. In bms.py, the dspam support faces a similar problem. Originally, I just did delrcpt() for the dspam user whose dictionary thought it was spam. (If you end up deleting all recipients, sendmail will DISCARD it automatically.) But then, I started discarding the message, and saving the original recipient list in the quarantined message. The other recipients don't see the message unless the dspam user that quarantined it reports it as a false positive. Now, this policy is not for everyone, but at our typical customer with 20-70 employees, this saves a lot of spam deleting. In fact, I went one step further: designated employees are flagged as "screeners" - messages that look like spam to their dictionary are quarantined (to the screeners quarantine box) even when the screener is *not* the recipient. It gets delivered to the original recipient if the screener marks it as a false positive. This lets 1 or 2 employees do the bulk of spam scanning for the rest of the company. There is usually one employee who actually enjoys this sort of work (spams are often good for a laugh). Furthermore, I am a screener for customers that don't want to deal with a spam interface at all. Their mail goes through my server, and if my dictionary thinks it is spam it gets quarantined. It gets delivered to the original recipient if I mark it as a false positive - often after a phone call to ask whether they really signed up for "News at Noon" (no they didn't). This system needs to be extended so that screeners can screen a subset of all mail users. Our larger customers have departments whose legit mail differs quite a bit from each other - so having a screener screen only their own department would increase accuracy. (Because other peoples mail that is not quarantined doesn't affect their dictionary - so it doesn't adapt to innocent mail unless they get the same kind of mail they are screening.) However, the addition of SPF has greatly increased accuracy already. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. From esj at harvee.org Fri Apr 23 15:08:15 2004 From: esj at harvee.org (Eric S. Johansson) Date: Fri, 23 Apr 2004 15:08:15 -0400 Subject: [Pymilter] exploding messages Message-ID: <4089699F.2050106@harvee.org> (I hope folks don't mind my thinking aloud. Hopefully it will illustrate features of pymilter one can use.) As I described earlier, in camram, there can be either individual or aggregate filters. a mail message may hit multiple individual or aggregate filters. The message must be compared to each filter and will get a different rating as to whether or not it is spam. At this point, the message will get different additional headers and therefore create multiple, almost identical, copies of the same message. I think I'm comfortable with the deep copy replication and address partitioning. The question is what's the best way to inject these messages back into the system without passing them through the camram filter again. I'm beginning to think that I might want to not release a message back through milter but always output the message through the reinjection mechanism. so the system will look something like: external>>-->> sendmail(p1)..|| milter || camram_pymilter || filter and replicate || reinject --> sendmail(p3)-->deliver an external e-mail message is delivered to the input side of sendmail. The message is passed to the milter with corresponding camram filter. The messages filtered and replicated which is then passed to reinjection system which delivers it to (another?) sendmail which then delivers it. Unless I'm very much mistaken, there does not seem to be any way to create and inject multiple copies at the milter level. thoughts? ---eric From stuart at bmsi.com Fri Apr 23 15:20:47 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Fri, 23 Apr 2004 15:20:47 -0400 (EDT) Subject: [Pymilter] exploding messages In-Reply-To: <4089699F.2050106@harvee.org> Message-ID: On Fri, 23 Apr 2004, Eric S. Johansson wrote: > an external e-mail message is delivered to the input side of sendmail. > The message is passed to the milter with corresponding camram filter. > The messages filtered and replicated which is then passed to reinjection > system which delivers it to (another?) sendmail which then delivers it. > > Unless I'm very much mistaken, there does not seem to be any way to > create and inject multiple copies at the milter level. You are correct. You can add/del recipients and change the message, but sendmail can deliver only one version of a message. As an optimization, you could DISCARD and reinject only when the number of message versions is > 1. Milter is not called for locally generated mail (i.e. from postmaster). This is annoying because most of my postmaster mail is spam (failed attempts to tell spammers their spam couldn't be delivered on machines I am a secondary MX for). However, locally generated mail is just what you require. I believe that running sendmail as a MSA, fp = popen("/usr/sbin/sendmail -f %s %s" % (sender,rcpt),"w"), will do what you want. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. From esj at harvee.org Fri Apr 23 15:22:31 2004 From: esj at harvee.org (Eric S. Johansson) Date: Fri, 23 Apr 2004 15:22:31 -0400 Subject: [Pymilter] other problems being chased In-Reply-To: References: Message-ID: <40896CF7.2060506@harvee.org> Stuart D. Gathman wrote: > If you want to remove a specific recipient, call delrcpt() from eom(). > bms.py also has a del_recipient() that can be called earlier. It just > accumulates a list to remove in eom(). I have suggested adding > basic tools like del_recipient() and add_header() (allows headers to > be added before eom() - just queue 'em up and add later) to Milter.py, > but have gotten strong resistence. There seems to be a demand for > a "bare bones" OO interface to libmilter. So such features would > go into an extended class derived from Milter. that works for me. I won't know who's to be deleted until I do my filtering deep within eom time. > This system needs to be extended so that screeners can screen a subset > of all mail users. Our larger customers have departments whose legit > mail differs quite a bit from each other - so having a screener screen only > their own department would increase accuracy. (Because other peoples mail > that is not quarantined doesn't affect their dictionary - so it doesn't > adapt to innocent mail unless they get the same kind of mail they are > screening.) However, the addition of SPF has greatly increased accuracy > already. I have that code already. I will gladly share with you the scar tissue I've encountered in making it work. In the "lump in the line" model, it was fairly easy to create filter specific IDs and associate them with either individual e-mail addresses or domains (or both). Making milter work the right way for message replication appears to be relatively difficult. the next trick is then using something I call delegation and merging. The delegation assigns responsibility for spamtrap filtering to another party. Merging takes delegation one step further and merges two or more accounts together making responsibility for sorting etc. the responsibility of one account. they also have different implications when it comes to undoing the various bindings. Merging is impossible to undo (think removing the chocolate from chocolate milk). The best you can do is replicate the various filtering data sources into the different user accounts. Delegations easier to undo because it merely undoes the spamtrap redirection. The user still have their own double spend, white list and Bayesian databases. To tell you choose, I'm not entirely happy with how the design/code evolved but it does map to how people work so it's probably going to take one more pass to make me more comfortable with the design. As for SPF increasing accuracy I'm glad it is working for you but, I have one customer that is not using SPF because too many of the mailing lists and forwarding services were being inappropriately marked. I have my doubts about SPF's long-term viability given that some of the bigger spammers are already creating SPF records for their zombie owned machines. On the other hand, I know there are people that have doubts about what I'm doing. :-) ---eric From stuart at bmsi.com Fri Apr 23 18:01:37 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Fri, 23 Apr 2004 18:01:37 -0400 (EDT) Subject: [Pymilter] SPF effectiveness In-Reply-To: <40896CF7.2060506@harvee.org> Message-ID: On Fri, 23 Apr 2004, Eric S. Johansson wrote: > As for SPF increasing accuracy I'm glad it is working for you but, I > have one customer that is not using SPF because too many of the mailing > lists and forwarding services were being inappropriately marked. I have > my doubts about SPF's long-term viability given that some of the bigger > spammers are already creating SPF records for their zombie owned > machines. On the other hand, I know there are people that have doubts > about what I'm doing. :-) It is fine for spammers to create SPF records. That just makes their domain a highly significant token for bayesian filtering - or else easily blacklisted with a RHSBL. Preventing forgery is even good for the semi-legit spammers (i.e. not a scam - they actually deliver a product) that strange people actually buy from. If I can easily block their pitch before SMTP DATA, their broadcasting is annoying but livable. The semi-legit spammers are suffering from the scammers - who have much more reason to forge their mail headers. SPF stops forging of the envelope sender - nothing else. (The Yahoo scheme prevents forging of From: and related headers.) While rejecting forged senders gets rid of a lot of spam now, the goal for the future is to have *no* spam with forged headers because all spammers will have SPF records. SMTP envelope level spam blocking will all be by domain name blacklists and there will still be content filtering since it is relatively cheap for sleazier spammers to keep buying new domain names). Furthermore, since getting a domain name (as opposed to forging someone elses) requires registering with a domain registrar, it will be easier to track down the truly criminal spammers. SPF, like all truly useful spam tools, is not a silver bullet. It simply enforces accountablity for domain holders. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. From stuart at bmsi.com Fri Apr 23 18:14:57 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Fri, 23 Apr 2004 18:14:57 -0400 (EDT) Subject: [Pymilter] other problems being chased In-Reply-To: <40896CF7.2060506@harvee.org> Message-ID: On Fri, 23 Apr 2004, Eric S. Johansson wrote: > As for SPF increasing accuracy I'm glad it is working for you but, I > have one customer that is not using SPF because too many of the mailing > lists and forwarding services were being inappropriately marked. I have > my doubts about SPF's long-term viability given that some of the bigger > spammers are already creating SPF records for their zombie owned > machines. On the other hand, I know there are people that have doubts > about what I'm doing. :-) If they would just use SPF to create the Received-SPF header, and feed it with the rest of the message to their favorite bayesian or score based filter, they wouldn't have that problem. It is innappropriate to reject mail based solely on SPF unless you understand how to handle forwarding services. Real mailing lists are not a problem. Any mailing list that puts the authors [From:] email as the sender [envfrom] is braindead - but if you unwisely decide to post to one, just include its servers in your SPF record. Too many greeting card services fall under this category. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. From stuart at bmsi.com Fri Apr 23 19:03:21 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Fri, 23 Apr 2004 19:03:21 -0400 (EDT) Subject: [Pymilter] exploding messages In-Reply-To: <4089699F.2050106@harvee.org> Message-ID: On Fri, 23 Apr 2004, Eric S. Johansson wrote: > As I described earlier, in camram, there can be either individual or > aggregate filters. a mail message may hit multiple individual or > aggregate filters. The message must be compared to each filter and will > get a different rating as to whether or not it is spam. At this point, > the message will get different additional headers and therefore create > multiple, almost identical, copies of the same message. The dspam filter creates a "TAG" that is filter (dictionary) specific and is used to lookup a token stats record (signature) to change the status of a message. Rather than copy the entire message, I simply add all the tags to the message. If the user changes the status (marks as spam), the tags not belonging to them are simply ignored. This way, there is only one message delivered. For smart message stores (like Exchange - but don't buy it, it sucks in too many other ways), the message is only stored once as well. If you simply make it easy to recognize which headers go with a filter/individual (perhaps by including a filter id), you can add 'em all and not duplicate the entire message. If you are worried about users finding out about Bcc recipients due to information leakage in the extra headers, then only duplicate the message for Bcc recipients. I suppose you might also be concerned that officemates getting the same email would see each others spam score ("Your spam score for than pr0n spam is 0.00??"), but that wouldn't bother me. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. From esj at harvee.org Fri Apr 23 22:38:48 2004 From: esj at harvee.org (Eric S. Johansson) Date: Fri, 23 Apr 2004 22:38:48 -0400 Subject: [Pymilter] exploding messages In-Reply-To: References: Message-ID: <4089D338.8040205@harvee.org> Stuart D. Gathman wrote: > On Fri, 23 Apr 2004, Eric S. Johansson wrote: > > >>As I described earlier, in camram, there can be either individual or >>aggregate filters. a mail message may hit multiple individual or >>aggregate filters. The message must be compared to each filter and will >>get a different rating as to whether or not it is spam. At this point, >>the message will get different additional headers and therefore create >>multiple, almost identical, copies of the same message. > > > The dspam filter creates a "TAG" that is filter (dictionary) specific and > is used to lookup a token stats record (signature) to change the status > of a message. > > Rather than copy the entire message, I simply add all the tags to the > message. If the user changes the status (marks as spam), the tags > not belonging to them are simply ignored. This way, there > is only one message delivered. For smart message stores (like Exchange - > but don't buy it, it sucks in too many other ways), the message is > only stored once as well. > > If you simply make it easy to recognize which headers go with a > filter/individual (perhaps by including a filter id), you can add 'em all and > not duplicate the entire message. If you are worried about users finding > out about Bcc recipients due to information leakage in the extra headers, > then only duplicate the message for Bcc recipients. I suppose you > might also be concerned that officemates getting the same email would see each > others spam score ("Your spam score for than pr0n spam is 0.00??"), but > that wouldn't bother me. wish we had had this conversation before I wrote the code to aggregate by filter group and replicate everything per group. :-) the information I store in each message is recipient (for reinjecting message), score, spamtrap ID. So if I have the message which is interpreted by three different sets a filter rules, I could potentially end up with three recipient address groups, three different scores and up to three spamtrap ID's. I must point out that if a message has a spamtrap ID it is not propagated to the end-user's mailbox. while this could work, it would seriously bollix up the internals which are, what I perceive as, MTA independent. for example, the headers I added to a message are used during the reinjection process once someone has approved a message as "good". The header information is used during the spamtrap message list generation process as well. Having multiple copies of the same headers is unpleasant at best enforces the user interface to do more testing for message than I am comfortable with. on the plus side, it's forced me to redo the filter front-end so that it's a little more general. I think it needs a couple more passes (i.e. postfix, exim) before its really decent but it's definitely better. I will admit to being disappointed that the sendmail folks didn't see far enough in the future to see the need for "forking" messages like we're talking about doing. It would have been quite useful. I was also playing with trying to see what address the traffic came in on and it looks like getsymval is the way to go but Milter: connect something?? redweb.harvee.org None at None is what I get for the receiver, interface_name and interface_address what I connect from local host. I noticed something in the documentation saying that it doesn't give back a valid answer for local host and I guess this is what they mean...grumble. They could at least give the port number but I guess I'll have to look into macros a little more closely and modify my mc file at the appropriate time. ---eric From esj at harvee.org Mon Apr 26 12:07:10 2004 From: esj at harvee.org (Eric S. Johansson) Date: Mon, 26 Apr 2004 12:07:10 -0400 Subject: [Pymilter] delrcpt errors Message-ID: <408D33AE.4010808@harvee.org> trying to delete recipients and haven't quite figured out what I'm doing wrong. The line "recipient list" is the list of addresses taken from the SMTP protocol level by 'envrcpt'. I do my munching and then I have a list of addresses I want to delete. I go through that list and I get a "cannot delete recipient" exception. As far as I can tell the strings are identical and I'm trying to delete recipients inside of eom. I'm puzzled. Milter: recipient list = ['', ''] Milter: address Milter: delete recipient cannot delete recipient Milter: address Milter: delete recipient cannot delete recipient From esj at harvee.org Mon Apr 26 12:16:22 2004 From: esj at harvee.org (Eric S. Johansson) Date: Mon, 26 Apr 2004 12:16:22 -0400 Subject: [Pymilter] exploding messages In-Reply-To: References: Message-ID: <408D35D6.1020002@harvee.org> Stuart D. Gathman wrote: > The dspam filter creates a "TAG" that is filter (dictionary) specific and > is used to lookup a token stats record (signature) to change the status > of a message. > > Rather than copy the entire message, I simply add all the tags to the > message. If the user changes the status (marks as spam), the tags > not belonging to them are simply ignored. This way, there > is only one message delivered. For smart message stores (like Exchange - > but don't buy it, it sucks in too many other ways), the message is > only stored once as well. after thinking about it all weekend. I think I was a bit hasty and rejecting your suggestions. While there are a bunch of reasons for doing what I'm going to do in the short-term, I think the right answer is to try an alternative based on your ideas. That is to not change the mail message at all but deliver all that I can in the first pass and only make copies of messages that are placed into the spamtrap or dumpster. I will probably use the message ID as a tag for associating all of the scores etc. in the message logs. If I can't record information in the mail message, the least I can do is being able to reconstruct it. I will admit that it bothers me that I can't put per recipient headers in the messages but I will get over it. I think I was thinking that way because of having been operating as a local delivery agent filter for so long. ---eric From stuart at bmsi.com Mon Apr 26 12:52:49 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Mon, 26 Apr 2004 12:52:49 -0400 Subject: [Pymilter] delrcpt errors In-Reply-To: <408D33AE.4010808@harvee.org> References: <408D33AE.4010808@harvee.org> Message-ID: <408D3E61.8040605@bmsi.com> Eric S. Johansson wrote: > trying to delete recipients and haven't quite figured out what I'm > doing wrong. The line "recipient list" is the list of addresses taken > from the SMTP protocol level by 'envrcpt'. I do my munching and then > I have a list of addresses I want to delete. I go through that list > and I get a "cannot delete recipient" exception. As far as I can tell > the strings are identical and I'm trying to delete recipients inside > of eom. I'm puzzled. > > Milter: recipient list = ['', > ''] > Milter: address > Milter: delete recipient cannot delete recipient > Milter: address > Milter: delete recipient cannot delete recipient You have to tell sendmail you might be deleting recipients by setting DELRCPT in the flags. Here is a snippet from the startup in bms.py: flags = Milter.CHGBODY + Milter.CHGHDRS + Milter.ADDHDRS if wiretap_dest or smart_alias or dspam_userdir: flags = flags + Milter.ADDRCPT if srs or len(discard_users) > 0 or smart_alias or dspam_userdir: flags = flags + Milter.DELRCPT Milter.set_flags(flags) From stuart at bmsi.com Mon Apr 26 15:22:32 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Mon, 26 Apr 2004 15:22:32 -0400 (EDT) Subject: [Pymilter] exploding messages In-Reply-To: <408D35D6.1020002@harvee.org> Message-ID: On Mon, 26 Apr 2004, Eric S. Johansson wrote: > I will probably use the message ID as a tag for associating all of the > scores etc. in the message logs. If I can't record information in the > mail message, the least I can do is being able to reconstruct it. I would recommend not relying on someone elses message ID as a tag. Spam, for instance, often uses the same message-id for a whole run of spam, or has no message-id at all. Instead, add your own X-Camram-Tag: header to associate scores, etc. (checking whether one already exists and discarding or escaping). This also solves the problem of coworkers seeing their officemates spam scores. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. From esj at harvee.org Tue Apr 27 13:21:51 2004 From: esj at harvee.org (Eric S. Johansson) Date: Tue, 27 Apr 2004 13:21:51 -0400 Subject: [Pymilter] exploding messages In-Reply-To: References: Message-ID: <408E96AF.2050704@harvee.org> Stuart D. Gathman wrote: > I would recommend not relying on someone elses message ID as a tag. > Spam, for instance, often uses the same message-id for a whole run of spam, > or has no message-id at all. I hadn't paid attention. Knowing what I do about spammer behavior, you are probably quite right. > Instead, add your own X-Camram-Tag: header to associate scores, etc. > (checking whether one already exists and discarding or escaping). This > also solves the problem of coworkers seeing their officemates spam scores. > shouldn't be a big problem. Need to find a good source of uniqueness. Will probably do something like sha1 on the entire body of the message. Collisions should be fairly rare. obviously, this will be a fun thing to test. ;-) ---eric From stuart at bmsi.com Wed Apr 28 17:05:26 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Wed, 28 Apr 2004 17:05:26 -0400 (EDT) Subject: [Pymilter] HELO optional Message-ID: I just discovered that using HELO in smtp is optional (at least with sendmail-8.12.10). 2004Apr27 21:32:43 [25310] connect from [202.155.23.34] at ('202.155.23.34', 28136) EXTERNAL 2004Apr27 21:32:45 [25310] mail from () Traceback (most recent call last): File "/usr/lib/python2.3/site-packages/Milter.py", line 178, in milter.set_envfrom_callback(lambda ctx,*str: File "bms.py", line 536, in envfrom return self.check_spf() File "bms.py", line 542, in check_spf q = spf.query(self.connectip,'@'.join(t),self.hello_name) AttributeError: bmsMilter instance has no attribute 'hello_name' This message was spam, but I'm wondering if I'm justified in rejecting any connections without HELO. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. From esj at harvee.org Wed Apr 28 17:10:20 2004 From: esj at harvee.org (Eric S. Johansson) Date: Wed, 28 Apr 2004 17:10:20 -0400 Subject: [Pymilter] HELO optional In-Reply-To: References: Message-ID: <40901DBC.5030805@harvee.org> Stuart D. Gathman wrote: > This message was spam, but I'm wondering if I'm justified in rejecting > any connections without HELO. IMHO, I would say yes but the final arbiter should be rfc2821 ---eric From stuart at bmsi.com Wed Apr 28 23:24:36 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Wed, 28 Apr 2004 23:24:36 -0400 (EDT) Subject: [Pymilter] Another bug to quash in email package Message-ID: Yet another message that crashes the email package. Will need to work around in the mime package somehow. At least I have a nice example for unit testing. Test script: import sys import mime msg = mime.MimeMessage(sys.stdin) sys.stdout.write(msg.as_string()) $ python2 te.py -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. -------------- next part -------------- From gquvet at guam.net Wed Apr 28 21:35:45 2004 Received: from 162.255.167.64 by 68.81.212.198; Thu, 29 Apr 2004 08:30:53 +0600 Message-ID: From: "Eliza Stapleton" Reply-To: "Eliza Stapleton" To: jean at victim.com Subject: Mort.gage ra.tes are decreasing Date: Wed, 28 Apr 2004 22:33:53 -0400 MIME-Version: 1.0 Content-Type: text/html; boundary="--20194380884522122" X-Originating-IP: 66.160.67.186 Received-SPF: neutral (spidey.bmsi.com: guessing: 68.81.212.198 is neither permitted nor denied by domain of guam.net) ----20194380884522122 Content-Type: text/html; Content-Transfer-Encoding: 7Bit ypical spam pitch. ----20194380884522122-- From stuart at bmsi.com Wed Apr 28 23:28:11 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Wed, 28 Apr 2004 23:28:11 -0400 (EDT) Subject: [Pymilter] Another bug to quash in email package In-Reply-To: Message-ID: On Wed, 28 Apr 2004, Stuart D. Gathman wrote: > import sys > import mime > msg = mime.MimeMessage(sys.stdin) > sys.stdout.write(msg.as_string()) Meant to say it fails the same way with unmodified 2.3 email package: import sys import email msg = email.message_from_file(sys.stdin) sys.stdout.write(msg.as_string()) -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial.