From jchrisos at hotmail.com Thu Sep 29 10:49:11 2011 From: jchrisos at hotmail.com (Jim Chrisos) Date: Thu, 29 Sep 2011 09:49:11 -0500 Subject: [Pymilter] Obtaining email attachments for external processing Message-ID: All, I am new to pymilter and despite writing many python scripts, I cannot, for the life of me, figure out how to obtain email attachments. I've added a "data" function in the sample.py script since 'data' is one of the sendmail milter callbacks and in test emails I am able to get into that function. But as far as extracting attachments for parsing with other tools of mine, I'm at a loss. I've tried examples from other sample scripts such as check_attachments and message_from_file from the mime package but I still can't figure it out. I was thinking I should be able to get the attachment(s) in relatively few lines of code which I could then hand off to my other tools, but am I mistaken? Can anyone provide any samples to help me out? Thanks in advance for any assistance anyone can provide. Jim -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuart at bmsi.com Thu Sep 29 11:27:57 2011 From: stuart at bmsi.com (Stuart D. Gathman) Date: Thu, 29 Sep 2011 11:27:57 -0400 (EDT) Subject: [Pymilter] Obtaining email attachments for external processing In-Reply-To: References: Message-ID: On Thu, 29 Sep 2011, Jim Chrisos wrote: > I am new to pymilter and despite writing many python scripts, I cannot, for > the life of me, figure out how to obtain email attachments. > > I've added a "data" function in the sample.py script since 'data' is one of > the sendmail milter callbacks and in test emails I am able to get into that > function. But as far as extracting attachments for parsing with other tools > of mine, I'm at a loss. I've tried examples from other sample scripts such > as check_attachments and message_from_file from the mime package but I still > can't figure it out. I was thinking I should be able to get the > attachment(s) in relatively few lines of code which I could then hand off to > my other tools, but am I mistaken? Can anyone provide any samples to help > me out? You can get the attachments in relatively few lines of code. The standard python email module parses MIME attachments. The milter package (my production milter that uses pymilter and is also on sourceforge) is probably too big, but it does extensive attachment processing. Let me summarize: o In the header callback, you get the header fields. Append those to a file or stringio. o In the data callback, you get the body in chunks. Append those chunks to a file or stringio. (Or append to a list, and join in eom.) o Now, join the header and body with a blank line between, and you have the email message, ready to pass to the python email module for extracting attachments, etc. If you will be dealing with malicious mail (spam), the email module is not as robust as it could be when presented with malformed MIME attachments. The mime module provided with pymilter has patches for some of those bugs that make it robust enough for my purposes. If the above isn't enough, I can cobble some (untested) code as a more explicit example. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flammis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. From stuart at bmsi.com Thu Sep 29 14:20:34 2011 From: stuart at bmsi.com (Stuart D. Gathman) Date: Thu, 29 Sep 2011 14:20:34 -0400 (EDT) Subject: [Pymilter] Obtaining email attachments for external processing In-Reply-To: References: Message-ID: On Thu, 29 Sep 2011, Stuart D. Gathman wrote: > If the above isn't enough, I can cobble some (untested) code as a more > explicit example. Ok, have a few minutes, here is how to collect the body: @Milter.noreply def envfrom(self,f,*str): self.log("mail from",f,str) self.fp = StringIO.StringIO() @Milter.noreply def header(self,name,hval): if self.fp: self.fp.write("%s: %s\n" % (name,hval)) # add decoded header to buffer return Milter.CONTINUE @Milter.noreply def eoh(self): if self.fp: self.fp.write("\n") # terminate headers return Milter.CONTINUE @Milter.noreply def body(self,chunk): # copy body to temp file try: if self.fp: self.fp.write(chunk) # IOError causes TEMPFAIL in milter self.bodysize += len(chunk) except Exception,x: if not self.ioerr: self.ioerr = x self.log(x) self.fp = None return Milter.CONTINUE def eom(self): if self.fp: self.fp.seek(0) msg = mime.message_from_file(self.fp) # msg is an email.message.Message # http://docs.python.org/release/2.6.6/library/email.message.html ... -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flammis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. From oetkenky at hotmail.com Thu Sep 29 21:26:48 2011 From: oetkenky at hotmail.com (Kyle Oetken) Date: Fri, 30 Sep 2011 01:26:48 +0000 Subject: [Pymilter] Obtaining email attachments for external processing In-Reply-To: References: , Message-ID: Stuart, When you have a moment can you provide some more detail about the issues that can occur when using the email mime module? Also, what fixes are provided with the pymilter mime module? Thanks, kyle > Date: Thu, 29 Sep 2011 11:27:57 -0400 > From: stuart at bmsi.com > To: jchrisos at hotmail.com > Subject: Re: [Pymilter] Obtaining email attachments for external processing > CC: pymilter at bmsi.com > > On Thu, 29 Sep 2011, Jim Chrisos wrote: > > > I am new to pymilter and despite writing many python scripts, I cannot, for > > the life of me, figure out how to obtain email attachments. > > > > I've added a "data" function in the sample.py script since 'data' is one of > > the sendmail milter callbacks and in test emails I am able to get into that > > function. But as far as extracting attachments for parsing with other tools > > of mine, I'm at a loss. I've tried examples from other sample scripts such > > as check_attachments and message_from_file from the mime package but I still > > can't figure it out. I was thinking I should be able to get the > > attachment(s) in relatively few lines of code which I could then hand off to > > my other tools, but am I mistaken? Can anyone provide any samples to help > > me out? > > You can get the attachments in relatively few lines of code. The standard > python email module parses MIME attachments. > > The milter package (my production milter that uses pymilter and is also > on sourceforge) is probably too big, but it does extensive attachment > processing. Let me summarize: > > o In the header callback, you get the header fields. Append those to a file > or stringio. > > o In the data callback, you get the body in chunks. Append those chunks to > a file or stringio. (Or append to a list, and join in eom.) > > o Now, join the header and body with a blank line between, and you have > the email message, ready to pass to the python email module for extracting > attachments, etc. > > If you will be dealing with malicious mail (spam), the email module > is not as robust as it could be when presented with malformed MIME > attachments. The mime module provided with pymilter has patches for > some of those bugs that make it robust enough for my purposes. > > If the above isn't enough, I can cobble some (untested) code as a more > explicit example. > > -- > Stuart D. Gathman > Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 > "Confutatis maledictis, flammis acribus addictis" - background song for > a Microsoft sponsored "Where do you want to go from here?" commercial. > _______________________________________________ > Pymilter mailing list > Pymilter at bmsi.com > http://www.bmsi.com/mailman/listinfo/pymilter -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuart at bmsi.com Thu Sep 29 21:59:10 2011 From: stuart at bmsi.com (Stuart D Gathman) Date: Thu, 29 Sep 2011 21:59:10 -0400 Subject: [Pymilter] Obtaining email attachments for external processing In-Reply-To: References: , Message-ID: <4E85226E.7080106@bmsi.com> On 09/29/2011 09:26 PM, Kyle Oetken wrote: > > When you have a moment can you provide some more detail about the > issues that can occur when using the email mime module? Also, what > fixes are provided with the pymilter mime module? > mime.Message extends email.message.Message. A quick review of comments in mime.py reveals (forgive the childish name calling of a popular but amazingly insecure email client - it has caused me much grief and wasted time): 1) Handle multipart attachments that are not labelled as such in ContentType. (This emulates Outhouse behaviour so we can remove Outhouse viruses that hide in unlabelled multipart attachments.) 2) Remove (ignore) trailing garbage after quoted header parameters. (Another Outhouse exploit.) 3) add ismodified() method and track modifications by attachment. (So you know whether replacebody() is required.) 4) add a headerchange method hook so you can conveniently trigger addheader/chgheader calls in your milter. (Doxygen drops the var comment - have to figure out why.) Misc features: mime.checkattach() walks attachments. (Current python has Message.walk() - but mine is still included for compatibility.) mime.defang replaces files with a warning message, both attached and within zip files (and zip within zip etc), that appear to be something Outhouse might try to execute. It also removes