From rnd at onego.ru Sat Dec 25 02:29:45 2004 From: rnd at onego.ru (Roman Suzi) Date: Sat, 25 Dec 2004 10:29:45 +0300 (MSK) Subject: [Pymilter] Number of threads Message-ID: Hi, my milter is experiencing a number of threads problem (thread_create() fails after 1020 threads are created, on RedHat 7.3, Python 2.3.4). It seem to happen in milter C code, not in Python milter class. What can I do to diagnose the problem before too late? It seems that spammers know how to DoS all kinds of milter by making empty connects or something like that to overflow the number of threads. I'd liked to call a Python function to get the number of already created threads and if the number is, say > 900, stop accepting some connections. Thank you! Merry Christmas to all! Sincerely yours, Roman Suzi -- rnd at onego.ru =\= My AI powered by GNU/Linux RedHat 7.3 From stuart at bmsi.com Mon Dec 27 08:35:15 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Mon, 27 Dec 2004 08:35:15 -0500 (EST) Subject: [Pymilter] Number of threads In-Reply-To: Message-ID: On Sat, 25 Dec 2004, Roman Suzi wrote: > my milter is experiencing a number of threads problem (thread_create() fails > after 1020 threads are created, on RedHat 7.3, Python 2.3.4). It seem to > happen in milter C code, not in Python milter class. What version of pymilter are you running? Beginning with 0.6.7, runmilter() should throw a milter.error("out of thread resources"). > What can I do to diagnose the problem before too late? > It seems that spammers know how to DoS all kinds of milter > by making empty connects or something like that to overflow > the number of threads. I had a similar problem. It turned out that certain Python threads were taking a long time due to locking issues with a bayesian filter. Sendmail already throttles the connect rate, and as long as the average elapsed time of a milter thread stays under what sendmail expects, then there is no problem. I solved my locking problem by using a Python mutex to prevent multiple milter threads from competing for the external file lock, and by skipping bayesian content filtering for large messages. > I'd liked to call a Python function to get the number of > already created threads and if the number is, say > 900, > stop accepting some connections. import threading if threading.activeCount() > 900: coverEars() activeCount is somewhat expensive, since it counts the enumeration of active threads. In your Milter derived class, you could also increment a count in connect(), and decrement in close(). I would focus on discovering what is making the milter threads take so long. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. From rnd at onego.ru Mon Dec 27 08:53:39 2004 From: rnd at onego.ru (Roman Suzi) Date: Mon, 27 Dec 2004 16:53:39 +0300 (MSK) Subject: [Pymilter] Number of threads In-Reply-To: References: Message-ID: On Mon, 27 Dec 2004, Stuart D. Gathman wrote: Thank you for your reply, Stuart! > On Sat, 25 Dec 2004, Roman Suzi wrote: > > > my milter is experiencing a number of threads problem (thread_create() fails > > after 1020 threads are created, on RedHat 7.3, Python 2.3.4). It seem to > > happen in milter C code, not in Python milter class. > > What version of pymilter are you running? Beginning with 0.6.7, runmilter() > should throw a milter.error("out of thread resources"). I am using 0.7.0, highly customized. > > What can I do to diagnose the problem before too late? > > It seems that spammers know how to DoS all kinds of milter > > by making empty connects or something like that to overflow > > the number of threads. > > I had a similar problem. It turned out that certain Python threads > were taking a long time due to locking issues with a bayesian filter. I do not use bayesian filter. > Sendmail already throttles the connect rate, and as long as the average > elapsed time of a milter thread stays under what sendmail expects, then > there is no problem. > > I solved my locking problem by using a Python mutex to prevent multiple > milter threads from competing for the external file lock, and by skipping > bayesian content filtering for large messages. > > > I'd liked to call a Python function to get the number of > > already created threads and if the number is, say > 900, > > stop accepting some connections. > > import threading > > if threading.activeCount() > 900: > coverEars() > > activeCount is somewhat expensive, since it counts the enumeration of > active threads. > > In your Milter derived class, you could also increment a count in connect(), > and decrement in close(). Good idea, thanks. > I would focus on discovering what is making the milter threads take so long. Mail comes in bursts. Sometimes milter is dead in 3 minutes, because 30 * 60 * 3 = 5400 - which is > 1020 five times. (connection throttle in sendmail = 30) Sincerely yours, Roman A.Suzi -- - Petrozavodsk - Karelia - Russia - mailto:rnd at onego.ru - From stuart at bmsi.com Mon Dec 27 11:37:45 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Mon, 27 Dec 2004 11:37:45 -0500 (EST) Subject: [Pymilter] Number of threads In-Reply-To: Message-ID: On Mon, 27 Dec 2004, Roman Suzi wrote: > > Sendmail already throttles the connect rate, and as long as the average > > elapsed time of a milter thread stays under what sendmail expects, then > > there is no problem. > > I would focus on discovering what is making the milter threads take so long. > Mail comes in bursts. Sometimes milter is dead in 3 minutes, because > 30 * 60 * 3 = 5400 - which is > 1020 five times. > (connection throttle in sendmail = 30) Sendmail limits the number of sendmail processes receiving mail. There can be only one milter thread per sendmail process - unless something goes wrong. That something is a milter timeout. In sendmail.mc, you should have a line looking something like this: INPUT_MAIL_FILTER(`pythonfilter', `S=local:/var/run/milter/pythonsock, F=T, T=C:5m;S:30s;R:5m;E:5m') The R timeout is 5 minutes. That means, if a milter thread takes longer than 5 minutes to send a response to sendmail, sendmail will give up, return a 451 to sending MTA, and start a new connection. This creates yet another python thread, making it yet more likely that sendmail will give up on a python thread. This is how python ends up having more milter threads than sendmail processes. A similar problem can result from the other timeouts. One solution is to implement the progess method (define _FFR_SMFI_PROGRESS when compiling sendmail and miltermodule.c). This call lets you keep sendmail waiting as long as you like. Another solution is to increase the timeout in the milter definition to match your application. A third solution is to fix the application so it doesn't take so long. I would check the sendmail log for milter timeout errors - this should let you know which timeout sendmail is hitting, and either increase it or adjust your application. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial. From rnd at onego.ru Mon Dec 27 13:35:40 2004 From: rnd at onego.ru (Roman Suzi) Date: Mon, 27 Dec 2004 21:35:40 +0300 (MSK) Subject: [Pymilter] Number of threads In-Reply-To: References: Message-ID: On Mon, 27 Dec 2004, Stuart D. Gathman wrote: >On Mon, 27 Dec 2004, Roman Suzi wrote: > >> > Sendmail already throttles the connect rate, and as long as the average >> > elapsed time of a milter thread stays under what sendmail expects, then >> > there is no problem. > >> > I would focus on discovering what is making the milter threads take so long. > > >In sendmail.mc, you should have a line looking something like this: > >INPUT_MAIL_FILTER(`pythonfilter', `S=local:/var/run/milter/pythonsock, F=T, T=C:5m;S:30s;R:5m;E:5m') I have these timeouts right now: T=C:5m;S:1m;R:1m;E:5m >Another solution is to increase the timeout in the milter definition to >match your application. > >A third solution is to fix the application so it doesn't take so long. > >I would check the sendmail log for milter timeout errors - this should >let you know which timeout sendmail is hitting, and either increase it >or adjust your application. It is usually this: Dec 27 19:44:26 mx sendmail[11422]: iBRGhQSD011422: Milter (mymilter): timeout before data read And yes, probably milter does too much. I need to check that. It usually takes 4 s to process a message but sometimes I see 30 s and more. Sincerely yours, Roman Suzi -- rnd at onego.ru =\= My AI powered by GNU/Linux RedHat 7.3 From stuart at bmsi.com Mon Dec 27 14:36:41 2004 From: stuart at bmsi.com (Stuart D. Gathman) Date: Mon, 27 Dec 2004 14:36:41 -0500 (EST) Subject: [Pymilter] Number of threads In-Reply-To: Message-ID: On Mon, 27 Dec 2004, Roman Suzi wrote: > I have these timeouts right now: > > T=C:5m;S:1m;R:1m;E:5m > > >Another solution is to increase the timeout in the milter definition to > >match your application. > > > >A third solution is to fix the application so it doesn't take so long. > > > >I would check the sendmail log for milter timeout errors - this should > >let you know which timeout sendmail is hitting, and either increase it > >or adjust your application. > > It is usually this: > > Dec 27 19:44:26 mx sendmail[11422]: iBRGhQSD011422: Milter (mymilter): > timeout before data read > > And yes, probably milter does too much. I need to check that. It usually takes > 4 s to process a message but sometimes I see 30 s and more. OK, sendmail is timing out on read. Your read timeout is 1 minute (R:1m). That means that sometimes, your milter thread takes more than 1 minute (60 secs). Whenever it takes that long, that thread will be an extra thread - disconnected from sendmail. When it eventually, gets around to responding to sendmail, it will get an abort. You could log the elapsed time since the last libmilter call to get an idea of what to set the read timeout to (or to give a clue to some bug that is hanging the milter thread). I would immediately increase the read timeout to 5 minutes, and see if that helps before doing more debugging. -- Stuart D. Gathman Business Management Systems Inc. Phone: 703 591-0911 Fax: 703 591-6154 "Confutatis maledictis, flamis acribus addictis" - background song for a Microsoft sponsored "Where do you want to go from here?" commercial.