[Bug 339823] [NEW] qmgr process loads the system when using rate_* in custom transports

Santiago Romero sromero at gmail.com
Mon Mar 9 08:32:41 GMT 2009


Public bug reported:


 Last month I had a "load average" issue in a postfix mail server (only runs postfix service). Suddenly, load average started to raise and qmgr process appeared on top of "top" taking 20-30% of CPU.

top - 18:19:54 up 7 days,  2:03,  2 users,  load average: 4.94, 3.96, 4.02
Tasks: 144 total,   6 running, 138 sleeping,   0 stopped,   0 zombie
Cpu(s): 48.3%us, 50.7%sy,  0.0%ni,  0.0%id,  0.0%wa,  0.0%hi,  1.0%si,  0.0%st
Mem:   1035280k total,   999964k used,    35316k free,   149072k buffers
Swap:   750696k total,       88k used,   750608k free,   599308k cached

 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
23665 postfix   20   0  5880 2628 1792 S 20.3  0.3  68:11.18 qmgr
23662 root      20   0  5392 1732 1400 R  6.0  0.2  20:49.46 master

Network traffic was low and we had the normal throughput of emails.

Queue had only 73 emails in it when the problem happened (just like now,
they are all deferred emails).

Doing "postfix stop" / "postfix start" solved the problem.

I reported the bug to Postfix Users mailing list and postfix's Author
(Wietse Venema) found that it was a bug and posted a PATCH in the
mailing list

 Some snippets from the list:

---------------------------------------------------------------
VICTOR DUCHOVNI:
Please wait for an updated patch, we believe we have identified the
cause and reproduced the symptoms (in that order). I have a candidate
patch, but I expect Wietse will send an updated more polished version
in the not too distant future.

The issue found applies only to "rate-limited" transports, if you are
not using such transports, you don't need the patch. The patch ensures
that work done at the completion of a delivery with a "normal" transport
is correctly split between "before suspend" and "after resume".

The original 2.5.x code is correct for "oqmgr", but not for "qmgr"
(aka "nqmgr"), which requires additional internal state adjustments
when destinations are blocked and unblocked.

---------------------------------------------------------------
WIETSE VENEMA:

To apply this patch, cd into the Postfix-2.5.* top-level source
directory and execute:

$ patch < thismessage

We were able to reproduce the scheduler looping problem, and it
does not recur with the patched version.

        Wietse


---------------------------------------------------------------


 I applied the patch and the problem didn't happen again, but I need that patch to be integrated into postfix's ubuntu deb packages so that I can still benefit of future security upgrades.

 The patch was submitted at:

Date: Thu, 5 Mar 2009 17:41:51 -0500 (EST)
 

 Thanks a lot.

** Affects: postfix (Ubuntu)
     Importance: Undecided
         Status: New

-- 
qmgr process loads the system when using rate_* in custom transports
https://bugs.launchpad.net/bugs/339823
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to postfix in ubuntu.



More information about the Ubuntu-server-bugs mailing list