[Bug 859311] Re: Wrong cpu-time calculation for long-running multi-threaded processes
Matthew L. Dailey
matthew.l.dailey at dartmouth.edu
Thu Feb 2 18:56:04 UTC 2012
Just a me too on this one. I ran into this on a 10.04.3 system with some
Matlab jobs that have been running for a couple of weeks.
# ps -f -C MATLAB
UID PID PPID C STIME TTY TIME CMD
1234 21267 21256 99 Jan09 pts/3 1184016086-14:43:06 /opt/matlabR2011b/bin
1234 21578 21567 97 Jan09 pts/5 22-23:06:31 /opt/matlabR2011b/bin/glnxa64
1234 21688 21666 96 Jan09 pts/6 22-17:46:41 /opt/matlabR2011b/bin/glnxa64
1234 21786 21775 89 Jan09 pts/7 21-05:12:21 /opt/matlabR2011b/bin/glnxa64
1234 21884 21873 99 Jan09 pts/9 1184016084-11:35:28 /opt/matlabR2011b/bin
# cat /proc/21267/stat
21267 (MATLAB) S 21256 21267 21256 34819 21267 4202496 48728480602 10279 1 0 85619631648 4611685975611267009 16 1 20 0 49 0 176768180 3195682816 527162 18446744073709551615 4194304 4206714 140735315194304 140735315193832 140009299867740 0 134742022 0 151526638 18446744073709551615 0 0 17 15 0 0 0 0 0
# cat /proc/21884/stat
21884 (MATLAB) S 21873 21884 21873 34825 21884 4202496 47007338124 10280 1 0 85595226373 4611685975617266477 6 1 20 0 49 0 176794854 3336249344 558780 18446744073709551615 4194304 4206714 140734232928064 140734232927592 140313580697692 0 134742022 0 151526638 18446744073709551615 0 0 17 15 0 0 0 0 0
For what it's worth, htop seems to work properly and show the CPU usage
of these processes.
Here's some system info:
# uname -a
Linux myhost 2.6.32-37-generic #81-Ubuntu SMP Fri Dec 2 20:32:42 UTC 2011 x86_64 GNU/Linux
# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=10.04
DISTRIB_CODENAME=lucid
DISTRIB_DESCRIPTION="Ubuntu 10.04.3 LTS"
Let me know if any other info would be helpful while these jobs are
still running. :-)
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to procps in Ubuntu.
https://bugs.launchpad.net/bugs/859311
Title:
Wrong cpu-time calculation for long-running multi-threaded processes
Status in “procps” package in Ubuntu:
Confirmed
Bug description:
There seems to be a problem with the calculation of cpu-time for long-
running multi-threaded processes. On a 2-way Xeon X5660 system, ps
reports a cpu-time of 1184018577 days for a process that has been
running for 5 days with 11 threads:
ps -f -C ustacks
UID PID PPID C STIME TTY TIME CMD
cul07b 13246 10866 99 Sep21 pts/0 1184018577-00:27:06 /home/cul07b/stacks/bin/ustacks -t fasta -f BN_pair1_mod.fasta -o ../results -i 2 -d -r -m 5 -M 2 -p 11
The corresponding /proc/13246/stat file looks like this:
13246 (ustacks) R 10866 10866 10861 34816 10866 4202496 2989827 0 0 0 85771819844 4611685996976182811 0 0 20 0 11 0 9806481 12440649728 2894242 18446744073709551615 1 1 0 0 0 0 0 0 0 18446744073709551615 0 0 17 3 0 0 0 0 0
top reports the same process with 0% cpu load although it is still running full-throttle with 11 threads:
13246 cul07b 20 0 11.6g 11g 1412 R 0 23.4 178668,22 ustacks
I saw similar issues with a process running for a couple of days with
48 threads on a 4-way AMD Opteron 6172 system.
I think the same bug has already been reported on bugs.debian.org:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=641905
ProblemType: Bug
DistroRelease: Ubuntu 10.04
Package: procps 1:3.2.8-1ubuntu4
ProcVersionSignature: Ubuntu 2.6.32-32.62-server 2.6.32.38+drm33.16
Uname: Linux 2.6.32-32-server x86_64
Architecture: amd64
Date: Mon Sep 26 11:32:31 2011
ProcEnviron:
SHELL=/bin/bash
PATH=(custom, user)
LANG=en_AU.UTF-8
SourcePackage: procps
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/procps/+bug/859311/+subscriptions
More information about the foundations-bugs
mailing list