atop

Brad Figg brad.figg at canonical.com
Wed Jul 14 16:31:38 BST 2010


On 07/13/2010 08:15 PM, Cole wrote:
> So this may be a little long winded...
>
> To quickly preface my thoughts I first want to state something pretty
> obvious.  In a multi tenant environment ( the current direction we seem to
> be headed ) I could care less about some of the tools that are packaged in
> sysstat and procps.  I don't care about load avg etc for self explanatory
> reasons and presently io reporting (especially in a multi app/multi user
> scenario) is lacking.
>
> That being said I think tools like atop, systemtap, oprofile are good but
> present 2 problems.  They are still tools with competition from closed
> source companies ( BMC to name 1) that will ultimately lead to discrepancies
> in collected data and they stop short of the challenge The Linux Foundation
> has asked the community to tackle with regard to keeping the kernel relevant
> for the next 5-10 years.
>
> KSLM is focused purely on gathering statistics around the 5 basic principals
> of compute ( cpu / memory / disk (storage) / time / IO (disk and net) on a
> per process basis in a standard way across distros and cpu architectures
> using a consistent thing across all implementations (the kernel itself).
>
> So to summarize, could kslm be used to solve the same issue described below,
> yep!  Would it be as elegant as atop?  Part of it's elegance is that it's
> distro agnostic and if used correctly, could be used to actually do
> intelligent workload management and remediation if conditions (like long
> disk waits) are met.
>
> Cole
>
>
> On Tue, Jul 13, 2010 at 2:04 PM, Clint Byrum<clint.byrum at canonical.com>wrote:
>
>>
>> On Jun 30, 2010, at 1:10 PM, Tim Gardner wrote:
>>>
>>> You are correct in that I am reluctant to drag in unmaintained crack
>>> into core kernel structures.
>>>
>>> I still find 'better task accounting' to be insufficient justification.
>>> What specifically makes for better task accounting? Why is atop better
>>> then other methods? As far as I can tell the current patches still
>>> suffer from the deficiencies mentioned by Andrew Morton in
>>> http://marc.info/?l=linux-kernel&m=120716470803492&w=2
>>>
>>> Gimme an example of a problem that atop will help solve for which no
>>> other method will suffice.
>>>
>>
>> I just recently was contacted by a friend looking for help on periodic
>> "total site freeze" issues with a web application. Atop revealed some badly
>> behaving processes where regular top did not, because processes "in disk
>> wait" might be waiting to read/write, and with hundreds of httpd's on the
>> machine in disk wait, its painful to try and find out whats going on. Its
>> such an instant revelation of activity, I really think as systems scale up
>> these sorts of tools are really vital.
>>
>> Whether atop as it is now is the way to do this remains to be decided. I
>> recall talking with Cole Crawford at UDS about KSLM which may add similar
>> capabilities to the kernel but in a more elegant way. I've CC'd Cole to get
>> his opinion on this as well.
>>
>>
>

Is there somewhere one might find more information or an implementation of
KSLM? What is the timeframe for it hitting upstream?

Brad
-- 
Brad Figg brad.figg at canonical.com http://www.canonical.com



More information about the kernel-team mailing list