brad.figg at canonical.com
Wed Jul 14 16:31:38 BST 2010
On 07/13/2010 08:15 PM, Cole wrote:
> So this may be a little long winded...
> To quickly preface my thoughts I first want to state something pretty
> obvious. In a multi tenant environment ( the current direction we seem to
> be headed ) I could care less about some of the tools that are packaged in
> sysstat and procps. I don't care about load avg etc for self explanatory
> reasons and presently io reporting (especially in a multi app/multi user
> scenario) is lacking.
> That being said I think tools like atop, systemtap, oprofile are good but
> present 2 problems. They are still tools with competition from closed
> source companies ( BMC to name 1) that will ultimately lead to discrepancies
> in collected data and they stop short of the challenge The Linux Foundation
> has asked the community to tackle with regard to keeping the kernel relevant
> for the next 5-10 years.
> KSLM is focused purely on gathering statistics around the 5 basic principals
> of compute ( cpu / memory / disk (storage) / time / IO (disk and net) on a
> per process basis in a standard way across distros and cpu architectures
> using a consistent thing across all implementations (the kernel itself).
> So to summarize, could kslm be used to solve the same issue described below,
> yep! Would it be as elegant as atop? Part of it's elegance is that it's
> distro agnostic and if used correctly, could be used to actually do
> intelligent workload management and remediation if conditions (like long
> disk waits) are met.
> On Tue, Jul 13, 2010 at 2:04 PM, Clint Byrum<clint.byrum at canonical.com>wrote:
>> On Jun 30, 2010, at 1:10 PM, Tim Gardner wrote:
>>> You are correct in that I am reluctant to drag in unmaintained crack
>>> into core kernel structures.
>>> I still find 'better task accounting' to be insufficient justification.
>>> What specifically makes for better task accounting? Why is atop better
>>> then other methods? As far as I can tell the current patches still
>>> suffer from the deficiencies mentioned by Andrew Morton in
>>> Gimme an example of a problem that atop will help solve for which no
>>> other method will suffice.
>> I just recently was contacted by a friend looking for help on periodic
>> "total site freeze" issues with a web application. Atop revealed some badly
>> behaving processes where regular top did not, because processes "in disk
>> wait" might be waiting to read/write, and with hundreds of httpd's on the
>> machine in disk wait, its painful to try and find out whats going on. Its
>> such an instant revelation of activity, I really think as systems scale up
>> these sorts of tools are really vital.
>> Whether atop as it is now is the way to do this remains to be decided. I
>> recall talking with Cole Crawford at UDS about KSLM which may add similar
>> capabilities to the kernel but in a more elegant way. I've CC'd Cole to get
>> his opinion on this as well.
Is there somewhere one might find more information or an implementation of
KSLM? What is the timeframe for it hitting upstream?
Brad Figg brad.figg at canonical.com http://www.canonical.com
More information about the kernel-team