[rfc] improving 32bit user performance/experience...
Daniel J Blueman
daniel.blueman at gmail.com
Tue May 19 17:55:22 UTC 2009
On Tue, May 19, 2009 at 1:21 AM, John Moser <john.r.moser at gmail.com> wrote:
> On Mon, May 18, 2009 at 7:24 PM, Daniel J Blueman
> <daniel.blueman at gmail.com> wrote:
> Even if we split up Ubuntu in i486 and i686, i686 gets its most major
> gains from the CMOV instruction family-- a conditional MOV instruction
> that acts as a branch-and-mov in one (compare, then either conditional
> branch and mov or use cmov). See discussion here:
> Its most major gain is a wash. Also there's no guarantee any CPU
> supports CMOV (it's an option), and thus to guarantee 100%
> compatibility we'd have to add a kernel level illegal instruction
> handler that handles the CMOV instruction family rather than throwing
> a SIGILL at the process (yes this is doable), which is very slow.
> Mind you I'm not against abandoning anything below i686 on desktops
> eventually, but some embedded systems will need i586 and the like.
> Cost-benefit analysis here.
I was trying to raise a more general point about the minimum spec
across the board, including the embedded and old-server hardware.
I challenge anyone to find someone using Ubuntu 8.10/9.04 on a
processor which doesn't support the full i586 instruction set (eg
i386/i486 or something with incomplete i586 support).
All older VIA processors, AMD Geode procs and so on support the full
i586 instruction set, which including MMX instructions and registers,
which itself can provide a good win.
Also, even if 1% of the users use i586, we can allow instruction
scheduling for deeper pipelines (with eg -mtune=i686) for the other
99%, generating fewer stalls on more modern processors - important
with far higher core vs memory latency. This gains more for i686 than
it loses for i586.
>> (does any of this apply to x86-64, eg -mtune=core2 or k8?)
> Yes but this becomes a mess. Leave it as is. gcc is good at tuning
> to general-purpose in an instruction set.
-mtune is instruction-set invariant. gcc will tune for for i386
scheduling, ie fewer pipeline stages. It's later processors that have
had to optimise for short-pipeline-scheduled code, not the converse.
IA64 would have been crippled if the instruction scheduling wasn't
right from the outset.
I think we have opportunity; Gentoo users tackled this problem in
their way 10 years ago.
Daniel J Blueman
More information about the Ubuntu-devel-discuss