I still wonder, this X.org bug has been there since the summer and I know everyone knows the real problem* behind it.. why isn't it being fixed in the X.org mainline or on any reasonable distro?
because this bug only affects the Pegasos ? :roll:
It affects ANY system that has multiple domains and the VGA adapter is not in domain 0.https://bugzilla.novell.com/attachment.cgi?id=98419
Here's a good example of the bus fudging in action; Note at the bottom (where addresses are listed) that the domain 0 devices are all on bus 0. This is correct.
69539 0 lrwxrwxrwx 1 root root 0 sze 12 14:35 /sys/bus/pci/devices/0000:00:0d.0 -> ../../../devices/pci0000:00/0000:00:0d.0
69540 0 lrwxrwxrwx 1 root root 0 sze 12 14:35 /sys/bus/pci/devices/0000:00:0c.6 -> ../../../devices/pci0000:00/0000:00:0c.6
69543 0 lrwxrwxrwx 1 root root 0 sze 12 14:35 /sys/bus/pci/devices/0000:00:0c.5 -> ../../../devices/pci0000:00/0000:00:0c.5
69546 0 lrwxrwxrwx 1 root root 0 sze 12 14:35 /sys/bus/pci/devices/0000:00:0c.4 -> ../../../devices/pci0000:00/0000:00:0c.4
69549 0 lrwxrwxrwx 1 root root 0 sze 12 14:35 /sys/bus/pci/devices/0000:00:0c.3 -> ../../../devices/pci0000:00/0000:00:0c.3
Note that the domain 1 devices are all on bus 1. This is due to the bus fudging enabled by the Linux kernel variable 'pci_assign_all_buses' which purposefully increments bus numbers globally. This is an old hack which enabled /proc filesystem entries to be on multiple domains but still be accessed through ancient PCI accessors which did not properly support domains; by looking at the bus number (which is unique globally with that flag) it could derive the domain the bus was sitting in.
69530 0 lrwxrwxrwx 1 root root 0 sze 12 14:35 /sys/bus/pci/devices/0001:01:08.1 -> ../../../devices/pci0001:01/0001:01:08.1
69533 0 lrwxrwxrwx 1 root root 0 sze 12 14:35 /sys/bus/pci/devices/0001:01:08.0 -> ../../../devices/pci0001:01/0001:01:08.0
69536 0 lrwxrwxrwx 1 root root 0 sze 12 14:35 /sys/bus/pci/devices/0001:01:00.0 -> ../../../devices/pci0001:01/0001:01:00.0
The problem here is that when X.org uses sysfs to scan the buses, it the starts to put these values VERBATIM through domain-aware PCI accessors, which do not use the fudging table. Therefore, a device which is truly on "0001:00:08.0" (Pegasos AGP slot graphics adapter) is passed through as pci_config_read_something(1, 1, 8, 0, *value) - this adapter does NOT exist.
There could be many fixes. You could turn off pci_assign_all_buses in the PCI code, which would return all bus numbers to normal. Sysfs would then report the 'correct' verbatim values to pass through to domain-aware pci accessors.
A hack might be to stop using domain-aware pci accessors, and use the Linux bus fudging non-domain-aware functionality instead (this is what the /proc interface does as the /proc interface does not list PCI domain numbering).
A hack might be to detect that buses are numbered globally (sysfs/sysctl variable?) and restore those bus numbers before passing in values.
Another hack might be to fudge the domain-aware pci accessors so that they also access the fudging table, but this might break a thousand things.
All in all, Linux and
X.org are both
doing this wrong and both of them need to be fixed in tandem.. unfortunately it seems the likelihood of this bug being fixed correctly to support PCI domains in the clean, simple manner it could be, is obfuscated by the code process of throw things at the mainline and hope it doesn't break, and not "how should this be done and what components do we need to overhaul to do it".
I don't think the work is truly that difficult to spec and design :(
I wrote a tutorial about Ubuntu Feisty installation on Pegasos, i'll send it to Geoffrey for the next Pegasos Book.
I would much prefer, in real life, that we had
* proper Yaboot support (I don't see why this is so difficult to be honest.. there are patches to make it work, even if our firmware 'does the wrong thing')
* This X.org bug fixed, or the pci_assign_all_buses stuff removed from the Linux kernels on PowerPC, or something done to alleviate the trouble?
I'd like to propose a test; can someone install Ubuntu and recompile their kernel so that pci_assign_all_buses (it's in powerpc/platforms/chrp/pci.c or something similar) is turned off and see if X.org works? I don't have a Pegasos here to truly test this and my Efika has no PCI devices.
Matt Sealey, Genesi USA Inc.
Product Development Analyst