In fact, all my assumptions are based on the fact of an INTERNAL LCD controller unit, NOT an PCI/AGP based one. I'm not talking about PCs but SoCs. The point is precisely that the LCD controller uses the SAME memory and the same bus the main core uses. That means the LCD controller "stealing" clock cycles for each pixel drawn on screen.
It does not steal clock cycles, it is allowed them by the MPX bus. Even with a relatively high resolution screen it can only take up a maximum of a quarter of the bandwidth of the entire MPX bus if it was requesting 64-bit pixels at the maximum rate possible; there is no display resolution supported which actually requires this kind of bandwidth. In fact it barely needs 1/8th the bandwidth of the MPX bus to do it; something which I think you will find, is fairly normal utilization by any part of the chip.
The OCeaN bus the rest of the chip is sitting on, is designed for high levels of transaction concurrency, has a non-blocking crossbar and has significant bandwidth per-port on each connected unit.
The MPX coherency module has plenty of buffer space (8 cache lines each for read and write), can read data from the CPU L2 cache back into the buffers, and is very low-latency.
It is actually very hard to imagine that you would be flooding the system bus with MPEG4 data (1920x1080@60Hz with 4-byte RGB pixels requires 497MB/s write speed to memory at that rate; but remember the MPEG was only 30 frames per second so it's actually half that. There is probably 10MBit/s of bandwidth required to manage reading and maintaining the data at that resolution, and decoding overhead and disk drive and network activity and audio and not have enough left over on a bus capable of some 5GB/s when bursting.
The DIU requires less than 64MB/s - less than 46MB/s in 24-bit RGB mode - of actual bandwidth to display that data. How is it "stealing" so much bandwidth as to make it an impossible challenge? You would have to be running something SO incredibly bandwidth intensive, and the chip would actually have had to be designed by a moron with far less bandwidth than required for each unit alone, let alone more than one of them working in tandem.
you should see a difference, based on the
extra clock cycles available to the core, not used by the LCD controller.
I don't think you would. Unless you are already saturating the bus (practically impossible given the architecture of the bus), you would never notice it.
There would be a problem if doing extremely heavy DMA operations and running out of bus bandwidth; we see that on the Efika if many BestComm tasks run at once, and CPU load is extremely high, the bus does get contended and prioritisation is key for performance here. We are going to see the same problem with the MPC5121E and the DIU where performance may actually suffer for using internal graphics; but only in extremely high resolutions. Of course the MPC5121E cannot support those resolutions :)
A 1GHz Pentium M chip can decode 720p H.264 video with modest hardware assistance from an nVidia GeForce card, in the Apple TV. It can downscale 1080p video to 720p. It has an external memory bus and a northbridge, not a low-latency internal one. The graphics have to go over the PCI Express bus to the graphics chip - including all the high latencies involved. Yet the Apple TV manages it very very easily.
I am fairly sure, given the performance of the Pentium M in the laptop I have here, that a slightly faster chip could decode 1080p video; because with power management on (running at 600-1000MHz), I can decode MPEG2 and MPEG4 1080p video (and scale it down to 720p, my screen isn't that big) without skips, on Windows, with an *open source codec*.
The MPEG4 demonstration here is running *without* the overhead of an OS like Linux. It plays MPEG4 files, and it's a specially optimized codec for commercial use. It is ABSOLUTELY possible. The DIU does not make a dent in the bandwidth required and the chip itself is up to the task. With some clever management of the system resources, it could even be possible inside an OS like Linux.
DivX y XviD are MPEG4-ASP (Advanced Simple Profile) implementations, while h.264 is MPEG4 AVC (Advanced Video Coding), and are computationally equivalent DEPENDING ON THE CHOSEN PROFILE.
Except they are not. Not at all.
Can I pass you on a CD with Windows NT 4.0 for PowerPC? It included NetMeeting, with the world famous MP43 codec that got hacked and made DivX possible (an the rest).
I already have that CD, and let me just ask: do you think Microsoft have been painstakingly maintaining their PowerPC codebase since 1996, adding AltiVec support, simultaneous multithreading, 64-bit awareness, DirectShow support, in case they ever made a high-def games console?
The answer is: No. Most of the codec support for XBox360 has been written for the XBox360, for best performance on the XBox360 - ostensibly from scratch and not for a Video For Windows codec from 1996. The more likely code path they took is to take the C reference of their WMV codec, and recompile it for POWER, then add in the optimisations where needed.
They still have some way to go. It is not in any way some kind of "proof" that you need a 3-core, 6-way multithreading, 3.2GHz 64-bit chip with improved AltiVec to decode MPEG4 video. The G4 is a significantly different - actually far more efficient - design, and there is plenty of scope in the chip to do the demo.
If you still think it's fake, then feel free to bash your head against the wall some more with your inaccurate assumptions. I can assure you, it is not fake, it is possible, and we'll have to just go so far as proving it.
Matt Sealey, Genesi USA Inc.
Product Development Analyst