(Moving to COFF)
On Mon, Mar 06, 2023 at 03:24:29PM -0800, Larry McVoy wrote:
But even that seems suspect, I would think they could
put some logic
in there that just doesn't feed power to the GPU if you aren't using
it but maybe that's harder than I think.
If it's not about power then I don't get it, there are tons of transistors
waiting to be used, they could easily plunk down a bunch of GPUs on the
same die so why not? Maybe the dev timelines are completely different
(I suspect not, I'm just grabbing at straws).
Other potential reasons:
1) Moving functionality off-CPU also allows for those devices to have
their own specialized video memory that might be faster (SDRAM) or
dual-ported (VRAM) without having to add that complexity to the more
general system DRAM and/or the CPU's Northbridge.
2) In some cases, having an off-chip co-processor may not need any
access to the system memory at well. An example of this is the "bump
in the wire" in-line crypto engines (ICE) which is located between the
Southbridge and the eMMC/UFS flash storage device. If you are using a
Android device, it's likely to have an ICE. The big advantage is that
it avoids needing to have a bounce buffer on the write path, where the
file system encryption layer has to copy-and-encrypt data from the
page cache to a bounce buffer, and then the encrypted block will then
get DMA'ed to the storage device.
3) From an architectural perspective, not all use cases need various
co-processors, whether it is to doing cryptography, or running some
kind of machine-learning module, or image manipulation to simulate
bokeh, or create HDR images, etc. While RISC-V does have the concept
of instructure set extensions, which can be developed without getting
permission from the "owners" of the core CPU ISA (e.g., ARM, Intel,
etc.), it's a lot more convenient for someone who doesn't need to bend
the knee to ARM, inc. (or their new corporate overloads) or Intel, to
simply put that extension outside the core ISA.
(More recently, there is an interesting lawsuit about whether it's
"allowed" to put a 3rd party co-processor on the same SOC without
paying $$$$$ to the corporate overload, which may make this point moot
--- although it might cause people to simply switch to another ISA
that doesn't have this kind of lawsuit-happy rent-seeking....)
In any case, if you don't need to play Quake with 240 frames per
second, then there's no point putting the GPU in the core CPU
architecture, and it may turn out that the kind of co-processor which
is optimized for running ML models is different, and it is often
easier to make changes to the programming model for a GPU, compared to
making changes to a CPU's ISA.
- Ted