I think that using the slides for a talk without any supplemental information (document, video) as a means of communicating the talk's content after the fact isn't a very good communication mechanism. Good slides shouldn't have the words the speaker says, and what the speaker says shouldn't be verbatim on the slides. In my own slides, I'll often try to just have an image that I then talk to. Therefore, since I'm posting the slides, here is an outline of the arguments I made in my [2017 i3d keynote](https://pharr.org/matt/mobilevr.pdf). Context (slides 3-7) ========================================================= - For a while starting with programmable GPUs, we got more and more graphics available. It was glorious. Sure, you might have to plug multiple power cables into your GPU, but who cares? - Larrabee was the peak of that mindset: "sure it's not *quite* as power efficient as a regular GPU, but that's cool, we'll just use more power." (And Kayvon made a funny comment after I posted a picture of an air filtration system at Intel's Israel office.) - Then everyone decided they wanted phones and laptops with batteries that lasted all day, and the power party was over. It was all about saving power, and there was little motivation to put more graphics on mobile devices, since the power they used wasn't worth it. - Now, with the arrival of mobile VR devices (Daydream, GearVR), that's changing, and graphics on mobile matters a lot now. So we've got that going for us. VR beyond games (slides 9-14) ============================================= - Another reason to be excited for mobile VR is the possibility of reaching more or less the whole world. - If VR ends up being another gaming peripheral, that'd be a loss. - Apps like Google Earth, VR journalism, Arts and Culture can be incredibly compelling. For Earth and Arts and Culture, all of the underlying content has been available for years, but it's so much more compelling in VR than in a web browser on a regular computer screen. Main arguments in the following ==================================== 1. However, if VR doesn't sell 100s of millions of devices a year, it will fail; it won't be possible to manufacture all of the specialized components needed at an acceptable cost. 2. The only way it will get to that point is if VR on mobile-class (think: battery powered) devices is successful. Desktop doesn't provide sufficient economies of scale. 3. VR on mobile is much harder than on desktop. You've got the same number of pixels to light up at the same high framerates, but a CPU and GPU that are 1/10 to 1/20 as powerful as desktop CPUs/GPUs. 4. A few years of Moore's law isn't going to save us; Moore's law is slowing down, and in any case it's about transistors, not about power consumption. Power consumption (and heat dissipation) are also placing significant constraints on how much better mobile GPUs can get. 5. There is some hope (don't leave yet): exotic HW architectures that didn't make sense in the past may make more sense in this new world, and there's more reason and opportunity for programmers to specialize deeply Economies of scale (slides 15-19) ===================================== - VR got to this point by leveraging economies of scale from mobile phones---screens, gyros, other sensors are made in mass quantities and are fairly inexpensive. (Yes, there's stuff like Lighthouse that doesn't fit into that bucket.) - 1.4B smartphones were sold in 2016. (450M high-end smartphones). That's a lot of displays, and that volume makes those displays relatively cheap. - This is also part of why current generation consoles are basically PCs (in HW architecture at least); it doesn't make sense for Sony to develop a custom CPU or GPU for a console; it'd cost a lot more and wouldn't be better enough (if better at all) to merit the cost. - VR today is roughly 1/1000 the smartphone market. Of course, it's early days, but even if you assume every desktop PC user with a $200 GPU (mid-range) buys a HMD, that's still just ~25M a year--20x less than high-end smartphones. Example: Displays (slides 20-23) ===================================== - Let's take the sort of displays we want for VR. Bajillions of pixels, light field displays; it's going to be awesome. - No mobile phone needs the pixel densities that VR does. Arguably phones have plenty of pixels already, though there still seems to be a PPI race--just marketing at this point? - In any case, phones will top out before VR: we put these devices in front of our faces, with a lens in front of their screens. - The big problem is that manufacturing displays is expensive: a new plant costs in the billions of dollars. A $9B display plant that only makes displays for 10M PC HMDs a year works out to $900 per HMD; that doesn't count materials, the people who work in the plant, or the R&D ahead of time. Such a plant would never got off the drawing board; it doesn't make business sense for the display manufacturers. The current state of mobile GPUs (slides 24-26) =================================================== - It's improved a lot in the past few years; a modern mobile GPU like the Adreno 530 is a pretty impressive thing. - However, it's still 10-15x off of a high-end desktop GPU. - Mobile GPU architectures themselves are good now--the performance ratio is generally proportional to the power consumption ratio, which suggests that the issue isn't that their architectures are worse than desktop GPU architectures. Unfortunately, this means that there aren't any easy performance jumps to be had in the near future. - On the plus side, if you divide out the (peak) numbers and consider e.g. the Daydream Pixel XL, which is 2560x1440 @ 60Hz, you've got >2k (peak!) flops per pixel, which isn't terrible... Moore's Law isn't going to save us (slides 27-29) ======================================== - It's clearly slowing and has been for a few years now. - In any case, it's a statement about the size of transistors (and thus, how many you can fit in a given amount of chip area). The problem is that we can't power all the transistors anyway---limited power from batteries, thermal and heat dissipation issues. - Transistor power efficiency has improved about 10x in 9 years, which is awesome, but it's been a tough haul, and is unlikely to be that good in the next 9 years. (And even then, 10x in 9 years isn't transformative.) Bridging the gap (we've got some options) (slides 30-43) ========================================================== (aka, "thorough adversity comes opportunity") - One thing that may help follows from a favorite statistic: the optic nerve only carries 8.75Mb/s to the brain. (That's a small b, so *bits*.) If you do the math, a mobile GPU can do 57k FLOPS for each such *bit*, which seems like plenty to compute a good scene representation. - Now, of course there's some computing going on in the retina (the start of a CNN...); we're not wiring up the GPU to the optic nerve, so we probably always need to compute more than that many bits of output from the GPU. - Insane thought (slide 32): researchers have had success fooling image-recognition neural networks with crafted input that looks nothing like the actual thing. If we understand more about early processing in the retina, can we figure out the signal we want to send along the optic nerve and work backward to figure out how to stimulate the retina accordingly? - It's interesting to consider consoles: programmers have low-level access to the hardware and can make use of it---can take the time to go deep on the architecture, wrote carefully tuned assembly, deeply understand the performance characteristics of the system. - One of my favorite examples is DICE doing pixel shading on the PS3 SPUs. Seriously, WTF. If there's one thing a GPU is built for, it's pixel shading. But they saw that they could not only reduce the load on the GPU, but that they could do less redundant work (light culling) by doing it on the SPU. - It's always amazing how much better the games look at the end of a console generation than at the start---and it's all running on exactly the same HW. All of this comes from programmer ingenuity and focus. - This hasn't made sense in the past for mobile devices, though if slowing Moore's law and power limits mean that the architectures change more slowly, it may in the future. - I don't believe that Vulkan will be sufficient to enable this. It's still an abstraction, still has the least common denominator issue--trying to abstract over many GPU architectures. - Another problem is that we don't have the same depth of third-party analysis of mobile CPUs and GPUs. (When I was at Intel, I used Agner Fog's stuff all the time to understand the CPUs--it was better than what we had internally!) - Unusual architectures may make more sense. There have been a number of architectures over the years that have packed a lot of compute on a chip but haven't gone anywhere--they were just too hard to program to be worth the trouble. - If there's a world where that's the only way to get to a certain level of performance, then people will be wiling to go through the pain. - Many of these architectures have fairly limited off-chip bandwidth (which has been painful). However, off-chip bandwidth is really expensive in its power requirements, so that's not a completely bad thing. - There's a lot of amazing stuff on Shadertoy that is generated almost entirely by compute, with very little off-chip memory access. This rainforest doesn't access any textures, except to do temporal anti-aliasing. It would probably run well on a specialized compute-heavy chip. - However, how would you author and art direct this content productively? iq@ doesn't scale... - Going to the other extreme, programmability consumes a lot of power. Fixed-function hardware can be ~10x-100x more power-efficient than programmable hardware. In a power limited environment, that means you can do 10x-100x more. - The question is: "what should the fixed function hardware be"? The HW design loop is iterative and backward-looking (necessarily conservative): assuming that yesterday's workload will be tomorrow's workload is usually a safer bet than trying to guess what tomorrow's workload will be. However, this takes a long time. - How can researchers get involved in fixed-function HW for graphics an VR? It's tough, since you really want to have an entire architecture design (CPU, GPU, everything) to compare with and also so that you figure out how your new thing integrates with those. Not clear how to crack this nut. - One hopeful possibility comes from the Chisel system from Berkeley, which is well-regarded. ("LLVM for hardware"). They've taped out multiple RISC-V CPUs designed using it, all designed by very small teams.