VR-Zone sits down with MediaTek executives to talk about the state of HSA and upcoming developments.
While the Heterogeneous System Architecture (HSA) Foundation and AMD are often said in the same breath, it’s important to remember that one of the most important power brokers on the foundation’s board is not an AMD person nor even someone from the desktop world. The man that chairs the working group that’s charged with creating the HSA’s Programmer’s Reference Manual, Chien-Ping Lu, comes from an up-and-coming mobile company — MediaTek — where he serves as a Senior Director.
Consider for a minute the curious disparity between mobile and desktop as far as the implementation of heterogeneous computing and HSA is concerned. AMD’s much-awaited Kaveri APU was released earlier this year but it didn’t live up to its full potential because HSA-enabled apps are non-existent. This is contrasted with some of the very real momentum we see on the mobile side from MediaTek.
Lu, who holds a PhD in Computer Science from Yale and spent nearly a decade at Nvidia as an architecture manager, offers a different take on the state of HSA then what you might hear from AMD. As he’s from MediaTek he offers a mobile-centric perspective of HSA: things like heterogeneous multiprocessing, which MediaTek has had some definite wins in with its CorePilot, comes first in his vocabulary — not hUMA.
VR-Zone recently sat down with Lu and MediaTek’s CMO Johan Lodenius at the company’s headquarters in Hsinchu, Taiwan, to get a non-AMD take on heterogeneous computing and where HSA is going.
VR-Zone: How did MediaTek get started in heterogeneous computing?
Chien-Ping Lu: MediaTek was one of the first companies to try and explore what kind of additional computing resources we can leverage on the chip. We maxed out the CPU computing power on the chip, so we looked around and found a ‘tiny processor’ called the GPU. At that time, and this was about three years ago, there was no proper GPGPU API to use it. There was only OpenGL and we still tried to use it.
At that time doing [general purpose compute] on a GPU wasn’t all that faster. So that’s we called it “GPU-assisted” computing. We used the GPU graphics shader, and, of course, there was a lot of inefficiencies there but even though it was a hack we still made it work.
With the MT8135 we got a powerful GPU from Imagination on-board. This is the first time we could use OpenCL, and this is the first time we found out some applications can run faster on a GPU than on a CPU. This is the second era, which we called the “GPU-accelerated” era.
Right now we get the performance we need, but we need to make it even higher. But we also need to make sure the device is easier to program. Because programming the GPU is still a “hack” — it’s very tedious. Mainstream programmers don’t want to touch it; Java programmers don’t want to touch it. So you want to make sure that the GPU can be programmed and can be accessible to even Java programmers — and that’s the third era.
We look around and find that since Nvidia won’t open up CUDA the only alternative would be AMD, that’s why we got in touch and started to explore options with them. They quickly identified us an important partner, and told us they wanted us to be a co-founder [of the alliance]. That’s how it got started: because we wanted to enter the “third era” of heterogeneous computing.
VRZ: How did the first meeting with AMD happen? Did you pitch to them?
CPL: It’s almost like a couple falling in love with each other at first sight [laughter]. I forget who pitched to whom, but I think when we first met we [pitched the idea] to each other.
VRZ: What’s the HSA ‘state of the union’?
CPL: We’ve been making big progress. Last year the major focus will be reaching the milestone of delivering the specs of version 0.95. That’s how we get industry attention, that’s why we won two awards: the Linley Group analyst choice award for Best Processor Technology and Best Electronic Design from Penton Electronics Group.
In this year we’re going to reach two major milestones. We’re going to deliver version 1.0 provisional for the system architecture spec (which means the real specifications will be released shortly thereafter), and version 1.0 for the programmer’s reference manual following that. This will be major things for this foundation. All the participating member companies are contributing very actively to this.
VRZ: The major absences from HSA are Intel and Nvidia. There are obvious implications about being in an alliance with AMD, but are they interested?
Johan Lodenius: Obviously we have no clue, but with Intel I think there are two schools of thought here. If you have “SoC view” on things you know that hardware is not everything. If you have the “processor view”, which Intel does, they probably don’t see it the same way.
CPL: HSA and CUDA are competing with each other. I come from Nvidia, and an old Nvidia colleague has said that Nvidia has had some internal discussion that was inconclusive on whether they should adapt HSA or not. But they have CUDA, and CUDA tries to tackle the same problem but with a different solution.
VRZ: Why has MediaTek been the first to implement HSA? What have you done that’s so special?
JL: Well it’s really about CorePilot [ MediaTek’s Heterogeneous Multi-Processing (HMP) control software for big.LITTLE]. What you hear from competitors sometimes it it’s only a hardware play — it’s about the processor. Of course the processor is important, but the “secret sauce” is really in the software controlling it. It’s not in the hardware. It’s in the layer senses what’s going on and makes the best use of the hardware. That’s going to be even more critical in the next phase of heterogeneous computing when we start using DSP and GPU bandwidth.
CPL: For the evolution of heterogeneous computing is going to be in two phases. Eventually the heterogeneous device, other than the CPU, is going to take over because they are more power efficient. Before the GPU or DSP or other device can take over, you need to lower the power of the CPU side. That’s why we have HMP. You need to make sure you leave “enough room” for a heterogeneous device so they have the power budget to run.
JL: We were the first because we started looking at the systems aspect of the SoC before other companies and we realized it’s not only about hardware, as we said, but it’s also about power, thermal and understanding how you distribute tasks to different processors in real time. There’s three different control mechanisms that come in play at once in CorePilot: power, thermal, task scheduling.
VRZ: How has Google been receiving HSA?
CPL: Google is particularly neutral. Once something gets endorsed by Google it becomes huge. So that’s why Google is neutral to a lot of technology.
VRZ: Thanks for your time.