Understanding FPGA Through GPU Comparison

Welcome FPGA engineers to join the official WeChat technical group.

Clickthe blue wordsto follow us at FPGA Home – the best and largest pure FPGA engineering community in China

Understanding FPGA Through GPU Comparison

FPGA is a bunch of transistors that you can connect (wire up) to create any circuit you want. It is like a nano-sized breadboard. Using FPGA is like chip prototyping, but you only need to buy this one chip to build different designs, in exchange, you have to pay a price in terms of efficiency.

Literally, this statement is not correct because you do not need to rewire (rewire) the FPGA, it is actually a lookup table 2D grid connected by a routing network, along with some arithmetic units and memory. FPGA can simulate any circuit, but they are actually just mimicking, just like software circuit simulators simulate circuits. The inappropriate part of this answer is that it oversimplifies how people actually use FPGA. The next two definitions can better describe FPGA.

Circuit simulation is the classic mainstream use case of FPGA, which is also the reason why FPGA first appeared. The key to FPGA is that hardware design is encoded in HDL form, and buying some cheap hardware can achieve the same effect as ASIC. Of course, you cannot use exactly the same Verilog code on FPGA and real chips, but at least their abstraction range is the same.

This is a use case different from ASIC prototype design. Unlike circuit simulation, computational acceleration is an emerging use case for FPGA. This is also the reason why Microsoft recently successfully accelerated search and deep neural networks. And importantly, computational instances do not rely on the relationship between FPGA and real ASIC: The Verilog code written for FPGA-based acceleration does not need to have any similarity to the Verilog code used for prototyping.

There are huge differences in programming, compilers, and abstraction between these two instances. I am more focused on the latter, which I call “computational FPGA programming”. My argument is that the current programming methods for computational FPGA have borrowed from traditional circuit simulation programming models, which is wrong. If you want to develop ASIC prototypes, Verilog and VHDL are the right choices. But if the goal is computation, we can and should rethink the entire stack.
Let’s get straight to the point. FPGA is a type of very special hardware used to efficiently execute special software that simulates circuit descriptions. FPGA configuration requires some underlying software – it is a program written for ISA.
Here we can use GPU as a comparison.
Before the prevalence of deep learning and blockchain, there was a time when GPUs were used to process graphics. In the early 21st century, people realized that they also heavily used GPUs as accelerators for computationally intensive tasks without graphical data: GPU designers had built more general machines, and 3D rendering was just one of the applications.
Definition of FPGA and Comparison with GPU
Computational FPGA follows the same trajectory. Our idea is to make more use of this trendy hardware, of course not for circuit simulation, but to utilize computational patterns suitable for circuit execution, viewing GPU and FPGA in analogy.
To develop GPUs into today’s data-parallel accelerators, people had to redefine the concept of GPU input. We used to think that GPUs accepted peculiar, intense, domain-specific visual effect descriptions. We implemented GPU execution programs, unlocking their true potential. This implementation allowed the goal of GPUs to evolve from a single application domain to an entire computational domain.
I believe computational FPGA is undergoing a similar transformation, and there is currently no concise description of the basic computational patterns that FPGA excels at. But it relates to potential irregular parallelism, data reuse, and most static data flows.
Like GPUs, FPGAs also need hardware abstractions that can embody this computational pattern; the problem with Verilog for computational FPGA is that it performs poorly in low-level hardware abstraction and also poorly in high-level programming abstraction. Let’s imagine through contradiction what it would be like if RTL (Register Transfer Level) replaced these roles.
Even RTL experts might not believe that Verilog is an efficient way to develop mainstream FPGAs. It will not push programming logic to the mainstream. For experienced hardware hackers, RTL design seems friendly and familiar, but the productivity gap between it and software languages is immeasurable.
In fact, for today’s computational FPGA, Verilog is essentially ISA. The major FPGA vendor toolchains take Verilog as input, while high-level language compilers take Verilog as output. Vendors generally keep the bitstream format secret, so Verilog will be at the lowest possible level in the abstraction hierarchy.
The problem with treating Verilog as ISA is that it is too far from the hardware. The abstract gap between RTL and FPGA hardware is huge; from a traditional perspective, it must at least include synthesis, technology mapping, and place-and-route – each of which is complex and slow processes. Therefore, the compile/edit/run cycle for RTL programming on FPGA takes hours or days, and worse, it is an unpredictable process; the deep stack of the toolchain may obscure changes in RTL that could affect design performance and energy characteristics.
A good ISA should directly showcase the unadulterated reality of the underlying hardware. Like assembly language, it doesn’t need to be very convenient to program. But like assembly language, its compilation speed needs to be very fast, and the results predictable. To build higher-level abstractions and compilers, a low-level target that does not produce surprises is required. And RTL is not such a target.
If computational FPGA is an accelerator for specific types of algorithm patterns, then current FPGAs do not ideally achieve this goal. Under these game rules, new types of hardware that can defeat FPGAs may bring entirely new abstraction hierarchies. The new software stack should discard the legacy issues of FPGA in circuit simulation, as well as the RTL abstraction.
-END-
Understanding FPGA Through GPU Comparison

Welcome communication engineers and FPGA engineers to follow our official account.

Understanding FPGA Through GPU Comparison

The largest FPGA WeChat technical group in the country

Welcome everyone to join the national FPGA WeChat technical group, this community has tens of thousands of engineers, a group of engineers who love technology, where FPGA engineers help each other, share, and have a strong technical atmosphere! Hurry up and invite your friends to join!!

Understanding FPGA Through GPU Comparison

Just press and hold to join the national FPGA technical group.

FPGA Home Component City

Advantageous component services, please scan to contact the group owner: Jin Juan, Email: [email protected] Welcome to recommend to procurement

ACTEL, AD part of the advantageous ordering (operating the full series):

Understanding FPGA Through GPU Comparison

XILINX, ALTERA advantageous spot or ordering (operating the full series):

Understanding FPGA Through GPU Comparison

(The above components are partial models, for more models please consult the group owner Jin Juan)

Service concept: FPGA Home Component City aims to facilitate engineers to quickly and conveniently purchase components, after years of sincere service, our customer service is spread across large listed companies, military scientific research units, and small and medium-sized enterprises, the biggest advantage is emphasizing the service-first concept and achieving fast delivery and favorable prices!

Direct brand: Xilinx ALTERA ADI TI NXP ST E2V, Micron and more than a hundred component brands, especially good at components that are subject to US embargo against China, welcome engineering friends to recommend us to procurement or consult us directly! We will continue to provide the best service in the industry!

Understanding FPGA Through GPU Comparison

FPGA technical group official thanks to brands: Xilinx, intel (Altera), microsemi (Actel), Lattice, Vantis, Quicklogic, Lucent, etc.

Leave a Comment

Your email address will not be published. Required fields are marked *