Overcoming Embedded Challenges with Superscalar Mixed-Signal Processors

Our relationship with electronic products is becoming increasingly close, allowing us to achieve higher efficiency and productivity. Due to advances in technology, processors, and embedded technologies, our connection to the digital realm is growing stronger. For embedded designers, this is not without issues, as they face numerous challenges when designing new products. These include: clock speeds and memory access times can no longer be significantly increased, while performance demands continue to rise; power budgets are no longer increasing and may even be decreasing, while application functionalities keep increasing; furthermore, there is a growing demand for mixed-signal processing in a wide range of embedded applications. Efficiently addressing these challenges requires a new type of embedded processor that not only provides very high performance levels but also effectively balances power requirements while supporting a mix of RISC and signal processing functionalities, which are becoming essential for many embedded applications.

Embedded Design Challenges

The era of expecting the next process node to allow you to double the clock speed with half the power consumption is likely gone for good. Most embedded designs have clock speeds that are already quite high, ranging from 1 GHz to 2 GHz, as shown in Figure 1. Although speeds continue to improve slightly, our ability to achieve higher performance solely by increasing clock frequencies has become very limited due to power and process constraints. This presents challenges for embedded designers as the performance demands of applications continue to grow.

Overcoming Embedded Challenges with Superscalar Mixed-Signal Processors

Figure 1: Historical Growth of Processor Performance (Source: researchgate.net)

This challenge is further exacerbated by the widening performance gap in memory (Figure 2). As we move down the process curve, the speed improvement of logic (red line in Figure 2) far exceeds that of memory access times (blue line). For example, at the 28nm process node, logic can run at over 3GHz clock speeds, but memory access speeds are limited to just 1.4GHz under optimal conditions. Figure 2 shows that memory access times remain relatively flat.

Overcoming Embedded Challenges with Superscalar Mixed-Signal Processors

Figure 2: Performance Gap in Embedded Memory (Source: semiwiki.com)

Memory access times will limit the maximum speed of the processor clock, as the processor cannot run faster than its memory access speed.

Clock speeds in embedded designs can also be adjusted to manage power consumption. Especially in battery-powered applications, power budgets are fixed or can only increase slightly, while the demands for performance, functionality, and features continue to rise. Even in applications where power does not seem to be an issue, power budgets are still constrained. For example, in automobiles, the power generated by the alternator is substantial, but the power of each module must also be limited to control overall power consumption, as the number of electronic components in cars is increasing rapidly. The design challenges of power consumption in embedded applications are not new, but as designs become increasingly complex, managing them becomes more difficult.

Therefore, advanced processors are needed to address these embedded challenges, but even processors face the challenge of meeting more requirements. In the past, if signal processing was needed in a design, a DSP coprocessor would be added, but now to improve processing efficiency, the functions of the coprocessor have been integrated into RISC processors. This merging of functionalities reduces the number of processors in the design, saving power, but it also puts pressure on performance, as RISC processors now need to handle multiple tasks.

Solving These Challenges

These challenges are daunting, but the capabilities offered by new embedded processors will help designers overcome them. While clock speeds in embedded designs have not increased, the latest embedded processors can support more instructions per clock cycle, thus improving performance. Additionally, the ability to issue and execute multiple instructions in parallel, i.e., multithreading capabilities, will also enhance processor performance without needing to increase frequency. Another approach is to use multicore processors in symmetric or asymmetric configurations. These methods allow for more work to be done in parallel, thus improving performance and throughput.

However, increasing the amount of work done per clock cycle does not solve the problem of memory access limitations. The widening gap between memory access speeds and logic speeds has the most profound impact on processors that allow only one stage to access memory in their pipeline. At the 28 nm process, memory access speeds will limit the maximum clock speed of processors in optimal conditions to 1 GHz or lower. Processors with single-cycle memory access have little way to overcome clock speed limitations. Newer high-performance embedded processors provide two or more memory access cycles, allowing memory to be accessed and stored in parallel. Through dual-cycle memory access, processors can run at twice the speed of memory, achieving higher maximum clock speeds across all process nodes, including newer advanced nodes.

Unfortunately, increasing processor performance, whether by increasing the number of instructions per clock cycle, using multicore processors, or running processors at higher speeds to take advantage of multi-cycle memory access, will significantly consume power, which is a major issue for designs with limited power budgets. Designers of embedded processors can no longer pile on transistors when faced with the need to improve performance and throughput. Any performance improvements must be balanced with the accompanying increase in power consumption. Therefore, embedded processors are now measured by their energy efficiency rather than purely by performance or power. When measured by performance per microwatt (DMIPS/W, CoreMark/W, etc.), energy efficiency must be seen as a key design metric for any new embedded processor. A carefully balanced energy efficiency allows embedded application designers to fully utilize the performance enhancements of processors while limiting the increase in power consumption.

Of course, energy efficiency is not the only way to control power consumption. New embedded processors can better control how the processor uses power. The ability to create power islands and dynamically control power consumption within the processor helps designers achieve their on-chip system (SoC) power goals. Additionally, significant advances in instruction sets and compiler optimizations can improve the density of embedded code. Reducing the size of embedded code by 10% or more can reduce memory requirements and save more power than the processor itself in many cases.

New Superscalar ARC HS4x Series

Since 2013, the widely deployed DesignWare® ARC® HS3x series high-performance processors have been on the market, and since then, design challenges have been continuously increasing. To help designers tackle these emerging challenges, Synopsys has launched the new ARC HS4x/D series. This new series has five members (HS44, HS45D, HS46, HS47D, and HS48) and features a dual-issue pipeline optimized for embedded applications (Figure 3). As a result, compared to the HS3x series, the mixed-signal HS4x/D series improves RISC performance by 25% and doubles signal processing performance, while achieving these milestones with only a 15% increase in power and area. This new series is fully compatible with the HS3x series and provides dual-cycle memory access capabilities, allowing the core to run at clock frequencies of up to 2.2 GHz on the 28 nm process. The HS45D and HS47D processors support 150 DSP instructions and provide a very high level of RISC and DSP combined performance. To make the new HS4x cores easy to use, both RISC and DSP functionalities can be programmed efficiently in C/C++ on Synopsys’s ARC MetaWare compiler, thereby automatically leveraging the dual-issue capabilities of the processor to maximize performance.

Overcoming Embedded Challenges with Superscalar Mixed-Signal ProcessorsFigure 3: New ARC HS4x Embedded Processor Series

Conclusion

The times are changing, continuously bringing more exciting functionalities to the electronic world around us. Advances in technology will seamlessly connect us to the digital world, improving efficiency, increasing productivity, and enhancing connections with others. These advancements also pose challenges for embedded designers, who need to adopt new methods to meet the growing demands for performance and functionality while balancing these new requirements with the ever-present power consumption limitations. Successfully addressing these challenges and realizing this new class of electronic products will require advancements in embedded processors. For example, Synopsys’s newly launched HS4x/D series can provide the necessary performance and functionality while also considering energy efficiency, ensuring it won’t exhaust your power budget.

About Synopsys

Synopsys (Synopsys, Inc., NASDAQ: SNPS) is dedicated to innovation that changes the world, leading and participating in close collaborations with various technology companies globally, developing the electronic products and software applications that people rely on, from silicon to software. Synopsys is the world’s leading electronic design automation (EDA) supplier and the world’s leading semiconductor interface IP supplier, as well as a global leader in software quality and security solutions, ranked the 15th largest software company in the world and selected as a leading company in the S&P 500 index. Headquartered in Silicon Valley, USA, Synopsys was founded in 1986 and currently has over 11,400 employees across more than 100 global offices. The expected revenue for the fiscal year 2017 exceeds $2.7 billion, with over 2,600 approved patents. As a core technology provider and driver in industries such as semiconductors, artificial intelligence, automotive electronics, and software security, Synopsys’s technology profoundly influences the current five emerging technology innovation applications: smart cars, the Internet of Things, artificial intelligence, cloud computing, and information security.

Since entering the Chinese market in 1995, Synopsys has established offices in nine major cities: Beijing, Shanghai, Shenzhen, Xiamen, Wuhan, Xi’an, Nanjing, Hong Kong, and Macau, with over 1,000 employees. It has built a complete technical research and support service system, adhering to the principles of “accelerating innovation, driving industry, and achieving customers,” working hand in hand with the industry and partners for mutual development, becoming an excellent partner and strong support for the rapid development of the Chinese semiconductor industry.

Leave a Comment