This Tool Can Automatically Generate DSA Processors Like NPU and DSP in Minutes
With the stagnation of Moore’s Law and system-level enterprises beginning to attempt to create their own chips, EDA and IP vendors seem to know that their spring has arrived—because their customer base has expanded.However, for EDA/IP vendors, there remains a massive obstacle in front of them: the complexity and high threshold of chip design are still very high; not every system enterprise can create chips.Therefore, in the past two years, we attended the Synopsys Developer Conference, where they repeatedly mentioned the need to lower the threshold of chip design, allowing more people to participate in chip design. This is also the only way to enlarge the cake, and it indeed provides some beneficial ideas for the digital transformation of society.In the context of geopolitical factors and the regionalization of the semiconductor industry, the market participants who have the opportunity to share this cake are increasing. However, “lowering the threshold for chip design” is not that easy: yet at the recent FARMStudio launch event by ChipEasy, we seem to have seen a glimmer of hope and more market opportunities.Automatic Generation of DSA ProcessorsAccording to ChipEasy’s official introduction, FARMStudio “is a dedicated processor generation tool described in C language and based on the RISC-V basic instruction set.” From our perspective, FARMStudio is an EDA tool capable of delivering DSA (Domain Specific Architecture) soft cores fully automatically. The DSA processors here at least include DSP, NPU, DPU, as well as dedicated processors in fields like CV, audio, industrial control, and communication.Xu Yong, co-founder of ChipEasy, stated at the launch that FARMStudio is a “disruptive product.” At least from the description, this disruption indeed ran through the entire product introduction at the launch event.
In simple terms, the “FARM design methodology” requires only three inputs into the FARMStudio software after initially analyzing application layer software and algorithms: the basic core, “super instructions,” and the preset templates provided by ChipEasy. With a “click of a button,” DSA hardware and software can be automatically generated.The input part here refers to the basic core, which means RISC-V “plus some configuration options.” The most complex parts, such as microarchitecture and pipeline, do not require user concern; users only need to check a few boxes in the software. Super instructions cover SIMD/VLIW support, “accelerating certain algorithms”; preset templates are internal to FARMStudio, including templates for DSP, NPU, CV, etc. Xu Yong said these are used to “accelerate a class of algorithms” rather than “accelerating a single algorithm” like ASIC—”customers can choose to use our design templates, which condense a large amount of technology”—this part looks more like the IP provided by ChipEasy.As for the parts automatically generated by FARMStudio, the hardware includes RTL, synthesis scripts, test suites, and FPGA development environments along with verification environments. The generated software includes a toolchain, compiler, instruction set simulator (ISS), performance simulator (Profiler), OS, math libraries, debuggers, application software packages, etc. “All are automatically produced.”Moreover, the entire generation process is “minute-level,” “with smaller ones taking less than a minute; larger ones won’t exceed two or three minutes.” In our view, just this part of the introduction is indeed quite disruptive. Therefore, ChipEasy claims it is the “world’s first domain-specific processor generation tool.”
To demonstrate its usability, Zhang Weihang, Vice President of Software at ChipEasy, conducted a live demonstration, showcasing the generation process of three RISC-V-based processor cores using FARMStudio, along with the use of software simulation and debugging; in addition to the enterprise version, FARMStudio also offers a personal version priced at an astonishing 299 yuan/year—seemingly to tell everyone, “Come and try it out.” “Before launching FARMStudio, all our employees used this tool,” Xu Yong said, “When I say everyone, I mean our marketing and administration too.” Isn’t this a practice of “allowing more people to participate in chip design”?Xu Ming, Marketing Director of ChipEasy, said during the introduction of using FARMStudio to create DSP processor instances: “We are very willing to provide training on how to use FARMStudio to build your own processors. Don’t think that building a DSP processor is unattainable. With this tool, efficiency will be greatly improved.”We have also seen similar products on the market, such as those from Cadence, Synopsys, and Codasip, but Xu Yong told us in an interview after the event: “There may be similar products on the market, but none do it like us. For example, competitors use their own cores, not RISC-V. RISC-V ensures autonomy and control.” Xu Ming added that FARMStudio’s software and hardware design is uniformly done in C language, while competitors generally “require a high learning cost.” Based on the RISC-V basic instruction set and C language, these two points are indeed the technological breakthroughs of FARMStudio.The breakthrough technologies listed on the PPT for FARMStudio, in Xu Yong’s view, are just a part of FARMStudio, “I just casually listed eight; our innovations go far beyond these eight.” This sounds like a bit of boasting, but from the introduction of FARMStudio, it indeed requires a lot of technological breakthroughs to achieve, if its usability has indeed been validated.The eight breakthrough technologies casually listed by Xu Yong include:(1) Based on the open-source RISC-V basic instruction set, which is the instruction set for the control part of the DSA processor;(2) Based on C language, which has been repeatedly mentioned at the launch. This is the foundation for achieving hardware-software co-design optimization in the FARM methodology, described by Zhang Weihang as “the most important function of FARMStudio—using C language as design input”; Xu Yong stated that FARMStudio is the world’s first tool to unify hardware and software design languages;(3) Automatically generating DSA processors and accompanying toolchains, and at a minute-level, as mentioned earlier;(4) High-performance compilers supporting automatic VLIW and automatic pipeline scheduling;(5) Multi-level verification, “verification has many levels,” including “instruction set level, which can be done on x86 platforms,” Xu Yong said this technology is also unique in the world;(6) Cycle-accurate simulator, with simulation speeds reaching MHz, currently the fastest in the industry;(7) FPGA verification phase, providing cloud services, “For users, no need to debug boards, no need for any interfaces, just purchase services in the cloud, plug and play, very convenient”;(8) Embedded OS, “Our OS is all automatically generated and configured.”We believe that any of these points has room for in-depth study. Not to mention that the phrase “automatic generation” means massive work for ChipEasy—because this process essentially shifts much of the work originally done by chip design companies upstream; just the cloud FPGA verification should present a large number of engineering issues; and the unification of hardware and software design languages based on C language could be a significant topic for discussion…So, after achieving these technologies, FARMStudio essentially represents a magnitude of cost reduction for chip design companies. Xu Yong cited RTL as an example, “If everyone does it themselves, depending on the project scale, it could be as few as dozens of people, or over a hundred for larger ones. Without a few months, you won’t see any results.” For instance, for the software compiler, “Without a team of at least a dozen people, it won’t run in less than a year.” Any component automatically generated by FARMStudio would typically require significant investment and deep, long-term experience accumulation from design companies under conventional operations.Cost Reduction in Chip DesignAt the time of FARMStudio’s launch, ChipEasy seems to already have many interested clients. Xu Yong mentioned that existing clients cover fields including security, automotive, communications, consumer electronics, and industrial control. Xu Ming, during the keynote speech, cited examples of using FARMStudio to create DSP and NPU, which reportedly originated from actual projects—further mentions will be made below.For downstream chip design clients, the most concerning factors are the usability, reliability, performance, and efficiency of the EDA tools themselves. Quantifying these aspects translates to how much benefit EDA tools can bring—essentially, how much cost savings can be achieved, including time, human and material resources, risk control, and ultimately the input-output ratio.Discussing the cost issues of FARMStudio, the most direct aspect should be its pricing—ChipEasy seems to be the only domestic EDA company that spends time discussing product business models and pricing at launch events, although this may also relate to FARMStudio’s product occupying a special ecological niche.
The above image has provided the functions and prices of the FARMStudio enterprise version. Functionally, it includes providing software toolchains, simulators, SDK, RTL downloads, and expert technical support. Its sales model is defined as a “1+1 model.” The typical licensing fee before chip mass production is 1 million yuan/year, and after mass production, a royalty of 1% of SoC chip ASP is charged. However, as mentioned earlier, if customers choose ChipEasy’s preset templates, they will need to pay corresponding fees—this should also reflect ChipEasy’s important knowledge reserve and corporate value.This pricing model seems to reflect the characteristics of FARMStudio to some extent: as an EDA tool, it includes the ability to automatically generate customized IP (or say, generate customized IP based on ChipEasy’s IP for clients). Xu Yong specifically mentioned that for traditional solutions, following the same process requires the use of design, verification, synthesis, and other tools, with total licensing fees in the millions of yuan; and the 1% royalty is “lower than the fees charged by industry IP companies.”It is also worth mentioning the royalty part’s “multi-project reuse”—generally, IP is charged per project, so this naturally reflects the cost-effectiveness of FARMStudio; additionally, “no limit on the number of cores,” multiple cores do not increase costs, and “no repeated charges for different types of cores” means that if different types of cores are used on one SoC, only 1% of the ASP for one type of core is charged.This pricing model, in Xu Yong’s view, is also extremely disruptive, allowing more companies to create chips at a lower cost. Meanwhile, in addition to the enterprise version, there is also a personal version of FARMStudio, previously mentioned, priced at 299 yuan/year, provided that it is not used for commercial purposes. The functional differences from the enterprise version include no support for RTL downloads and no expert technical support. Xu Yong stated that the introduction of the personal version aims to further lower the usage threshold of EDA and increase active user engagement.
Charging is the most visible cost manifestation. Another part of the cost is reflected in the control of time, human resources, and risk when clients use this product. If a corresponding understanding of the FARM design methodology is achieved, then controlling these factors becomes relatively easy to understand.For instance, regarding decision risk control, or allowing chip design project decisions to have the possibility of quick trial and error. “If using traditional tools, the decision process first naturally involves analyzing application layer software and algorithms; it also includes market research,” Xu Yong stated, “Market personnel say this project has great prospects, R&D personnel say it can be done, then you need to decide whether to undertake this DSA processor project. This poses a significant risk for decision-makers.”Once the project begins, there will be a lot of follow-up work, and many unknowns will exist. Architecture design, performance optimization work, “every product iteration, software is relatively quick, but hardware iterations take months; after both hardware and software are completed, application layer function verification takes another few months.” Moreover, during this cycle, the market itself is also changing, so the initial decision faces greater risk. In recent years, we have seen numerous chip design projects fail or fail to deliver on time, which seems to be increasingly common.On the other hand, the FARM process is “designed in parallel based on FARMStudio.” Xu Yong said: “When things are almost done, such as PPA being well done, decision-makers have a reference, and then they can decide whether to undertake this project.” Xu Ming likened this to buying a house, deciding whether to pay before the house is built or after it is fully constructed. The former situation carries the risk of ending up with a half-finished project. FARMStudio allows for decisions to be made “after knowing what the project looks like.”This control of risk or quick trial and error is based on the rapid capabilities of FARMStudio—this means that the first part of this article leads to the second part of cost control. “From our typical cases, we conclude that we only need to hire a team equivalent to 1/10 of the previous size—that is, a magnitude advantage—to achieve the same performance, results, and even faster speed as traditional methods.”“Decision-makers receive four pieces of data, functionality, verification, performance, and cost, and then decide whether to create such a DSA chip,” Xu Yong stated, “This is a very low-risk decision-making method.”
Time costs and human resource investments, specifically for the FARMStudio product, based on the FARMStudio workflow, the original chip design team can be streamlined, including “software engineers doing hardware work” and “architects doing processors”—thus controlling human resources; the automatic generation process of FARMStudio is inherently aimed at drastically controlling time costs; hardware-software co-development, based on the same development environment, achieves high parallelism, and “each iteration takes only minutes,” reflecting the savings in time costs.The entire process achieves ChipEasy’s ideal setting. Ultimately, the characteristics of FARMStudio can be summarized as: first, speed; second, low cost; third, low usage threshold.Examples of Generating DSP and NPUIn fact, the most impressive aspects of the entire launch event were the two applications based on the FARM methodology cited by Xu Ming, including DSP and NPU. For instance, at the microarchitecture level of DSP, regarding the number of load/store units. “If a company believes that the original DSP computing power is sufficient but wants to add a load/store unit, how would a traditional DSP IP company respond?” Xu Ming said, “The most likely answer is, ‘We don’t have that specification; you might consider buying a more powerful IP that has two load/store units.'”“But for FARM, this is not an issue, because adding a load/store unit just requires checking the corresponding configuration options.” Zhang Weihang also mentioned this during the demonstration of FARMStudio. This means the input part, RISC-V plus configuration options. It reflects usability and design flexibility, or rather, precise matching of system architecture needs.The “F” in FARM stands for Flexible; for essential modules of vector processors, such as load/store units, memory storage resources, registers, VLIW support, and computing units, they can be relatively easily defined and implemented. “For example, does memory require TCM (Tightly Coupled Memory) or cache, is there one or two load/store units, all of these options are available in the tool; or the width of vector registers, how many entries, and whether MAC (Multiply-Accumulate) is fixed-point or floating-point can all be described using C language.” Based on this, flexible and differentiated chip design options are realized.This point may also reflect cost-effectiveness to some extent. Xu Ming mentioned in an interview that FARMStudio differentiates from the specification definition stage, even breaking the current trend of using only one core microarchitecture for a generation of products—this one design covers all application scenarios.Xu Yong stated, “Differentiation comes at a cost; the traditional case is ‘buying an IP, only customizing a little bit, and the price may increase by 5-10 times. Moreover, these companies are often reluctant to take on such tasks’ because traditional tools use Verilog to write, and the customization cost is extremely high, whereas highly automated tools inherently possess the characteristics of customized IP.”
Regarding customized DSP and NPU examples, due to space constraints, this article will not elaborate. However, some issues are particularly worth mentioning. These examples should all reflect flexibility and the value of performance and efficiency.In the customized DSP example, Xu Ming mentioned FARM’s support for non-standard data types: because typically, operations involving non-standard data types such as 10-bit, 12-bit would conventionally operate at 16-bit specifications; the advantage of customized DSP is achieving native support for non-standard data types, significantly enhancing memory consumption and computing performance (achieving more data processing under corresponding SIMD widths). ChipEasy provided data indicating a comprehensive performance improvement of 1.6x-2.5x in this regard.Additionally, there is support for personalized fractional scaling: traditional DSP does not support personalized fractional scaling and can only match the required format through shift or saturation instructions. FARM can relatively easily implement personalized fractional scaling.Another highlight related to DSP is “custom instruction optimization.” It seems that general DSP IPs also commonly support instruction set extensions, but Xu Ming told us that traditional DSP IP’s “extended instructions are additional resources,” meaning that custom instructions and preset instructions cannot reuse resources and require additional area cost.Although Xu Ming did not explain how FARM achieves resource reuse at the RTL level, he provided an example: generally, chip design clients need to perform FFT or matrix multiplication, while traditional DSP IP uses multiplication or MAC instructions to implement it. “But with our tools, one instruction can perform matrix multiplication or FFT butterfly operations; you can even choose the scale of the matrix for balancing performance and cost.”
The above image compares FFT performance with “a widely used DSP IP in the industry.” “We are based on butterfly operations and base-2, while the competitor is ‘based on traditional multiplication/MAC instructions but uses mixed bases.’ Theoretically, mixed-base performance is higher than base-2, but comes at a cost of larger code size; we achieved fewer cycles, smaller code size, and better performance,” Xu Ming stated.At the same time, custom instructions can also improve resource utilization, filling more clock cycles than traditional DSP standard instructions, and implemented with lower resources. Regarding this, “after clients update and optimize instructions, the compiler will also update automatically without manual intervention; the compiler will automatically recognize optimization scenarios.”
In the customized NPU example, the above image lists the comparison between FARM’s customized NPU and generic NPU IP. Fundamentally, it still reflects that flexibility and differentiation can achieve higher efficiency, similar to the DSP examples.This part also includes two examples cited by Xu Ming, including a heterogeneous architecture of NPU + DSP, where the two exchange data through a shared buffer without needing to “go through the external bus”; and the so-called “local mode AI processor architecture paradigm”—mainly the customization of on-chip memory solutions, achieving a design that does not require external DDR, thus saving on system costs. Further details will not be elaborated here.Xu Yong, co-founder of ChipEasyFocusing on the Product ItselfOne detail we noted during this launch event was that ChipEasy primarily discussed the FARMStudio product itself and hardly mentioned the company, the personnel composition, or the founding experiences and “sentiments”—which is relatively rare in the launch events of domestic chip startups.Moreover, it is also rare for domestic EDA/IP companies willing to share technology with the media, although what we have seen does not involve the implementation of FARMStudio itself. Previously, we mentioned that only when the integrated circuit industry and market gradually mature will companies be willing to focus on their products when addressing the media rather than telling stories or discussing sentiments. This point is well reflected in ChipEasy.While we do not know if FARMStudio will truly succeed, this has already been a good start and an excellent beginning. When we interviewed Xu Yong, we specifically asked about the inevitable oligopolization of the EDA market and how ChipEasy positions itself in such a scenario. Xu Yong said, “We do not want to preset the future; focusing on the present is what matters most to us.”