Click the “blue text” above, and select “Star this“
Key messages delivered at D1 time!
The rise of edge AI is already underway, as we see AI deployed in smaller, more specialized models, particularly in the Internet of Things. This AI approach shifts processing tasks from centralized data centers to the “edge” of the network, closer to where data is actually generated and used.
Have you heard of the ENIAC computer? Weighing 27 tons and occupying 1,800 square feet, ENIAC launched the computer age in 1946. This device was truly staggering in scale—boasting 6,000 manual switches and 17,468 vacuum tubes, consuming 200 kilowatts of power—it changed the game as the world’s first programmable general-purpose electronic digital computer.
The sensational news headlines surrounding ENIAC at the time would feel eerily familiar to anyone following the current developments in the AI field.
In April 1946, Popular Science proclaimed: “With lightning-fast computers tackling problems that have plagued humanity for years, today’s equations could become tomorrow’s rockets.”
The Philadelphia Evening Bulletin reported: “The University of Pennsylvania’s 30-ton electronic brain thinks faster than Einstein.”
Fast forward 75 years to today, the Cortex-M4 chip controlling your smart refrigerator is 10,000 times faster than ENIAC—using just 90 microamps per megahertz of current and occupying a few inches of space. This is because as computing technology matures, devices are specifically optimized to become more effective and efficient in specific, limited, and economical applications.
This is also the direction of AI development.
Specialization of Technology
Like ENIAC, AI is currently generating tremendous excitement and optimism (and a bit of anxiety)—especially with the rapid development of GenAI over the past year. However, if we want to understand its long-term trajectory, we can learn a lot from the history of computing hardware. In fact, this is the same path most technologies follow. Things start big, powerful, and centralized, and once they work, they begin to specialize, localize, and become more suited for efficient edge cases.
From large telephone switchboards to smartphones, from large power plants to home solar panels, from broadcast television to streaming services, we introduce large and expensive things, then begin a long refining process. AI is no exception. In fact, the large language models (LLMs) that enable AI have become so enormous that they are at risk of becoming unmanageable. The solution will be the specialization, decentralization, and democratization of AI technology into specific use cases—this is what we call “edge AI.”
LLMs: Huge Promises (and Imminent Challenges)
Large language models like GPT (Generative Pre-trained Transformer) have made the AI era possible. These giant models are trained on vast amounts of data and possess unprecedented abilities to understand, generate, and interact with human language, blurring the lines between machine and human thought.
LLM models are still evolving, pushing the limits of possibility—this is incredible. But it is not a blank check. The massive amounts of data required and the computational power needed to process this data make the operational costs of these systems extremely high, and they will be difficult to scale indefinitely. The demands for data and computational power from LLMs have become extraordinarily intense—they require high costs and energy consumption that will soon exceed our resources to sustain them.
At our current pace, LLMs will soon encounter a series of inherent limitations:
-
The availability of high-quality data for training.
-
The environmental impact of powering such large models.
-
The ever-expanding financial feasibility.
-
The security of maintaining such large entities.
Given the astonishing speed of AI adoption and expansion, this turning point is not far off. What took mainframes 75 years may become a matter of months for AI, as limitations trigger the demand for a shift to more efficient, decentralized, and accessible subsets of AI: niche edge AI models.
The rise of edge AI is already underway, as we see AI deployed in smaller, more specialized models, particularly in the Internet of Things. This AI approach shifts processing tasks from centralized data centers to the “edge” of the network, closer to where data is actually generated and used. It includes some terms you may have heard of:
Small Language Models: These are AI versions that understand and generate human-like text but are smaller in size. They can be thought of as “mini-brains.” Their smaller size makes them faster and cheaper to use, especially on less powerful devices, such as the chips in your smartphone or field equipment. They are very aware of their areas of responsibility but may not be as knowledgeable or creative as their larger LLM siblings. These models benefit from the latest advancements in highly parallel GPUs—supporting more mature neural networks with general machine learning (ML).
Edge AI:This is a term that means AI operates where the action occurs, such as on your phone, street cameras, or inside your car, rather than on large computers far away in data centers. The “edge” refers to the outer edge of the network, close to where data is created and used. This enables faster processing since data doesn’t have to travel far, while also protecting your data more privately, as it doesn’t always need to be sent over the internet.
Expert Mixtures:In AI, “expert mixtures” are like a team of experts, each member specializing in a specific task. It is a system made up of many smaller AI units (experts), each focusing on different types of work or knowledge areas. When faced with a task, the system decides which expert or combination of experts is best suited to handle the task. In this way, AI can process various tasks very efficiently, as it always uses the best tools for that task.
These technologies together make AI more versatile and efficient, easy to train, run, and deploy—able to work in many different places and ways, from the tiny computers in our homes and pockets to specialized tasks requiring expert knowledge. The smart refrigerator we mentioned earlier is one example, and traffic light arrays are another—autonomous vehicles, diabetes management, smart grids, facial recognition, and this list is as endless as human creativity.
Risks and Rewards of Edge AI
Like any technology, edge AI brings inherent risks and rewards. Let’s quickly go through this list.
Increased Innovation: By eliminating development bottlenecks, edge AI opens the door to a surge of creative, niche applications and micro-applications—anyone willing and able to create applications can take advantage of this.
Reduced Resources and Increased Capacity: Edge AI reduces latency and has lower processing demands, significantly lowering costs and consumption.
Enhanced Privacy and Security: Local data processing means sensitive information does not need to be transmitted over the internet, reducing the risk of data breaches.
Customizable and Independent: Edge AI allows models trained on local, specific data, providing more accurate and relevant solutions that can operate independently and reliably.
Quality Control: The increase in models raises the demand for rigorous quality and validation processes and may create new bottlenecks in quality control.
Security and Governance: More devices running AI applications indeed expand the possibilities for security vulnerabilities, and more creators may overload these processes, opening up a “wild west” environment.
Limited Scope and Scalability: Edge AI models are designed for specific tasks, which may limit their ability to scale or generalize across different scenarios.
Need for Supervision: Leaders need to supervise all development activities, helping creators stay within safe bounds of creativity. This includes controlling the potential for redundancy, as solutions may proliferate and replicate in a vacuum. The solution here will be software that can help track, develop, and guide ideas from concept to development stages.
From this list, it is clear that we have a tremendous opportunity to rethink how we develop and manage AI applications. However, despite cost savings and the innovation dividends, many CIOs and compliance officers may still be eager to ensure that new edge AI technologies are compliant, controlled, and validated. Handing edge AI as a powerful tool to ordinary people, accessibility may be a double-edged sword.
We are on the edge of a new era of AI development, and the shift to edge AI may represent a paradigm shift that echoes the huge leap from those bulky old mainframes to today’s personal computers. This transition promises to make AI more accessible, efficient, and tailored to specific needs, driving innovation in ways we have yet to fully imagine.
In this future, the potential of AI is limitless, constrained only by our imagination and our commitment to responsibly guiding its development.
Copyright Notice: This article is compiled by D1Net, and reprints must indicate the source at the beginning of the article as: D1Net. If not indicated, D1Net reserves the right to pursue legal responsibility.
(Source: D1Net)
About D1Net (www.d1net.com)
D1Net is a mainstream IT portal for B2B in China, also operating the largest CIO expert database and intelligence output and social platform in the country – Xinzongzhi (www.cioall.com). It also operates 19 IT industry public accounts (Search WeChat D1net to follow).
If you work in a certain field of enterprise IT, networking, or telecommunications, and wish to share your views, please feel free to submit articles to D1Net.The cover image is sourced from Shetu Network.
Submission Email:
[email protected]
Cooperation Phone:
010-58221588 (Beijing Company)
021-51701588 (Shanghai Company)
Cooperation Email:
[email protected]
D1Net’s Xinzongzhi is a platform for CIO (Chief Information Officer) experts and intelligence output and resource sharing, with over 50,000 CIO experts, and is currently the largest CIO social platform.
Xinzongzhi connects CIOs to serve them, providing practical services in digital upgrade transformation consulting, training, demand matching, etc. It is also one of the earliest B2B sharing economy platforms in the country. It also provides headhunting, selection reviews, IT department performance promotion, and other services.
Scan the “QR code” to see more details