Introduction: In China’s “Artificial Intelligence Standardization White Paper (2018)”, a definition of artificial intelligence is given: “Artificial intelligence is the theory, methods, technologies, and application systems that utilize digital computers or machines controlled by digital computers to simulate, extend, and enhance human intelligence, perceive the environment, acquire knowledge, and use knowledge to achieve optimal results.”“
The core idea of artificial intelligence lies in constructing intelligent artificial systems. Artificial intelligence is a knowledge engineering that uses machines to mimic humans in completing a series of actions. Based on whether it can achieve understanding, thinking, reasoning, problem-solving, and other advanced behaviors.
In the future, the main characteristics of artificial intelligence applications will reflect several core technological features.
01 Robotic Process Automation (RPA)
The definition of RPA (Robotic Process Automation): a technology that automatically executes corresponding process tasks according to rules through specific techniques that can simulate human operations on a computer interface, replacing or assisting humans in completing related computer operations.
Unlike what people usually think of as a “robot” with a mechanical entity, RPA is essentially a software that can complete tasks according to specific instructions, installed on personal computers or large servers, automating office operations by simulating keyboard, mouse, and other manual operations.
▲Figure 1-1 RPA is the trend of innovation and development in future offices
RPA is also vividly referred to as digital labor because it comprehensively utilizes technologies like big data, artificial intelligence, and cloud computing, manipulating elements in the user graphical interface (GUI) to simulate and enhance the interaction process between humans and computers, thereby assisting in executing tasks that were previously only possible for humans or supplementing human labor in high-intensity work.
Since 2015, artificial intelligence technology and RPA have developed and advanced significantly at the same time, complementing each other. Naturally, the combination of RPA and AI has brought about a unique trend in the development of intelligent applications, which we call intelligent RPA technology, or IPA technology (Intelligent Processing Automation), as shown in Figure 1-2.
▲Figure 1-2 Composition of Intelligent RPA: RPA + AI = IPA
In other words, RPA is the foundation that needs to be integrated with other technological means to realize IPA and its advantages.
The expectations of the business community for the functionality of process automation will continue to increase, applying AI technologies such as machine learning to RPA, integrating artificial intelligence functions into product suites, to provide more types of automation functions has become the mainstream trend in the future development of RPA.
02 Optical Character Recognition (OCR)
OCR technology refers to the technique of using electronic devices (such as scanners or digital cameras) to convert text in paper documents into black-and-white bitmap image files, and then using recognition software to convert the text in the images into text format for further editing and processing by word processing software. In simple terms, it is the technology of scanning textual materials and then analyzing the image files to obtain text and layout information.
OCR technology can generally be divided into five stages as shown in Figure 3-1.
▲Figure 3-1 Five Stages of OCR Technology
Next, we will explain the OCR recognition process in detail.
1. Image Processing
Correcting imaging issues of the image. Common image preprocessing processes include: geometric transformations (perspective, distortion, rotation, etc.), distortion correction, removing blurriness, image enhancement and light correction, binarization, etc.
2. Text Detection
Detecting the location, range, and layout of the text, usually including layout analysis and text line detection. The main problem solved by text detection is where there is text and how large the text range is.
The processing algorithms used for text detection generally include: Faster-RCNN, Mask-RCNN, FPN, PANet, Unet, IoUNet, YOLO, SSD.
3. Text Recognition
Based on text detection, recognizing the content of the text, converting the text information in the image into text information that can be recognized and processed by computers. The main problem solved by text recognition is what each character is.
Common processing algorithms for text recognition include: CRNN, Attention OCR, RNNLM, BERT.
4. Text Extraction
Extracting the necessary fields or elements from the text recognition results.
Common processing algorithms for text extraction include: CRF, HMM, HAN, DPCNN, BiLSTM+CRF, BERT+CRF, Regex.
5. Output
Outputting the final text recognition result or text extraction result.
03 Machine Learning/Big Data Analysis
Machine learning/big data analysis is a method used to design complex models and algorithms to achieve predictive functionality, meaning that computers have the ability to learn rather than relying on pre-written code. It can autonomously identify patterns in structured data based on observations of existing structured data and output predictions for future results.
Machine learning is an algorithm that recognizes patterns in structured data through “supervised” and “unsupervised” learning (for example, daily performance data). Supervised algorithms learn from a structured dataset of inputs and outputs before making predictions based on their inputs. Unsupervised algorithms observe structured data and provide insights based on identified patterns.
Machine learning and advanced analytics could change the game for insurance companies, such as improving compliance, reducing cost structures, and gaining competitive advantage from new insights.Advanced analytics has been widely applied in leading human resources departments, mainly for determining and assessing the core qualities of leaders and managers to better predict behaviors, plan career development paths, and assign the next leadership positions.
04 Natural Language Generation (NLG)
Computers have the ability to express and write like humans, following certain rules to convert the information observed from data into high-quality natural language text. For example, automatically identifying topics, numerical place names, names, and addresses in meeting emails to generate itinerary memos, or identifying key content of contract terms and generating a summary list.
For a detailed introduction to natural language generation and natural language processing, please read “Detailed Explanation of 5 Major Semantic Analysis Technologies and 14 Types of Applications in Natural Language Processing (Recommended for Collection)“
05 Smart Workflow
Smart workflow is a software tool used for process management, integrating work performed by both humans and machines, allowing users to start and track the status of end-to-end processes in real-time, facilitating the switch between different groups, including switches between robots and human users, while also providing statistics on bottleneck stages.
With the continuous advancement of society and technology, various fields are gradually developing rapidly towards automation and intelligence. The research on workflow-related technologies is becoming increasingly important andis widely applied in various fields such as manufacturing, software development, banking, finance, and biomedicine.
Workflows can not only automate the processing of related activities and tasks, reducing potential errors brought about by human-computer interaction but also precisely enhance each processing step, maximizing generation efficiency, and applying workflows to dynamic, variable, and flexible application scenarios.
In recent years, under the background of big data and artificial intelligence, the business processes in workflows have become increasingly complex, and the environments and data faced have also become increasingly complex, with frequent remapping of business processes due to demand analysis or process pattern changes and improvements caused by maintenance upgrades.
In this dynamic and complex environment, how to quickly identify tasks and then efficiently and purposefully address workflow issues has become a key issue in current workflow task research.
RPA software robots also encounter many similar situations during work. The complexity and variability of workflows can lead to complex and variable RPA operation processes, making it impossible for them to adapt, which will greatly affect the operational efficiency of RPA software robots.
Therefore, it is necessary to achieve dynamic adjustments of task settings in RPA and automatic changes and upgrades of RPA business processes through smart workflow technology, realizing adaptive operation modes under the guidance of smart workflows.
There are many methods to realize smart workflows, such as the workflow scheduling based on genetic algorithms proposed by Professor J.H. Holland from the United States, and the heuristic algorithms based on particle swarm optimization proposed by Pandey S et al. can be used for intelligent scheduling of different resources. In addition, there are many intelligent algorithms based on nature and bionics, such as hybrid frog jumping algorithms, cuckoo search algorithms, bat algorithms, artificial bee colony algorithms, etc.
The currently more common method is to implement a workflow processing mode based on intelligent planning, which no longer treats different activities as independent events that do not affect each other but purposefully considers the common effects of multiple events.
This mode fully considers the similarities between workflows and intelligent planning, inferring the intrinsic logical relationships between different workflow tasks through intelligent planning and fully mining potential relationships from other channels and external information.
Gradually improving the problems in traditional workflows, using new intelligent planning methods to mine potential information from surface actions, filtering noise data, and then achieving automatic correction of processes, finally, through the conclusions drawn earlier, purposefully modifying previous RPA operation processes to achieve adaptive operation modes and processes.
06 Cognitive Agents
A cognitive agent is a technology that combines machine learning and natural language generation, with the addition of emotional detection capabilities to make judgments and analyses, allowing it to perform tasks, communicate, learn from data sets, and even make decisions based on emotional detection results. In other words, machines can produce “emotional resonance and mental resonance” like humans,truly becoming a fully virtual labor force(or agent).
In the customer service field, a certain car insurance company in the UK improved its customer conversion rate by 22% by using cognitive agent technology, reducing verification error rates by 40%, and achieving an overall return on investment of 330%.
Of course, consulting firms such as Deloitte and EY also openly state that, based on the current basic capabilities of many enterprises’ process management and systems, there is still a lot of foundational work to be done. The core technologies required to build intelligent process automation (such as cognitive agents) are still in their infancy.
Intelligence comprises three aspects: computational intelligence, perceptual intelligence, and cognitive intelligence.
-
Incomputational intelligence, the speed of computers has long surpassed human efficiency.
-
Inperceptual intelligence, with the development of technologies like OCR and NLP, many effects can now be achieved.
-
However, incognitive intelligence, even in certain specific fields, natural language processing can achieve better results than humans, but in certain areas, especially knowledge understanding, reasoning, and judgment, there are still many aspects that need to be gradually accumulated and improved.
According to whether machines can produce self-awareness and the applicable scope of robots, artificial intelligence is divided intoweak artificial intelligence and strong artificial intelligence, where machines in weak artificial intelligence do not have self-awareness and do not possess true reasoning and independent problem-solving capabilities, usually only suitable for solving specific problems under certain conditions. Current artificial intelligence research mainly focuses on the field of weak artificial intelligence.
In terms of strong artificial intelligence, machines have a certain degree of self-awareness and can expand their functions through learning. They can obtain functions that they currently do not possess or knowledge that they do not understand through self-learning.
Currently, comprehensive strong artificial intelligence still faces challenges in technical capabilities, social ethics, and other aspects, but in certain specific scenarios of certain fields, artificial intelligence software with cognitive intelligence and learning abilities can not only optimize operational processes, respond quickly, cover more different situations but also avoid technical and application risks to the greatest extent, making it a very valuable research direction.
Cognitive intelligence has many definitions, among which Professor Xiao Yanghua from Fudan University once mentioned that making machines possess cognitive intelligence means enabling machines to think like humans, and this thinking ability is specifically reflected in the following aspects.
-
First, machines have the ability to understand data, understand language, and thus understand the real world.
-
Second, machines have the ability to explain data, explain processes, and thus explain phenomena.
-
Third, machines possess a series of cognitive abilities unique to humans, such as reasoning, planning, and so on, meaning that cognitive intelligence needs to solve a series of complex tasks such as reasoning, planning, association, and creation.
An agent refers to a computational entity that resides in a certain environment, continuously and autonomously plays a role, and has characteristics of residency, reactivity, sociality, and proactivity. According to the theory of the famous artificial intelligence scholar, Professor Hayes-Roth from Stanford University, “agents can continuously execute three functions: perceive dynamic conditions in the environment, execute actions to influence the environment, and perform reasoning to interpret perceived information, solve problems, and decide actions.”
From the previous definitions, we can see that cognitive agents can perceive dynamic conditions in the environment and then execute corresponding actions to influence the existing environment, while also being able to reason to interpret perceived information, solve relevant problems, and decide subsequent actions.
By combining cognitive agents with RPA, we can obtain a robot with cognitive intelligence that can dynamically perceive what needs to be done next based on the changes in the application systems and other environments, while executing corresponding actions to influence the corresponding environmental information, achieving intelligent data entry, intelligent monitoring, intelligent document processing, and assisted decision-making.
Meanwhile, cognitive agents can also learn relevant experiences and knowledge through RPA technology while processing business, gradually mastering the ability to identify key points.
The research on cognitive agents includes various methods. In recent years, with the continuous development of distributed artificial intelligence, information science, and network science, distributed collaborative decision-making in dynamic environments has become an important research approach for cognitive agents. This approach has been widely applied in typical decentralized multi-agent systems represented by multi-UAV systems and multi-robot systems.
At the same time, limited by their own design, agents often exhibit partially observable characteristics of information in their environments and systems, and the limited interactions between agents and external constraints also make it necessary to pay a high cost to obtain global information.
Moreover, decentralized multi-agent systems in applications exhibit self-organizing structures and corresponding complex network characteristics similar to social networks, meaning that individual agents in the network can usually only connect/interact with a small portion of agents in their local network, making traditional centralized collaborative models no longer applicable.
Additionally, similar to how limited information exchange between people in social networks can greatly enhance individual decision-making efficiency, it is also under continuous attempts to see if similar methods can be applied to relevant research.
About the Author: Da Guan Data, a leading enterprise in the field of intelligent RPA in China, has independently developed a complete set of “RPA + AI” systems, holding core intellectual property rights. Da Guan’s intelligent RPA products are industry-leading products that do not rely on Microsoft’s underlying development framework and do not use third-party open-source frameworks.
This article is excerpted from “Intelligent RPA Practical Guide“, published with the authorization of the publisher.
More Highlights:
Professor Liu Yunhao from Tsinghua University answers 2000 questions about AI
[Contents] “Computer Education” August 2020 Issue
[Contents] “Computer Education” July 2020 Issue
[Contents] “Computer Education” June 2020 Issue
[Contents] “Computer Education” May 2020 Issue
Professor Zhan Dechen from Harbin Institute of Technology: A new model to ensure teaching quality in universities – synchronous and asynchronous blended teaching
Professor Li Xiaoming from Peking University: After realizing that “online teaching is also feasible”…
From the “accidental” outbreak of the epidemic to the “inevitable” combination of offline and online teaching
Successfully held the online teaching seminar on “Working Together to Fight the Epidemic”
Call for papers: “Online Teaching under the Epidemic”
Several suggestions for effective online teaching – Li Fengxia from Beijing Institute of Technology
How can university teachers ensure online teaching quality? See what experts say
[Principal Interview] Accelerating the Advancement of Computer Science Education as a Pathfinder for Data Science Education – Interview with Professor Zhou Aoying, Vice President of East China Normal University
Editor-in-Chief’s Message: Reflections on Computer Education from “Coffee on the Wall”
[Contents] “Computer Education” January 2020 Issue
[Series on Words Ten] Analysis and Suggestions on Quality Issues in Undergraduate Computer Courses
Analysis and Enlightenment of Undergraduate Computer Course Settings at the University of Tokyo
Teaching Reform and Practice of Artificial Intelligence Courses at Peking University
New Engineering and Big Data Major Construction
[Series on Words Ten] Discerning Truth from Falsehood – A Discussion on ESI Indicators
Lessons from Abroad Can Be Beneficial – A Compilation of Research Articles on Computer Education from China and Abroad