Key Points of Artificial Intelligence in Embedded Systems

Follow+Star Public Account, don’t miss the wonderful content

Source | Renesas Embedded Encyclopedia

Artificial Intelligence, commonly referred to as AI, is one of the hottest technologies today.

Today, I would like to share some key points regarding artificial intelligence from the perspective of embedded systems.

What is Artificial Intelligence?

The rapid development of human technology continuously brings exciting new technologies. Currently, Artificial Intelligence (AI) is undoubtedly one of the hottest technologies.

Artificial Intelligence, abbreviated as AI, is a new scientific discipline that studies and develops theories, methods, technologies, and application systems used to simulate, extend, and enhance human intelligence. It is a scientific field that constructs computers and machines capable of reasoning, learning, and acting, which typically requires human intelligence or involves data scales beyond human analytical capabilities.

In the diagram below, terms such as artificial intelligence, machine learning, deep learning, data mining, and pattern recognition are often seen, along with many articles and discussions about their interrelations. Generally, artificial intelligence is a broad research field formed by the intersection of many disciplines; machine learning is a branch of artificial intelligence that provides many algorithms, while deep learning is an important branch of machine learning based on artificial neural networks; data mining and pattern recognition focus on algorithm applications. They complement each other and require support from knowledge in other fields.

Key Points of Artificial Intelligence in Embedded Systems

The goals of artificial intelligence include: reasoning, knowledge representation, automated planning, machine learning, natural language understanding, computer vision, robotics, and strong artificial intelligence, encompassing eight aspects.

• Knowledge representation and reasoning include: propositional calculus and resolution, predicate calculus and resolution, which can derive some formulas or theorems.

• Automated planning includes the planning, actions, and learning of robots, state space search, adversarial search, and planning.

• Machine learning is a research area developed from a sub-goal of AI, aimed at helping machines and software to learn from experience to solve encountered problems.

• Natural language processing is another research area developed from a sub-goal of AI, aimed at facilitating communication between machines and humans.

• Computer vision is a field that arose from the goals of AI, aimed at identifying and recognizing objects that machines can see.

• Robotics is also derived from the goals of AI, aimed at giving a machine a physical form to perform actual actions.

• Strong artificial intelligence is one of the main goals of AI research, also referred to as artificial general intelligence (AGI), or the capability to perform general intelligent behaviors. Strong AI typically connects artificial intelligence with human characteristics such as consciousness, perception, knowledge, and self-awareness.

The three elements of artificial intelligence: data, algorithms, and models.

• Data is fundamental, discovering patterns in data based on data.

• Algorithms are key, determining how to solve models using data and turning them into executable programs.

• Models are core, explaining data and discovering data patterns.

Machine Learning in Artificial Intelligence

Machine learning is an important branch of artificial intelligence that improves system performance through computational methods and learning from experience. It includes: supervised learning, unsupervised learning, and reinforcement learning. The algorithms include: regression algorithms (least squares, LR, etc.), instance-based algorithms (KNN, LVQ, etc.), regularization methods (LASSO, etc.), decision tree algorithms (CART, C4.5, RF, etc.), Bayesian methods (naive Bayes, BBN, etc.), kernel-based algorithms (SVM, LDA, etc.), clustering algorithms (K-Means, DBSCAN, EM, etc.), association rules (Apriori, FP-Growth), genetic algorithms, artificial neural networks (PNN, BP, etc.), deep learning (RBN, DBN, CNN, DNN, LSTM, GAN, etc.), dimensionality reduction methods (PCA, PLS, etc.), and ensemble methods (Boosting, Bagging, AdaBoost, RF, GBDT, etc.).

Deep Learning in Artificial Intelligence

Deep learning is an important branch of machine learning, based on artificial neural networks. The learning process of deep learning is termed deep because the structure of artificial neural networks consists of multiple input, output, and hidden layers. Each layer contains units that convert input data into information for the next layer for specific prediction tasks. Thanks to this structure, machines can learn through their data processing. For example, in a two-layer network, where a is the value of a “unit,” w represents the “connection” weight, and g is the activation function, typically using the sigmoid function for convenience in differentiation. Matrix operations simplify the formulas in the diagram: a(2) = g( a(1) * w(1) ), z = g( a(2) * w(2) ). Let the true value of the training sample be y, the predicted value be z, and define the loss function loss = (z – y)². The goal of optimizing all parameters w is to minimize the loss sum for all training data, transforming this problem into an optimization problem commonly solved using gradient descent algorithms. Generally, the backpropagation algorithm is used to calculate gradients layer by layer from back to front and ultimately solve each parameter matrix.

The architecture of deep learning neural networks (Neural Network)

• Perceptron

o The perceptron is the most basic among all neural networks and is the fundamental component of more complex neural networks. It connects only one input neuron and one output neuron.

• Feed-Forward Network

o The feed-forward network is a collection of perceptrons, which includes three basic types of layers: input layer, hidden layer, and output layer. In each connection process, signals from the previous layer are multiplied by a weight, a bias is added, and then passed through an activation function. The feed-forward network uses backpropagation to iteratively update parameters until ideal performance is achieved.

• Recurrent Neural Network (RNN)

o The recurrent neural network is a special type of network that contains loops and self-repetition, hence the term “recurrent.” By allowing information to be stored in the network, RNNs use reasoning from previous training to make better, more informed decisions about upcoming events. It is a feed-forward neural network with temporal connections: they have states, and there are temporal connections between channels. The input information of neurons includes not only the output of the previous neuron layer but also their own state in previous channels.

• Long Short Term Memory Network (LSTM)

o Due to the limited range of contextual information in practice, RNNs have a significant problem. The influence of given input on the hidden layer (i.e., the output of the network) can either explode exponentially or decay to zero. The solution to this gradient vanishing problem is the Long Short Term Memory network (LSTM).

• Auto Encoder (AE)

o The basic idea of the autoencoder is to “compress” the original high-dimensional data into low-dimensional data with high information content and then project the compressed data into a new space. Autoencoders have many applications, including dimensionality reduction, image compression, data denoising, feature extraction, image generation, and recommendation systems. They can be either unsupervised or supervised, providing insights into the essence of the data.

• Variational Auto Encoder (VAE)

o The autoencoder learns a compressed representation of an input (which can be an image or text sequence), for example, compressing the input and then decompressing it back to match the original input, while the variational autoencoder learns the probability distribution based on the data. It not only learns a function representing the data but also acquires a more detailed and nuanced view of the data, sampling from the distribution to generate new input data samples.

• Deep Convolutional Network (DCN)

o Images have very high dimensions, so training a standard feed-forward network to recognize images would require thousands of input neurons, which not only leads to obvious high computational load but may also cause many problems related to the dimensionality disaster in neural networks. Convolutional neural networks provide a solution by utilizing convolution and pooling layers to reduce image dimensions. Since convolutional layers are trainable but have significantly fewer parameters than standard hidden layers, they can highlight important parts of the image and propagate each important part forward. In traditional CNNs (Convolutional Neural Networks), the last few layers are hidden layers used to handle the “compressed image information.”

• Deconvolutional Network (DN)

o As its name suggests, the deconvolutional network operates oppositely to the convolutional network. DN does not reduce the image dimensions through convolution but uses deconvolution to create images, often recovering images from noise.

• Generative Adversarial Network (GAN)

o The generative adversarial network is a network specifically designed to generate images, consisting of two networks: a discriminator and a generator. The task of the discriminator is to distinguish whether the image is extracted from the dataset or generated by the generator, while the task of the generator is to produce sufficiently realistic images so that the discriminator cannot distinguish whether the images are real. Over time, under careful supervision, these two adversaries compete with each other, each aiming to successfully improve the other. The end result is a trained generator capable of generating realistic images. The discriminator is a convolutional neural network aimed at maximizing the accuracy of recognizing real and fake images, while the generator is a deconvolutional neural network aimed at minimizing the performance of the discriminator.

• Markov Chain (MC)

o Markov Chain (MC) or Discrete Time Markov Chain (DTMC) is in some sense the predecessor of BMs and HNs.

• Hopfield Network (HN)

o A network in which each neuron is interconnected with other neurons.

• Boltzmann Machines (BM)

o Very similar to Hopfield networks, the difference is that some neurons act as input neurons while the rest act as hidden neurons. After the entire neural network has been updated, the input neurons become output neurons. Initially, the weights of the neurons are random, and learning occurs through the backpropagation algorithm or the recently common contrastive divergence algorithm (the Markov chain is used to compute the gradient between two information gains).

• Echo State Networks (ESN)

o The echo state network is another distinct type of (recurrent) network. Its difference lies in that the connections between neurons are random (there is no uniform layer of neural cells), and its training process also differs. Unlike the input data followed by backpropagation of errors, ESN first inputs data, feeds forward, updates neuron states, and finally observes the results. Its input and output layers play rather unconventional roles, with the input layer dominating the network and the output layer acting as an observer of the activation patterns unfolding over time. During training, only the connections between observation and hidden units are modified.

• Deep Residual Networks (DRN)

o A very deep FFNN network with a special connection that allows information to be passed from one neural cell layer to several layers ahead (usually 2 to 5 layers).

• Kohonen Networks (KN)

o KN uses competitive learning to classify data without supervision. The neural network is given an input, and then it evaluates which neuron best matches that input. This neuron will then continue to adjust to better match the input data, while also influencing adjacent neurons. The distance that adjacent neurons move depends on how far they are from the best-matching unit.

• Neural Turing Machines (NTM)

o Can be understood as an abstraction of LSTM, attempting to demystify the neural network (to peek into the details occurring inside). NTM does not design memory units within neurons but separates them. NTM attempts to combine the efficiency and permanence of conventional digital information storage with the efficiency and functional expressiveness of neural networks. Its idea is to design a content-addressable memory store and allow the neural network to read and write to it. The “Turing” in the name of NTM indicates that it is Turing complete, meaning it has the capability to read, write, and modify states based on the content it reads, expressing everything a universal Turing machine can express.

Traditional Machine Learning in Artificial Intelligence

• Linear Regression

o Linear regression is about finding a straight line, using data points to find the best-fit line. It attempts to represent the independent variable (x value) and the numerical outcome (y value) by fitting the straight line equation to the data. As per the formula, y=kx+b, where y is the dependent variable, x is the independent variable, and k and c are values obtained from the given dataset.

• Logistic Regression

o Logistic regression is similar to linear regression, but it is used for binary classification outputs (i.e., the output has only two possible values), predicting the final output as a nonlinear S-shaped function.

This logic function maps intermediate results to the outcome variable y, with values ranging between 0 and 1. These values represent the probability of y occurring. The properties of the S-shaped logistic function make logistic regression more suitable for classification problems.

• Decision Tree

o Decision trees are used for regression and classification problems. The trained model learns to predict the target variable’s value by learning the decision rules represented by the tree. The tree consists of nodes with corresponding attributes. At each node, questions about data are asked based on available features, with left and right branches representing possible answers, and the final nodes (leaf nodes) corresponding to predicted values. The importance of each feature is determined through a top-down approach, with higher nodes indicating more important attributes. As illustrated in the example of who in the crowd prefers using credit cards, if a person is married and over 30 years old, they are more likely to have a credit card (100% preference). Test data is used to generate the decision tree.

• Naïve Bayes

o Naïve Bayes is based on Bayes’ theorem. It measures the probability of each class, with the conditional probability of each class given the x value. This algorithm is used for classification problems to yield a binary “yes/no” result.

• Support Vector Machine (SVM)

o Support Vector Machine is a machine learning algorithm used to solve binary classification problems, finding a separating hyperplane in the sample space that separates samples of different categories while maximizing the minimum distance from two point sets to this hyperplane. As shown in the diagram, there are black and white point samples, and the goal of the support vector machine is to find a line (H3) that separates the black and white points while maximizing the sum of distances of all black and white points to this line (H3).

• K-Nearest Neighbors (KNN)

o KNN is an instance-based learning algorithm, or lazy learning that postpones all calculations until classification. It uses the nearest neighbors (k) to predict unknown data points. The value of k is a key factor in prediction accuracy, whether for classification or regression, and measuring the weight of neighbors is very useful, with closer neighbors weighted more heavily than distant ones. The disadvantage of KNN is its sensitivity to the local structure of the data. It has high computational demands, requiring normalization of data to ensure each data point is within the same range.

• K-Means

o K-Means is an unsupervised learning algorithm that provides a solution for clustering problems. The K-Means algorithm partitions n points (which can be a sample observation or an instance) into k clusters, ensuring that each point belongs to the cluster corresponding to the nearest average (centroid). This process continues until the centroids do not change.

• Random Forest

o Random Forest is a very popular ensemble machine learning algorithm. The basic idea of this algorithm is that many people’s opinions are more accurate than an individual’s opinion. In random forests, we use decision tree ensembles (see Decision Trees). To classify new objects, we take votes from each decision tree, combine the results, and make a final decision based on the majority vote.

• Dimensional Reduction

o Dimensional reduction refers to the process of reducing the number of random variables under certain conditions to obtain a set of “uncorrelated” main variables, which can be further divided into feature selection and feature extraction methods.

———— END ————

The Vision Board development board for machine vision in the embedded field has arrived.

What is the performance of the industry’s first Cortex-M85 MCU?

MCU million times read and write flash memory test.

Related posts

Leave a Comment Cancel reply