↓Recommended Follow↓
Author: Chocolate Brain
https://juejin.cn/post/7235844016640933943
Babel
When I was very young, someone told me that code should be written artistically. I thought to myself: …… so advanced, how pretentious. However, with the passage of time and the adoption of various proposals, the way of writing JS has gradually upgraded, and the syntactic sugar has increased. Originally three or four lines of code can now be done in one line with just a snap, and looking at the entire code at a glance makes one feel, well, there’s something to it. Below is a small demo:
const demo = () => [1,2,3].map(e => e + 1)
After Babel transformation, this line of code actually looks like this:
var demo = function demo() {
return [1, 2, 3].map(function (e) {
return e + 1;
});
};
This is how it can achieve backward compatibility to run in various possible environments. Therefore, the main functions of Babel are as follows:
-
Code transformation -
Add missing features in the target environment via Polyfill (@babel/polyfill)
@babel/polyfill
module includes core-js
and a custom regenerator runtime
module, which can simulate a complete ES2015+ environment. This means you can use new built-in components like Promise
and WeakMap
, static methods like Array.from
or Object.assign
, instance methods like Array.prototype.includes
, and generator functions (provided that you use the @babel/plugin-transform-regenerator
plugin). To add these features, polyfill
will be added to the global scope and built-in prototypes like String
(which will pollute the global environment). In fact, after Babel 7.4.0, @babel/polyfill
has been deprecated, and corejs
and regenerator
are directly imported in the source code, and in practical applications, @babel/preset-env
can be used as a substitute.
In practical projects, Babel can be configured in multiple ways (details can be understood in the source code analysis below). Here, we take babel.config.js
as an example:
module.exports = {
presets: [...],
plugins: [...],
}
The plugins
are the basis and rules for Babel to perform transformations, while presets
are a collection of plugins. You can learn more in the source code analysis. When two transformation plugins process a certain code snippet of the “program (Program)”, they will execute in the order of the arrangement of transformation plugins or presets:
-
Plugins run before Presets. -
Plugin order is arranged from front to back. -
Preset order is reversed (from back to front (the specific reason will be explained in the source code analysis)).
The compilation process of Babel can be divided into three stages:
-
Parsing: Parsing the code string into an abstract syntax tree. -
Transformation: Performing transformation operations on the abstract syntax tree. -
Code Generation: Generating a code string based on the transformed abstract syntax tree.
In this process, the most important thing is the AST, as follows.
AST
What is AST
A classic interview question: What is the principle of Babel? Answer: Parse – Transform – Generate. In simple terms, it is to convert code into a specific data structure through certain rules, then perform some CRUD operations on this data structure, and then convert this data structure back into code. Because it is “frontend”, with clearly defined levels of trees, the AST is such a tree. Going deeper leads to the profound subject of – “Compiler Principles”.
As a frontend developer, it’s hard not to come into contact with AST. Webpack, Eslint… or the underlying optimization… are all closely related to AST.
AST
(abstract syntax tree
) is an abstract representation of the syntax structure of source code. It represents the syntax structure of programming languages in a tree format, where each node on the tree represents a structure in the source code.
Generating an AST mainly consists of two steps:
Lexical Analysis (lexical analyzer
)
Splitting the code, traversing the code according to predefined rules to convert each character into a lexical unit (token
), thus generating a token list. The demo code converted to tokens is as follows:
[
{
"type": "Keyword",
"value": "const"
},
{
"type": "Identifier",
"value": "demo"
},
{
"type": "Punctuator",
"value": "="
},
{
"type": "Punctuator",
"value": "("
},
{
"type": "Punctuator",
"value": ")"
},
{
"type": "Punctuator",
"value": "=>"
},
{
"type": "Punctuator",
"value": "["
},
{
"type": "Numeric",
"value": "1"
},
{
"type": "Punctuator",
"value": ","
},
{
"type": "Numeric",
"value": "2"
},
{
"type": "Punctuator",
"value": ","
},
{
"type": "Numeric",
"value": "3"
},
{
"type": "Punctuator",
"value": "]"
},
{
"type": "Punctuator",
"value": "."
},
{
"type": "Identifier",
"value": "map"
},
{
"type": "Punctuator",
"value": "("
},
{
"type": "Identifier",
"value": "e"
},
{
"type": "Punctuator",
"value": "=>"
},
{
"type": "Identifier",
"value": "e"
},
{
"type": "Punctuator",
"value": "+"
},
{
"type": "Numeric",
"value": "1"
},
{
"type": "Punctuator",
"value": ")"
}
]
Syntax Analysis (Syntax analyzer
)
After obtaining the token list, it can be associated through syntax rules to form an AST. The AST tree generated from the demo code is shown below (too long to show entirely):
Online AST conversion site: https://astexplorer.net/
AST with Babel
Ok, we want to perform such a series of operations through Babel to generate AST, transform AST, and then convert it back to JS. We take the demo as an example and implement this operation by calling the API provided by Babel:
const transformLetToVar = babel.transformSync(`${beforeFile}`, {
plugins: [{
visitor: {
// [const, let] transformed to var
VariableDeclaration(path) {
if (path.node.kind === 'let' || path.node.kind === 'const' ) {
path.node.kind = 'var';
}
},
// () => {} transformed to function, note the absence of {}
ArrowFunctionExpression(path) {
let body = path.node.body;
if (!t.isBlockStatement(body)) {
body = t.blockStatement([t.returnStatement(body)]);
}
path.replaceWith(
t.functionExpression(
null,
path.node.params,
body,
false,
false
)
);
},
// Process arrays
CallExpression(path) {
if (path.get('callee.property').node.name === 'map') {
const callback = path.node.arguments[0];
if (t.isArrowFunctionExpression(callback)) {
let body = callback.body;
if (!t.isBlockStatement(body)) {
body = t.blockStatement([t.returnStatement(body)]);
}
path.node.arguments[0] = t.functionExpression(
null,
callback.params,
body,
false,
false
);
}
}
}
}
}]
});
The code uses the transformSync
method from @Babel/core
, which is a combination of @Babel/parser
, @babel/traverse
, and @babel/generator
. If these three methods are used, the code is as follows:
// Step 1: Parse the code to AST
const ast = parser.parse(code);
// Step 2: Traverse and modify the AST
traverse(ast, {
VariableDeclaration(path) {
if (path.node.kind === 'let' || path.node.kind === 'const' ) {
path.node.kind = 'var';
}
},
ArrowFunctionExpression(path) {
let body = path.node.body;
if (!t.isBlockStatement(body)) {
body = t.blockStatement([t.returnStatement(body)]);
}
path.replaceWith(
t.functionExpression(
null,
path.node.params,
body,
false,
false
)
);
},
CallExpression(path) {
if (path.get('callee.property').node.name === 'map') {
const callback = path.node.arguments[0];
if (t.isArrowFunctionExpression(callback)) {
let body = callback.body;
if (!t.isBlockStatement(body)) {
body = t.blockStatement([t.returnStatement(body)]);
}
path.node.arguments[0] = t.functionExpression(
null,
callback.params,
body,
false,
false
);
}
}
}
});
// Step 3: Generate new code from the modified AST
const output = generator(ast, {}, code);
@Babel/core Source Code Analysis
In the above text, when converting with Babel, we used babel.transformSync
, which comes from @Babel/core
. Let’s start with it for a simple source code analysis. First, we go to Babel/core/index.js
, which mainly contains some basic imports and exports. The relevant part about babel.transformSync
is as follows:
export {
transform,
transformSync,
transformAsync,
type FileResult,
} from "./transform";
Next, we go directly to Babel/core/transform
:
type Transform = {
(code: string, callback: FileResultCallback): void;
(
code: string,
opts: InputOptions | undefined | null,
callback: FileResultCallback,
): void;
(code: string, opts?: InputOptions | null): FileResult | null;
};
const transformRunner = gensync(function* transform(
code: string,
opts?: InputOptions,
): Handler<FileResult | null> {
const config: ResolvedConfig | null = yield* loadConfig(opts);
if (config === null) return null;
return yield* run(config, code);
});
export const transform: Transform = function transform(
code,
optsOrCallback?: InputOptions | null | undefined | FileResultCallback,
maybeCallback?: FileResultCallback,
) {
let opts: InputOptions | undefined | null;
let callback: FileResultCallback | undefined;
if (typeof optsOrCallback === "function") {
callback = optsOrCallback;
opts = undefined;
} else {
opts = optsOrCallback;
callback = maybeCallback;
}
if (callback === undefined) {
if (process.env.BABEL_8_BREAKING) {
throw new Error(
"Starting from Babel 8.0.0, the 'transform' function expects a callback. If you need to call it synchronously, please use 'transformSync'.",
);
} else {
// console.warn(
// "Starting from Babel 8.0.0, the 'transform' function will expect a callback. If you need to call it synchronously, please use 'transformSync'.",
// );
return beginHiddenCallStack(transformRunner.sync)(code, opts);
}
}
beginHiddenCallStack(transformRunner.errback)(code, opts, callback);
};
export function transformSync(
...args: Parameters<typeof transformRunner.sync>
) {
return beginHiddenCallStack(transformRunner.sync)(...args);
}
export function transformAsync(
...args: Parameters<typeof transformRunner.async>
) {
return beginHiddenCallStack(transformRunner.async)(...args);
}
We will find that they all call transformRunner
, which receives two parameters: code and opts:
const transformRunner = gensync(function* transform(
code: string,
opts?: InputOptions,
): Handler<FileResult | null> {
const config: ResolvedConfig | null = yield* loadConfig(opts);
if (config === null) return null;
return yield* run(config, code);
});
Where InputOptions
is:
export type InputOptions = ValidatedOptions;
export type ValidatedOptions = {
cwd?: string;
filename?: string;
filenameRelative?: string;
babelrc?: boolean;
babelrcRoots?: BabelrcSearch;
configFile?: ConfigFileSearch;
root?: string;
rootMode?: RootMode;
code?: boolean;
ast?: boolean;
cloneInputAst?: boolean;
inputSourceMap?: RootInputSourceMapOption;
envName?: string;
caller?: CallerMetadata;
extends?: string;
env?: EnvSet<ValidatedOptions>;
ignore?: IgnoreList;
only?: IgnoreList;
overrides?: OverridesList;
// Generally verify if a given config object should be applied to the given file.
test?: ConfigApplicableTest;
include?: ConfigApplicableTest;
exclude?: ConfigApplicableTest;
presets?: PluginList;
plugins?: PluginList;
passPerPreset?: boolean;
assumptions?: {
[name: string]: boolean;
};
// browserslists-related options
targets?: TargetsListOrObject;
browserslistConfigFile?: ConfigFileSearch;
browserslistEnv?: string;
// Options for @babel/generator
retainLines?: boolean;
comments?: boolean;
shouldPrintComment?: Function;
compact?: CompactOption;
minified?: boolean;
auxiliaryCommentBefore?: string;
auxiliaryCommentAfter?: string;
// Parser
sourceType?: SourceTypeOption;
wrapPluginVisitorMethod?: Function;
highlightCode?: boolean;
// Sourcemap generation options.
sourceMaps?: SourceMapsOption;
sourceMap?: SourceMapsOption;
sourceFileName?: string;
sourceRoot?: string;
// Deprecate top level parserOpts
parserOpts?: ParserOptions;
// Deprecate top level generatorOpts
generatorOpts?: GeneratorOptions;
};
Among them, properties like plugins
and presets
are included, where plugins
is the second parameter we called when invoking babel.transformSync
, which is the basis for implementing code transformation in Babel. Each plugin is a small JavaScript program that tells Babel how to perform specific code transformations, while presets
are a collection of pre-defined plugins. Due to the numerous new features of JavaScript, if we had to manually specify all the required plugins each time, the configuration would be very cumbersome, so Babel provides presets that allow us to import a whole set of plugins with one line of code.
Returning to transformRunner
, the main body of this method is divided into two steps: calling loadConfig
and calling run
. First, let’s look at loadConfig
, which is actually the method babel-core/src/config/full.ts
that handles loading configurations, presets, and plugins. The source code is as follows, and we will analyze it step by step:
export default gensync(function* loadFullConfig(
inputOpts: unknown,
): Handler<ResolvedConfig | null> {
const result = yield* loadPrivatePartialConfig(inputOpts);
if (!result) {
return null;
}
const { options, context, fileHandling } = result;
if (fileHandling === "ignored") {
return null;
}
const optionDefaults = {};
const { plugins, presets } = options;
if (!plugins || !presets) {
throw new Error("Assertion failure - plugins and presets exist");
}
// Create presetContext object, adding options.targets to the original context
const presetContext: Context.FullPreset = {
...context,
targets: options.targets,
};
// ...
})
First, it calls loadPrivatePartialConfig
to build the configuration chain. This method receives opts
, processes various validations and transformations, and finally returns a processed configuration object. During this process, it also converts some values in the input options into absolute paths and creates some new objects and properties:
export default function* loadPrivatePartialConfig(
inputOpts: unknown,
): Handler<PrivPartialConfig | null> {
if (
inputOpts != null &&
(typeof inputOpts !== "object" || Array.isArray(inputOpts))
) {
throw new Error("Babel options must be an object, null, or undefined");
}
const args = inputOpts ? validate("arguments", inputOpts) : {};
const {
envName = getEnv(),
cwd = ".",
root: rootDir = ".",
rootMode = "root",
caller,
cloneInputAst = true,
} = args;
// Convert cwd and rootDir to absolute paths
const absoluteCwd = path.resolve(cwd);
const absoluteRootDir = resolveRootMode(
path.resolve(absoluteCwd, rootDir),
rootMode,
);
// If filename is a string, convert to absolute path
const filename =
typeof args.filename === "string"
? path.resolve(cwd, args.filename)
: undefined;
// resolveShowConfigPath method is used to resolve file paths, returning that path if it exists and points to a file
const showConfigPath = yield* resolveShowConfigPath(absoluteCwd);
// Reassemble the converted data into an object named context
const context: ConfigContext = {
filename,
cwd: absoluteCwd,
root: absoluteRootDir,
envName,
caller,
showConfig: showConfigPath === filename,
};
// Call buildRootChain, the source code analysis of buildRootChain is below
const configChain = yield* buildRootChain(args, context);
if (!configChain) return null;
const merged: ValidatedOptions = {
assumptions: {},
};
// Iterate through configChain.options and merge into merged
configChain.options.forEach(opts => {
mergeOptions(merged as any, opts);
});
// Define a new options object
const options: NormalizedOptions = {
...merged,
targets: resolveTargets(merged, absoluteRootDir),
// Tack the passes onto the object itself so that, if this object is
// passed back to Babel a second time, it will be in the right structure
// to not change behavior.
cloneInputAst,
babelrc: false,
configFile: false,
browserslistConfigFile: false,
passPerPreset: false,
envName: context.envName,
cwd: context.cwd,
root: context.root,
rootMode: "root",
filename:
typeof context.filename === "string" ? context.filename : undefined,
plugins: configChain.plugins.map(descriptor =>
createItemFromDescriptor(descriptor),
),
presets: configChain.presets.map(descriptor =>
createItemFromDescriptor(descriptor),
),
};
return {
options,
context,
fileHandling: configChain.fileHandling,
ignore: configChain.ignore,
babelrc: configChain.babelrc,
config: configChain.config,
files: configChain.files,
};
}
This method mainly does three things: processes configurations, processes presets, and processes plugins, and we will analyze the source code below.
export function* buildRootChain(
opts: ValidatedOptions,
context: ConfigContext,
): Handler<RootConfigChain | null> {
let configReport, babelRcReport;
const programmaticLogger = new ConfigPrinter();
// Generate programmatic options (program options), which will be used when using @babel/cli or babel.transfrom
const programmaticChain = yield* loadProgrammaticChain(
{
options: opts,
dirname: context.cwd,
},
context,
undefined,
programmaticLogger,
);
if (!programmaticChain) return null;
const programmaticReport = yield* programmaticLogger.output();
let configFile;
// If a configuration file is specified, call loadConfig to load it; if not, call findRootConfig to load the root directory configuration
if (typeof opts.configFile === "string") {
configFile = yield* loadConfig(
opts.configFile,
context.cwd,
context.envName,
context.caller,
);
} else if (opts.configFile !== false) {
configFile = yield* findRootConfig(
context.root,
context.envName,
context.caller,
);
}
// ...
}
The findRootConfig
method traverses ROOT_CONFIG_FILENAMES
and loads the first found configuration file from the root directory (the current execution directory).
export const ROOT_CONFIG_FILENAMES = [
"babel.config.js",
"babel.config.cjs",
"babel.config.mjs",
"babel.config.json",
"babel.config.cts",
];
export function findRootConfig(
dirname: string,
envName: string,
caller: CallerMetadata | undefined,
): Handler<ConfigFile | null> {
return loadOneConfig(ROOT_CONFIG_FILENAMES, dirname, envName, caller);
}
function* loadOneConfig(
names: string[],
dirname: string,
envName: string,
caller: CallerMetadata | undefined,
previousConfig: ConfigFile | null = null,
): Handler<ConfigFile | null> {
const configs = yield* gensync.all(
names.map(filename =>
readConfig(path.join(dirname, filename), envName, caller),
),
);
const config = configs.reduce((previousConfig: ConfigFile | null, config) => {
if (config && previousConfig) {
throw new ConfigError(
`Multiple configuration files found. Please remove one:\n` +
` - ${path.basename(previousConfig.filepath)}\n` +
` - ${config.filepath}\n` +
`from ${dirname}`,
);
}
return config || previousConfig;
}, previousConfig);
if (config) {
debug("Found configuration %o from %o.", config.filepath, dirname);
}
return config;
}
Ok, we have finally completed the source code analysis of the run
method, which receives three parameters: config (the return value of the source code processed above), code (the code string), and ast (an optional AST):
export function* run(
config: ResolvedConfig,
code: string,
ast?: t.File | t.Program | null,
): Handler<FileResult> {
const file = yield* normalizeFile(
config.passes,
normalizeOptions(config),
code,
ast,
);
// ...
}
The normalizeFile
code is as follows, which uses config.passes
(an array of plugins), normalized configuration, source code, and an optional ast
as parameters:
export default function* normalizeFile(
pluginPasses: PluginPasses,
options: { [key: string]: any },
code: string,
ast?: t.File | t.Program | null,
): Handler<File> {
code = `${code || ""}`;
if (ast) {
if (ast.type === "Program") {
ast = file(ast, [], []);
} else if (ast.type !== "File") {
throw new Error("AST root must be a Program or File node");
}
if (options.cloneInputAst) {
ast = cloneDeep(ast);
}
} else {
ast = yield* parser(pluginPasses, options, code);
}
let inputMap = null;
if (options.inputSourceMap !== false) {
if (typeof options.inputSourceMap === "object") {
inputMap = convertSourceMap.fromObject(options.inputSourceMap);
}
if (!inputMap) {
const lastComment = extractComments(INLINE_SOURCEMAP_REGEX, ast);
if (lastComment) {
try {
inputMap = convertSourceMap.fromComment(lastComment);
} catch (err) {
debug("discarding unknown inline input sourcemap", err);
}
}
}
if (!inputMap) {
const lastComment = extractComments(EXTERNAL_SOURCEMAP_REGEX, ast);
if (typeof options.filename === "string" && lastComment) {
try {
const match: [string, string] = EXTERNAL_SOURCEMAP_REGEX.exec(
lastComment,
) as any;
const inputMapContent = fs.readFileSync(
path.resolve(path.dirname(options.filename), match[1]),
"utf8",
);
inputMap = convertSourceMap.fromJSON(inputMapContent);
} catch (err) {
debug("discarding unknown file input sourcemap", err);
}
} else if (lastComment) {
debug("discarding un-loadable file input sourcemap");
}
}
}
return new File(options, {
code,
ast: ast as t.File,
inputMap,
});
}
Where parser
source code ultimately points to babel-core/src/parser/index.ts
as follows:
export default function* parser(
pluginPasses: PluginPasses,
{ parserOpts, highlightCode = true, filename = "unknown" }: any,
code: string,
): Handler<ParseResult> {
try {
const results = [];
for (const plugins of pluginPasses) {
for (const plugin of plugins) {
const { parserOverride } = plugin;
if (parserOverride) {
const ast = parserOverride(code, parserOpts, parse);
if (ast !== undefined) results.push(ast);
}
}
}
if (results.length === 0) {
return parse(code, parserOpts);
} else if (results.length === 1) {
yield* [];
if (typeof results[0].then === "function") {
throw new Error(
`You appear to be using an async parser plugin, ` +
`which your current version of Babel does not support. ` +
`If you're using a published plugin, you may need to upgrade ` +
`your @babel/core version.`,
);
}
return results[0];
}
throw new Error("More than one plugin attempted to override parsing.");
} catch (err) {
if (err.code === "BABEL_PARSER_SOURCETYPE_MODULE_REQUIRED") {
err.message +=
"\nConsider renaming the file to '.mjs', or setting sourceType:module " +
"or sourceType:unambiguous in your Babel config for this file.";
// err.code will be changed to BABEL_PARSE_ERROR later.
}
const { loc, missingPlugin } = err;
if (loc) {
const codeFrame = codeFrameColumns(
code,
{
start: {
line: loc.line,
column: loc.column + 1,
},
},
{
highlightCode,
},
);
if (missingPlugin) {
err.message =
`${filename}: ` +
generateMissingPluginMessage(missingPlugin[0], loc, codeFrame);
} else {
err.message = `${filename}: ${err.message}\n\n` + codeFrame;
}
err.code = "BABEL_PARSE_ERROR";
}
throw err;
}
}
Returning to the run
method, after obtaining the AST, it calls the transformFile
method for transformation:
function* transformFile(file: File, pluginPasses: PluginPasses): Handler<void> {
for (const pluginPairs of pluginPasses) {
const passPairs: [Plugin, PluginPass][] = [];
const passes = [];
const visitors = [];
for (const plugin of pluginPairs.concat([loadBlockHoistPlugin()])) {
const pass = new PluginPass(file, plugin.key, plugin.options);
passPairs.push([plugin, pass]);
passes.push(pass);
visitors.push(plugin.visitor);
}
for (const [plugin, pass] of passPairs) {
const fn = plugin.pre;
if (fn) {
const result = fn.call(pass, file);
yield* [];
if (isThenable(result)) {
throw new Error(
`You appear to be using an plugin with an async .pre, ` +
`which your current version of Babel does not support. ` +
`If you're using a published plugin, you may need to upgrade ` +
`your @babel/core version.`,
);
}
}
}
const visitor = traverse.visitors.merge(
visitors,
passes,
file.opts.wrapPluginVisitorMethod,
);
traverse(file.ast, visitor, file.scope);
for (const [plugin, pass] of passPairs) {
const fn = plugin.post;
if (fn) {
const result = fn.call(pass, file);
yield* [];
if (isThenable(result)) {
throw new Error(
`You appear to be using an plugin with an async .post, ` +
`which your current version of Babel does not support. ` +
`If you're using a published plugin, you may need to upgrade ` +
`your @babel/core version.`,
);
}
}
}
}
}
In the transformFile
method, the pre, visitor, and post are called in order, which are:
-
pre(state: PluginPass)
: This method is called before traversing. It is typically used to set some initial state information that needs to be maintained throughout the traversal on the plugin state object. Thestate
parameter is an instance ofPluginPass
that contains information related to the plugin execution context. -
visitor
: This object defines the methods to be called during the traversal. Each method’s key is the type of node to visit, and the value is the corresponding visitor method or an object containingenter
andexit
methods. -
post(state: PluginPass)
: This method is called after the traversal is complete, typically used to perform some cleanup or to collect and use results computed during the traversal. Thestate
parameter is the same as thepre
method.
Next, we return to the run
method:
{
// ...
let outputCode, outputMap;
try {
if (opts.code !== false) {
({ outputCode, outputMap } = generateCode(config.passes, file));
}
} catch (e) {
e.message = `${opts.filename ?? "unknown file"}: ${e.message}`;
if (!e.code) {
e.code = "BABEL_GENERATE_ERROR";
}
throw e;
}
return {
metadata: file.metadata,
options: opts,
ast: opts.ast === true ? file.ast : null,
code: outputCode === undefined ? null : outputCode,
map: outputMap === undefined ? null : outputMap,
sourceType: file.ast.program.sourceType,
externalDependencies: flattenToSet(config.externalDependencies),
};
}
Finally, the generateCode
method is called to convert the AST back to code. The source code is as follows, which is similar to parser
:
export default function generateCode(
pluginPasses: PluginPasses,
file: File,
): {
outputCode: string;
outputMap: SourceMap | null;
} {
const { opts, ast, code, inputMap } = file;
const { generatorOpts } = opts;
generatorOpts.inputSourceMap = inputMap?.toObject();
const results = [];
for (const plugins of pluginPasses) {
for (const plugin of plugins) {
const { generatorOverride } = plugin;
if (generatorOverride) {
const result = generatorOverride(ast, generatorOpts, code, generate);
if (result !== undefined) results.push(result);
}
}
}
let result;
if (results.length === 0) {
result = generate(ast, generatorOpts, code);
} else if (results.length === 1) {
result = results[0];
if (typeof result.then === "function") {
throw new Error(
`You appear to be using an async codegen plugin, ` +
`which your current version of Babel does not support. ` +
`If you're using a published plugin, you may need to upgrade ` +
`your @babel/core version.`,
);
}
} else {
throw new Error("More than one plugin attempted to override codegen.");
}
let { code: outputCode, decodedMap: outputMap = result.map } = result;
if (result.__mergedMap) {
outputMap = { ...result.map };
} else {
if (outputMap) {
if (inputMap) {
outputMap = mergeSourceMap(
inputMap.toObject(),
outputMap,
generatorOpts.sourceFileName,
);
} else {
outputMap = result.map;
}
}
}
if (opts.sourceMaps === "inline" || opts.sourceMaps === "both") {
outputCode += "\n" + convertSourceMap.fromObject(outputMap).toComment();
}
if (opts.sourceMaps === "inline") {
outputMap = null;
}
return { outputCode, outputMap };
}
At this point, the source code analysis of the run
method has been completed, and the source code analysis of @babel/core
that started with babel.transformSync
has also been finished!
Simple JavaScript Compiler (Like Babel)
Next, we will create a simple compiler for demo purposes, following the same process of parsing – transforming – generating as follows:
Node Types (constants.js)
const TokenTypes = {
Keyword: "Keyword",
Identifier: "Identifier",
Punctuator: "Punctuator",
String: "String",
Numeric: "Numeric",
Paren: 'Paren',
Arrow: 'Arrow'
}
const AST_Types = {
Literal: "Literal",
Identifier: "Identifier",
AssignmentExpression: "AssignmentExpression",
VariableDeclarator: "VariableDeclarator",
VariableDeclaration: "VariableDeclaration",
Program: "Program",
NumericLiteral: "NumericLiteral",
ArrowFunctionExpression: 'ArrowFunctionExpression',
FunctionExpression: 'FunctionExpression'
}
module.exports = {
TokenTypes,
AST_Types
}
Lexical Analysis (tokenizer.js)
const tokens = require("./constants")
// Match keywords
const KEYWORD = /let/
// Match "=", ";"
const PUNCTUATOR = /[\=;]/
// Match whitespace
const WHITESPACE = /\s/
// Match characters
const LETTERS = /[A-Za-z]/i
// Match numbers
const NUMERIC = /[0-9]/i
const PAREN = /[()]/;
const {TokenTypes } = tokens
function tokenizer(input) {
const tokens = []
let current = 0
// Traverse the string
while (current < input.length) {
let char = input[current]
// Handle keywords and variable names
if (LETTERS.test(char)) {
let value = ''
// Use a loop to traverse all letters and store them in value
while (LETTERS.test(char)) {
value += char
char = input[++current]
}
// Check if the current string is a keyword
KEYWORD.test(value) ? tokens.push({
type: TokenTypes.Keyword,
value: value
}) : tokens.push({
type: TokenTypes.Identifier,
value: value
})
continue
}
// Check if it’s a parenthesis
if (PAREN.test(char)) {
tokens.push({
type: TokenTypes.Paren,
value: char
});
current++;
continue;
}
// Check if it’s an arrow symbol
if (char === '=' && input[current + 1] === '>') {
tokens.push({
type: TokenTypes.Arrow,
value: '=>'
});
current += 2; // Skip the two characters
continue;
}
// Determine if it’s a number
if (NUMERIC.test(char)) {
let value = '' + char
char = input[++current]
while (NUMERIC.test(char) && current < input.length) {
value += char
char = input[++current]
}
tokens.push({ type: TokenTypes.Numeric, value })
continue
}
// Check if it’s a symbol, "=", ";"
if (PUNCTUATOR.test(char)) {
const punctuators = char // Create a variable to save the matched symbol
current++
tokens.push({
type: TokenTypes.Punctuator,
value: punctuators
})
continue;
}
// Handle whitespace, skip on whitespace
if (WHITESPACE.test(char)) {
current++
continue;
}
// Handle strings
if (char === '"') {
let value = ''
// Ignore the leading quote
char = input[++current]
// Traverse until the next quote is encountered
while (char !== '"') {
value += char
char = input[++current]
}
// Ignore the trailing quote
char = input[++current]
tokens.push({ type: TokenTypes.String, value: '"'+value+'"' })
continue;
}
// If no current matching rule is satisfied, throw an error
throw new TypeError('Unknown' + char)
}
return tokens
}
module.exports = tokenizer
Syntax Analysis (parser.js)
const {TokenTypes, AST_Types} = require("./constants");
function parser(tokens) {
let current = 0;
function walk() {
let token = tokens[current];
if (token.type === TokenTypes.Numeric) {
current++;
return {
type: AST_Types.NumericLiteral,
value: token.value,
};
}
if (token.type === TokenTypes.String) {
current++;
return {
type: AST_Types.Literal,
value: token.value,
};
}
if (token.type === TokenTypes.Identifier) {
current++;
return {
type: AST_Types.Identifier,
name: token.value,
};
}
if (token.type === TokenTypes.Keyword && token.value === 'let') {
token = tokens[++current];
let node = {
type: AST_Types.VariableDeclaration,
kind: 'let',
declarations: [],
};
while (token.type === TokenTypes.Identifier) {
node.declarations.push({
type: AST_Types.VariableDeclarator,
id: {
type: AST_Types.Identifier,
name: token.value,
},
init: null,
});
token = tokens[++current];
if (token && token.type === TokenTypes.Punctuator && token.value === '=') {
token = tokens[++current];
if (token && token.type === TokenTypes.Paren) {
token = tokens[++current];
if (token && token.type === TokenTypes.Paren) {
token = tokens[++current];
if (token && token.type === TokenTypes.Arrow) {
token = tokens[++current];
let arrowFunction = {
type: AST_Types.ArrowFunctionExpression,
params: [],
body: walk(),
};
node.declarations[node.declarations.length - 1].init = arrowFunction;
}
}
} else {
node.declarations[node.declarations.length - 1].init = walk();
}
}
token = tokens[current];
if (token && token.type === TokenTypes.Punctuator && token.value === ';') {
current++;
break;
}
}
return node;
}
throw new TypeError(token.type);
}
let ast = {
type: AST_Types.Program,
body: [],
};
while (current < tokens.length) {
ast.body.push(walk());
}
return ast;
}
module.exports = parser;
Traverser (traverser.js)
const constants = require("./constants")
const { AST_Types } = constants
function traverser(ast, visitor) {
// Traverse nodes, calling traverseNode
function traverseArray(array, parent) {
array.forEach(function(child) {
traverseNode(child, parent);
});
}
function traverseNode(node, parent) {
// Check if there’s a corresponding method in the visitor for the type.
const method = visitor[node.type]
if (method) {
method(node, parent)
}
// Handle each different type of node separately.
switch (node.type) {
case AST_Types.Program:
traverseArray(node.body, node)
break
case AST_Types.VariableDeclaration:
traverseArray(node.declarations, node);
break;
case AST_Types.VariableDeclarator:
traverseNode(node.id, node);
traverseNode(node.init, node);
break;
case AST_Types.ArrowFunctionExpression:
traverseArray(node.params, node);
traverseNode(node.body, node);
case AST_Types.AssignmentExpression:
case AST_Types.Identifier:
case AST_Types.Literal:
case AST_Types.NumericLiteral:
break
default:
throw new TypeError(node.type)
}
}
// Trigger the traversal of the AST, passing null for the root node which has no parent.
traverseNode(ast, null)
}
module.exports = traverser
Transformer (transformer.js)
const traverser = require("./traverser")
const constants = require("./constants")
const { AST_Types } = constants
function transformer(ast) {
const newAst = {
type: AST_Types.Program,
body: [],
sourceType: "script"
};
ast._context = newAst.body
// Pass AST and visitor into traverser
traverser(ast, {
// Convert let to var
VariableDeclaration: function(node, parent) {
const variableDeclaration = {
type: AST_Types.VariableDeclaration,
declarations: node.declarations,
kind: "var"
};
parent._context.push(variableDeclaration)
},
ArrowFunctionExpression: function (node, parent) {
const functionExpression = {
type: AST_Types.FunctionExpression,
params: node.params, // Keep parameter list unchanged
body: node.body, // Keep function body unchanged
}
if (parent.type === AST_Types.VariableDeclarator) {
parent.init = functionExpression;
}
},
});
return newAst
}
module.exports = transformer
Code Generator (codeGenerator.js)
const constants = require("./constants")
const { AST_Types } = constants
function codeGenerator(node) {
// Handle different types of nodes
switch (node.type) {
// If it’s a Program node, traverse each node in its body property and add newline characters
case AST_Types.Program:
return node.body.map(codeGenerator)
.join('\n')
case AST_Types.VariableDeclaration:
return (
node.kind + ' ' + node.declarations.map(codeGenerator)
)
case AST_Types.VariableDeclarator:
return (
codeGenerator(node.id) + ' = ' +
codeGenerator(node.init)
)
case AST_Types.Identifier:
return node.name
case AST_Types.Literal:
return '"'+node.value+'"' + "; }"
case AST_Types.NumericLiteral:
return node.value + '; }'
case AST_Types.FunctionExpression:
return 'function(' + node.params + ') { return ' + codeGenerator(node.body)
default:
throw new TypeError(node.type)
}
}
module.exports = codeGenerator
index.js
const tokenizer = require('./tokenizer')
const parser = require('./parser')
const transformer = require("./transformer")
const codeGenerator = require("./codeGenerator")
const demo = 'let a = () => 1;'
const tokens = tokenizer(demo)
const AST = parser(tokens)
const newAST = transformer(AST)
const newCode = codeGenerator(newAST)
console.log(newCode)
console.dir(newAST, {depth: null})
The final transformation result is as follows:
var a = function() { return 1; }
The generated new AST tree is as follows:
{
type: 'Program',
body: [
{
type: 'VariableDeclaration',
declarations: [
{
type: 'VariableDeclarator',
id: { type: 'Identifier', name: 'a' },
init: {
type: 'FunctionExpression',
params: [],
body: { type: 'NumericLiteral', value: '1' }
}
}
],
kind: 'var'
}
],
sourceType: 'script'
}
– EOF –
Add me on WeChat, not only will your frontend skills improve +1

On a daily basis, I will also share frontend development learning resources and selected technical articles on my personal WeChat. Occasionally, I will share some interesting activities, job referrals, and how to use technology for side projects.
Add me on WeChat, open a window
1. Say goodbye to “copy and paste”, let’s build a Babel ourselves
2. What every frontend developer should know about AST
3. I tried an AI code completion tool, it’s really cool!
If you find this article helpful, please share it with more people
Recommend following “Frontend All-in-One” to improve your frontend skills
Likes and views are the biggest support❤️