AI Benchmarking on MacBook Air and ThinkPad

In the previous experience article about the Dell Inspiron 14 Plus, we mentioned that we would conduct an AI Benchmark later; some readers complained about a segment of Python code in the article that didn’t clarify much; and the MacBook Air we brought for comparison is not even in the same league as a notebook… We need to fill this gap.
Recently, we received a Lenovo ThinkPad X13 Gen 3 (hereinafter referred to as ThinkPad X13) – this series is often referred to as the “small X1 Carbon.” Although it is commonly called an “office laptop” or “business laptop,” its form factor and positioning can be clearly defined as a thin and light laptop. We plan to conduct an AI Benchmark on this lightweight laptop. Through this article, we can also get a rough idea of how this year’s PC thin and light laptops and low-power CPUs perform, as we have previously experienced only “all-rounder laptops” with standard power processors.
Although running AI tests on a lightweight laptop or office laptop without a dedicated GPU is somewhat unreliable, because if you really want to do AI development, you clearly wouldn’t use such a configuration for heavy tasks. However, the laptop processors in the past two years often have 12 cores, and just running Cinebench rendering is a waste… It’s not just for fun; such tests should give everyone an idea of the same core microarchitecture and CPUs of different scales.
AI Benchmarking on MacBook Air and ThinkPad
To have a rough understanding of the test results, we still brought in the MacBook Air (M1 7-core GPU, 16GB RAM, 256GB SSD) as the main comparison object. In terms of pricing, this ThinkPad X13 Gen 3 (Core i5-1240P, 16GB RAM, 512GB SSD) is roughly comparable to the most basic MacBook Air, making it valuable for comparison from a purchasing perspective.
In fact, the currently sold M1 + 16GB RAM configuration MacBook Air still costs over 9000 yuan. Of course, this is mainly because we do not have the M2 version of the MacBook Air… Moreover, although we are not comparing with the M2, the i5-1240P is certainly not the strongest chip in Intel’s product line.
In addition to the MacBook Air, this article will also include test results from the 2021 MateBook X Pro (previous generation Core i7-1165G7, 16GB RAM) and the 2022 ROG Zephyrus M16 (Core i9-12900H, 24GB RAM) for reference in some tests.
AI Benchmarking on MacBook Air and ThinkPad
The MacBook Air (M1) and ThinkPad X3 Gen 3 have very similar form factors and pricing.

This is a laptop with peculiar performance release.

Before discussing performance, let’s first take a look at the performance release level of the ThinkPad X13 laptop at the system level to see if it is representative of the thin and light laptop category in terms of performance.
The main configuration information for the ThinkPad X13 Gen 3 is as follows:
AI Benchmarking on MacBook Air and ThinkPad
At the system level, the ThinkPad X13 has quite a few highlights. For example, this SSD is definitely a high-end product, coming from the well-known 980Pro. In terms of wireless connectivity, this laptop supports nano-SIM card insertion for 4G communication – a significant convenience for business travelers.
As a business laptop, most of the configuration items are quite sincere, and although the magnesium-aluminum alloy body is not as thin as the X1 Carbon, it is already quite good. There are some drawbacks, mainly the 300 nit brightness 1080p screen, which is relatively outdated in a world full of 1440p; and the fingerprint recognition is somewhat difficult to use…
AI Benchmarking on MacBook Air and ThinkPad
AI Benchmarking on MacBook Air and ThinkPad
From the teardown, we can see that the memory chips come from SK Hynix; the SSD comes from Samsung – from several laptops we have disassembled before, it can be seen that the heat dissipation configurations added by OEM manufacturers to contemporary SSDs are becoming increasingly luxurious, with thermal interface materials on both sides.
AI Benchmarking on MacBook Air and ThinkPad
The 4G communication module is the EM05-CE from Shanghai Quectel, which is quite well-known; the internal chip source is unknown.
AI Benchmarking on MacBook Air and ThinkPad
The WiFi module is the Intel AX211.
The system-level experience is still not the focus of this article. The core part is the Intel Core i5-1240P processor – the suffix P here is a new series code added by Intel for the 12th generation Core this year. Intel officially sets the TDP (thermal design power) for the P series at 28W, which belongs to the low-power processor series.
Compared to the lower power U series (15W/9W TDP), the P series and H series (45W TDP) use the same die. In other words, the P series is a downscaled version of the H series; while the U series is purer in terms of “low power” and “ultra-low power” attributes. Therefore, Intel describes the P series as “high-performance thin and light laptops” in the context of the 12th generation Core processors. The processor used in our ThinkPad X13, the Core i5-1240P, mainly differs from the i7 in core frequency and cache size.
AI Benchmarking on MacBook Air and ThinkPad
Under the heat pipe should be the Core i5-1240P processor.
Although we shouldn’t have too high expectations for the performance release of thin and light laptops, the traditional side heat dissipation fins of ThinkPad are still present, which should provide better performance release than regular thin and light laptops.
However, the actual situation is more complicated (all tests below are conducted with power plugged in and using the “Best Performance” setting in Windows power options; testing environment temperature: 24℃):
AI Benchmarking on MacBook Air and ThinkPad
This is the result of running 4 rounds of Cinebench R23, with HWiNFO recording the CPU package temperature (red line) and power consumption (blue line) curves. The first two rounds are relatively normal: during the first round of testing, the CPU package power can peak at 55W; then it gradually declines to around 35W; the second round can peak at around 42W, then stabilize at around 33W. The CPU package temperature generally fluctuates around 97℃. This already exceeds the performance release levels of many thin and light laptops last year.
The situation in the third round returned to the positioning of thin and light laptops – in the first half of the third round test, it could maintain a power consumption level of 35W, but in the second half, it suddenly dropped to around 27W and slid down to 24W. At this point, the CPU package temperature dropped to around 85℃. In the fourth round of testing, aside from initially peaking at 41W, it subsequently dropped to 18W. The CPU package temperature was only 70℃ at this time.
18W is not uncommon for thin and light laptops. However, the ThinkPad X13 starts high, and based on its cooling design, disregarding other system-level limitations, such sustained performance release can be considered conservative. This is not friendly for some long-duration high-load applications, such as gaming and AI applications. However, the ThinkPad X13 is not originally designed for such tasks, and office scenarios generally do not require sustained high-load performance output for long periods.
But this indeed affects many tests we will conduct next; it also indicates that the ThinkPad X13 is indeed not suitable for AI – although that’s stating the obvious.
AI Benchmarking on MacBook Air and ThinkPad
The results of half an hour of Cinebench R23 loop testing are generally consistent with the above situation. In the first round of testing, the ThinkPad X13 could achieve a score of 10002, but by the 15th round, it was left with only 75% of its performance level. This degree of performance loss is similar to some ultra-thin laptops, but it is not due to temperature limits.
In fact, the burst performance supply of the ThinkPad X13 is still quite strong. During the PCMark 10 modern office test, we observed several instances of nearly 60W of instantaneous power release (mainly occurring during web browsing and photo editing tests; note that the official power supply is only 65W) – this burst performance release is precisely what many office applications, web browsing, etc., require. This indicates that Lenovo has a clear understanding of the positioning of this laptop.
We believe that such a strategy is quite reasonable for P-series thin and light laptops. This year, many P-series thin and light laptops have shown aggressive performance releases, such as the Lenovo Yoga Slim 7i Pro, which also uses the Core i5-1240P and is said to have its PL1 set at 50W, causing it to outperform many i7s in multi-core performance. This is akin to doing the work of the H series, and its efficiency at this power consumption level is not particularly good.
AI Benchmarking on MacBook Air and ThinkPad
Using AIDA64 for a 30-minute CPU stress test (FPU stress test), the CPU package temperature and power consumption trends appear to be quite “winding.” The behavior of the ThinkPad X13 seems to lean towards full power performance output for a while, then resting for a while before fully outputting again at some point. The cooling system during this period does not seem to be under much pressure.
Throughout the process, there are several relatively stable power output points: 35W, 25W, 18W; naturally, the temperature fluctuates with the power changes.
Additionally, there is another piece of evidence from a system-level perspective that the ThinkPad X13 may not be very suitable for gaming:
AI Benchmarking on MacBook Air and ThinkPad
This is the result of a dual stress test on CPU and GPU using AIDA64 + FurMark, showing the changes in CPU package power (blue line) and temperature (red line), as well as CPU core power (green line) and GPU core power (yellow line).
The CPU core power remained relatively stable throughout, maintaining between 13-19W; while the GPU core (integrated graphics) power fluctuated significantly. Due to the overall power limit of the processor package, the GPU core could initially allocate 20W of power, but after about 4 minutes into the test, it suddenly dropped to around 2W; then it worked within the 1-3W range for a while, which is even more energy-efficient than a mobile phone GPU (deleted)…
This will affect some 3D games that have CPU performance requirements, such as “Genshin Impact.” However, Lenovo also has strategies for this at the system level. Lenovo Vantage (similar to a computer manager) has a Game Boost mode, which we guess can avoid this situation in gaming scenarios. Subsequent gaming tests can confirm this.
AI Benchmarking on MacBook Air and ThinkPad

Running AI Benchmark on a Thin and Light Laptop

Next, let’s use this laptop, which is not suitable for AI, to run the AI Benchmark for entertainment purposes. The significance of this test may not be that great for another reason: for both the ThinkPad X13 and the MacBook Air, we did not enable GPU or NPU acceleration, and simply ran it using the CPU – the AI Benchmark itself seems to not yet support acceleration from Apple’s GPU, Intel GPU, or other units.
Additionally, the AI Benchmark we used is from the Swiss Federal Institute of Technology Zurich, and the testing for PCs is still quite incomplete; this project is currently in the Alpha phase (AI Benchmark v.0.1.2). Although the project introduction states that it only requires a few lines of commands to run the AI Benchmark using Python, we encountered various issues during the initial environment configuration process, such as version compatibility issues with various components, and differences between the built-in Python and the post-installed Python versions on macOS, etc. Fortunately, all tests were completed smoothly. However, based on the early stage of this project and our insufficient optimization settings for some projects, the following test results are for reference only.
For comparison, we also have a previously tested ROG Zephyrus M16 – the processor is the ceiling of laptops, the Core i9-12900H (except for the HX series), and it is equipped with a dedicated GPU: the laptop version of GeForce RTX 3070 Ti. So on one hand, we ran the AI Benchmark with the i9-12900H; on the other hand, we also ran the AI Benchmark with GPU using CUDA and cuDNN for reference results.
AI Benchmarking on MacBook Air and ThinkPad
All tests are based on the TensorFlow library (TensorFlow 2.9), and the test content is relatively lightweight. In fact, TensorFlow 2.4 has achieved phased results in adapting to Apple’s ML Compute for GPU acceleration. However, our tests did not utilize this resource. If there is another opportunity in the future, we will delve deeper into GPU acceleration.
Because there are many test items, we only list the test results for some sub-items here. Each test model is divided into inference and training scores.
From the results, the performance of the Core i9-12900H is basically in line with expectations, clearly leading among CPUs – after all, it has advantages in core count and core frequency; while the Nvidia GeForce RTX 3070 Ti Laptop, as a GPU, can exhibit enormous performance advantages in a large number of matrix multiplication operations at speeds multiple times faster.
The performance of the Core i5-1240P does not seem ideal here; the Apple M1 has some impressive scores in certain sub-items (is there macOS-specific optimization?). As mentioned earlier, the sustained performance output at the system level of the ThinkPad X13 is unstable, and the CPU often experiences performance fluctuations, which should have a significant negative impact on such longer tests. But this is not the most important thing.
In these types of tests, the impact of software and various middleware is extremely significant. Therefore, we again used the WSL subsystem of Windows 11 to run Ubuntu, and then ran the AI Benchmark again within the Ubuntu subsystem.
AI Benchmarking on MacBook Air and ThinkPad
During this round of testing, it was quite surprising that the testing program specifically indicated that “this TensorFlow binary is optimized for Intel’s OneAPI Deep Neural Network library, so it will fully utilize AVX2, AVX_VNNI, and FMA instructions” (however, we know that the 12th generation Core has disabled the AVX-512 instruction set); we don’t know why there was no such prompt when running it via Windows’ PowerShell (TensorFlow version issue?).
Perhaps due to the support of Ubuntu (and OneDNN?), the performance of the Core i5-1240P in the ThinkPad X13 during this round of testing saw a significant improvement compared to running in Windows 11. Many sub-items showed over 1x performance improvement. Furthermore, in most projects, it also showed significant advantages over the M1 CPU. Based on our previous scores, theoretically, if we switch to native Linux, there is still room for score improvement. The final results summary is shown in the figure below:
AI Benchmarking on MacBook Air and ThinkPad
Again, it is emphasized that the first three are all CPU-based tests, and there are differences in systems and platforms (note that the i9-12900H and 3070Ti Laptop were run on Windows). In the future, we will continue to improve the testing process and include more processors or acceleration units in the tests. This time, the tests were prepared quite hastily, and the degree to which the testing program can reflect actual performance is relatively basic; it can be considered as a reflection of some dimension of CPU absolute performance.

Theoretical Performance Tests of CPU and GPU

Let’s return to the functional track of thin and light laptops and the proper roles of the ThinkPad X13 in office and entertainment. Before real load testing, we still need to look at the theoretical performance levels of the processor. First, let’s look at CineBench R23 and Geekbench 5:
AI Benchmarking on MacBook Air and ThinkPad
AI Benchmarking on MacBook Air and ThinkPad
The test results are basically in line with expectations: the base and turbo frequencies of the Core i5-1240P are not particularly high compared to higher-end Core processors, with a base frequency of 1.7GHz and a turbo frequency of 4.4GHz, which theoretically reduces single-core performance – after all, the high-end processors we have encountered before often reach 5GHz frequency. However, compared to the M1, the i5-1240P still has an advantage in the Cinebench R23 test. As for multi-threaded performance, the advantage of significantly increased core count in this generation of Core processors can also be demonstrated at this time.
AI Benchmarking on MacBook Air and ThinkPad
Moreover, the new version of the Blender Benchmark (3.2.1) has completely removed support for Iris Xe acceleration; coupled with the earlier mention that Blender’s support for M1 GPU acceleration is still very preliminary, we will treat this test as a comparison of CPU theoretical performance, and the results are shown above.
AI Benchmarking on MacBook Air and ThinkPad
GPU performance is also a key point we previously criticized about Apple chips, which is due to its ecological development lagging behind, resulting in low efficiency of Apple GPUs in real applications and GPU general computing acceleration processes.
However, on one hand, the Core i5-1240P does not utilize the full 96EU Iris Xe GPU, and on the other hand, the M1 GPU is clearly more abundant in terms of materials (as can be seen from the die size occupied by the GPU, although the M1 tested here is also the 7-core GPU version). In graphics theoretical performance tests like 3DMark, the M1 will definitely have an advantage.
In this test, we specifically listed the test results of the Huawei MateBook X Pro 2021. This is a more pure thin and light laptop that we tested last year. The MateBook X Pro 2021 uses the 11th generation Core i7-1165G7 – this processor has 96EU Iris Xe integrated graphics.
Simply from the perspective of the number of GPU execution units, the ThinkPad X13 has 16 EU fewer than the MateBook X Pro 2021, so it should perform better. However, the actual situation is not so. This result can fully illustrate how much impact system-level design can have on chip performance. It is not just that the MateBook X Pro 2021’s processor stable performance output is only 15W, but also that the higher bandwidth of the LPDDR5 memory in the ThinkPad X13 helps the integrated graphics performance.

Gaming, Office, and Productivity Performance

So specifically in gaming, the test scores are also roughly similar: 96EU may not outperform 80EU. But the difference is that, as we have mentioned several times, the theoretical performance of the M1 GPU is higher, but due to its still immature development ecosystem, its performance is difficult to fully realize in more realistic application scenarios. Therefore, the MacBook Air often does not perform as well as the ThinkPad X13 with 80EU Iris Xe integrated graphics in many games.
AI Benchmarking on MacBook Air and ThinkPad
This is also greatly related to more “intermediaries” beyond the chip manufacturers, including graphics standard APIs, game engine developers, game developers, and others. Therefore, the quality of the software ecosystem fundamentally affects whether the die size and cost of the chip are wasted. This also applies to the AI discussion mentioned earlier.
Of course, this test is also somewhat for amusement, as neither the MacBook Air nor the ThinkPad X13 is designed for playing AAA games. However, the ThinkPad X13’s performance in this test exceeded our expectations, generally ranking above average among similarly configured and positioned laptops.
It is worth mentioning that in the online gaming test, the Steam platform’s CS:GO seems to have received “divine optimization” after supporting macOS, allowing the M1 to run at over 100+ fps. Unfortunately, the actual gaming experience is not good, with many stutters and bugs; macOS is not currently a choice for gamers.
AI Benchmarking on MacBook Air and ThinkPad
In terms of multimedia creation, we upgraded the Adobe suite to the 2022 version. Aside from Photoshop projects, which are largely constrained by CPU single-core performance, Intel processors still have an advantage; for tests like After Effects, which have begun to natively support M1, the test scores have seen significant improvements – based on our tests, just switching software versions seems to have resulted in an 80% improvement in this area, which fully illustrates the importance of the ecosystem.
As for video editing tools like Premiere Pro, this is the home ground of Apple chips. We also observed the sub-results of the Pr test – the Pr test examines the smoothness of video playback during video editing, video encoding output speed, and performance in adding effects. Intel scores well in scenarios that require multi-core performance, while the main weaknesses are in the GPU.
AI Benchmarking on MacBook Air and ThinkPad
Most mainstream office testing tools are not supported on macOS, including UL Procyon and PCMark. So we compared the results of the MateBook X 2021. It can be seen that this year’s low-power Core i5 has shown significant improvements compared to last year’s low-power Core i7 in Word and Excel – although, as mentioned earlier, the performance differences between these two laptops are also considerable.
UL Procyon test items are quite complex, executing Word and Excel scripts that are basically heavy loads; therefore, the ThinkPad X13 is indeed suitable for writing Word documents and creating Excel spreadsheets (deleted).
AI Benchmarking on MacBook Air and ThinkPad
When we talk about “real load,” we cannot miss the PCMark 10 modern office test. This test perfectly matches the ThinkPad X13 as an office laptop; the test items include application launch speed, video conferencing, web browsing, spreadsheet, document writing, photo editing, video editing, etc. This is the proper role of the ThinkPad X13, not running AI or gaming…
PCMark 10 has been updated to now, and it is indeed not just a reflection of CPU performance. For example, web browsing, video editing, rendering, and visualization utilize GPU acceleration a lot. Comparing with the MateBook X Pro 2021 in these three items also shows that 96EU may not necessarily be better than 80EU.
AI Benchmarking on MacBook Air and ThinkPad
Currently, one of the few tests that can cross-platform compare daily real loads is CrossMark. The testing direction of CrossMark includes productivity, creation, and responsiveness. In the testing process, one can see that the sub-items include photo editing, photo sorting, video editing, web browsing, etc.
Thus, CrossMark should essentially be the cross-platform version of PCMark 10. Unfortunately, CrossMark provides relatively few testing details. Just looking at the results of CrossMark testing, the ThinkPad X13 still outperformed the MacBook Air in these common daily tasks for thin and light laptops.
AI Benchmarking on MacBook Air and ThinkPad
Aside from the more conventional tests in the latter part, this article has spent some length using two thin and light laptops to run AI Benchmark, although the actual value is not high. However, as CPU manufacturers have been intensifying their technical competition over the past two years, the number of processor cores has increased: from dual-core laptops 4-5 years ago to the current 12-core laptops, the tasks they can handle and the efficiency of those tasks have changed dramatically, which is also the basis for us to run AI on thin and light laptops.
Moreover, as traditional chip manufacturers begin to invest heavily in GPUs – for example, Intel has started its independent GPU layout, although this article did not attempt much in terms of GPU acceleration for AI loads, and CUDA still holds an absolute advantage, the competition in this area will only become more intense in the future. We hope that next time we conduct AI Benchmark tests, the acceleration of processor integrated graphics can be activated to compete.
Author: Huang Yefeng, Senior Industry Analyst
END
Click to enjoy exciting videos

Leave a Comment