Summary and Insights on Lua Connector Development for TDengine

1.

Why TDengine

First of all, TDengine meets my aesthetic standards. Although I have been paying attention to the field of time-series databases, I only found a product that meets my expectations when I encountered TDengine. An ideal product should use a language that comes with a runtime, is a strongly typed compiled language, which means that after packaging and compiling, it has less dependency on the environment and is more convenient to use; the code project should be as concise as possible, making it easy to understand; the product should run fast and consume relatively few resources. If a product is like a truck, carrying 10 tons while weighing 5 tons, it is clearly not an ideal option. The above explanation is like shooting an arrow and then drawing a target; of course, C language is the deciding factor, which is the most complete language I have configured in the emacs environment.
Each Dnode node of TDengine is responsible for both storage and computation, which seems somewhat unconventional in the current trend of “separation of computing and storage”. However, practice is the standard for testing truth, and I have not seen any obvious problems so far; we will continue testing in the K8s CSI solution.

2.

Lua Connector

I have developed a Lua version of the connector for TDengine, mainly targeting two user groups: OpenResty (Nginx+Lua) and Skynet. These two products are also two open-source products that I particularly admire. After testing, on my personal laptop (SSD hard drive, 8G RAM), based on TDengine 2.0.18, using the Lua connector I developed, I can write records with only 3 columns (timestamp, integer, tag) in a single thread at an average of 10,000 records per second, as shown in the figure below. I have submitted the benchmark test code, and everyone can test the items of interest in their own working environment.
Summary and Insights on Lua Connector Development for TDengine
Currently, the Lua connectors seen in the community are based on Lua 5.1 and Lua 5.3, see https://github.com/taosdata/TDengine/tree/develop/tests/examples/lua. This is mainly because OpenResty uses LuaJIT based on version 5.1, while Skynet, although it has upgraded to Lua 5.4 with the Lua community, I have not encountered any compatibility issues, so I only verified compatibility with Lua 5.4 locally and did not submit it.
From the perspective of Lua, Lua 5.1 and Lua 5.2 and above versions belong to two different worlds. This is mainly due to an important improvement in Lua 5.2 – “yieldable pcall and metamethods”. This improvement in Lua 5.2 allows calling C functions without returning immediately, which is a necessary feature for asynchronous operations. Therefore, it is common to see people in the OpenResty community based on Lua 5.1 API asking how to solve the “attempt to yield across C-call boundary” issue.
I also had to face this issue when implementing the connector; of course, there are other API differences, so I wrote two versions of the code instead of controlling it with compilation switches. In fact, unlike OpenResty’s over-reliance on LuaJIT, I agree with Yunfeng’s viewpoint that the comprehensive benefits brought by following the mainstream community’s version upgrades should be higher than customizing on a specific version.
OpenResty users can directly access the TDengine database through the Lua connector in the HTTP request processing, just like using MySQL, without handing it over to the server for processing; the overall architecture is very simple. However, in this Lua version, only synchronous access to the database can be achieved. For performance reasons, I experimented with a connection pool to avoid frequently establishing and then releasing database connections. Unfortunately, the non-preemptive feature of Lua means that when a piece of code has not completed execution, it will not release the CPU, so there is no opportunity to handle other requests; the observed WaterMark has remained at 1. How to solve this problem is still inconclusive. If you want to squeeze out as much performance as possible based on the current foundation, my suggestion is to delay requesting connections from the pool as much as possible and return connections as early as possible.
Skynet users can also use the Lua connector in a MySQL-like manner, but I recommend following Skynet’s advice to maintain connections in services like simpledb and handle specific request processing work. Since Skynet itself implements a relatively complete Actor model, I am not sure whether the current solution has introduced any bottlenecks. If synchronous access causes bottlenecks, you might try asynchronous calls, but since TDengine’s design regarding asynchrony requires that the previous access result must be returned before executing the next access, it is unclear where the bottleneck caused by asynchronous calls will shift.
Although the possibility is very small, the connector cannot avoid the issue of database connection failure. Fortunately, TDengine provides a heartbeat mechanism to detect whether the connection is valid. Currently, the implemented connector has not perfected this function, so if the link to the database fails, the application needs to rebuild the connection.
In actual production environments, we found that the string type data returned from TDengine Dnode to the Client side does not have an end character attached. For performance reasons, when returning data to the connector, it also does not reallocate memory to do a copy to append the end character, resulting in network type 4G and WIFI being stored in the database, returning results like “4GFI” when queried. This is a very hidden default rule, and designers of other language connectors should pay attention to avoid potential problems in advance.

3.

Custom Functions

Custom functions (UDF) can reduce the complexity of applications or achieve functionalities that preset query functions cannot implement. I judge that Lua is the most suitable language for this mission. This is because the original intention of Lua’s design is to integrate with C language, and the two can be considered a perfect match.
Currently, TDengine has officially implemented a framework for user-defined functions, based on Lua 5.1 to implement a basic model, and integrated Lua 5.1. Due to the fragmented state of the Lua community, the preset Lua development libraries have brought some minor troubles to my development work. Users certainly cannot use two Lua versions at the same time, so ultimately, it is necessary to integrate both Lua 5.1 and a higher version (the upcoming version is Lua 5.4.4) in TDengine, relying on macro switches to select one Lua version for compilation.
In terms of specific implementation, it is necessary to design interfaces for the C API of both versions. Each version upgrade of Lua brings some functional changes and upgrades, so whether it is possible to abstract a set of common interfaces to shield Lua users from differences is a question I hold a rather pessimistic attitude towards.

4.

Outlook

This is a summary of my experiences using TDengine. Currently, the connector has been deployed in our production environment and has undergone two large-scale production activities, easily completing its mission. Next, I will continue to improve the issues mentioned above in the project application, and supporting the use of Lua to implement UDF will be my next focus, which will further reduce the complexity of applications.

👇 Clickto read the original text to learn more about the experience of TDengine!

Leave a Comment

×