MogDB Releases Update to Address Ustore Table Bloat Issue in Long Transactions for openGauss Database

MogDB Releases Update to Address Ustore Table Bloat Issue in Long Transactions for openGauss DatabaseIntroduction

Recently, Yunhe Enmo released the MogDB version 5.0.11. This update brings a series of important improvements and fixes, particularly optimizing Ustore feature enhancements, support for SQL statement analysis in stored procedures, and compatibility improvements. The most noteworthy aspect is the resolution of the Ustore table bloat issue in long transaction scenarios.

In the database field, the performance and stability of storage engines have always been a focal point. Among them, the table bloat issue can severely impact database performance and storage efficiency, especially in long transaction scenarios.

The openGauss database initially used the Astore storage engine, which employs an append-update method, storing both the previous image and the current value in the same page. In simple terms, when executing an update statement, the system does not directly modify the original data row but marks the original data row as expired and then inserts a row of updated data (append write). Although this update method has its advantages, it also has a significant drawback: since Astore does not separate the previous image data through undo storage, it is prone to table bloat after performing bulk update or delete operations on the table.

To better understand the table bloat issue with Astore, we conducted an experiment using the openGauss version 7.0.0-RC1. First, we check the database version.

openGauss=# select version();                                                                          version                                                                          ----------------------------------------------------------------------------------------------------------------------------------------------------------- (openGauss 7.0.0-RC1 build cff7b04d) compiled at 2025-03-28 11:44:23 commit 0 last mr   on x86_64-unknown-linux-gnu, compiled by g++ (GCC) 10.3.0, 64-bit(1 row)

Assuming there is a table named test01, its structure comes from Oracle 11g DBA_OBJECTS, and it has repeatedly inserted 44,537,344 rows of data, occupying 6115 MB of disk space.

openGauss=# \\d+                                       List of relations Schema |  Name  | Type  |   Owner   |  Size   |             Storage              | Description --------+--------+-------+-----------+---------+----------------------------------+------------- public | test01 | table | opengauss | 6115 MB | {orientation=row,compression=no} |  public | test02 | table | opengauss | 12 MB   | {orientation=row,compression=no} | (2 rows)

Next, we create a table t_astore using the Astore storage engine and insert 1 million rows of data.

openGauss=# create table t_astore with(storage_type=astore) as select * from test01 where 1=0;INSERT 0 0openGauss=# insert into t_astore select * from test01 where rownum<=1000000;INSERT 0 1000000openGauss=# \\d+                                                  List of relations Schema |   Name   | Type  |   Owner   |  Size   |                       Storage                        | Description --------+----------+-------+-----------+---------+------------------------------------------------------+------------- public | t_astore | table | opengauss | 137 MB  | {orientation=row,storage_type=astore,compression=no} |  public | test01   | table | opengauss | 6115 MB | {orientation=row,compression=no}                     |  public | test02   | table | opengauss | 12 MB   | {orientation=row,compression=no}                     | (3 rows)

We can see that t_astore occupies 137 MB of disk space.Then, we perform 10 bulk update/delete operations on t_astore.

openGauss=# declareopenGauss-# v_date date;openGauss-# beginopenGauss$#   for i in 1..10 loopopenGauss$#     select sysdate into v_date;openGauss$#     update t_astore set owner='B' where object_id>2; openGauss$#     commit;openGauss$#     delete from t_astore where object_id>2;openGauss$#     commit;openGauss$#     insert into t_astore select * from test01 where object_id>2 and rownum<=999988;openGauss$#     commit;openGauss$#   end loop;openGauss$# end;openGauss$# /ANONYMOUS BLOCK EXECUTEopenGauss=# \\d+                                                  List of relations Schema |   Name   | Type  |   Owner   |  Size   |                       Storage                        | Description --------+----------+-------+-----------+---------+------------------------------------------------------+------------- public | t_astore | table | opengauss | 2844 MB | {orientation=row,storage_type=astore,compression=no} |  public | test01   | table | opengauss | 6115 MB | {orientation=row,compression=no}                     |  public | test02   | table | opengauss | 12 MB   | {orientation=row,compression=no}                     | (3 rows)

Upon checking the table information again, the result shows that the disk space occupied by t_astore has ballooned from 137 MB to 2844 MB, an increase of nearly 20 times, which illustrates the table bloat issue with Astore.

Later, openGauss introduced the Ustore storage engine, which uses in-place updates and stores historical version data in undo space, alleviating the table bloat issue to some extent. We also verified this through an experiment: creating a table t_ustore using the Ustore storage engine and inserting 1 million rows of data.

openGauss=# create table t_ustore with(storage_type=ustore) as select * from test01 where 1=0;INSERT 0 0openGauss=# insert into t_ustore select * from test01 where rownum<=1000000;INSERT 0 1000000openGauss=# \\d+                                                  List of relations Schema |   Name   | Type  |   Owner   |  Size   |                       Storage                        | Description --------+----------+-------+-----------+---------+------------------------------------------------------+------------- public | t_astore | table | opengauss | 2844 MB | {orientation=row,storage_type=astore,compression=no} |  public | t_ustore | table | opengauss | 115 MB  | {orientation=row,storage_type=ustore,compression=no} |  public | test01   | table | opengauss | 6115 MB | {orientation=row,compression=no}                     |  public | test02   | table | opengauss | 12 MB   | {orientation=row,compression=no}                     | (4 rows)

Checking the table information shows that t_ustore occupies 115 MB of disk space.Next, we perform 10 bulk update/delete operations on t_ustore.

openGauss=# declareopenGauss-# v_date date;openGauss-# beginopenGauss$#   for i in 1..10 loopopenGauss$#     select sysdate into v_date;openGauss$#     update t_ustore set owner='B' where object_id>2; openGauss$#     commit;openGauss$#     delete from t_ustore where object_id>2;openGauss$#     commit;openGauss$#     insert into t_ustore select * from test01 where object_id>2 and rownum<=999988;openGauss$#     commit;openGauss$#   end loop;openGauss$# end;openGauss$# /ANONYMOUS BLOCK EXECUTEopenGauss=# \\d+                                                  List of relations Schema |   Name   | Type  |   Owner   |  Size   |                       Storage                        | Description --------+----------+-------+-----------+---------+------------------------------------------------------+------------- public | t_astore | table | opengauss | 2844 MB | {orientation=row,storage_type=astore,compression=no} |  public | t_ustore | table | opengauss | 232 MB  | {orientation=row,storage_type=ustore,compression=no} |  public | test01   | table | opengauss | 6115 MB | {orientation=row,compression=no}                     |  public | test02   | table | opengauss | 12 MB   | {orientation=row,compression=no}                     | (4 rows)

After completing the operations, we check again, and the disk space occupied by t_ustore has increased from 115 MB to 232 MB. Compared to Astore’s 20 times increase, Ustore only increased by 1 time, indicating that Ustore significantly alleviates the table bloat issue.

However, DBAs familiar with Oracle know that the disk usage of tablespaces in such scenarios will hardly change. Moreover, the previous tests were not rigorous as they did not consider long transaction scenarios. So how does Ustore perform in long transaction scenarios?

We continue the experiment:We create a table t_idletransaction, which can use either Astore or Ustore as the storage engine, and insert 1000 rows of data; at the same time, we create a table t_ustore2 using Ustore and insert 1 million rows of data.

openGauss=# create table t_idletransaction with(storage_type=astore) as select * from test01 where rownum<=1000;INSERT 0 1000openGauss=# create table t_ustore2 with(storage_type=ustore) as select * from test01 where 1=0;INSERT 0 0openGauss=# insert into t_ustore2 select * from test01 where rownum<=1000000;INSERT 0 1000000openGauss=# \\d+                                                       List of relations Schema |       Name        | Type  |   Owner   |  Size   |                       Storage                        | Description --------+-------------------+-------+-----------+---------+------------------------------------------------------+------------- public | t_astore          | table | opengauss | 2844 MB | {orientation=row,storage_type=astore,compression=no} |  public | t_idletransaction | table | opengauss | 176 kB  | {orientation=row,storage_type=astore,compression=no} |  public | t_ustore          | table | opengauss | 232 MB  | {orientation=row,storage_type=ustore,compression=no} |  public | t_ustore2         | table | opengauss | 115 MB  | {orientation=row,storage_type=ustore,compression=no} |  public | test01            | table | opengauss | 6115 MB | {orientation=row,compression=no}                     |  public | test02            | table | opengauss | 12 MB   | {orientation=row,compression=no}                     | (6 rows)

At this point, t_ustore2 occupies 115 MB of disk space.Next, we start a transaction in session1 without committing it to simulate a long transaction.

openGauss=# begin;BEGINopenGauss=# update t_idletransaction set object_id=0;UPDATE 1000

Then, in session2, we perform 10 bulk update/delete operations on t_ustore2.

openGauss=# declareopenGauss-# v_date date;openGauss-# beginopenGauss$#   for i in 1..10 loopopenGauss$#     select sysdate into v_date;openGauss$#     update t_ustore2 set owner='B' where object_id>2; openGauss$#     commit;openGauss$#     delete from t_ustore2 where object_id>2;openGauss$#     commit;openGauss$#     insert into t_ustore2 select * from test01 where object_id>2 and rownum<=999988;openGauss$#     commit;openGauss$#   end loop;openGauss$# end;openGauss$# /ANONYMOUS BLOCK EXECUTEopenGauss=# \\d+                                                       List of relations Schema |       Name        | Type  |   Owner   |  Size   |                       Storage                        | Description --------+-------------------+-------+-----------+---------+------------------------------------------------------+------------- public | t_astore          | table | opengauss | 2844 MB | {orientation=row,storage_type=astore,compression=no} |  public | t_idletransaction | table | opengauss | 312 kB  | {orientation=row,storage_type=astore,compression=no} |  public | t_ustore          | table | opengauss | 232 MB  | {orientation=row,storage_type=ustore,compression=no} |  public | t_ustore2         | table | opengauss | 1267 MB | {orientation=row,storage_type=ustore,compression=no} |  public | test01            | table | opengauss | 6115 MB | {orientation=row,compression=no}                     |  public | test02            | table | opengauss | 12 MB   | {orientation=row,compression=no}                     | (6 rows)

The results are surprising; the disk space occupied by t_ustore2 has ballooned from 115 MB to 1267 MB, an increase of 10 times. Compared to the 1 time increase without long transactions, the table bloat issue with Ustore has significantly worsened in the presence of long transactions, making Ustore unable to meet commercial requirements in such scenarios. Moreover, if we test earlier versions of openGauss, the table bloat under long transactions would be even more severe.

So,the key point is here! The recently updated MogDB 5.0.11 has brought a turnaround, significantly addressing the Ustore table bloat issue in long transactions.

First, let’s check the version information of MogDB.

MogDB=# select version();                                                                        version                                                                        ------------------------------------------------------------------------------------------------------------------------------------------------------- (MogDB 5.0.11 build 01ca8799) compiled at 2025-03-28 12:46:00 commit 0 last mr 1804  on x86_64-unknown-linux-gnu, compiled by g++ (GCC) 7.3.0, 64-bit(1 row)

Then we conduct the same experiment, creating the t_idletransaction and t_ustore2 tables and inserting data.

MogDB=# \\d+                                     List of relations Schema |  Name  | Type  | Owner |  Size   |             Storage              | Description --------+--------+-------+-------+---------+----------------------------------+------------- public | test01 | table | omm   | 6115 MB | {orientation=row,compression=no} |  public | test02 | table | omm   | 12 MB   | {orientation=row,compression=no} | (2 rows)MogDB=# create table t_idletransaction with(storage_type=astore) as select * from test01 where rownum<=1000;INSERT 0 1000MogDB=# create table t_ustore2 with(storage_type=ustore) as select * from test01 where 1=0;INSERT 0 0MogDB=# insert into t_ustore2 select * from test01 where rownum<=1000000;INSERT 0 1000000
MogDB=# \\d+                                                     List of relations Schema |       Name        | Type  | Owner |  Size   |                       Storage                        | Description --------+-------------------+-------+-------+---------+------------------------------------------------------+------------- public | t_idletransaction | table | omm   | 176 kB  | {orientation=row,storage_type=astore,compression=no} |  public | t_ustore2         | table | omm   | 115 MB  | {orientation=row,storage_type=ustore,compression=no} |  public | test01            | table | omm   | 6115 MB | {orientation=row,compression=no}                     |  public | test02            | table | omm   | 12 MB   | {orientation=row,compression=no}                     | (4 rows)

Checking the table information shows that t_ustore2 occupies 115 MB of disk space. Next, we start a long transaction in session1:

MogDB=# begin;BEGINMogDB=# update t_idletransaction set object_id=0;UPDATE 1000

In session2, we perform 10 bulk update/delete operations on t_ustore2.

MogDB=# declareMogDB-# v_date date;MogDB-# beginMogDB$#   for i in 1..10 loopMogDB$#     select sysdate into v_date;MogDB$#     update t_ustore2 set owner='B' where object_id>2; MogDB$#     commit;MogDB$#     delete from t_ustore2 where object_id>2;MogDB$#     commit;MogDB$#     insert into t_ustore2 select * from test01 where object_id>2 and rownum<=999988;MogDB$#     commit;MogDB$#   end loop;MogDB$# end;MogDB$# /ANONYMOUS BLOCK EXECUTEMogDB=# \\d+                                                     List of relations Schema |       Name        | Type  | Owner |  Size   |                       Storage                        | Description --------+-------------------+-------+-------+---------+------------------------------------------------------+------------- public | t_idletransaction | table | omm   | 312 kB  | {orientation=row,storage_type=astore,compression=no} |  public | t_ustore2         | table | omm   | 371 MB  | {orientation=row,storage_type=ustore,compression=no} |  public | test01            | table | omm   | 6115 MB | {orientation=row,compression=no}                     |  public | test02            | table | omm   | 12 MB   | {orientation=row,compression=no}                     | (4 rows)

We can see that, in the presence of long transactions, the disk space occupied by t_ustore2 has increased from 115 MB to 371 MB. Compared to the Astore’s 20 times increase and Ustore’s 10 times increase in openGauss, MogDB’s Ustore only increased by 2 times.

This indicates that MogDB 5.0.11 has effectively alleviated the Ustore table bloat issue under long transactions, making it commercially viable, providing a strong guarantee for the stable operation and efficient use of databases. This improvement is undoubtedly good news for database users and developers, promising a better experience in practical applications.

MogDB Releases Update to Address Ustore Table Bloat Issue in Long Transactions for openGauss Database

Ustore Feature Enhancements, Improving Data Processing Capabilities

The MogDB 5.0.11 version update, in addition to the space management optimizations verified through experiments mentioned earlier, has also made various feature enhancements to the Ustore storage engine.

In terms of query performance, Ustore now supports vectorized queries, which are significantly effective in large data analysis query scenarios, and can dynamically decide whether to use vectorized queries based on the cost evaluation results of the execution plan.

In terms of table type support, Ustore now allows the creation of global temporary tables, which are visible to all sessions, but each session can only operate on its own committed data.

Additionally, the logical replication feature, which started supporting DDL operations from MogDB version 5.0.8, reduces the manual maintenance of tables by users during logical replication, avoiding exceptions in the logical replication synchronization process due to changes in table structure. Now, logical replication also supports DDL operations on Ustore tables, ensuring that the table structures of the target and source databases are consistent, preventing logical replication interruptions or data inconsistencies caused by DDL operations on the source database.

Stored Procedure Analysis Function Upgrade, Accurately Locating Executed Statements

MogDB 5.0.11 has made significant upgrades in stored procedures, adding SQL statement analysis functionality. Users can now directly view the execution plans of SQL statements and the SQL statements currently being executed within stored procedures in the pg_stat_activity table.

During the execution of stored procedures, users can easily locate the currently executing statements by querying the pg_stat_activity system table. After the stored procedure ends, users can also query historical slow SQL using statement_history, with relevant execution plan information automatically stored in that table. It is important to note that the criteria for judging slow SQL are controlled by the log_min_duration_statement parameter.

Furthermore, the statement_history table is specifically used to record SQL statements and related information during stored procedures, but this table can only be queried within the postgres database, and there is no data in other databases. Additionally, this system table is controlled by the track_stmt_stat_level parameter, with a default value of “OFF,L0”. The first part controls Full SQL, and the second part controls Slow SQL. Users can flexibly set it according to their needs to choose to record different types of SQL.

Significant Compatibility Improvements, Meeting Diverse Usage Needs

The new version has made comprehensive improvements in compatibility.

In character handling, by setting the GUC parameter enable_compatible_illegal_chars, MogDB 5.0.11 supports the insertion of illegal characters and C-style string terminators ‘\0’. In trigger syntax, by enabling the GUC parameter enable_ora_trigger_style, it not only supports the Oracle-style anonymous block creation but also supports referencing and other syntax. Additionally, this version has added support for the ROWID pseudo-column, which is generated by merging “tableoid+ctid” and guarantees uniqueness within transactions, allowing normal queries through the original ROWID even if the row data is updated.

In terms of syntax support, MogDB 5.0.11 supports the ALTER TABLE xxx DROP PRIMARY KEY CASCADE DROP INDEX syntax, which can automatically locate and delete primary key constraints, greatly improving operational efficiency. Moreover, the new DBMS_STATS.GATHER_TABLE_STATS function interface can indirectly optimize query performance by updating table statistics and can automatically collect statistics during off-peak business hours through scheduled tasks, ensuring stable daily query performance.

In terms of command usage, when users connect to the database through gsql or tools and need to exit the session after completing operations, in addition to the original \q and \quit commands, they can now also use the newly added exit and quit commands. Furthermore, by setting the GUC parameter a_format_date_timestamp = true, current_timestamp function, support for hours, minutes, and seconds has also been enhanced. Finally, the system will also record the creation and modification times of synonyms and triggers, with this information stored in the pg_object system table, making it convenient for users to query change history.

Official Statement, Emphasizing Commitment to Continuous Optimization

Zhang Chengwei, head of database software development at Yunhe Enmo, stated: “The release of MogDB 5.0.11 is an important step in our ongoing commitment to providing users with higher quality database products. By continuously optimizing performance, enhancing functionality, and improving stability, we hope to help enterprise users better cope with the data management challenges during the digital transformation process. Whether in core transaction systems or complex computing scenarios, MogDB can provide enterprises with an efficient, secure, and reliable database environment.”

Although 5.0.11 is just a patch version of MogDB 5.0.0, the enhancements in Ustore features, upgrades in stored procedures, and improvements in compatibility demonstrate Yunhe Enmo’s determination to continuously optimize and advance in the database field. We believe these new features can better leverage the advantages of MogDB, bringing a better user experience and improving the operational efficiency of their business systems, helping enterprises achieve more efficient and stable development in the process of accelerated digital transformation.

MogDB Releases Update to Address Ustore Table Bloat Issue in Long Transactions for openGauss Database

Data-driven, achieving the future, Yunhe Enmo, living up to expectations!

Yunhe Enmo, founded in 2011, is a leading provider of “intelligent data technology” in the industry. The company is committed to its mission of “data-driven, achieving the future,” bringing innovative data technology products and solutions to enterprises and organizations worldwide, helping customers build a secure, efficient, agile, and economical data environment, continuously enhancing their competitive advantage in data insights and decision-making, and achieving data-driven business innovation and upgrade development.

Since its establishment, Yunhe Enmo has focused on the field of data technology, innovatively developing a series of software products covering databases, database storage, database management, and data intelligence, according to the ever-changing market demands. These products have been widely applied in large, medium, and high-growth customers as well as industry cloud scenarios, proving our technological and commercial competitiveness and showcasing the company’s advantages in end-to-end data technology solutions.

MogDB Releases Update to Address Ustore Table Bloat Issue in Long Transactions for openGauss Database

Click on “Read the original text” at the end of the article to view the detailed release notes for MogDB 5.0.11

Leave a Comment