Rewriting Solaris Command in Python Reduced Code by 90% and Improved Performance by 17 Times

Author: Darren Moffat

Translator: Xia Ye

Editor: Tian Xiaoxu

In the /usr/bin/listusers command, I fixed a memory allocation error that caused issues when the command was converted to 64-bit. After fixing this error, I decided to investigate whether this ancient C code could be improved by converting it to a Python implementation.

Rewriting Solaris Command in Python Reduced Code by 90% and Improved Performance by 17 Times

This C code is about 800 lines long and has not been modified since 1988. At the time this code was written, the number of users was quite small, and it is likely that user information was stored in the local /etc/passwd file or on a small NIS server.

After some research, I found that the algorithm for listusers is essentially a series of simple set operations. The listusers command, when run without parameters, simply outputs a sorted list of users to the domain service, while the -l and -g options filter the lists of users and groups.

I rewrote listusers in Python3, and the number of lines of code is now almost only 1/10 of the original code—this is because Python itself includes set operations, whereas the C version implemented set operations using linked lists.

But shouldn’t Python be slower? It turns out that it is not. In fact, when tested in my database (which contains over 100,000 users), it was 17 times faster. Moreover, I am confident that when the Python version recognizes that the command is using the -l and -g options for filtering, it will not load the entire domain service content into memory at once.

After switching to Python, I found that a long-standing bug became very easy to fix, namely the issue where listusers could not correctly expand nested groups. The concept of nested groups did not exist in the original C code, but with LDAP, nested groups can be implemented.

After converting to Python, I also found that this 100-line Python3 version of the code will be very easy to maintain going forward—although I hope that listusers will not need any further updates, as its original code has already stood the test of decades!

Original link:

https://blogs.oracle.com/solaris/reimplementing-a-solaris-in-python-gained-17x-performance-improvement-from-c

Rewriting Solaris Command in Python Reduced Code by 90% and Improved Performance by 17 Times

Click to see fewer bugs👇

Leave a Comment