Why Is Our Company Still Using Python for Development?

👉 Welcome to join Xiao Ha’s Planet , you will gain: Exclusive project practice / Java learning path / One-on-one Q&A / Study check-in / Book benefits

The full-stack front and back end separation blog project version 1.0 is complete, and version 2.0 is being updated, Demo linkhttp://116.62.199.48/ , hands-on guidance throughout, back-end + front-end full-stack development, explaining each functional point development step from 0 to 1, 1v1 Q&A until the project goes live. Currently updated with 219 sections, totaling over 350,000 words, with 1492 explanatory images, and still continuously working hard.. More projects will be launched later, aiming to cover typical projects in the Java field, such as flash sale systems, online malls, IM instant messaging, Spring Cloud Alibaba, etc. Click to join learning, with over 1200+ friends already joined (early bird price is super low)

Why Is Our Company Still Using Python for Development?

Author: Wah Da Xi Wah

https://www.zhihu.com/question/278798145/answer/3416549119

Why Is Our Company Still Using Python for Development?

In recent years, I have often seen some large companies that heavily used Python migrate to other language tech stacks, but what about small companies/small teams?

I have always wanted to understand how those companies that still insist on using Python and support a certain scale of business use the Python tech stack for development, what difficulties/lessons they encounter, and what excellent experiences they have?

By chance, I saw an answer to the question “Why do software companies rarely use Python for web development?” on a certain platform, and I would like to share it with everyone.

Why Is Our Company Still Using Python for Development?

Author: Wah Da Xi Wah

https://www.zhihu.com/question/278798145/answer/3416549119)

Reply:

I have been using Python for over 10 years now, and the longest maintained project has an annual transaction volume of several hundred million, which is an e-commerce platform. The concurrency is not large, usually in the dozens, and during holidays it goes over 100. At peak times, I have not seen it exceed 200. The maximum total number of orders in the database is about 50 million, with several tens of thousands added daily. The project has been running for seven or eight years, still using Python 2.7 + Django 1.8, and there are no plans to upgrade.

Currently equipped with 1 server with 4 cores and 8GB, and 3 servers with 8 cores and 16GB on Alibaba Cloud, the database and Redis are also on Alibaba Cloud, with an annual cost of about less than 50,000. We use Qiniu for CDN, which costs several tens of thousands a year. There are three programmers, including myself, maintaining it, and new features are added almost every week. After several years of adjustments, the effective code is estimated to be less than 70%, and some code is not used anymore due to business reasons.

In 2021, another system was developed using Python 3.8 + Django 3. There is usually not much volume, but it spikes during holidays, with the highest record so far being 350 orders in one minute and 150,000 orders in a day, with a transaction volume of about 15 million on that day. The usual configuration is two servers with 8 cores and 16GB, and during holidays, we expand to 6 servers, and the database and other components are temporarily upgraded as well. There are four programmers, including myself, maintaining it.

There are also a few small projects that haven’t really taken off, with about two people responsible for one project, and one person handling two projects in a cross manner.

Currently, the entire backend tech stack of the company is Python + Django + Gunicorn (with a small project using Tornado). The company has accumulated some basic frameworks based on Django, and the company is not large, with about 14 to 15 programmers who are basically familiar with this framework. Newcomers generally start with strengthening Python basics -> learning Django -> learning the company’s framework -> entering project development.

The company has more requirements on naming, style, etc., and pays more attention during code reviews. After getting familiar, everyone basically works in harmony, so the drawbacks of Python as a dynamic language have not been significantly manifested.

In the early days, due to lack of experience, there were instances where the system crashed with a slight increase in concurrency. Later, the database was upgraded (initially self-built) and some Redis caching was implemented, significantly reducing the occurrence of such issues.

Some of Django’s query languages are overly complex or not optimized, leading to some slow queries. The current solution is to regularly monitor slow logs, identify the code that causes issues, and optimize it. The database itself also needs to be upgraded according to business needs. This point is actually the same regardless of the programming language used.

I have encountered most programming languages, but I feel that only Python allows me to express my ideas to the computer as easily as my mother tongue.

From my own experience, once programmers become familiar with Python, they only need to understand the business and convert requirements into code without spending too much time on technical aspects. Python has a rich library, and most problems encountered have ready-made solutions available. Django’s ORM is also excellent, allowing programmers to easily operate on the database without worrying about table structure changes or complex queries.

There are also drawbacks, for example, Python can be quite cumbersome, especially as projects grow larger and more complex, leading to longer startup and loading times and increased memory usage.

Django’s ORM brings convenience but also results in some inefficient code, such as seeing some people construct overly complex queries, leading to too many join tables and long query times, or often querying all fields regardless of necessity, and performing large data queries inside for loops.

However, I believe these drawbacks are not fatal because the cost of increasing cloud server resources is very low compared to labor costs and development efficiency, and most projects do not reach the stage where optimization is needed before they fail. Some specifications or usage methods can be improved through training, and the overall code quality can gradually improve.

In addition to web applications, we also use Python on some hardware devices (mostly single-board computers with Linux, like Raspberry Pi, 7688, etc.), which allows development on a computer and direct deployment to the device without needing to hire embedded engineers. After encapsulating the hardware calls, any backend developer in the company can develop it.

We also use Python in areas such as image processing and recognition, web scraping, automated testing, and CI/CD.

For small teams, Python’s low entry barrier and high efficiency are more valuable compared to the elusive performance loss types, of course, provided that standards are established, quality is emphasized, and continuous attention and optimization are maintained.

I didn’t expect this answer to attract so much attention, so I’ll add a few more points.

The system mentioned above that handles 350 orders per minute mainly aggregates orders from several takeaway platforms into the system, allowing merchants to use a delivery platform to call riders for delivery. The entire process involves synchronizing takeaway orders and delivery orders along with some management functions.

Order notifications from takeaway platforms (new orders, order status changes, etc.) are sent to our system via HTTP requests. Initially, we used a synchronous method, meaning that upon receiving a request, we called the takeaway platform’s order query interface (and several other associated interfaces) to retrieve order detail data, create orders, and then respond. Due to the large number of network requests, it took considerable time, and as soon as the concurrency increased slightly, we couldn’t handle it. We tried deploying multiple machines and processes, but it didn’t have much effect.

I remember that initially, processing 30 orders per minute was the limit, and any more would lead to noticeable slow responses, while the takeaway platform required us to respond within a specified time, so this synchronous processing approach couldn’t last long before hitting significant bottlenecks. We attempted multi-threaded task queues, but the results were unsatisfactory and there was a risk of task loss.

Later, we used Celery. After receiving notifications, we placed the messages in the Celery queue and returned immediately, allowing Celery worker processes to handle them gradually, avoiding overload during peak times. Since placing messages in the Celery queue is an extremely fast operation, the system can respond to notifications from the takeaway platform instantly.

Based on the backlog of messages, we adjust the number of Celery worker processes accordingly and can allocate different queues based on message priority, ensuring that new order notifications are processed promptly so that merchants are informed of new orders that need attention as early as possible.

Initially, we used Redis for Celery message distribution, but later switched to RabbitMQ for easier monitoring. After several years of iteration, we are relatively confident in handling peak periods during holidays, adding cloud resources as needed, and enabling automatic scaling of Celery worker processes. In principle, unless we encounter extremely extreme situations, we are confident that we can handle it.

In addition to the aforementioned, about seven or eight years ago, we used Python 2 + Django 1.8 to create a data reporting system for the government. Each year, we opened it for a week for enterprises to fill in data, with about four to five thousand enterprises participating, each filling in seven or eight forms, and the concurrency was not noted at the time, but conservatively estimated to be in the dozens.

Initially, we ran it using Django’s built-in runserver mode (which was also inexperienced), and it easily led to stalling issues. Later, after running several processes with Gunicorn, we no longer encountered language-level stalling issues. When it was slow, it was mostly due to high database load or MongoDB doing data aggregation.

The server configuration was not high, only 2C8G, running Python Web, MySQL, MongoDB, and a bunch of other application processes. This system ran for three years, and in the fourth year, due to changes at the government level, another company was brought in to redevelop it, with fewer functions and poorer usability than ours, and I have no idea what language they used.

On this project, the Python + MongoDB approach provided us with great flexibility, as the data filled in each year is different, and the statistical indicators vary. The entire system supports custom report forms, data validation, data import and export, and custom statistics. I feel it would be very difficult to achieve such results with other languages, or the cost to achieve the same results would be much higher.

Of course, this system requires very little maintenance, basically just ensuring accessibility after the initial development. At that time, I led a junior programmer in the development, where I was responsible for the core architecture and most of the code implementation, while he handled simpler logic, UI, table definitions, etc. He may not have easily understood the code I wrote. The maintainability of complex system code largely depends on standards, documentation, and training, rather than type constraints at the language level.

We also developed an internal office system for a travel agency, mainly targeting Southeast Asian travel agencies, supporting multiple languages and currencies, covering almost all daily operations of a travel agency, including planning, group formation, group buying, hotel transportation, shopping, guides, customers, accounting, revenue, finance, reports, charts, etc.

This was also done using Python 2 + Django 1.8. We deployed a separate web process and database for each travel agency (the database name is independent, but one MySQL instance runs on one machine). Each web process consumes about 170MB of memory when running, and we used 2C8G machines, with each machine able to serve about 40 clients. Generally, the daily user data for each client is around 10, with larger travel agencies having 20 to 30 employees operating simultaneously. The concurrency for most clients is estimated to not exceed 10.

At the beginning of each month, when each agency is doing accounting and exporting data, they occasionally report stalling issues. From my observation, most of these are performance issues at the database level. Our solution is to tell clients to wait a while before exporting (or if they have a lot of data to export, we suggest they do it at night). As long as a few agencies stagger their data exports, there are no issues.

To save costs, we also built the database on a cloud server, nearly pushing the cloud server to its limits. During the day, the server generally runs at over 80% CPU usage and over 90% memory usage, and CPU usage spikes during data exports.

Before 2020, we had over 100 clients. During the three years of the pandemic, we basically stopped doing tourism, and these years might be the quietest for those servers. Last year and this year, we have resumed some clients, but it is incomparable to before, as income has sharply declined and clients’ willingness to pay has decreased significantly.

I started using Python around 2012, having previously used Java and C# more. I used Java for Android and Web development, and C# for Windows desktop programs and Windows Phone development.

Before, I found Java’s SSH framework XML configuration very cumbersome, feeling like I was mostly writing useless code. I don’t know if Spring is more convenient now. In addition to the framework hassle, I feel Java itself is relatively verbose.

C# feels better than Java, but it lags behind in cross-platform capabilities, so now I choose C# for desktop applications, but not for other situations, especially since most desktop applications are now web-based.

Python, on the other hand, is simple enough to allow people to focus on business, which is why the company chose Python as the primary language (although at that time, my familiarity with Python was not as strong as with Java and C#). The entire team has also built in the direction of Python, but I also want to express that with the development of AI, Python will become increasingly popular.

In previous years, it was relatively difficult to recruit Python developers; they mostly came from other languages and gradually adapted and became familiar. In recent years, there are more people with a foundation in Python (thanks to public account advertisements?), but the majority still lack depth and need a process of familiarization and strengthening.

Generally, those who are good can become proficient in about half a year with the project, while those who are slower may need more than a year. The key factor is still interest; some people just have a passion for programming and will work on personal projects after hours, progressing quickly.

Different team experiences cannot be completely replicated. I am one of the co-founders of the company, and I determine the technical direction, with no issues of turnover.

I am still very interested in using Python to solve most of the technical problems we encounter, including how to establish standards and lead people to ensure code control and personnel improvement, etc.

In summary, over the years, I have accumulated some technical and management experience through ups and downs, and to be honest, I am less confident in switching to other languages.

👉 Welcome to join Xiao Ha’s Planet , you will gain: Exclusive project practice / Java learning path / One-on-one Q&A / Study check-in / Book benefits

The full-stack front and back end separation blog project version 1.0 is complete, and version 2.0 is being updated, Demo linkhttp://116.62.199.48/ , hands-on guidance throughout, back-end + front-end full-stack development, explaining each functional point development step from 0 to 1, 1v1 Q&A until the project goes live. Currently updated with 219 sections, totaling over 350,000 words, with 1492 explanatory images, and still continuously working hard.. More projects will be launched later, aiming to cover typical projects in the Java field, such as flash sale systems, online malls, IM instant messaging, Spring Cloud Alibaba, etc. Click to join learning, with over 1200+ friends already joined (early bird price is super low)

Why Is Our Company Still Using Python for Development?

Recommended Reading  Click the title to jump
1. ByteDance transfers 23 people to public security, dismisses 136 people!
2. In 2024, only the colorful P-site truly cares about website performance!!!
3. 8 essential skills for Python web scraping experts!

If you find this article helpful, please share it with more people

We recommend following "Python Internship Room" to enhance your Python skills

Liking and viewing is the greatest support ❤️

Leave a Comment