The Biggest Pitfall in Security Monitoring Networking: Is It Really the Switch?

After the transition from analog to IP in security, networking has become increasingly important and complex in security applications. DOUDOU, with years of experience in the security network field, has found that many technical personnel in the industry have taken many detours. Whether they are security manufacturers, integrators, or end users, there are many misunderstandings about how to select switches and the causes of video lag.

Many so-called selection experiences and documents circulating in the market are actually full of pitfalls, such as a recent article titled “How Many Cameras Can a Switch Support?” Therefore, today I will summarize these common misconceptions.

Myth 1: Blindly calculating the number of cameras based on switch capacity

This calculation simply divides the switch’s capacity by the camera’s bitrate to determine the number of cameras it can support.

According to this theory, a 24-port non-managed switch with full Gigabit capability, where each port has a rate of 1000Mbps, should be able to support up to 250 cameras with a 4M bitrate without any issues, suggesting that the entire switch could support thousands of cameras.

Assuming the actual performance is generally only 60-70% of the theoretical value, then each port could support no more than 150 cameras, meaning the entire switch could still support over 1000 cameras.

But is this the actual situation?

Following this logic, there would be no difference in the number of devices supported between a Gigabit unmanaged switch and a managed switch. When we analyze the network causes of video lag based on this theory, it leads to existential doubts.

Ultimately, we find that the bandwidth design of each node in the network is completely fine, there are no bottlenecks in traffic, and the switch appears to be operating normally, yet the video is still lagging and pixelated. How do we explain that?

Myth 2: The actual performance of switches is generally only 60-70% of the theoretical value?

Many people, even pre-sales personnel from switch manufacturers, will tell you that the actual forwarding performance of a switch is only 60-70% of the theoretical value when making security proposals, so you need to leave some margin when calculating the number of devices.

Having worked in the data communication field for 7 years, including time at equipment manufacturers and chip companies, I have never seen a chip from any manufacturer that fails to meet its theoretical performance (switching capacity).

A 24-port Gigabit switch chip must have a switching capacity of ≥48Gbps [24 (24 ports) X 1G (1000M) X 2 (full duplex) = 48G]. I don’t think any chip design company would make such a basic mistake, nor would any reputable switch manufacturer market a switch that does not meet line-speed forwarding performance (excluding chassis switches with blocking ratios).

If you have indeed encountered a switch whose switching capacity does not meet the theoretical value and only achieves 60-70% performance, congratulations, you have successfully purchased a defective product. Such defective products are something reputable manufacturers cannot produce, as they would only arise from defects in research, design, or production processes, and would be sold directly to the market without professional testing. The same applies to forwarding rates.

Myth 3: Selecting switches based on experience

Currently, when various network equipment manufacturers engage in security network projects, in addition to selecting based on port specifications and switching capacity, one of the most important methods is to select based on past project experience.

However, we often encounter situations where the same switch performs well in different projects, and these projects have similar network scales, camera counts, and bitrates, with similar networking schemes.

Project A is fine, Project B is also fine, but Project C experiences lag. WHY?

Immediately contact the manufacturer to replace the switch, and once replaced, it works fine. It seems like bad luck. But after a while, lag occurs again. WHY?

Constantly replacing devices, rebooting devices, adjusting network structures, etc. After repeated efforts, it may work, or it may still randomly lag, leaving one exhausted, and ultimately unable to draw a conclusion. Even leading network brand manufacturers cannot provide an accurate reason.

First, let’s briefly analyze the basic principles of video stream transmission:

Video streams consist of I-frames and P-frames, where I-frames are large frames. During network transmission, the loss of any I-frame packet will result in the video being unable to display. Additionally, due to the real-time requirements of video, UDP transmission is generally used, meaning that lost packets are not retransmitted. Therefore, as long as there is packet loss in the network, lag will occur.

Next, let’s briefly introduce the switching principles of switches:

When a 100M port transmits 1M of data to another 100M port, it transmits at a rate of 100M for 1/100 of a second. If during that 1/100 of a second, another 100M port also transmits 1M of data to the same 100M port, even though the combined data flow from both ports is only 2M, far from reaching the 100M bandwidth bottleneck, congestion will still occur.

Similarly, a 1000M port can only accept data from one 1000M port at the same time, but it can accept data from ten 100M ports simultaneously. However, if it exceeds ten, congestion will occur.

Therefore, traffic (bandwidth) and rate are two different concepts and should not be confused.Regardless of how large the transmitted data flow is, the transmission rate remains at 100M or 1000M; it is merely the time taken for different data flow sizes to be transmitted that varies. When the rates are the same, if two or more ports transmit to the same port simultaneously, congestion will occur. If the buffer can accommodate the congested data flow, there will be no packet loss; if the buffer cannot accommodate it, packet loss will occur.

From the above two simple analyses, we can understand that the more video streams a switch transmits, the greater the possibility of instantaneous concurrency, and thus the higher the probability of congestion. This is why aggregation or core layers are more prone to congestion, especially at the core layer, where the number of video streams transmitted is the highest, with hundreds or thousands of streams passing through the core switch.

It is important to emphasize again that in security networks, most lag and packet loss are caused by this type of congestion, not by forwarding performance. These are two completely different concepts.

Note:Many clients confuse latency with lag. Latency refers to the time difference between when image data is collected by the front-end network camera and when it is viewed on the user’s monitoring device. The image collected by the camera undergoes compression, network transmission, decoding, and output display. Although these processes are very brief, we can still perceive a lag in the displayed image. This lag is referred to as image latency. However, as long as the latency does not exceed 1 second, it is difficult to perceive intuitively, and in most scenarios, it does not affect application. Unless in specific industrial fields that require millisecond-level processing based on video analysis, latency becomes critical. Latency does not cause image loss or packet loss. Lag, on the other hand, results in image loss and is caused by packet loss.

In addition to congestion and packet loss, another reason could be due to the quality of the wiring project, such as aging lines, oxidized connectors, poorly made connectors, etc. These situations can lead to similar FCS error frames and cause packet loss.Strictly speaking, this is unrelated to the switch, so I won’t elaborate further.

1. Select switch specifications based on the camera’s bitrate and quantity, and design a good networking scheme.

DOUDOU believes that as networking becomes more prevalent in security, the technical capabilities of practitioners will gradually improve, and network failures caused by specification selection and networking schemes will become less frequent. If bandwidth bottlenecks occur due to this reason, it is indeed too basic. For a network with XX cameras of X bitrate, how many switches of what port specifications (port quantity and port rate) should be selected for the access layer, how many switches of what port specifications for the aggregation layer, and how to select for the core layer—this kind of basic knowledge I won’t waste words on here, as there is plenty online.

At the same time, to cope with sudden traffic, it is recommended that the bandwidth utilization of switch ports should not exceed 70% during selection and design. It is best to keep it below 60%. Note: This is not because the actual performance is only 60-70% of the theoretical value, but to prevent sudden traffic; high utilization is not recommended. Forwarding performance is the first step to ensure, and then we consider avoiding congestion.

2. Choose managed switches with larger buffers whenever possible.

Buffers can reduce packet loss caused by congestion. Theoretically, if the buffer is large enough, packet loss would be zero, and video would not lag due to network reasons. A client once asked DOUDOU how to calculate the required buffer size for a switch based on XX cameras with XX bitrate. Theoretically, it can be calculated, but in practice, you will find that there are currently no switches on Earth that can meet this buffer requirement.

Congestion is probabilistic; it is impossible for every port to congest simultaneously, so chip companies do not design buffers this way due to the high cost of buffers.Under normal circumstances, the higher-end the switch, the richer the business features, and the larger the buffer. This is why when we choose managed or layer 3 switches, the probability of packet loss and lag is lower. Similarly, a 24-port Gigabit switch may have only a few hundred K of buffer, while a layer 3 switch may have tens of M of buffer.

Therefore, when the budget is sufficient and costs are acceptable, it is advisable to choose managed switches with larger buffers, as this is a pattern in chip design. A small piece of knowledge: the switching capacity of a 24-port Gigabit unmanaged chip and a 24-port Gigabit layer 3 chip is the same; the difference lies in the capacity of various entries, buffer size, and business features (functions). For equipment manufacturers, when developing switches, they can only choose chips with larger buffers; they cannot change the buffer size, as this is a hardware characteristic of the chip.

However, regardless of how switches are selected and how networking is designed, currently no manufacturer dares to guarantee that their products and solutions will never experience lag in any security project, including well-known brands like Huawei and H3C. This is because the transmission of camera bitrates is dynamic, and the possibility of congestion always exists, while the buffer size of switches cannot completely meet the needs of all camera congestion.

Related posts

Leave a Comment Cancel reply