Analysis of long-running Dante processPosted by Inferno Nettverk A/S, Norway on Thu Apr 23 14:42:08 MEST 2015
Making software run properly requires testing and we continuously run tests internally for software we develop in order to uncover as many bugs and problems as possible before the software is released. Most problems are usually detected by our internal testing, but testing rarely manages to reproduce all the diverse configurations and environments that the software might be used in when in production.
Fortunately, now and then there are Dante users that allow us to test and monitor our software on their machines when we are getting close to a new release, and most of our internal tests are passing. Apart from allowing us to see how Dante behaves in a specific production environment, this testing has sometimes allowed us to find and fix problems that were not found in our internal tests, and often provides an opportunity to improve our internal test systems based on the the newly discovered problems.
In one case, during preparation for the release of Dante 1.4.0, the machine the customer was going to deploy the new Dante version on was running Dante 1.3.2, and had been running the same server instance uninterrupted for almost one year. When the main Dante server process was about to be terminated so that it could be upgraded it to Dante 1.4.0, it had been running for 326 days, 18 hours and 34 minutes.
This was a good opportunity to observe the behavior of a long-running Dante server, so before terminating the 1.3.2 processes, we were allowed to spend some days collecting data on the customer's machine.
Data collection and analysis
We have a set of scripts that we use to collect various system data, based on tools such as ifconfig, netstat, ps, etc. This makes it possible to determine the load on the machine, and in some cases, additional information about the number of active Dante sessions.
The figure above shows the incoming and outgoing transfer rates on the machine, with almost all of the traffic passing through the Dante server since it was a server dedicated to running Dante. For the measured period, the incoming and and outgoing rates overlap, indicating as expected that most of the traffic passes through the machine, rather than originating or terminating at the machine. The measured rates vary between 56 Mbps and 198 Mbps, with the average being 108 Mbps.
A comparison of the packet rates for the different IP protocols is shown in the above figure. Around 25,000 packets are received and sent each second, with a slightly higher number being received. The traffic is largely split between TCP and UDP, with the number of UDP packets being slightly larger. A relatively small amount of ICMP traffic can also be seen from the lines at the very bottom of the diagram.
An overview of the processes running on the machine can be seen in the figure above. Between around 300 and 500 processes run on the machine, with around half of being Dante processes. Most of the variability in the number of processes comes from changes in the number of Dante processes, as Dante adapts to changes in the client load on the machine, increasing and decreasing the process count based on the number of active clients.
The CPU usage on the machine is shown in the above plot, and it is fairly low. Dante uses a total of between around 8% and 20% of the CPU for all its processes. The total CPU usage for the non-Dante processes is also fairly low and stable, being between 4% and 10%.
The above figure shows the memory usage on the machine. Most of the memory is in use, but there is very little swap usage.
From the data on this machine one can observe that Dante, at least for the type of client load handled by this machine, is able to run stable for a long time while handling a fairly large amount of traffic. One can also see that Dante adjusts the number of processes dynamically as the need changes.
If the main Dante server process, or any of the other long running Dante processes, had suffered from even modest resource leaks, such defects in Dante should have been easily visible after handling this amount of traffic on a daily basis for almost one year. One could also have expected Dante to have crashed a long time ago if such problems were in general present in Dante.
Part of the reason for the long lifetime of the processes might be that Dante has been designed to, as much as possible, avoid dynamic memory allocation, as it can often lead to gradual resource leaks and performance degradation when not implemented fully correct. This design choice in Dante can often make code less complicated also, though the trade off is that it may use somewhat more memory than strictly necessary at any given time. The benefits can however likely be seen in the long lifetime of this Dante server instance; being able to handle a large amount of clients and traffic month after month.
In sum, after running for almost one year there is nothing remarkable about the data from this machine. The machine is under load and it is able to handle the load with a good margin for spikes in client sessions and traffic. Both the machine and Dante appear to be running well, and there is no indication that it would not have continued to run for another year or two if we had not terminated it in order to test a new Dante version.
We consider this type of testing and analysis is essential for the development of well working software.
If you are in a situation where you can help us with this type of testing, feel free to contact us at the (non-public) email address email@example.com. While being able to access and test software remotely is very helpful, having access to performance data from a production environment is also very useful. We have a set of shell scripts that we use for data collection (based on tools such as ps, ifconfig, etc.). If you think you might be able to collect this type of system information from one of your systems running Dante, please feel free to contact us on the address above, and we will send you a copy of the scripts we use for this.
In addition to helping us produce releases that have fewer bugs, the benefits to you include that you are much less likely to experience any unexpected problems or performance issues when you upgrade to the latest Dante version, as we will already have tested and analyzed the performance of Dante on your machine. As a bonus, you also get a free performance analysis of your Dante server.
Subject: When will you update Debian repository...
Date: May 2016
Hello, I'm using Debian Jessie and I was wondering; do you abandon debian?> ttps://packages.debian.org/search?keywords=dante-server I emailed them several times last year, and they still doesn't update it. I'm using 1.1(installed when Debian Wheezy era), and I think it's time to upgrade Debian repository. (And 1.1 has horrible bug that stop connection if the user upload big file using POST method to webserver)
Name: Inferno Nettverk A/S
Subject: Debian repository
Date: July 2016
The Dante packages included with distributions such as Debian are not maintained by us. Unfortunately, this means that sometimes only old versions of Dante are available. Version 1.1 of Dante is very old and we would recommend that you manually compile a newer version, if possible.
To add a public comment to this article, please use the form below.