Scaling Performance Tests in the Cloud with JMeter

I have very heterogeneous performance test use cases. From simple performance regression tests that are executed from a Jenkins node to eventual large-ish stress tests that run with over 100K requests per second and > 100 load generators. With higher loads, many problems arise, like feeding data to load generators, retrieving results, real-time view, analyzing huge data sets and so on.

JMeter is a great tool, but it has its own limitations. In order to scale, I had to work around a few of it’s limitations and created a test framework to help me execute tests at scale, on Amazon’s EC2.

Having a central data feeder was a problem. Using JMeter’s master node is impossible. A single shared data source might become a bottleneck, so having a way of distributing it was important. I thought about using a feeder model similar to Twitter’s Iago or a clustered, load balanced resource, but settled for something simpler. Since most tests only use a limited data set and loop around it, I just decided to bzip files and upload them to each load generator before the test starts. This way I avoided the problem of making an extra request to get data during execution and requesting the same data multiple times because of the loop. One problem with this approach is that I don’t have centralized control over the data set, since each load generator is using the same input. I mitigate that by managing the data locally on each load generator, with a hash function or introducing random values. I also considered distributing different files to different load generators based on a hash function, but so far, there was no need.

Retrieving results was tricky too. Again, using JMeter’s master node was impossible because of the amount of traffic. I tried having a pooler fetching raw ( only timestamp, label, success and response time ) results in real-time, but that affected the results. Downloading all results at the end of the test worked by checking the status of the test ( running or not ) every minute and downloading after completion, but I settled with having a custom sampler in a tearDown thread group, compressing and uploading results to Amazon’s S3. This could definitely be a plugin too. It works reasonably well, but I loose the real-time view and have to manually add a file writer and sampler to tests.

With real-time view, I started with the same approach as jmeter-ec2, pooling aggregated data (avg response time, rps, etc) from each load generator and printing that, but it proved useless with a large number of load generators. For now, on Java samplers, I’m using Netflix’s servo to publish metrics in real time (averaged over a minute) to our monitoring system. I’m considering writing a listener plugin that could use the same approach to publish data from any sampler. Form the monitoring system I can then analyze and plot real-time data with minor delays. Another option I’m considering is using the same approach, but using StatsD and Graphite.

Analyzing huge result sets was the biggest challenge I believe. For that, I’ve developed a web-based analysis tool. It doesn’t store raw results, but mostly time-based aggregated statistical data from both JMeter and monitoring systems, allowing some data manipulation for analysis and automatic comparison of result sets. Aggregating and analyzing tests with over 1B samples is a problem, even after constant tuning. Loading all data points to memory to calculate percentiles and sorting is practically impossible, just for the fact that the amount of memory I’ll need is impractical, even with small objects. For now, on large tests, I settled on aggregating data while loading results ( second/minute data points ) and accepting the statistical problems, like average of averages. Another option would be to analyze results from each load generator independently and aggregate at the end. In the future, I’m considering having results on a Hadoop cluster and using Map/Reduce the get the aggregated statistical data back.

The framework also helps me automate most of the test process, like creating a new load generator cluster on EC2, copying test artifacts to load generators, executing and monitoring the test while it’s running, collecting results and logs, triggering analysis, tearing down the cluster and cleaning up after the test completes.

Most of this was written in Java or Groovy and I hope to open-source the analysis tool in the future.

Stop_time is 0 LoadRunner Analysis Error

It was not the first time I got this error message when trying to analyze results on LoadRunner Analysis. This is pretty strange since from the controller the test seemed to run fine. Usually I would just discard the test results and re-run it, just to avoid troubleshooting another problem caused by a LoadRunner bug that probably messed the results, but this time I decided to take some time to just check what was happened to the results.

When I tried to analyze raw results on Analysis, the import process fails and the error log has the following message:

Analysis Error log: <1/6/2012 1:23:52 PM>
Error 75012: in file 25983.lrr the Stop_time is 0

This seems really strange, but first thing I tried was to open the .lrr file with a text editor. Searched for the Stop_time parameter and apparently it was missing. Start_time was there and apparently with the correct test start time. Let’s try to include Stop_time to the file and see what happens.

Both Start_time and Stop_time are Epoch, or UNIX time, which is pretty much seconds since 01-01-1970. Just to be sure, check your Start_time with a converter like . If the time is correct, do the math to check how many seconds your test ran. For example, 1 hour equals to 3600 seconds, 2 hours 7200 and so on.
Take this number and sum it with Start_time. Add a new parameter called Stop_time and use the result of your sum. Eg.:

Start_time=1325809089 // Fri, 06 Jan 2012 00:18:09 GMT
Stop_time=1325816289 // Fri, 06 Jan 2012 02:18:09 GMT

Saved the file and tried to open the results again and it worked!!!

Hope this helps saving some time and avoiding headaches!

Cheers,
Martin