Writing another
load test tool

That was stupid idea, wasn't it?

René Schwietzke, Xceptance GmbH

About René Schwietzke

  • Master of Computer Science (Dipl.-Inf.)
  • Programmer since 1989
  • Java since Java 1.0, 1996
  • QA and Testing since 1998
  • Performance Tester since 1999
  • Co-Founder of
  •    @ReneSchwietzke
  • @reneschwietzke@foojay.social

About

  • Founded 2004
  • Headquarters in Jena, Germany; Subsidiary in Cambridge, MA, USA
  • Specialized in Software Testing and Quality Assurance
  • Performance testing since 2006
  • Over 100 performance test projects every year
  • World-wide customer base including APAC and South America
  • Performance Test Tool , Java-based, APL 2.0

License

I Care and Share

c b a

This presentation is licensed under
Creative Commons Attribution-ShareAlike 4.0 International License.

A little bit of History

History Before the History

SilkPerformer

2000 to 2004

  • Work at an Ecommerce Vendor
  • Seque SilkPerformer
  • Microsoft Windows*
  • Scripting language Visual Basic-like
  • Needed an SQL Server
  • No real open data
  • Odd protocol for agents
  • No good default reports
  • 500,000 Euro, 50k(?) users
  • Plus support at 15%-20% annually
  • A lot of hardware, 600 blade machines

Early Gigs

The First Load Tests

2006

Back to Load Testing with a New Company

  • First large customer
  • Required extensive product load testing
  • Later also project load testing
  • Hosting fully on Linux
  • Cloud was not yet a thing
  • They had no money as a startup
  • We had no money as a startup
  • Loadrunner and Silkperformer not an option
  • No project license concept

Why not JMeter?

Wasn't There Something Open Source?

  • JMeter was already a thing
  • Version 2.X or something
  • No yet an Apache TLP project
  • Only request level recording
  • No scripting language
  • Cumbersome UI
  • Scaling was difficult
  • No debugging
  • No ready-to-use reports

Write your own
when nothing fits

A First Version

A first version

The How and What of the First Version

  • Needed a debuggable scripting language: Java
  • Something familar for reuse and IDE support: JUnit
  • Must fit version control systems (SVN was a thing)
  • Didn't want to fiddle with dynamic forms: HtmlUnit
  • Wanted to query the DOM and not regex it: HtmlUnit
  • HttpClient in the JDK was horrible: Apache HttpClient
  • Charts: JFreeChart
  • Strange naming in most tools (transaction/request)

Drawbacks

Not Everything Was Working Yet

  • No HTML reports
  • No good scaling
  • Multimachine parallel runs for scale

[java] ===================================== Total =======================================
[java] Timer List Size: 16
[java] RegisterUser: 356 (1924msec) within 00:14:58,096
[java] SimpleSearch: 45920 (174msec) within 00:15:45,088
[java] Storefront: 11283 (36msec) within 00:15:43,762
[java] Checkout.OrderSummary: 2642 (891msec) within 00:15:22,356
[java] ViewCart: 2642 (587msec) within 00:15:23,727
[java] ProductDetails: 25625 (59msec) within 00:15:34,437
[java] Checkout.Unregistered.ShippingMethod: 2642 (218msec) within 00:15:22,400
[java] AddToCart: 19378 (585msec) within 00:15:35,220
[java] Checkout: 2642 (211msec) within 00:15:23,579
[java] Checkout.Unregistered.Addresses: 2642 (303msec) within 00:15:22,644
[java] Checkout.Unregistered: 2642 (117msec) within 00:15:22,617
[java] Checkout.Unregistered.Payment: 2642 (682msec) within 00:15:22,545
[java] MyAccountPage.LoggedOff: 356 (46msec) within 00:14:57,560
[java] MyAccountPage.RegisterWithUs: 356 (43msec) within 00:14:56,522
[java] SelectVariationProductDetails: 35306 (226msec) within 00:15:34,566
[java] SimpleBrowsing: 860800 (67msec) within 00:15:33,816

[java] Total Requests: 1017874
        

First Charts

We Can Do Stupid Names Too!

A Cool Tool Needs an Abbreviation

YART - Yet Another Regression Test Tool

Charts and Data

Things We Wanted To Do Right

Charts from Others

It's a Trap, Luke!

Chart Example

Trust Me, They Are Lying!

Test time: 1 h - Total: 112,695
Mean: 501 ms - P50: 250 ms - P95: 1,950 ms - P99: 5,170 ms - Max: 16,689 ms - P99.9: 7,830 ms

The Mean Friend

The Mean is Not Your Friend

Test Time: 3h 30 m - Total: 114,386
Mean: 504 ms - P95: 550 ms - P99: 2,660 ms - Max: 6,169 ms - P99.9: 4,280 ms

Test Automation

One Stone, Two Birds, and a Costly Mistake

One Script, Two Use Cases

The Right Idea, But...

  • Why not use the same scripts for test automation and load testing?
  • Build a nice UI to create scripts
  • Execute on the fly, don't transform to Java code

Why it failed

Failed for Two and a Half Men Reasons

  • Browsers evolve too quickly
  • JavaScript in HtmlUnit differs from browser JS
  • No UI rendering, different outcome
  • Real browser load testing is expensive
  • XUL was deprecated by Mozilla
  • 7 years of work down the drain
  • 2nd attempt with Eclipse/SWT failed
  • Eclipse programming model is too complicated
  • Gave devs too much room
  • Lost in UI ideas and API issues
  • Total cost about 500,000 EUR

What came out of it

One Thing Survived

  • Load testing with real browsers
  • Overcomes the JavaScript mess
  • Scripting similar to test automation with WebDriver
  • Pay with hardware
  • Chrome eats minimum 500 MB and 2 cores
  • Byproduct: Web.Vitals metrics

Data, Lt. Cmdr.

Load Testing is About a Ton of Data

How Much Data

A Standard Load Test Result of a Large US Customer

Business Perspective

Runtime 3 hours
User Scenarios 17
Visits 5,266,130
Page Interactions 55,462,101
Total Requests 122,185,828
Orders 677,606
Errors 70,491
Datacenters 7
Load Generators 50 / 800 Cores / 1.6 TB RAM

Tool Perspective

 
Test Cases 17
Transactions5,266,130
Actions 55,925,554
Requests 122,185,828
Events 124,519
Custom 5,232,721
Agent 53,409
Data Lines 189,751,960

How many data points?

How Many Data Points Are Captured?

For Transactions47,395,170
For Actions279,627,770
For Requests2,810,274,044
For Custom Data622,595
For Event Data26,163,605
For Agent Data1,228,407
Total3,165,311,591
Uncompressed Data 48.72 GB
Compressed Data 4.10 GB
Lines per Second 17,569
Datapoints per Second293,084

Open Data

Open Data for Custom Analytics, Modification, and Reporting


R,ProductDetailsPage.1,1666819841884,2759,false,1345,40942,200,
    https://acme.org/p/soap-126303030.html?cgid=foaming-hand-soap,text/html,0,0,2749,10,2749,2759,,GET,,,0,,
R,ProductDetailsPage.2,1666819844769,993,false,1858,429,200,
    https://acme.org/en_US/__Analytics-Start?url=https...,image/gif,0,0,992,0,992,992,,GET,,,0,,
R,ProductDetailsPage.3,1666819845762,940,false,1305,1259,200,
    https://acme.org/authiframe,text/html,0,0,940,0,940,940,,GET,,,0,,
R,ProductDetailsPage.4,1666819846703,1008,false,1350,2050,200,
    https://acme.org/en_US/Cart-MiniCartContent,text/html,0,0,1008,0,1008,1008,,GET,,,0,,
A,ProductDetailsPage,1666819841883,7968,false
T,TAddToCart,1666819778626,72846,false,,,,
  • CSV holds all measured data
  • XML for intermediate data
  • XSLT for transformation into HTML
  • CSS for styling

Lifesaver Features

Things That Stand Out
and Are All Based on Learnings

Isolation and Cleanup

What Many Get Wrong in the First Place

  • Fully isolated clients with no sharing by default
  • "You are sharing a session!" - Nope, never ever ;)
  • You have to break it intentionally when needed
  • "You are using the same session!" - No, we don't!
  • We are not even sharing the same connection to make it as real as possible.
  • Learning: Make it right in the first place and prevent stupidity.

Archiving and Sharing

Surprising Use Cases

  • All data is open
  • Results can be zipped up (CSV)
  • Reports can be zipped up (HTML)
  • Hosting everywhere
  • Recreate a report at any time and anywhere
  • Surprising use case: BaFin
  • Working for a German neobroker
  • Needs to archive things for good
  • Simple with open formats and open source

Misc

A list of small but important things

  • Flight mode: Close the laptop in BOS and capture the result in FRA
  • Emergency break: Stops an agent if it goes nuts
  • Partial: Download data and get a report at any time
  • Partial: Get data despite dead agents
  • Anywhere: Download from another machine
  • Fence it: Setup filters to avoid wandering around
  • Custom: Log custom events, timers, and data
  • Java: Use any Java feature you like just don't mess with threading and keep overhead in mind

Result Browser

Don't Give Me Excuses, Give Me Details!

  • For debugging or on failure
  • Communicates all details
  • Keeps history to see how one got to the failure point
  • Able to restrict itself in volume when the same problems repeats
  • Can be triggered "manually" too

Pseudo Randomness

When Randomness is Predictable

  • Need: Tests are too static, randomize them
  • Challenge: Hard to reproduce failures, because you don't know the path
  • Solution: Pseudo-random generators, keep the seed and you can replay it later
  • Limitation: The environment should behave (nearly) the same to make it work.
  • Important: Ensure that the seed is random to avoid the same numbers again
  • Learning: We got burned badly by incorrect seeds

Profiles and Arrival Rate

Concurrent Users

  • User rate
  • Arrival rate
  • Flexible over runtime
  • Learned that from a customer

Test Data

Avoid Hard Coding Data

  • Take the data from the pages
  • Categories, SKUs, URLs, forms...
  • Customer can update data freely
  • Data can go offline during testing
  • Form changes get noticed
  • Exploratory load testing!
  • Headless changes the game

public class ViewCart extends PageAction<ViewCart>
{
    @Override
    protected void doExecute() throws Exception
    {
        // Get mini cart link.
        final HtmlElement cartLink =
            GeneralPages.instance.miniCart.
            getViewCartLink().asserted().single();

        // Click it.
        loadPageByClick(cartLink);
    }

    @Override
    protected void postValidate() throws Exception
    {
        // this was a page load, so validate
        // what is important
        Validator.validatePageSource();

        // basic checks for the cart
        CartPage.instance.validate();
    }
}

Exclusive Test Data

What Also Hit Us Often

  • "You don't test right!"
  • Don't use the logn concurrently
  • We don't!
  • Split data intelligently
  • Provide it to each agent exclusively
  • Check out and check in data

// If there's only 1 party we don't have to compute anything
if (numberOfParties == 1)
{
   return data;
}
else
{
   // Contains all data for the current party
   final List<T> partition = new ArrayList<T>();

   // Compute the block numbers
   final int blockSize = numberOfParties;
   final int completeBlocks = (dataSize / blockSize);

   // Check for remainder
   final int mod = dataSize % blockSize;
   final int blocks = completeBlocks + (mod > 0 ? 1 : 0);

   // To get the partition data, fetch the current party's piece from each block
   for (int blockIndex = 0; blockIndex < blocks; blockIndex++)
   {
       final int dataIndex = (blockIndex * blockSize) + currentPartyIndex;
       if (dataIndex < dataSize)
       {
           partition.add(data.get(dataIndex));
       }
       else
       {
           break;
       }
   }

   return partition;
}

Merging and Splitting

Testing is Dynamic

Merging, Splitting, Filtering

Infinite Possibilities and Regex - Apply Retroactively, When Needed

Massage the Data Later

  • By time
  • By datacenter or agent
  • By test case
  • By response code
  • By URL (parts)
  • By response time
  • By content type
  • By name

Extra Magic

  • Merge load tests
  • Manually split tests
  • Search data
  • Obfuscate
  • Feed it into other tools

Every feature was built, because we needed that at some point. Sometimes, we were ahead of the curve.

Best Finds

Success due to Functionality

CDNs made the world turn differently

Errors became a thing

  • When adding the CDN to our large customer, errors became a thing
  • 500, 502, 503, 52X out of nowhere for normal load
  • Before that, we often went completely error free
  • Call with CDN provider: 1% errors are quite normal
  • Nothing in life is free, CDN caching and protection costs
  • Features: Full error logging, connection retry, DNS handling, assertions

One Famous CDN Vendor Setup

Performance Was Turning South Quickly

  • "You don't test right."
  • "No, you don't test right"
  • "You use the same session."
  • "More IPs!"
  • "More locations."
  • "Give us requests and custom headers."
  • "You must do EDNS."
  • "You must not cache the DNS entry."
  • Got them all data and setup
  • Result: It was a prefetch setting
  • It killed the origin because of the logic to load more than needed
  • Learned that many load tests in the wild seem to be garbage; one is prejudged and not trusted.

Second CDN Vendor's Kernel Issue

Occationally Requests Got Lost

  • Request was sent but was never answered
  • "Can't be! You test wrong. Proof it."
  • "Give us details."
  • "We see it in the log, must be origin."
  • Origin never got the request
  • Vendor discovered a kernel issue when handling I/O against the disk cache
  • Feature: Insert custom UUIDs into request and user agent, only agent names are mostly logged
  • Feature: Logging all details by default, able to log even non-failures fully, run tests for up to 7 days

Cost, License, Benefits

Some Business Things

Open Source

Try to Extend the Reach

  • No sales or marketing department
  • Reach was kind of limited
  • Decided to go open source for reach
  • Picked APL because of the stack
Copyright [yyyy] [name of copyright owner]

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
        
  • Has not changed anything
  • No external contributions
  • People dropping support requests and screenshots of exceptions
  • Still, gives us a good feeling and helps in project sales
  • Open source still requires marketing!
  • It is not freemium software!

Cost and Benefits

Get We Get the Check, Please?!

  • Started 2007 to evolve into a product
  • 18 years of development
  • Sometimes three devs, sometimes none
  • More than 2 Million Euros, for sure
  • Have not had to pay for external tooling
  • Able to run unlimited tests at any size concurrently
  • Quick turn around in projects
  • Debugging is easy, features can be done quickly

Any Future?

So Much Competition,
So Much More to Do

What is next?

Features, Features, Features

  • JDK 21 (done) and virtual threads
  • HTTP/3 support
  • HTTP/2 as default
  • OpenTelemetry support (in progress)
  • Realtime errors (done) and metrics (in progress)
  • Scorecards (done)
  • Auto-Rating
  • JMeter replay (done)
  • Even faster report generation
  • Maybe live data querrhying
  • DNS override
  • Request categories

Built it into an SaaS Offer

We can do things we couldn't have done without.

We never broke even but it was worth the money.

We are damn proud of the tool we have built.

Resources

Just pointers more information