Writing another
load test tool

What a stupid idea, wasn't it?

René Schwietzke, Xceptance GmbH

About René Schwietzke

  • Master of Computer Science (Dipl.-Inf.)
  • Programmer since 1989
  • Java since Java 1.0, 1996
  • QA and Testing since 1998
  • Performance Tester since 1999
  • Co-Founder of
  •    @ReneSchwietzke
  • @reneschwietzke@foojay.social

About

  • Founded 2004
  • Headquarters in Jena, Germany; Subsidiary in Cambridge, MA, USA
  • Specialized in Software Testing and Quality Assurance
  • Performance testing since 2004
  • Over 150 performance test projects every year
  • World-wide customer base including APAC and South America
  • Performance Test Tool , Java-based, APL 2.0

License

I care and share

c b a

This presentation is licensed under
Creative Commons Attribution-ShareAlike 4.0 International License.

A little bit of History

History Before the History

SilkPerformer

2000 to 2004

  • Work before Xceptance
  • Seque SilkPerformer
  • Microsoft Windows*
  • Scripting language Visual Basic-like
  • Needed a DB (SQL Server)
  • No real open data
  • Odd protocol for agents
  • No good default reports
  • 500,000 Euro, 20k(?) users
  • Plus support at 15%-20% annually
  • Needed hardware, 600 blade machines

Early Xceptance Gig

The First Load Test

2006

Back to load testing

  • Our first large customer
  • Required extensive product load testing
  • Later also project load testing
  • Hosting fully on Linux
  • Cloud was not yet a thing
  • They had no money as a startup
  • We had not money as a startup
  • Loadrunner and Silkperformer, no option
  • Also no project license concept

Why not JMeter?

Wasn't there something open source?

  • JMeter was already a thing
  • Version 2.X or something
  • No yet an Apache project
  • Only request level recording
  • No scripting language
  • Cumbersome UI
  • Scaling was difficult
  • No debugging
  • No ready-to-use reports

Our own tool

Why we rolled a first version

A first version

The how and what of the first version

  • Needed a debuggable scripting language: Java
  • Didn't want to fiddle with dynamic forms: HtmlUnit
  • Wanted to query the DOM and not regexp it: HtmlUnit
  • HttpClient in the JDK was horrible: Apache HttpClient
  • Needed some charts: JFreeChart
  • Something familar for better reuse and IDE support: JUnit
  • Must fit version control systems (SVN was a thing)
  • We hated the naming in most tools (transaction)

Drawbacks

Not everything was working yet

  • No HTML reports
  • No scaling
  • Used multimachine parallel runs for scale

[java] ===================================== Total =======================================
[java] Timer List Size: 16
[java] RegisterUser: 356 (1924msec) within 00:14:58,096
[java] SimpleSearch: 45920 (174msec) within 00:15:45,088
[java] Storefront: 11283 (36msec) within 00:15:43,762
[java] Checkout.OrderSummary: 2642 (891msec) within 00:15:22,356
[java] ViewCart: 2642 (587msec) within 00:15:23,727
[java] ProductDetails: 25625 (59msec) within 00:15:34,437
[java] Checkout.Unregistered.ShippingMethod: 2642 (218msec) within 00:15:22,400
[java] AddToCart: 19378 (585msec) within 00:15:35,220
[java] Checkout: 2642 (211msec) within 00:15:23,579
[java] Checkout.Unregistered.Addresses: 2642 (303msec) within 00:15:22,644
[java] Checkout.Unregistered: 2642 (117msec) within 00:15:22,617
[java] Checkout.Unregistered.Payment: 2642 (682msec) within 00:15:22,545
[java] MyAccountPage.LoggedOff: 356 (46msec) within 00:14:57,560
[java] MyAccountPage.RegisterWithUs: 356 (43msec) within 00:14:56,522
[java] SelectVariationProductDetails: 35306 (226msec) within 00:15:34,566
[java] SimpleBrowsing: 860800 (67msec) within 00:15:33,816

[java] Total Requests: 1017874
        

First Charts

We can do stupid names too!

No abbreviation, no cool tool

YART - Yet Another Regression Test Tool

Charts and Data

A Thing We Wanted To Do Right

Charts from Others

It's a trap, Luke!

Chart Example

Trust me, they are lying!

Test time: 1 h - Total: 112,695
Mean: 501 ms - P50: 250 ms - P95: 1,950 ms - P99: 5,170 ms - Max: 16,689 ms - P99.9: 7,830 ms

The Mean Friend

The mean is not your friend

Test Time: 3h 30 m - Total: 114,386
Mean: 504 ms - P95: 550 ms - P99: 2,660 ms - Max: 6,169 ms - P99.9: 4,280 ms

Test Automation

One Stone, Two Birds, A Costly Mistake

One Script, Two Use Cases

The right idea, but the wrong web

  • Why not use the same scripts for test automation and load testing?
  • Build a nice UI to create scripts
  • Execute on the fly, don't transform to Java code

Why it failed

Failed for 2 and a Half Reasons

  • Browsers evolve too quickly
  • JavaScript in HtmlUnit differs from browser JS
  • No UI rendering, different outcome
  • Real browser load testing is expensive
  • XUL was deprecated by Mozilla
  • 7 years of work down the drain
  • Second attempt using Eclipse/SWT failed
  • Gave devs too much room
  • Lost in UI ideas and API issues
  • Eclipse programming model is too complicated
  • Cost about 500,000 EUR

What came out of it

One good thing remained

  • Load testing with real browsers
  • Overcomes the JavaScript state mess
  • Scripting similar to test automation with WebDriver
  • You gotta pay with hardware cost aka scale
  • A Chrome eats easily 500 MB and 2 cores
  • Byproduct: You get Web.Vitals metrics!

Data, Lt. Cmdr.

Load Testing is About a Ton of Data

How Much Data

A standard load test result of a large US customer

Business Perspective

Runtime 3 hours
User Scenarios 17
Visits 5,266,130
Page Interactions 55,462,101
Total Requests 122,185,828
Orders 677,606
Errors 70,491
Datacenters 7
Load Generators 50 / 800 Cores / 1.6 TB RAM

Tool Perspective

 
Test Cases 17
Transactions5,266,130
Actions 55,925,554
Requests 122,185,828
Events 124,519
Custom 5,232,721
Agent 53,409
Data Lines 189,751,960

How many data points?

How many points of data are captured?

For Transactions47,395,170
For Actions279,627,770
For Requests2,810,274,044
For Custom Data622,595
For Event Data26,163,605
For Agent Data1,228,407
Total3,165,311,591
Uncompressed Data 48.72 GB
Compressed Data 4.10 GB
Lines per Second 17,569
Datapoints per Second293,084

Open Data

Open data for custom analytics, modification, and reporting

  • CSV holds all measured data
  • XML for intermediate data
  • XSLT for transformation into HTML
  • CSS for styling

R,ProductDetailsPage.1,1666819841884,2759,false,1345,40942,200,
    https://acme.org/p/soap-126303030.html?cgid=foaming-hand-soap,text/html,0,0,2749,10,2749,2759,,GET,,,0,,
R,ProductDetailsPage.2,1666819844769,993,false,1858,429,200,
    https://acme.org/en_US/__Analytics-Start?url=https...,image/gif,0,0,992,0,992,992,,GET,,,0,,
R,ProductDetailsPage.3,1666819845762,940,false,1305,1259,200,
    https://acme.org/authiframe,text/html,0,0,940,0,940,940,,GET,,,0,,
R,ProductDetailsPage.4,1666819846703,1008,false,1350,2050,200,
    https://acme.org/en_US/Cart-MiniCartContent,text/html,0,0,1008,0,1008,1008,,GET,,,0,,
A,ProductDetailsPage,1666819841883,7968,false
T,TAddToCart,1666819778626,72846,false,,,,

Lifesaver Features

Things That Make us Stand Out
But Which Are All Based on Learnings

Isolation and Cleanup

What many get wrong in the first place

  • Fully isolated clients with no sharing by default
  • "You are sharing a session!" - Nope, never ever ;)
  • You have to break it intentionally when needed
  • "You are using the same session!" - No, we don't!
  • We are not even sharing the same connection to make it as real as possible.
  • Learning: Make it right in the first place and prevent stupidity.

Archiving and Sharing

Surprising use cases

  • All data is open
  • Results can be zipped up
  • Reports can be zipped up
  • Hosting everywhere possible
  • Recreate a report at any time anywhere
  • Surprising use case: BaFin
  • Working for a German neobroker
  • Need to archive things for good
  • Simple with open, tool independent files

Misc

A list of small but important things

  • Flight mode: Close the laptop in BOS and capture the result in FRA
  • Emergency break: Stop an agent if it goes nuts
  • Partial: Download data and get a report at any time
  • Partial: Get data despite dead agents
  • Anywhere: You can download from another machine
  • Fence it: Setup filters to avoid wandering around
  • Custom: Log custom events, timers, and data
  • Java: Use any Java feature you like just don't mess with threading and keep overhead in mind

Result Browser

Don't give me excuses, give me details!

  • For debugging or on failure
  • Communicates all details
  • Keeps history to see how one got to the failure point
  • Able to restrict itself in volume when the same problems repeats too often
  • Can be triggered "manually" too

Pseudo Randomness

When randomness is predictable

  • Need: Tests are too static, randomize
  • Challenge: Hard to reproduce failures, because you don't know the path
  • Solution: Pseudo-random generators, keep the seed and you can replay it later
  • Limitation: The environment should behave (nearly) the same to make it work.
  • Important: Ensure that the seed is random to avoid the same numbers again.
  • Learning: We got burned badly by incorrect seeds.

Profiles and Arrival Rate

Concurrent users?

  • User rate
  • Arrival rate
  • Flexible over runtime
  • Learned that from a customer

Test Data

Avoid Hard Coding Data

  • Take the data from the pages
  • Categories, SKUs, URLs, forms...
  • Customer can update data freely
  • Data can go offline during testing
  • Form changes get noticed
  • Exploratory load testing!
  • Doesn't apply to all testing

public class ViewCart extends PageAction<ViewCart>
{
    @Override
    protected void doExecute() throws Exception
    {
        // Get mini cart link.
        final HtmlElement cartLink =
            GeneralPages.instance.miniCart.
            getViewCartLink().asserted().single();

        // Click it.
        loadPageByClick(cartLink);
    }

    @Override
    protected void postValidate() throws Exception
    {
        // this was a page load, so validate
        // what is important
        Validator.validatePageSource();

        // basic checks for the cart
        CartPage.instance.validate();
    }
}

Merging and Splitting

Testing is Dynamic

Merging, Splitting, Filtering

Infinite possibilities and regexp of course

  • Massage the data later
  • By time
  • By datacenter or agent
  • By test case
  • By response code
  • By URL (parts)
  • By response time
  • By content type
  • By name
  • Out of tool magic
  • Merge load tests
  • Manually split tests
  • Search data
  • Obfuscate

Every feature was built, because we needed that at some point or sometimes we were ahead of the curve.

Java

What? Where is the Java relationship?

Java? Java!

Where is Java?

  • The entire stack is Java
  • We profit from the ecosystem
  • Learned a lot about performance
  • Learned a lot about scalability
  • Learned a lot about memory and GC
  • We went from JDK 1.3 to 17
  • We support ARM and x86
  • Learned a lot about cloud machines
  • Looking forward to Leyden
  • Looking forward to Valhalla

Ecosystem

What we use

Apache-Commons DNSJava Freemarker Hessian WebDriver Selenium JFreeChart Java-HLL HtmlUnit HttpClient OkHttp Jetty Log4J SLF4J Xalan Xerces WebP-ImageIO XStream DSIUtils JSON JUnit Progressbar and more

Wait...

There is more! JVM rocks!

Java Lessons Learned

Things That Were Messy

1.3 to 17

<= JDK 8

  • Tuned the hell out of CMS
  • Ran many smaller VMs per hardware box
  • Compensate for GC pauses and measurement impact

JDK 11

  • Let the VM grab 70% of memory
  • One VM only, G1 (enough reserved space)
  • Compensates for GC pauses and measurement impact
  • Module concept made migration harder

JDK 17

  • Playing with ZGC and Shenandoah
  • Socket impl changes impacted us
  • External libs and deprecations
  • Hunted a C2 compiler issue, never got it resolved, JDK 21 took care of it

Mac ARMs

  • Suddently realized how many native code some dependencies bring

Best Finds

Success due to Functionality

CDNs made the world turn differently

Errors became a thing

  • When adding the CDN to our large customer, errors became a thing
  • 500, 502, 503 out of nowhere for normal load
  • Before that, we often went completely error free
  • Call with CDN provider: 1% errors are quite normal
  • Nothing in life is free, CDN caching and protection costs
  • Feature: Full error detail logging

Akamai Customer Setup

Performance was turning south quickly

  • "You don't test right."
  • "No, you don't test right"
  • "You use the same session."
  • "More IPs!"
  • "More locations."
  • "Give us requests and custom headers."
  • "You must do EDNS."
  • "You must not cache the DNS entry."
  • Got them all of that
  • It was a prefetch setting
  • It killed the origin because of the logic to load more than needed
  • Learned that many load testing seems to be garbage; one is not trusted

Cloudflare Kernel Issue

Occationally requests got lost

  • Request was sent but was never answered
  • "Can't be! You test wrong. Proof it."
  • "Give us details."
  • "We see it in the log, must be origin."
  • Origin never got the request
  • Cloudflare discovered a kernel issue when handling I/O against the disk cache
  • Feature: Insert customer UUIDs into requests and user agents, only agent names are always logged
  • Feature: Logging all details by default, able to log even non-failures fully, run tests for up to 7 days

Cost, License, Benefits

Some business things

Open Source

Try to increase the reach

  • No sales or marketing department
  • Reach was kind of limited
  • Decided to go open source for reach
  • Picked APL because of the stack
Copyright [yyyy] [name of copyright owner]

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
        
  • Going open source did not change much
  • No external contributions
  • Rather people dropping support requests and screenshots of exceptions
  • But it gave us a good feel and helped in some project sales
  • Still requires marketing!

Cost and Benefits

The final bill

  • Started 2007 to evolve into a product
  • 17 years of development
  • Sometimes three devs, sometimes none
  • More than 1.25 Million Euros, for sure
  • No costs for external tools though
  • Able to run unlimited tests at any size concurrently
  • Quick turn around in projects
  • Debugging help and features done in no time

Any Future?

So Much Competition, so Much More to Do

What is next?

Features, features, features

  • JDK 21 and virtual threads
  • HTTP/3 support
  • HTTP/2 as default
  • OpenTelemetry support
  • Realtime errors and metrics
  • Auto-Rating/Scorecards
  • JMeter replay
  • Even faster report generation
  • Maybe live data querying

Built it into an SaaS Offer

We could do things we couldn't have done otherwise.

We never broke even.

It was worth the money.

We are damn proud of the tool we have built.

Now, there are too many tools on the market, hence doing the same in 2024 may not make sense anymore. Or does it?

The JVM is still the right choice when you want to do more than just firing requests.

Resources

Just pointers more information