27.10.2025 | Oliver Oberkandler

Gatling – A Powerful Framework for Load Testing

How does Gatling work technically – and why is it so efficient at simulating thousands of users?
What are the differences between open and closed load models, and when is each one appropriate?
How can realistic load tests be created with the Gatling DSL, including feeders, checks and assertions?
How can Gatling be scaled in the cloud – for example, using AWS EC2 or distributed test setups?
Which metrics, reports and best practices help to interpret performance results correctly?

In Brief

This article explains what Gatling is, how it works under the hood, and how to design realistic load tests with it (including code examples). It also shows how to scale Gatling scripts in the cloud (e.g., on AWS EC2) and which metrics and strategies matter when analyzing test results. The text combines insights from a bachelor’s thesis and hands-on project experience and provides checklists, pitfalls, and concrete configuration

Introduction

Load testing is not a luxury — it is essential to uncover performance risks early and to define capacity limits. Gatling is an open-source, code-centric load testing framework specifically designed for modern web and API architectures. It combines an asynchronous, event-driven engine with a clear DSL for writing test scripts.

In this article, I will share practical technical details, design decisions, and concrete examples so that you can build reliable, reproducible, and meaningful load tests yourself.

Gatling supports simulations in multiple languages. In addition to Scala, there are DSLs for Java, JavaScript/TypeScript, and Kotlin. At its core, however, the framework is powered by the Scala-based Gatling Core, which leverages Akka for scalable concurrency and Netty for efficient network I/O. This enables Gatling to simulate thousands of virtual users without quickly exhausting load generator resources.

What is Gatling? Architecture & Core Concepts

Gatling is a load-testing framework built on a high-performance, asynchronous architecture. Key characteristics include:

Scalable, non-blocking engine: Gatling uses Akka (actors) and Netty to implement an event-driven model. Each virtual user is essentially a lightweight message, not a dedicated thread. This allows thousands of concurrent users to be simulated with relatively little CPU and memory, making Gatling highly efficient — ideal for testing large systems.
Code-centric DSL: Tests are written as code. Gatling provides expressive DSLs (Scala, Java, JavaScript, Kotlin) to precisely describe user flows, requests, and checks. Advantage: test scripts are version-controlled and maintainable. Caveat: you need programming knowledge to adapt scripts.
Simulations & Scenarios: A Gatling test consists of a Simulation (in Scala, a Simulation class), which combines one or more scenarios with injection profiles. Key building blocks: Scenario (user journey, e.g., login → action → logout), Injection Profile (how many users start and when), Protocol Configuration (e.g., base URL, headers), and Checks/Assertions (validations and pass/fail criteria).
Protocol support: By default, Gatling is designed for HTTP/HTTPS but also supports other protocols (e.g., JMS) and can be extended via plugins. It is therefore best suited for web and API testing but not strictly limited to those.
Reporting: After each run, Gatling generates detailed HTML reports including latency distributions (p50, p95, p99, etc.), error rates, throughput, and more. (Note: The open-source edition produces static HTML reports, while Gatling Enterprise provides interactive dashboards.)

Important: Gatling is code-centric. This is a major advantage for maintainability, but it requires programming knowledge (and typically a build tool such as Maven or Gradle). Although there is also a recorder/GUI, for serious tests you gain full flexibility by writing scripts yourself.

Quick Start: Minimal Example (Scala DSL)

A short example demonstrates the central concepts (imports, HTTP protocol, feeder, scenario, injection, assertions). In Scala, a basic scenario looks like this:

import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._

class BasicSimulation extends Simulation {
 import io.gatling.core.Predef._ 
import io.gatling.http.Predef._ 
import scala.concurrent.duration._ 
 
class BasicSimulation extends Simulation { 
  // Protocol configuration: base URL, headers, etc. 
  val httpProtocol = http 
    .baseUrl("https://example.com") 
    .acceptHeader("application/json") 
 
  // Feeder: Read credentials from CSV file (shared pool, circular) 
  val feeder = csv("users.csv").circular 
 
  // Scenario: A typical user journey 
  val scn = scenario("UserJourney") 
    .feed(feeder)                      // Provide data from feeder 
    .exec( 
      http("HomePage")                 // Request to homepage 
        .get("/") 
        .check(status.is(200))         // Check: status 200 
    ) 
    .pause(1.second, 5.seconds)        // Think/interaction time 
    .exec( 
      http("Login")                    // Login request with form data 
        .post("/login") 
        .formParam("username", "${username}") 
        .formParam("password", "${password}") 
        .check(status.is(200)) 
    ) 
 
  // Simulation setup: injection profile and assertions 
  setUp( 
    // Open model: ramp up from 10 → 100 new users/second over 5 minutes 
    scn.inject(rampUsersPerSec(10).to(100).during(5.minutes)) 
  ).protocols(httpProtocol) 
   .assertions( 
     global.responseTime.max.lt(2000),    // Max latency < 2000ms 
     global.successfulRequests.percent.gt(95) // ≥95% successful requests 
   ) 
}

Explanation:

Feeder: The CSV file users.csv provides login data. .circular means that once the end of the file is reached, it loops back to the beginning. Other strategies include .random or .queue.

Checks: In the example, we only verify the HTTP status. In real tests, you would also validate contents (e.g., JSON fields) — this way Gatling immediately detects error states.

pause(1.second, 5.seconds): Simulates user “thinking time.” Alternatively, you can use pace to enforce a constant arrival rate.

Injection: rampUsersPerSec(10).to(100) is an example of the open model (see Section Injection Models). It starts at 10 users per second and ramps up evenly to 100 over 5 minutes. In the Scala DSL, inject(...) defaults to the open model.

Assertions: With global.* assertions, you can enforce thresholds. If the test does not meet them, it will be marked as “failed” (important for automated pipelines)

Tip: Gatling simulations can also be written in Java or JavaScript — the concepts remain the same. For example, the Java/JS DSLs provide methods like scenario(), injectOpen(), and similar.

Injection Models: Open vs. Closed — with Examples

The injection profile determines how virtual users are introduced into the simulation. Gatling distinguishes between two load models:

Open Model (controlling arrival rate): Here you define the rate of incoming requests (e.g., 50 new users per second). It simulates real public traffic where users arrive independently and randomly (e.g., visitors to a news or e-commerce site). Gatling offers profiles like constantUsersPerSec, rampUsersPerSec, stressPeakUsers, etc. Examples

// 50 new users per second over 10 minutes 
scn.inject(constantUsersPerSec(50).during(10.minutes)) 
 
// Increase from 10 to 100 users/s in 5 minutes 
scn.inject(rampUsersPerSec(10).to(100).during(5.minutes))

Advantage: Mirrors real-world traffic by defining an arrival rate.
Disadvantage: The number of concurrent users (concurrency) emerges dynamically from arrival rate × session duration.

Closed Model (controlling concurrent users): Here you directly control the number of concurrent users. New users are only admitted once others finish. Examples:

// Always exactly 50 simultaneous users 
scn.inject(constantConcurrentUsers(50).during(15.minutes)) 
 
// Linear ramp-up from 10 to 200 simultaneous users in 10 minutes 
scn.inject(rampConcurrentUsers(10).to(200).during(10.minutes))

This model is often used when system load is constrained by a fixed user base (e.g., a ticket booking system with limited session slots).

Advantage: You control concurrency precisely.
Disadvantage: The load may be less realistic — if the system slows down, no new users will be admitted (since all are waiting).

Important: Choose the model that best matches your production system. Open model are suitable for public-facing systems with variable traffic. Closed model are suitable for controlled environments with fixed user bases. In Gatling, a simulation must use either open or closed profiles (e.g., injectOpen(...) or injectClosed(...)).

Note: In the Scala DSL, inject(...) is equivalent to injectOpen(...) (open model). To use the closed model, you must explicitly call injectClosed(...). For clarity, I used the explicit method names in the examples above.

Scenario Design: Best Practices

A meaningful test separates test problems from system problems. Recommendations to improve simulation quality:

Modularity & reusability: Build complex user journeys from small, reusable components (e.g., functions or traits in Scala). This allows you to easily combine or reuse scenario parts across tests.

Test data management: Use feeders or custom data sources (CSV, JSON, JDBC, Redis, etc.). Avoid data conflicts: if your test creates users or orders, they must be unique or isolated. For repeatable tests, use feeder strategies like .queue() or .circular() for infinite loops. Also plan data cleanup (e.g., deleting test data after the run).

Checks instead of “blind” tests: Critically validate system responses. Use .check(status.is(200)), JSONPath queries, header checks, or regex to ensure requests are processed successfully. A test without checks produces numbers but says nothing about functionality.

Pacing vs. Pause: pause(...) simulates user think or wait time. Use pace or controlled timing to generate realistic load patterns — so a user doesn’t unrealistically click again immediately after a response.

Parameterization: Parameterize test behavior (target URLs, rates, durations) via environment variables or system properties (e.g., -Dusers=100). This way you don’t need to recompile simulations for every change.

Session management: Keep sessions clean. Extract CSRF tokens, session cookies, or JWTs automatically via checks and reuse them in subsequent requests. Separate sessions by scenario if necessary to avoid users sharing the same session.

Stateful vs. Stateless: Many API endpoints are stateless (same request → same response). If your scenario is stateful (e.g., shopping cart, order creation), ensure a clear data lifecycle: dynamically create test data (e.g., new users), perform the action, and clean up afterwards. A consistent environment prevents false errors caused by bad test data instead of performance issues.

Scaling & Operating in the Cloud (AWS EC2 – Practical Guide)

In real stress testing, you often don’t run on a single machine but distribute load across multiple generators. Example for AWS:

Architecture Approach

Load Generator Pool: Create multiple EC2 instances as Gatling load generators. Each instance runs a Gatling simulation.
Control & Coordination Node: A control node (e.g., another EC2 instance or CI server) coordinates and starts tests (via SSH, Ansible, or CI scripts) on the generators and collects results centrally (e.g., in S3).
Network: Ensure generators are close to the System Under Test (SUT), ideally in the same region or availability zone. This way you measure real latency and avoid Internet-induced noise.

Instance Sizing & Tuning

Instance type: Use compute-optimized instances (e.g., AWS C5/C6) with high network throughput. Short, fast HTTP requests stress CPU/network; large response payloads stress bandwidth.
Mind network limits: Cloud VMs have bandwidth and connection caps. Check for rate limits or network bottlenecks. For very high load, consider multiple availability zones or network-optimized instances.
System tuning: On load generators, increase open file limits (ulimit -n) and tune TCP settings (e.g., ephemeral ports). Use a modern JVM version and set heap sizes (-Xms/-Xmx) appropriately. Gatling itself is lightweight, but many virtual users can consume JVM resources.

Orchestration & Artifact Management

Infrastructure as Code: Automate provisioning (e.g., with CloudFormation or Terraform). Start Gatling via user data (on boot) or CI scripts.
Centralize logs & reports: Collect all simulation.log files and Gatling HTML reports from the instances. Upload them to central storage (e.g., S3 bucket). This enables later analysis.
Security: Open only necessary ports (usually just egress to the Internet) and isolate test environments. Respect API rate limits or firewall rules of the SUT so your test doesn’t accidentally break them.Automatisiere das Provisioning (z. B. mit CloudFormation oder Terraform). Starte Gatling via User Data (bei Boot) oder per CI-Skript.

Distributed Tests: Options

Open-source approach: You can run multiple Gatling instances in parallel (synchronously or asynchronously) and merge results manually. Tools like InfluxDB/Grafana (via Gatling plugin) help aggregate metrics. Alternatively, consolidate all simulation.log files and use Gatling’s official tooling (CLI or APIs) to generate aggregated reports.
Gatling Enterprise (formerly FrontLine): The enterprise edition provides integrated distributed execution, centralized control, and real-time dashboards. It simplifies large-scale testing significantly but requires a license.

Monitoring, Metrics & Report Analysis

Which metrics are relevant and how should you interpret them? Comprehensive monitoring during test runs is essential. Key points:

Latencies: Pay special attention to medians and upper percentiles (p50, p90, p95, p99). Outliers (p99) often reveal real problems that averages tend to hide.
Throughput (requests/s): Measure how many requests per second the SUT (system under test) processes. Increase load gradually to identify the “knee point,” where latencies suddenly rise. This often marks the true capacity limit.
Error rate: Analyze the percentage of failed requests. Even a low error rate (e.g., timeouts) can be alarming.
Connection metrics: Gatling measures DNS resolution time, connection setup, TLS handshake, time-to-first-byte, etc. These help identify bottlenecks under the hood.
SUT resource utilization: Monitor CPU, RAM, network, and database pools on the target system. High CPU usage or saturated connection pools often correlate with rising response times.
Client bottlenecks: Also measure the load generator utilization (CPU, network). This ensures you are testing the SUT and not overloading the generators themselves.

Interpretation & Pitfalls:

Average vs. percentiles: A low average can be misleading if the distribution is wide. Example: If p99 is much higher than the median, there are sporadic delays.
Throughput-latency correlation: Ramp up test volume gradually. At each throughput step, observe when latencies suddenly spike (“knee point”). This point is often the practical upper limit for the system.
Long-term behavior: Short tests may miss memory leaks or saturation effects. For critical scenarios, run long tests (e.g., 30–60 minutes) to detect throttling, temporary cache warm-ups, or resource exhaustion.
Correlation with system metrics: Compare latency spikes with system indicators (e.g., garbage collection pauses, CPU saturation). This helps pinpoint root causes.

Gatling Reports

The default Gatling HTML report provides a solid first analysis (e.g., latency graphs, error rates, success ratios). For deeper insights, integrate it with external monitoring (e.g., Prometheus or InfluxDB/Grafana): track real-time CPU, memory, and database metrics alongside the test to identify internal bottlenecks.

CI/CD, Automation and Artefact Management

Load tests can be automated, but they require clear strategies. Here’s an example using GitHub Actions:

name: Gatling-Load-Test 
on: [push] 
jobs: 
  gatling: 
    runs-on: ubuntu-latest 
    steps: 
      - uses: actions/checkout@v3 
      - name: Set up JDK 
        uses: actions/setup-java@v4 
        with: 
          distribution: temurin 
          java-version: '17' 
      - name: Build and run Gatling 
        run: | 
          mvn -DskipTests package 
          mvn -Dgatling.simulationClass=com.example.BasicSimulation gatling:test 
      - name: Upload reports 
        if: always() 
        run: | 
          tar -czf gatling-report.tar.gz target/gatling 
          # Upload gatling-report.tar.gz as Artifact or in S3

Strategy: Run quick smoke tests (small scenarios) on every commit/merge to immediately detect major performance issues. Execute larger load tests in separate pipeline jobs (e.g., as nightly builds or manual triggers).

Pass/Fail Criteria: Use Gatling assertions to automatically determine whether a test has passed. For example, an assertion like global.successfulRequests.percent.gt(99) ensures that fewer than 1% of requests fail. If SLAs are violated, the CI job will fail.

Important: Store test reports and logs as artifacts. This allows you to analyze failed runs without rerunning the tests.

Common Pitfalls & Troubleshooting

Even experienced testers encounter traps. Typical issues:

Load generator as bottleneck: CPU or network limits of the generators may appear as SUT performance degradation. Monitor and scale the loaders, or reduce the number of virtual users if needed.
DNS/connection pooling: Every Gatling setup has defaults for DNS caching and HTTP connection pools. Ensure that DNS resolution or too few (or too many) connections are not skewing your measurements.
Data/test conflicts: When testing stateful endpoints (e.g., creating orders), use unique IDs or clean data states. Otherwise, requests may fail due to “resource already exists.”
Hidden errors (no checks): Without check(...) in requests, Gatling may not recognize errors (e.g., HTTP 500 or validation failures) and count them as “successful.” Always add checks to catch silent failures.
Cold starts/warm-ups: On the first run, caches (CPU cache, database cache, just-in-time compilation) are still cold. Include a short warm-up phase so your actual test run isn’t skewed by startup transients.

If you encounter strange results, it’s often worth to reviewing logs (simulation.log contains all request/response details), to monitoring the system or by reducing the load to analyze behavior step by step.

Checklist Before a Production-Grade Load Test

Goal definition: What should the test answer? (Max load, SLA compliance, scaling strategy, etc.)
Realistic scenarios: Are user paths and pauses modeled realistically?
Test data: Do you have enough unique data? Are feeders configured correctly?
Monitoring set up: System metrics (CPU, RAM, DB), network, as well as load generator metrics.
Network verified: Bandwidth, firewalls, API rate limits in the SUT environment.
Warm-up phase defined: A short pre-phase to heat up caches.
Assertions/SLA checks: Are they defined and meaningful?
Reports & logs: Automated collection of reports, e.g., S3 upload or CI artifacts.

This checklist helps avoid common oversights and ensures the test runs smoothly and delivers meaningful data.

Conclusion

Gatling is a powerful tool for developer-driven load testing: thanks to its asynchronous architecture, it can generate significant load with relatively few resources, and its clear DSL ensures maintainable test scripts. Particularly in cloud-native environments, API-first architectures, and microservices, Gatling proves highly effective.

However, note:

If you need a mature GUI or out-of-the-box support for many different protocols, other tools or the commercial Gatling version may be more suitable. Gatling Enterprise (formerly FrontLine) provides, for example, a central web interface with real-time dashboards.
For distributed tests with fine-grained control, the open-source version requires custom scripts and metric stores (e.g., InfluxDB/Grafana). Gatling Enterprise significantly automates and simplifies this.
Overall, Gatling is an excellent choice for many use cases — from automated CI load tests to manual stress tests.

Sources

[1] Gatling Documentation https://docs.gatling.io/

[2] Workload models in load testing | Gatling Blog https://gatling.io/blog/workload-models-in-load-testing

[3] Gatling injection scripting reference https://docs.gatling.io/concepts/injection/

[4] Gatling session scripting reference - feeders https://docs.gatling.io/concepts/session/feeders/