Ron and Ella Wiki Page

Extremely Serious

Retry with Backoff in Modern Java Systems

Retry with backoff is a core resilience pattern: instead of hammering a failing dependency with constant retries, you retry a limited number of times and increase the delay between attempts, often with randomness (jitter), to give the system time to recover. In modern Java microservices, this is as fundamental as timeouts and circuit breakers, and you should treat it as part of your basic “failure budget” design rather than an afterthought.


Why Simple Retries Are Not Enough

If you just “try again” immediately on failure, you run into two systemic issues:

  • You amplify load on an already unhealthy dependency, potentially turning a small blip into an outage (classic retry storm or thundering herd).
  • You synchronize client behavior: thousands of callers that fail at the same time also retry at the same time, causing periodic waves of load.

Backoff addresses these issues by spreading retries out over time and giving downstream systems breathing room, while still masking short transient failures from end users.


The Core Concept of Backoff

At its heart, retry with backoff is just a loop with three key decisions:

  • Should I retry this failure? (Is it transient and safe to repeat?)
  • How many times will I retry at most?
  • How long will I wait before the next attempt?

Retryable vs non-retryable failures

You normally only retry failures that are likely transient or environmental:

  • HTTP: 429, 503, 504, and connection timeouts are typical candidates.
  • TCP / OS: ECONNRESET, ETIMEDOUT, ECONNREFUSED, etc., often indicate temporary network issues.

You usually do not retry:

  • Client bugs: 400, 401, 403, validation errors, malformed requests.
  • Irreversible business errors, like “insufficient funds”.

The rationale is simple: retried non-transient errors only add load and latency without any chance of success.


Backoff Strategies (Fixed, Exponential, Jitter)

Several backoff strategies are used in practice; the choice affects both user latency and system stability.

Fixed backoff

You wait the same delay before each retry (for example, 1 second between attempts).

  • Pros: Simple to reason about.
  • Cons: Poor at protecting an overwhelmed dependency; many clients still align on the same intervals.

Exponential backoff (with optional cap)

You grow delays multiplicatively:

  • Example: base 200 ms, factor 2 → 200 ms, 400 ms, 800 ms, 1600 ms, … up to some cap (for example 30 s).

This reduces pressure quickly as failures persist, but may produce very long waits unless you cap the maximum delay.

Exponential backoff with jitter

Large-scale systems (AWS and others) recommend adding randomness to each delay, typically “full jitter” where you wait a random time between 0 and the current exponential delay.

  • This breaks synchronization between many clients and avoids retry waves.
  • Conceptually: delay_n = random(0, min(cap, baseDelay × factor^n)).

From a system-design perspective, exponential backoff with jitter is the default you should reach for in distributed environments.


Design Parameters You Must Choose

When you design a retry-with-backoff policy, decide explicitly:

  • Max attempts: How many retries are acceptable before surfacing failure? This is a user-experience vs resilience trade-off.
  • Total time budget: How long are you willing to block this call in the worst case? This should be consistent with your higher-level SLAs and timeouts.
  • Base delay: The initial wait, often 50–200 ms for low-latency calls or higher for heavily loaded services.
  • Multiplier: The growth factor, often between 1.5 and 3; higher factors reduce load faster but increase tail latency.
  • Maximum delay (cap): To prevent absurd waits; typical caps are in the 5–60 s range depending on context.
  • Jitter mode: Full jitter is usually preferred; “no jitter” is only acceptable when you have few clients.

You should also define per-operation policies: a read-heavy, idempotent query can tolerate more retries than a rare, expensive write.


Java Example: Simple HTTP Client with Exponential Backoff and Jitter

Below is an example using Java 21’s HttpClient and virtual threads. It implements:

  • Exponential backoff with full jitter
  • A simple notion of retryable HTTP status codes
  • A hard cap on attempts and delay

Code

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;
import java.time.Duration;
import java.util.Random;

private static final HttpClient CLIENT = HttpClient.newBuilder()
        .connectTimeout(Duration.ofSeconds(5))
        .build();

private static final Random RANDOM = new Random();

// Policy parameters
private static final int MAX_ATTEMPTS    = 5;
private static final long BASE_DELAY_MS  = 200;   // initial delay
private static final double MULTIPLIER   = 2.0;   // exponential factor
private static final long MAX_DELAY_MS   = 5_000; // cap per attempt

void main() {
    String url = "https://httpbin.org/status/503"; // change to /status/200 to see success

    try {
        String body = getWithRetry(url);
        System.out.println("Final response body: " + body);
    } catch (Exception e) {
        System.err.println("Request failed after retries: " + e.getMessage());
    }
}

public static String getWithRetry(String url) throws Exception {
    int attempt = 0;

    while (true) {
        attempt++;

        HttpRequest request = HttpRequest.newBuilder()
                .uri(URI.create(url))
                .GET()
                .timeout(Duration.ofSeconds(3))
                .build();

        try {
            HttpResponse<String> response =
                    CLIENT.send(request, HttpResponse.BodyHandlers.ofString());

            int status = response.statusCode();

            if (!isRetryableStatus(status)) {
                // Either success or a non-transient error: stop retrying
                if (status >= 200 && status < 300) {
                    return response.body();
                }
                throw new RuntimeException("Non-retryable status: " + status);
            }

            if (attempt >= MAX_ATTEMPTS) {
                throw new RuntimeException(
                        "Exhausted retries, last status: " + status
                );
            }

            long delay = computeBackoffDelayMillis(attempt);
            System.out.printf("Attempt %d failed with %d, retrying in %d ms%n",
                    attempt, status, delay);
            Thread.sleep(delay);

        } catch (Exception ex) {
            // Network / IO exceptions
            if (attempt >= MAX_ATTEMPTS) {
                throw new RuntimeException("Exhausted retries", ex);
            }

            long delay = computeBackoffDelayMillis(attempt);
            System.out.printf("Attempt %d threw %s, retrying in %d ms%n",
                    attempt, ex.getClass().getSimpleName(), delay);
            Thread.sleep(delay);
        }
    }
}

private static boolean isRetryableStatus(int status) {
    // Treat typical transient codes as retryable
    return status == 429 || status == 503 || status == 504;
}

private static long computeBackoffDelayMillis(int attempt) {
    // attempt is 1-based, but we want exponent starting at 0
    int exponent = Math.max(0, attempt - 1);
    double rawDelay = BASE_DELAY_MS * Math.pow(MULTIPLIER, exponent);
    long capped = Math.min((long) rawDelay, MAX_DELAY_MS);

    // Full jitter: random between 0 and capped
    return (long) (RANDOM.nextDouble() * capped);
}

Why this is structured this way

  • isRetryableStatus centralizes policy so you can evolve it without touching the control flow.
  • computeBackoffDelayMillis hides the math and encodes base, multiplier, and cap in one place, making it trivial to test in isolation.
  • The loop is explicit: this makes your retry behavior visible in logs and debuggable, which is important in production troubleshooting.

How to validate the example

  1. Run it as-is; https://httpbin.org/status/503 will keep returning 503.
    • You should see multiple attempts logged with growing (but jittered) delays, then a failure after the max attempt.
  2. Change the URL to https://httpbin.org/status/200.
    • The call should succeed on the first attempt with no retries.
  3. Change to https://httpbin.org/status/429.
    • Observe multiple retries; tweak MAX_ATTEMPTS, BASE_DELAY_MS, and MULTIPLIER and see how behavior changes.

Using Libraries: Resilience4j and Friends

In real systems you rarely hand-roll this everywhere; you typically standardize via a library.

A popular option is Resilience4j, where you:

  • Configure an IntervalFunction for exponential backoff (and optionally jitter).
  • Define RetryConfig with maxAttempts, intervalFunction, and error predicates.
  • Decorate functions or suppliers so retry behavior is applied consistently across the codebase.

Putting It in System Design Context

Retry with backoff must coexist with other resilience mechanisms:

  • Timeouts: Every retried call still needs a per-call timeout; otherwise retries just tie up threads.
  • Circuit breakers: When a dependency is consistently failing, stop sending it traffic for a while instead of continuously retrying.
  • Bulkheads / limits: Cap concurrency so a single broken dependency cannot consume all your resources.

Conceptually, you should design a retry contract per dependency: which operations are idempotent, what latency budget you have, and what backoff profile is acceptable for your users and upstream callers.


A Brief Parameter Guide for Production

As a rule of thumb for synchronous HTTP calls in a microservice:

  • Base delay: 50–200 ms for low-latency services, up to 500 ms for heavy operations.
  • Multiplier: 2 is a safe starting point; 1.5 if you care more about latency, 3 if you are aggressively protecting a fragile dependency.
  • Max delay: 1–5 s for interactive paths, 10–60 s for background jobs.
  • Max attempts: 3–5 attempts (including the initial one) is typical for user-facing calls, more for asynchronous jobs.

Always measure: instrument how many retries happen, which status codes cause them, and their impact on latency and error rates.

Building Resilient Java Services with the Bulkhead Pattern

The bulkhead pattern in Java isolates resources (threads, connections, queues) per dependency or feature so that one overloaded part of the system cannot bring down the whole application. Conceptually, it is named after ship bulkheads: watertight compartments that prevent a single hull breach from sinking the entire ship.

Why the bulkhead pattern matters

In a modern service, you often call multiple downstream systems: payment, inventory, recommendations, analytics, and so on. If all of those calls share the same common resources (for example, the same thread pool), one slow or failing dependency can exhaust those resources and starve everything else.

The intent of the bulkhead pattern is:

  • To prevent cascading failures when one dependency is slow or failing.
  • To protect critical flows (e.g. checkout, login) from non‑critical ones (e.g. recommendations).
  • To create predictable failure modes: instead of everything timing out, some calls are rejected or delayed while others keep working.

A typical “bad” scenario without bulkheads:

  • All outgoing HTTP calls use a single pool of 200 threads.
  • A third‑party recommendation API becomes very slow.
  • Those calls tie up many of the 200 threads, waiting on slow I/O.
  • Under load, all 200 threads end up blocked on the slow service.
  • Now even your payment and inventory calls cannot acquire a thread, so the entire service degrades or fails.

With bulkheads, you deliberately split resources so this cannot happen.

Core design ideas in Java

In Java, the most straightforward way to implement bulkheads is to partition concurrency using:

  • Separate ExecutorServices (thread‑pool bulkhead).
  • Per‑dependency Semaphores (semaphore bulkhead).
  • Separate connection pools per downstream service (database or HTTP clients).

All of these approaches express the same idea: each dependency gets its own “budget” of concurrent work. If it misbehaves, it can at worst exhaust its own budget, not the whole application’s.

Thread‑pool bulkhead

You create dedicated thread pools per dependency or per feature:

  • paymentExecutor only handles calls to the payment service.
  • inventoryExecutor only handles inventory calls.
  • recommendationsExecutor handles non‑critical recommendation calls.

If recommendations become slow, they can only occupy the threads from recommendationsExecutor. Payment and inventory retain their own capacity and remain responsive.

Semaphore bulkhead

Instead of separate threads, you can have a shared thread pool but limit concurrency quantitatively with Semaphore:

  • Each dependency has Semaphore paymentLimiter, Semaphore inventoryLimiter, etc.
  • Before calling the dependency, you try to acquire a permit.
  • If no permit is available, you reject early (fail fast) or queue.
  • This prevents unbounded concurrent calls to any one dependency.

Semaphores work well when you already have a thread pool and you want a light‑weight concurrency limit per call site, without fragmenting your pool into many smaller pools.

Java example (thread pool)

Below is an example using Java and CompletableFuture. It demonstrates how to isolate three fictitious dependencies: payment, inventory, and recommendations.

import java.util.concurrent.CompletableFuture;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class BulkheadExample {

    // Separate executors = separate bulkheads
    private final ExecutorService paymentExecutor =
            Executors.newFixedThreadPool(16);  // payment API

    private final ExecutorService inventoryExecutor =
            Executors.newFixedThreadPool(8);   // inventory API

    private final ExecutorService recommendationsExecutor =
            Executors.newFixedThreadPool(4);   // non‑critical

    public CompletableFuture<String> callPayment(String request) {
        return CompletableFuture.supplyAsync(() -> {
            sleep(500); // simulate remote call latency
            return "payment-ok for " + request;
        }, paymentExecutor);
    }

    public CompletableFuture<String> callInventory(String request) {
        return CompletableFuture.supplyAsync(() -> {
            sleep(100); // inventory is usually fast
            return "inventory-ok for " + request;
        }, inventoryExecutor);
    }

    public CompletableFuture<String> callRecommendations(String userId) {
        return CompletableFuture.supplyAsync(() -> {
            sleep(1000); // imagine this sometimes gets very slow
            return "reco-ok for " + userId;
        }, recommendationsExecutor);
    }

    private static void sleep(long millis) {
        try {
            Thread.sleep(millis);
        } catch (InterruptedException ie) {
            Thread.currentThread().interrupt();
        }
    }

    public void shutdown() {
        paymentExecutor.shutdown();
        inventoryExecutor.shutdown();
        recommendationsExecutor.shutdown();
    }

    static void main(String[] args) {
        var service = new BulkheadExample();

        // Step 1: Saturate the recommendations bulkhead.
        for (int i = 0; i < 50; i++) {
            service.callRecommendations("user-" + i);
        }

        // Step 2: Invoke critical calls and measure latency.
        long start = System.currentTimeMillis();

        var payment = service.callPayment("order-123");
        var inventory = service.callInventory("sku-999");

        payment.thenAccept(p -> System.out.println("Payment: " + p));
        inventory.thenAccept(i -> System.out.println("Inventory: " + i));

        CompletableFuture.allOf(payment, inventory).join();
        long elapsed = System.currentTimeMillis() - start;

        System.out.println("Critical calls finished in ~" + elapsed + " ms");

        service.shutdown();
    }
}

Why it is written this way

  • Dedicated executors express isolation explicitly. When you read the code, you can see the boundaries: payment vs inventory vs recommendations.
  • CompletableFuture lets you compose async calls in a modern, non‑blocking style instead of manually creating and joining threads.
  • The pool sizes reflect relative importance:
    • Payment has more threads (16) because it is critical and may have higher throughput.
    • Inventory has fewer threads (8) but is still important.
    • Recommendations has the smallest pool (4) because it is non‑critical and can be sacrificed under load.

In a real system, you would base these numbers on load tests and SLOs, but the principle holds: allocate more capacity to critical flows, and less to non‑critical ones.

How to validate that the bulkhead works

To treat this as a proper engineering exercise, you should validate that the isolation actually behaves as intended.

From the example:

  1. You deliberately flood the recommendations executor by submitting many requests with high latency (sleep(1000)).
  2. Immediately after, you call payment and inventory once each.
  3. You measure how long the payment and inventory calls take.

What you should observe:

  • Payment: ... and Inventory: ... log lines appear after roughly their simulated latencies (hundreds of milliseconds, not several seconds).
  • The final "Critical calls finished in ~X ms" shows a number close to the sum of those latencies (plus minor overhead), not dominated by the slow 1‑second recommendation calls.

If you were to “break” the bulkhead intentionally (e.g. by using a single shared executor for everything), then under load the critical calls would complete much later or even time out, because they would be competing for the same threads as the slow recommendations. That contrast is exactly what proves the value of the bulkhead.

In a more advanced setup, you would:

  • Run a load test that increases traffic only to recommendations.
  • Monitor latency and error rates for payment and inventory.
  • Expect recommendations to degrade first, while payment and inventory remain within SLO until their own capacity is genuinely exhausted.

When to reach for bulkheads

You especially want bulkheads when:

  • You have multiple remote dependencies with different reliability profiles.
  • Some features are clearly more important than others.
  • You run in a multi‑tenant or multi‑feature service where one tenant/feature might behave badly.

On the other hand, bulkheads add configuration and operational overhead:

  • Too many tiny thread pools fragment your resources and make tuning harder.
  • Mis‑sized bulkheads can waste resources (too large) or throttle throughput (too small).

A good practice is to start with a small number of coarse‑grained bulkheads (e.g. “critical vs non‑critical calls”), validate behaviour under failure, and then refine as you learn where contention really happens.

Circuit Breakers for Java Services

A circuit breaker is a protective layer between your service and an unreliable dependency, designed to fail fast and prevent cascading failures in distributed systems.

Why Circuit Breakers Exist

In a microservice architecture, one slow or failing dependency can quickly exhaust threads, connection pools, and CPU of its callers, leading to a chain reaction of outages. The circuit breaker pattern monitors calls to these dependencies and, when failure or latency crosses a threshold, temporarily blocks further calls to give the system time to recover.

The rationale is simple: it is better to return a fast, controlled error or degraded response than to hang on timeouts and drag the entire system down.

Core States and Behaviour

Most implementations define three key states.

  • Closed
    • All calls pass through to the downstream service.
    • The breaker tracks metrics such as error rate, timeouts, and latency over a sliding window.
    • When failures or slow calls exceed configured thresholds, the breaker trips to Open.
  • Open
    • Calls are rejected immediately or routed to a fallback without touching the downstream.
    • This protects the unhealthy service and the caller’s resources from overload.
    • The breaker stays open for a configured cool‑down period.
  • Half‑open
    • After the cool‑down, a limited number of trial calls are allowed through.
    • If trial calls succeed, the breaker returns to Closed; if they fail, it flips back to Open and waits again.

The design rationale is to adapt dynamically: be optimistic while things are healthy, aggressively protect resources when they are not, and probe carefully for recovery.

When You Should Use a Circuit Breaker

Circuit breakers are most valuable when remote failures are frequent, long‑lasting, or expensive.

  • Protection and stability
    • Prevents retry storms and timeouts from overwhelming a struggling dependency.
    • Limits the blast radius of a failing service so other services remain responsive.
  • Better user experience
    • Fails fast with clear errors or fallbacks instead of long hangs.
    • Enables graceful degradation such as cached reads, default values, or “read‑only” modes.
  • High‑availability systems
    • Essential where you must keep the system partially available even when individual services are down.

You usually combine a circuit breaker with timeouts, retries (with backoff and jitter), and bulkheads for a robust resilience layer.

Java Example With Resilience4j

Below is a complete, runnable Java example using Resilience4j’s circuit breaker in a simple main program.

import io.github.resilience4j.circuitbreaker.CircuitBreaker;
import io.github.resilience4j.circuitbreaker.CircuitBreakerConfig;
import io.github.resilience4j.circuitbreaker.CircuitBreakerRegistry;

import java.time.Duration;
import java.util.concurrent.TimeoutException;
import java.util.function.Supplier;

static String callRemoteService() throws Exception {
    double p = Math.random();
    if (p < 0.4) {
        // Simulate a timeout-style failure
        throw new TimeoutException("Remote service timed out");
    } else if (p < 0.7) {
        // Simulate normal, fast success
        return "FAST OK";
    } else {
        // Simulate slow success
        Thread.sleep(1500);
        return "SLOW OK";
    }
}

void main() {
    var config = CircuitBreakerConfig.custom()
            .failureRateThreshold(50.0f)                      // trip if >= 50% failures
            .slowCallRateThreshold(50.0f)                     // trip if >= 50% slow calls
            .slowCallDurationThreshold(Duration.ofSeconds(1)) // >1s is “slow”
            .waitDurationInOpenState(Duration.ofSeconds(3))   // open for 3s
            .permittedNumberOfCallsInHalfOpenState(3)         // 3 trial calls
            .minimumNumberOfCalls(5)                          // need data first
            .slidingWindowSize(10)                            // last 10 calls
            .recordExceptions(TimeoutException.class)         // what counts as failure
            .build();

    CircuitBreakerRegistry registry = CircuitBreakerRegistry.of(config);
    CircuitBreaker breaker = registry.circuitBreaker("remoteService");

    Supplier<String> guardedCall = CircuitBreaker.decorateSupplier(
            breaker,
            () -> {
                try {
                    System.out.println("  executing remote call...");
                    return callRemoteService();
                } catch (Exception e) {
                    throw new RuntimeException(e);
                }
            }
    );

    for (int i = 1; i <= 25; i++) {
        var state = breaker.getState();
        System.out.println("Attempt " + i + " | state=" + state);

        try {
            String result = guardedCall.get();
            System.out.println("  -> SUCCESS: " + result);
        } catch (Exception e) {
            System.out.println("  -> FAILURE: " + e.getClass().getSimpleName()
                    + " | " + e.getMessage());
        }

        try {
            Thread.sleep(500);
        } catch (InterruptedException ignored) {
            Thread.currentThread().interrupt();
        }
    }
}

How to Validate This Example

  • Observe Closed → Open → Half‑open transitions
    • Run the program; you should see some attempts in CLOSED with mixed successes and failures.
    • Once enough calls fail or are slow, the state switches to OPEN and subsequent attempts fail fast without printing “executing remote call…”.
    • After roughly 3 seconds, the state changes to HALF_OPEN, a few trial calls run, and then the breaker returns to CLOSED or back to OPEN depending on their outcomes.
  • Confirm protection behavior
    • The absence of “executing remote call…” logs during OPEN demonstrates that the breaker is blocking calls and thus protecting both caller and callee.

The rationale for this configuration is to keep the example small yet realistic: using a sliding window and explicit thresholds makes the breaker’s decisions explainable in production terms.

Circuit Breaker vs Retry vs Bulkhead

These patterns solve related but distinct concerns and are often composed together.

Pattern Concern addressed Typical placement
Circuit breaker Persistent failures, high error/slow rate. Around remote calls, per dependency.
Retry Transient, short‑lived faults. Inside Closed breaker, with backoff.
Bulkhead Isolation of resource usage across calls. At thread‑pool or connection‑pool level.

The key design idea is: bulkhead limits blast radius, circuit breaker limits how long you keep talking to something broken, and retry gives a flaky but recoverable dependency a second chance.

WireMock Response Templates with Handlebars and Response Builders

WireMock gives you two powerful levers for dynamic responses: declarative Handlebars templates embedded in stubs, and imperative Java logic via the response builder APIs. They complement each other rather than compete.


Why response templating exists

In real systems, responses rarely stay static: IDs echo from the URL, correlation IDs flow through headers, timestamps change, payloads depend on the request body, and so on. WireMock’s response templating solves this by letting you:

  • Use Handlebars to keep mocks close to the contract, with simple expressions and helpers.
  • Drop to Java builders and transformers when behaviour becomes algorithmic or needs integration with other libraries.

The design goal is: keep the simple cases in configuration, and only push complex behaviour into code.


Handlebars in WireMock: the declarative layer

Handlebars lets you embed {{expressions}} directly in response headers, bodies, and even proxy URLs. At runtime, WireMock feeds a rich request model into the template engine so the template can “see” all relevant request data.

The request model

Some of the most useful fields in the request model:

  • request.url, request.path, request.pathSegments.[n] and named path variables (e.g. request.path.customerId).
  • request.query.foo or request.query.foo.[n] for multi-valued params.
  • request.headers.X-Request-Id, request.cookies.session, request.method, request.baseUrl.
  • request.body, request.bodyAsBase64, and multipart request.parts.* for more advanced cases.

Rationale: this model gives you enough context to “shape” the response purely from the incoming HTTP request, without Java code.

Enabling and scoping templating

In programmatic (local) mode, WireMock 3+ enables templating support out of the box, but actually applying it depends on how your instance is configured:

  • You can run with local templating, where templating is applied only to stubs that specify the response-template transformer.
  • Or you can flip to global templating, where every stub is rendered through Handlebars unless explicitly disabled.

This is intentional: it prevents accidental template evaluation on basic static stubs, while still allowing an “all templates” mode when you know your mapping set depends heavily on it.

Basic Handlebars example

Below is a minimal, self-contained Java example using a modern WireMock 3.x style API, focusing only on the stub and response (no build configuration). It echoes data from path, query, and headers via Handlebars.

import com.github.tomakehurst.wiremock.WireMockServer;
import com.github.tomakehurst.wiremock.core.WireMockConfiguration;

import static com.github.tomakehurst.wiremock.client.WireMock.*;

public class HandlebarsExample {

    public static void main(String[] args) {
        WireMockServer wm = new WireMockServer(
            WireMockConfiguration.options()
                // Local templating enabled by default in Java mode;
                // if you’d changed it globally, you can toggle here.
                .templatingEnabled(true)
        );
        wm.start();

        configureFor("localhost", wm.port());

        wm.stubFor(
            get(urlPathMatching("/customers/(.*)"))
                .willReturn(
                    aResponse()
                        .withStatus(200)
                        .withHeader("Content-Type", "application/json")
                        // Handlebars: use path segment, header, and query param
                        .withBody("""
                            {
                              "id": "{{request.pathSegments.[1]}}",
                              "correlationId": "{{request.headers.X-Correlation-Id}}",
                              "greeting": "Hello, {{request.query.name}}"
                            }
                            """)
                        // Ensure the response goes through the template engine:
                        .withTransformers("response-template")
                )
        );

        // Simple validation (manual): curl the endpoint:
        // curl -H "X-Correlation-Id: abc-123" "http://localhost:8080/customers/42?name=Alice"
        // Response should contain id=42, correlationId=abc-123, greeting "Hello, Alice".
    }
}

Why this shape:

  • We keep the logic in the template, not in Java branching, so QA or other devs can read and modify it like a contract.
  • The same body can be copied verbatim into a JSON mapping file if you later move to standalone/Cloud, which makes your tests more portable.

Handlebars helpers: beyond simple interpolation

Handlebars in WireMock is extended with a large set of helpers for dates, random values, JSON/XML processing, math, array manipulation, and more. This lets templates stay expressive without turning into full programming languages.

Some particularly pragmatic helpers:

  • Date/time: now with format, timezone, and offsets; parseDate, truncateDate for consistent timestamps.
  • Random: randomValue, randomInt, pickRandom for synthetic but realistic data.
  • JSON: jsonPath, parseJson, toJson, formatJson, jsonArrayAdd, jsonMerge, jsonRemove, jsonSort for shaping JSON payloads.
  • XPath / XML: xPath, soapXPath, formatXml.
  • Utility: math, range, contains, matches, numberFormat, base64, urlEncode, formData, regexExtract, size, and more.

Rationale: you can keep test data generation in the template layer, avoiding brittle fixture code and making mocks self-describing.


Response builders: the imperative layer

While Handlebars owns the declarative side, the Java Response builder APIs give you full programmatic control. Conceptually, you work with:

  • ResponseDefinitionBuilder (via aResponse() in stubs) at the configuration level.
  • Response and its Builder when writing extensions like ResponseTransformerV2.

A typical stub already uses the builder implicitly:

wm.stubFor(
    get(urlEqualTo("/hello"))
        .willReturn(
            aResponse()
                .withStatus(200)
                .withHeader("Content-Type", "text/plain")
                .withBody("Hello from WireMock")
        )
);

Here the builder is used in its simplest form: static status, headers, and body. The real power shows up when you combine it with extensions.

Custom transformer with Response.Builder

A common pattern is: start with the stub’s definition, then refine the actual Response in a transformer. This is where Response.Builder (or the like(response).but() style) becomes useful.

Conceptual example using a ResponseTransformerV2, which takes the already-resolved Response and lets you mutate it:

import com.github.tomakehurst.wiremock.extension.ResponseTransformerV2;
import com.github.tomakehurst.wiremock.http.Request;
import com.github.tomakehurst.wiremock.http.Response;
import com.github.tomakehurst.wiremock.stubbing.ServeEvent;

public class UppercaseNameTransformer implements ResponseTransformerV2 {

    @Override
    public String getName() {
        return "uppercase-name-transformer";
    }

    @Override
    public Response transform(Response response, ServeEvent serveEvent) {
        String originalBody = response.getBodyAsString();
        Request request = serveEvent.getRequest();
        String name = request.queryParameter("name").isPresent()
                ? request.queryParameter("name").firstValue()
                : null;

        String effectiveName = (name != null && !name.isBlank())
                ? name.toUpperCase()
                : "UNKNOWN";

        String transformedBody = originalBody.replace("{{NAME}}", effectiveName);

        return Response.Builder
                .like(response)
                .but()
                .body(transformedBody)
                .build();
    }
}

Key rationale:

  • Immutability and reuse: like(response).but() copies status, headers, etc., so you only specify what changes.
  • Composability: multiple transformers can run without trampling each other; each focuses on a narrow concern.
  • Testability: behaviour is in Java, which you can unit test thoroughly when it gets complex.

A stub can then provide the template “skeleton” that this transformer fills:

wm.stubFor(
    get(urlPathEqualTo("/greet"))
        .willReturn(
            aResponse()
                .withStatus(200)
                .withHeader("Content-Type", "text/plain")
                .withBody("Hello, {{NAME}}!") // Placeholder, not Handlebars here
                .withTransformers("uppercase-name-transformer")
        )
);

Validation strategy:

  • Hit /greet?name=alice and assert you get Hello, ALICE!.
  • Hit /greet with no name and assert you get Hello, UNKNOWN!.

This demonstrates the division of responsibilities: the stub defines the shape, the transformer plus builder define the behaviour.


Example: combining Handlebars and Response builders

To tie everything together, here is a minimal Java program that:

  • Starts WireMock.
  • Defines one stub using pure Handlebars.
  • Defines another stub that uses a custom transformer implemented with Response.Builder.

Again, no build tooling, only the code that matters for behaviour.

import com.github.tomakehurst.wiremock.WireMockServer;
import com.github.tomakehurst.wiremock.core.WireMockConfiguration;
import com.github.tomakehurst.wiremock.extension.ResponseTransformerV2;
import com.github.tomakehurst.wiremock.http.Request;
import com.github.tomakehurst.wiremock.http.Response;
import com.github.tomakehurst.wiremock.stubbing.ServeEvent;

import static com.github.tomakehurst.wiremock.client.WireMock.*;

public class WireMockTemplatingDemo {

    public static void main(String[] args) {
        WireMockServer wm = new WireMockServer(
                WireMockConfiguration.options()
                        .port(8080)
                        .templatingEnabled(true) // ensure Handlebars engine is available
                        .extensions(new UppercaseNameTransformer())
        );
        wm.start();

        configureFor("localhost", 8080);

        // 1) Pure Handlebars-based response.
        wm.stubFor(
                get(urlPathMatching("/customers/(.*)"))
                        .willReturn(
                                aResponse()
                                        .withStatus(200)
                                        .withHeader("Content-Type", "application/json")
                                        .withBody("""
                            {
                              "id": "{{request.pathSegments.[1]}}",
                              "name": "{{request.query.name}}",
                              "requestId": "{{request.headers.X-Request-Id}}"
                            }
                            """)
                                        .withTransformers("response-template")
                        )
        );

        // 2) Response builder + custom transformer.
        wm.stubFor(
                get(urlPathEqualTo("/welcome"))
                        .willReturn(
                                aResponse()
                                        .withStatus(200)
                                        .withHeader("Content-Type", "text/plain")
                                        // Placeholder token; the transformer will replace it.
                                        .withBody("Welcome, {{NAME}}!")
                                        .withTransformers("uppercase-name-transformer")
                        )
        );

        System.out.println("WireMock started on http://localhost:8080");

        // Manual validation:
        // 1) Handlebars endpoint:
        // curl -H "X-Request-Id: req-123" "http://localhost:8080/customers/42?name=Alice"
        //    -> JSON with id=42, name=Alice, requestId=req-123
        //
        // 2) Transformer + Response.Builder endpoint:
        // curl "http://localhost:8080/welcome?name=alice"
        //    -> "Welcome, ALICE!"
        // curl "http://localhost:8080/welcome"
        //    -> "Welcome, UNKNOWN!"
    }

    // Custom transformer using Response.Builder
    public static class UppercaseNameTransformer implements ResponseTransformerV2 {

        @Override
        public String getName() {
            return "uppercase-name-transformer";
        }

        @Override
        public Response transform(Response response, ServeEvent serveEvent) {
            String body = response.getBodyAsString();
            Request request = serveEvent.getRequest();
            String name = request.queryParameter("name").isPresent()
                    ? request.queryParameter("name").firstValue()
                    : null;

            String effectiveName = (name != null && !name.isBlank())
                    ? name.toUpperCase()
                    : "UNKNOWN";

            String newBody = body.replace("{{NAME}}", effectiveName);

            return Response.Builder
                    .like(response)
                    .but()
                    .body(newBody)
                    .build();
        }
    }
}

Why this example is structured this way:

  • It demonstrates that Handlebars and Response builders are orthogonal tools: you can use either alone, or both together.
  • The Handlebars stub stays close to an API contract and is easy to lift into JSON mappings later.
  • The transformer shows how to use Response.Builder for behavioural logic while preserving the stub’s declarative shape.

When to prefer which approach

In practice, a good rule of thumb is:

  • Start with Handlebars templates whenever the response can be expressed as “request data + light helper usage”. This keeps mocks transparent and editable by a wide audience.
  • Move logic into Response builders and transformers when you need non-trivial algorithms, external data sources, or complex JSON manipulation that would make templates hard to read.

A lot of teams settle on a hybrid: templates for the bulk of the response, plus a few narrow custom transformers for cross-cutting concerns (IDs, timestamps, test data injection). That balance usually gives you both readability and power.

WireMock Matchers in Practice: From Strings to JSON Schemas

WireMock’s matching model is built around a small set of pattern types that you reuse everywhere: URLs, headers, query parameters, and bodies all use variations of the same abstractions. In this article we’ll build an intuition for those abstractions and then see them in a Java example.


The Core Abstractions: ContentPattern and StringValuePattern

At the heart of WireMock matching is the idea “given some content, decide how well it matches this expectation.” That idea is embodied in ContentPattern and its concrete subclasses.

  • ContentPattern is the conceptual base: “match content of type T and produce a MatchResult.”
  • StringValuePattern is the concrete type for matching strings and underpins almost every simple matcher you use in the DSL.

You rarely construct StringValuePattern directly; you use the DSL factory methods that create it for you. These include:

  • equalTo("foo") – exact match.
  • containing("foo") – substring.
  • matching("[A-Z]+") – regular expression.
  • equalToJson("{...}"), equalToXml("<root/>") – structural equality for JSON/XML.

Rationale: by funnelling everything through StringValuePattern, WireMock can reuse the same matching semantics across URLs, headers, query params, and body fragments, and can assign a distance score that lets it pick the best matching stub.

Mini example (headers and query):

stubFor(get(urlPathEqualTo("/search"))
    .withQueryParam("q", equalTo("WireMock"))
    .withHeader("Accept", containing("json"))
    .willReturn(okJson("""{ "result": "ok" }"""))
);

Both equalTo("WireMock") and containing("json") are StringValuePattern instances internally.


MultiValuePattern: Matching Lists of Values

Headers and query parameters can have multiple values; sometimes you care about the set of values as a whole, not just any one of them. That’s what MultiValuePattern is for.

Conceptually, MultiValuePattern answers: “does this list of strings satisfy my condition?” Typical usage for query parameters:

stubFor(get(urlPathEqualTo("/items"))
    .withQueryParam("tag", havingExactly("featured", "sale"))
    .willReturn(ok())
);

Here:

  • havingExactly("featured", "sale") expresses that the parameter tag must have exactly those two values and no others.
  • MultiValuePattern is the pattern over that list, analogous to how StringValuePattern is a pattern over a single string.

Rationale: this is important for APIs where order and multiplicity of query or header values matter (e.g. “all of these scopes must be present”). Without a dedicated multi-value abstraction you would be forced to encode such logic in brittle string hacks.


UrlPathPattern and URL-Level Matching

WireMock lets you match URLs at several levels in the Java DSL: exact string, regex, or path-only vs path+query. UrlPathPattern corresponds to urlPathMatching(...) when you use the Java client.

Key URL matchers in Java:

  • urlEqualTo("/path?query=...") – exact match on full path (with query).
  • urlMatching("regex") – regex on full path (with query).
  • urlPathEqualTo("/path") – exact match on path only.
  • urlPathMatching("regex") – regex on path only.
stubFor(get(urlPathMatching("/users/([A-Za-z0-9_-]+)/repos"))
    .willReturn(aResponse()
        .withStatus(200)));

In modern WireMock 3 you’ll also see templates like urlPathTemplate("/contacts/{contactId}"), which are easier to read and type-safe at the path level:

stubFor(get(urlPathTemplate("/contacts/{contactId}"))
    .willReturn(aResponse()
        .withStatus(200)));

Rationale: splitting “path” from “query” and using distinct matchers lets you:

  • Use exact matches (urlEqualTo, urlPathEqualTo) when you want deterministic behaviour (faster, simpler).
  • Only fall back to regex (urlMatching, urlPathMatching) when you truly need flexible matching, such as versioned paths or dynamic IDs.

MatchesJsonPathPattern: Querying Inside JSON Bodies

When the request body is JSON and you care about conditions inside it rather than full equality, WireMock uses MatchesJsonPathPattern. You access it via matchingJsonPath(...) in the DSL.

Two main flavours:

  • matchingJsonPath("$.message") – body must match the JSONPath (i.e. the expression finds at least one node).
  • matchingJsonPath("$.message", equalTo("Hello")) – evaluates JSONPath, converts the result to a string, then matches it with a StringValuePattern.

Example:

stubFor(post(urlEqualTo("/api/message"))
    .withRequestBody(matchingJsonPath("$.message", equalTo("Hello World!")))
    .willReturn(aResponse().withStatus(200))
);

Important detail: all WireMock matchers operate on strings, so the JSONPath result is stringified before applying the StringValuePattern. This even allows selecting a sub-document and then matching it with equalToJson(...).

Rationale: JSONPath is ideal when you need to assert that some field exists or meets a condition (e.g. price > 10) without tightly coupling your tests to the entire JSON shape. It makes tests robust to benign changes in unrelated fields.


JsonUnit Placeholders in equalToJson

When you do want structural equality for JSON, equalToJson is your tool, and it is powered by JsonUnit. JsonUnit supports placeholders, which WireMock exposes to let you relax parts of the JSON equality check.

You embed placeholders directly into the expected JSON:

stubFor(post(urlEqualTo("/orders"))
    .withRequestBody(equalToJson("""
        {
          "id": "${json-unit.any-string}",
          "amount": 123.45,
          "status": "NEW",
          "metadata": "${json-unit.ignore}"
        }
        """))
    .willReturn(ok())
);

Common placeholders:

  • ${json-unit.ignore} – ignore value and type, just require key presence.
  • ${json-unit.any-string} – value can be any string.
  • ${json-unit.any-number} – any number.
  • ${json-unit.any-boolean} – any boolean.
  • ${json-unit.regex}[A-Z]+ – value must match this regex.

You can also change delimiters if ${ and } clash with your payload conventions.

Rationale: equalToJson is excellent for enforcing payload shape and fixed fields, but brittle when some fields are inherently variable (IDs, timestamps, correlation IDs). JsonUnit placeholders let you keep strictness where it matters and loosen it where variability is expected.


JSON Schema Matching

For more formal validation you can use JSON Schema with matchers like matchingJsonSchema. This checks that the JSON body (or a path variable) conforms to an explicit schema: types, required fields, min/max, etc.

Typical body matcher:

stubFor(post(urlEqualTo("/things"))
    .withRequestBody(
        matchingJsonSchema("""
            {
              "$schema": "http://json-schema.org/draft-07/schema#",
              "type": "object",
              "required": ["id", "name"],
              "properties": {
                "id":   { "type": "string", "minLength": 2 },
                "name": { "type": "string" },
                "price": { "type": "number", "minimum": 0 }
              },
              "additionalProperties": false
            }
            """)
    )
    .willReturn(created())
);

You can also apply matchingJsonSchema to path parameters when you use path templates, e.g. constraining userId to a specific shape.

Rationale:

  • JsonPath answers “does there exist an element meeting this condition?”
  • equalToJson + JsonUnit placeholders answers “is this JSON equal to a template with some flexible parts?”
  • JSON Schema answers “does this JSON globally satisfy a formal contract?”.

JSON Schema is especially useful when your WireMock stub is acting as a consumer-driven contract test for another service.


Custom Matcher: Escaping the Built-in Model

When WireMock’s built-in matchers are not enough, you can implement Custom Matchers. The main extension is RequestMatcherExtension, which gives you full control over how a request is matched.

The minimal shape:

import com.github.tomakehurst.wiremock.extension.Parameters;
import com.github.tomakehurst.wiremock.extension.requestfilter.RequestMatcherExtension;
import com.github.tomakehurst.wiremock.http.MatchResult;
import com.github.tomakehurst.wiremock.http.Request;

public class BodyLengthMatcher extends RequestMatcherExtension {

    @Override
    public String getName() {
        return "body-too-long";
    }

    @Override
    public MatchResult match(Request request, Parameters parameters) {
        int maxLength = parameters.getInt("maxLength");
        boolean tooLong = request.getBody().length > maxLength;
        return MatchResult.of(tooLong);
    }
}

You then reference it by name in your stub:

stubFor(requestMatching("body-too-long", Parameters.one("maxLength", 2048))
    .willReturn(aResponse().withStatus(422))
);

There is also a lower‑level ValueMatcher<T> interface that you can use in verification or in some advanced APIs to match arbitrary values, without registering a global extension.

Rationale: custom matchers are your “escape hatch” for domain‑specific logic which would otherwise be contorted into regexes or JSONPath. Examples include:

  • Checking that a JWT in a header is valid and contains certain claims.
  • Ensuring consistency between header and body fields.

Putting It All Together

Below is a Java 17 style example which combines several of the discussed matcher types. You can drop this into a plain Maven/Gradle project that has wiremock dependencies;

Example: Product API Stub with Multiple Matchers

import com.github.tomakehurst.wiremock.WireMockServer;

import static com.github.tomakehurst.wiremock.client.WireMock.*;
import static com.github.tomakehurst.wiremock.core.WireMockConfiguration.wireMockConfig;

public class ProductApiStub {

    public static void main(String[] args) {
        WireMockServer server = new WireMockServer(
            wireMockConfig().port(8080)
        );
        server.start();

        configureFor("localhost", 8080);
        registerStubs(server);

        Runtime.getRuntime().addShutdownHook(new Thread(server::stop));
        System.out.println("WireMock running at http://localhost:8080");
    }

    private static void registerStubs(WireMockServer server) {
        server.stubFor(post(urlPathTemplate("/shops/{shopId}/products"))
            // UrlPathTemplate + JSON Schema for path parameter
            .withPathParam("shopId", matchingJsonSchema("""
                {
                  "type": "string",
                  "minLength": 3,
                  "maxLength": 10
                }
                """))

            // MultiValuePattern on query param
            .withQueryParam("tag", havingExactly("featured", "sale"))

            // StringValuePattern: header must match regex
            .withHeader("X-Request-Id", matching("req-[0-9a-f]{8}"))

            // JsonUnit placeholders in equalToJson
            .withRequestBody(equalToJson("""
                {
                  "id": "${json-unit.any-string}",
                  "name": "Socks",
                  "price": 9.99,
                  "metadata": "${json-unit.ignore}"
                }
                """))

            // MatchesJsonPathPattern + StringValuePattern
            .withRequestBody(matchingJsonPath("$[?(@.price > 0)]"))

            .willReturn(okJson("""
                {
                  "status": "ACCEPTED",
                  "message": "Product created"
                }
                """)));
    }
}

What this stub requires:

  • Path: /shops/{shopId}/products where shopId is a string 3–10 chars long (validated via JSON Schema).
  • Query: ?tag=featured&tag=sale and only those values (MultiValuePattern).
  • Header: X-Request-Id must match req-[0-9a-f]{8} (regex StringValuePattern).
  • Body: JSON with fixed fields name and price, but any string for id and anything for metadata, thanks to JsonUnit placeholders.
  • Body: JSONPath asserts that $.price is strictly greater than 0.

How to Validate the Example

Using any HTTP client (curl, HTTPie, Postman, Java HTTP client), send:

curl -i \
  -X POST "http://localhost:8080/shops/abc/products?tag=featured&tag=sale" \
  -H "Content-Type: application/json" \
  -H "X-Request-Id: req-deadbeef" \
  -d '{
        "id": "any-random-id",
        "name": "Socks",
        "price": 9.99,
        "metadata": { "color": "blue" }
      }'

You should receive a 200 OK with JSON body:

{
  "status": "ACCEPTED",
  "message": "Product created"
}

If you break one constraint at a time (e.g. price negative, missing tag value, bad X-Request-Id format, too-short shopId), the request will no longer match that stub, and you will either hit no stub or some other fallback stub, which is precisely how you verify each matcher behaves as expected.

The Age of Slop Code – And How Senior Engineers Keep Systems Sane

Slop code is becoming a defining challenge of modern software engineering: code that looks clean, runs, and even passes tests, yet is shallow, fragile, and corrosive to long‑term quality.

From “AI Slop” to Slop Code

The term “AI slop” emerged to describe low‑quality AI‑generated content that appears competent but is actually superficial, cheap to produce, and easy to flood the world with. Researchers characterize this slop by three prototypical properties: superficial competence, asymmetric effort, and mass producibility. When this pattern moved into software, engineers started talking about “AI slop code” or simply “slop code” for similar low‑quality output in codebases.

At the same time, “vibe coding” entered the lexicon: relying on LLMs to generate entire chunks of functionality from natural‑language prompts, reviewing results only lightly and steering with follow‑up prompts rather than deep understanding. When this practice spills over into rushed shipping, missing refactors, and weak testing, you get “vibe slopping”: chaotic, unrefactored, AI‑heavy changes that harden into technical debt.

What Slop Code Looks Like in Practice

Slop code is not obviously broken. That is precisely why it is dangerous. It often has these traits:

  • Superficially correct behavior: it compiles, runs, and passes basic or happy‑path tests.
  • Overly complex implementations: verbose solutions, unnecessary abstractions, and duplicated logic rather than refactoring.
  • Architectural blindness: code that “solves” the prompt but ignores existing patterns, invariants, or system boundaries.
  • Weak error handling and edge‑case coverage: success paths are implemented, but failure modes are hand‑waved or inconsistent.
  • Inconsistent conventions: style, naming, and dependency usage drift across files or services.
  • Low comprehension: the submitting developer struggles to explain trade‑offs, invariants, or why this approach fits the system.

Reports from teams using AI‑assisted development describe AI slop as code that “looks decent at first glance” but hides overcomplication, neglected edge cases, and performance or integration issues that only surface later. Senior engineers increasingly describe their role as auditing AI‑generated code and guarding architecture and security rather than writing most of the initial implementation themselves.

A Simple Example Pattern

Consider an AI‑generated “quick” integration:

  • It introduces a new HTTP client wrapper instead of reusing the existing one.
  • It hard‑codes timeouts and retry logic instead of using shared configuration.
  • It parses responses with ad‑hoc JSON access rather than central DTOs and validation.

Everything appears to work in a demo and passes a couple of unit tests, but it quietly duplicates concerns, violates resilience patterns, and becomes a fragile outlier under load — classic slop behavior.

Why Slop Code Is Systemically Dangerous

The slop layer is insidious because it is made of code that “works” and “looks fine.” It doesn’t crash obviously; instead, it undermines systems over time.

Key risks include:

  • Accelerated technical debt: AI tools optimize for local code generation, not global architecture, so they create bloat, duplication, and shallow abstractions at scale.
  • False sense of velocity: teams see rapid feature delivery and green test suites while hidden complexity and fragility quietly accumulate.
  • Integration fragility: code that works in isolation clashes with production data shapes, error behaviors, and cross‑service contracts.
  • Erosion of engineering skill: juniors rely on AI for non‑trivial tasks, skipping the deep debugging and maintenance work that forms real expertise.

Some industry analyses describe this as an “AI slop layer”: code that compiles, passes tests, and looks clean, yet is “system‑blind” and architecturally shallow. The result is a sugar‑rush phase of AI‑driven development now, followed by a slowdown later as teams pay down accumulated slop.

How Slop Relates to Vibe Coding and Vibe Slopping

The modern ecosystem has started to differentiate related behaviors:

Term Core idea Typical failure mode
AI slop Low‑quality AI content that seems competent but is shallow. Volume over rigor; hard‑to‑spot defects.
Vibe coding Using LLMs as the primary way to generate code from English. Accepting working code without fully understanding it.
Vibe slopping The chaotic aftermath of vibe coding under delivery pressure. Bloated, duct‑taped, unrefactored code and technical debt.
Slop code The resulting messy or shallow code in the repo. Long‑term maintainability and reliability problems.

Crucially, using AI does not automatically produce slop. If an engineer reviews, tests, and truly understands AI‑written code, that is closer to using an LLM as a typing assistant than to vibe coding. Slop arises when teams accept AI output at face value, optimize for throughput, and skip the engineering disciplines that make software robust.

Guardrails: How Technical Leads Can Contain Slop

For someone in a technical‑lead role, the real question is: how do we get the productivity benefits of AI without drowning in slop?

Industry guidance and experience from teams operating heavily with AI suggest a few practical guardrails.

  • Raise the bar for acceptance, not generation
    Treat AI code as if it were written by a very fast junior: useful, but never trusted without review. Require that the author can explain key invariants, trade‑offs, and failure modes in their own words.
  • Design and architecture first
    Make system boundaries, contracts, and invariants explicit before generating code. The more precise the specification and context, the less room there is for the model to generate clever but misaligned solutions.
  • Enforce consistency with existing patterns
    Review code for alignment with established architecture, libraries, and conventions, not just for local correctness. Build simple checklists: shared clients, shared error envelopes, shared DTOs, and standard logging and metrics patterns.
  • Strengthen tests around behavior, not implementation
    Focus tests on business rules, edge cases, and contracts between modules and services. This constrains slop by making shallow or misaligned behavior visible quickly.
  • Be deliberate with AI usage
    Use AI where it shines: boilerplate, glue code, and refactors, rather than core domain logic or delicate concurrency and performance‑critical code. When applying AI to critical paths, budget time for deep human review and stress testing.
  • Train for slop recognition
    Teach your team to spot red flags: over‑verbose code, unnecessary abstractions, unexplained dependencies, and “magic” logic. Encourage code reviews that ask, “How does this fit the system?” as much as “Does this pass tests?”

A recurring theme in expert commentary is that future high‑value skills include auditing AI‑generated code, debugging AI‑assisted systems, and securing and scaling AI‑written software. In that world, leads act less as primary implementers and more as stewards of architecture, quality, and learning.

A Simple Example: Turning Slop into Solid Code (Conceptual)

To keep this language‑agnostic, imagine a service that needs to fetch user preferences from another microservice and fall back gracefully on failure.

A slop‑code version often looks like this conceptually:

  • Creates a new HTTP client with hard‑coded URL and timeouts.
  • Calls the remote service directly in multiple places.
  • Swallows or logs errors without clear fallback behavior.
  • Has only a basic success‑path test, no network‑failure tests.

A cleaned‑up version, written with architectural intent, would instead:

  • Reuse the shared HTTP client and central configuration for timeouts and retries.
  • Encapsulate the call behind a single interface, e.g., UserPreferencesProvider.
  • Define explicit behavior on failure (default preferences, cached values, or clear error propagation).
  • Add tests for timeouts, 4xx/5xx responses, and deserialization failures, plus contract tests for the external API.

Slop is not about who typed the code; it is about whether the team did the engineering work around it.

WireMock Java Stubbing: From Configuration to StubMapping

In this article we will walk through the main Java concepts behind WireMock: how you configure the server, choose a port, describe requests and responses, and how everything ends up as a StubMapping. The goal is that you not only know how to use the API, but also why it is structured this way, the way an experienced engineer would reason about test doubles.


Configuring WireMockServer with WireMockConfiguration

WireMockConfiguration is the object that describes how your WireMock HTTP server should run. You rarely construct it directly; instead you use a static factory called options(), which returns a configuration builder.

At a high level:

  • WireMockConfiguration controls ports, HTTPS, file locations, extensions, and more.
  • The fluent style (via options()) encourages explicit, readable configuration instead of magic defaults scattered through the codebase.
  • Because it is a separate object from WireMockServer, you can reuse or tweak configuration for different test scenarios.

Example shape (without imports for now):

WireMockServer wireMockServer = new WireMockServer(
    options()
        .dynamicPort()
        .dynamicHttpsPort()
);

You pass the built configuration into the WireMockServer constructor, which then uses it to bind sockets, set up handlers, and so on. Conceptually, think of WireMockConfiguration as the blueprint; WireMockServer is the running building.


Dynamic Port: Why and How

In test environments, hard‑coding ports (e.g. 8080, 9090) is a common source of flakiness. If two tests (or two services) try to use the same port, one will fail with “address already in use.”

WireMock addresses this with dynamicPort():

  • dynamicPort() tells WireMock to pick any free TCP port available on the machine.
  • After the server starts, you ask the server which port it actually bound to, via wireMockServer.port().
  • This pattern is ideal for parallel test runs and CI environments, where port availability is unpredictable.

Example pattern:

WireMockServer wireMockServer = new WireMockServer(
    options().dynamicPort()
);

wireMockServer.start();

int port = wireMockServer.port(); // the chosen port at runtime
String baseUrl = "http://localhost:" + port;

You then configure your HTTP client (or the service under test) to call baseUrl, not a hard‑coded port. The rationale is to shift from “global fixed port” to “locally discovered port,” which removes an entire class of brittle test failures.


Creating a Stub: The Big Picture

When we say “create a stub” in WireMock, we mean:

Define a mapping from a request description to a response description, and register it with the server so that runtime HTTP calls are intercepted according to that mapping.

This mapping is built in three conceptual layers:

  • A request pattern (what should be matched).
  • A response definition (what should be returned).
  • A stub mapping that joins these two together and gives it identity and lifecycle inside WireMock.

In Java, the fluent DSL exposes this as:

wireMockServer.stubFor(
    get(urlEqualTo("/api/message"))
        .willReturn(
            aResponse()
                .withStatus(200)
                .withHeader("Content-Type", "text/plain")
                .withBody("hello-wiremock")
        )
);

This one line of code hides several objects: a MappingBuilder, a RequestPattern, a ResponseDefinition, and eventually a StubMapping. The design encourages a declarative style: you describe what should happen, not how to dispatch it.


MappingBuilder: Fluent Construction of a Stub

MappingBuilder is the central builder used by the Java DSL. Calls like get(urlEqualTo("/foo")) or post(urlPathMatching("/orders/.*")) return a MappingBuilder instance.

It is responsible for:

  • Capturing the HTTP method (GET, POST, etc.).
  • Associating a URL matcher (exact equality, regex, path, etc.).
  • Enriching with conditions on headers, query parameters, cookies, and body content.
  • Attaching a response definition via willReturn.

You rarely instantiate MappingBuilder yourself. Instead you use static helpers from the DSL:

get(urlEqualTo("/api/message"))
post(urlPathEqualTo("/orders"))
put(urlMatching("/v1/users/[0-9]+"))

Each of these returns a MappingBuilder, and you chain further methods to refine the match. The rationale is to keep your test code highly readable while still configuring quite a lot of matching logic.


RequestPattern: Describing the Request Shape

Under the hood, MappingBuilder gradually accumulates a RequestPattern (or more precisely, builds a RequestPatternBuilder). A RequestPattern is an object representation of “what an incoming HTTP request must look like for this stub to apply.”

A RequestPattern may include:

  • HTTP method (e.g. GET).
  • URL matcher: urlEqualTo, urlPathEqualTo, regex matchers, etc.
  • Optional header conditions: withHeader("X-Env", equalTo("test")).
  • Optional query param or cookie matchers.
  • Optional body matchers: raw equality, regex, JSONPath, XPath, and so on.

Example via DSL:

post(urlPathEqualTo("/orders"))
    .withHeader("X-Tenant", equalTo("test"))
    .withQueryParam("source", equalTo("mobile"))
    .withRequestBody(matchingJsonPath("$.items[0].id"));

Each of these DSL calls contributes to the underlying RequestPattern. The motivation for this design is to let you express complex request matching without writing imperative “if header equals X and URL contains Y” code; WireMock handles that logic internally.


ResponseDefinition and aResponse: Describing the Response

If RequestPattern says “what we expect to receive,” then ResponseDefinition says “what we will send back.” It captures all aspects of the stubbed response:

  • Status code and optional status message.
  • Headers (e.g., content type, custom headers).
  • Body content (string, JSON, binary, templated content).
  • Optional behaviour like artificial delays or faults.

The idiomatic way to construct a ResponseDefinition in Java is via the aResponse() factory, which returns a ResponseDefinitionBuilder:

aResponse()
    .withStatus(201)
    .withHeader("Content-Type", "application/json")
    .withBody("{\"id\":123}");

Using a builder for responses has several benefits:

  • It separates pure data (status, headers, body) from the network I/O, so you can reason about responses as values.
  • It encourages small, focused stubs rather than ad‑hoc code that manipulates sockets or streams.
  • It allows extensions and transformers to hook into a well‑defined structure.

Once built, this response definition is attached to a mapping via willReturn.


willReturn: Connecting Request and Response

The willReturn method lives on MappingBuilder and takes a ResponseDefinitionBuilder (typically produced by aResponse()).

Conceptually:

  • Before willReturn, you are only describing the request side.
  • After willReturn, you have a complete “if request matches X, then respond with Y” mapping.
  • The resulting MappingBuilder can be passed to stubFor, which finally registers it with the server.

Example:

get(urlEqualTo("/api/message"))
    .willReturn(
        aResponse()
            .withStatus(200)
            .withBody("hello-wiremock")
    );

The wording is deliberate. The DSL reads like: “GET /api/message willReturn this response.” This is a very intentional choice to make tests self‑documenting and easy to skim.


StubMapping: The Persisted Stub Definition

Once you call stubFor(mappingBuilder), WireMock converts the builder into a concrete StubMapping instance. This is the in‑memory (and optionally JSON‑on‑disk) representation of your stub.

A StubMapping includes:

  • The RequestPattern (what to match).
  • The ResponseDefinition (what to send).
  • Metadata: UUID, name, priority, scenario state, and other advanced properties.

StubMapping is what WireMock uses at runtime to:

  • Evaluate incoming requests against all known stubs.
  • Decide which stub wins (based on priority rules).
  • Produce the actual HTTP response that the client receives.

From an architectural perspective, StubMapping lets WireMock treat stubs as data. That is why you can:

  • Export stubs as JSON.
  • Import them via admin endpoints.
  • Manipulate them dynamically without recompiling or restarting your tests.

WireMock Class: The Fluent DSL Entry Point

The WireMock class is the static gateway to the Java DSL. It provides methods used throughout examples:

  • Request builders: get(), post(), put(), delete(), any().
  • URL matchers: urlEqualTo(), urlPathEqualTo(), regex variants.
  • Response builders: aResponse(), plus convenience methods like ok(), badRequest(), etc.
  • Utility methods to bind the static DSL to a specific server (configureFor(host, port)).

In tests you typically import its static methods:

import static com.github.tomakehurst.wiremock.client.WireMock.*;

This is what enables code such as:

get(urlEqualTo("/api/message"))
    .willReturn(aResponse().withStatus(200));

instead of more verbose, object‑oriented calls. The goal is to minimize ceremony and make test intent immediately obvious.


A Simple Example

Let’s now put all these pieces together in a small JUnit 5 test using:

  • Java 11+ HttpClient.
  • WireMockServer with dynamicPort().
  • A single stub built with the core DSL concepts we have discussed.

This example intentionally avoids any build or dependency configuration, focusing only on the Java code.

import com.github.tomakehurst.wiremock.WireMockServer;
import org.junit.jupiter.api.AfterEach;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;

import java.net.URI;
import java.net.http.HttpClient;
import java.net.http.HttpRequest;
import java.net.http.HttpResponse;

import static com.github.tomakehurst.wiremock.client.WireMock.*;
import static com.github.tomakehurst.wiremock.core.WireMockConfiguration.options;
import static org.junit.jupiter.api.Assertions.assertEquals;

class WireMockExampleTest {

    private WireMockServer wireMockServer;

    @BeforeEach
    void startServer() {
        // Configure WireMock with a dynamic port to avoid clashes.
        wireMockServer = new WireMockServer(
                options().dynamicPort()
        );
        wireMockServer.start();

        // Bind the static DSL to this server instance.
        configureFor("localhost", wireMockServer.port());
    }

    @AfterEach
    void stopServer() {
        wireMockServer.stop();
    }

    @Test
    void shouldReturnStubbedMessage() throws Exception {
        // Create a stub (MappingBuilder -> RequestPattern + ResponseDefinition)
        wireMockServer.stubFor(
                get(urlEqualTo("/api/message"))
                        .willReturn(
                                aResponse()
                                        .withStatus(200)
                                        .withHeader("Content-Type", "text/plain")
                                        .withBody("hello-wiremock")
                        )
        );

        // Build an HTTP client and request using the dynamic port.
        HttpResponse<String> response;
        try (HttpClient client = HttpClient.newHttpClient()) {
            String baseUrl = "http://localhost:" + wireMockServer.port();

            HttpRequest request = HttpRequest.newBuilder()
                    .uri(URI.create(baseUrl + "/api/message"))
                    .GET()
                    .build();

            response = client.send(request, HttpResponse.BodyHandlers.ofString());
        }

        // Validate that the stub mapping was applied correctly.
        assertEquals(200, response.statusCode());
        assertEquals("hello-wiremock", response.body());
    }
}

How to validate this example

To validate the example:

  • Ensure you have WireMock and JUnit 5 in your project dependencies (via Maven, Gradle, or your build tool of choice).
  • Run the test class.
  • The test passes if:
    • The WireMockServer starts on a dynamic port without conflicts.
    • The request to /api/message is matched by the RequestPattern defined in the MappingBuilder.
    • The ResponseDefinition created with aResponse() and attached via willReturn produces the expected status and body.

Why We Need Modern Software and Tools?

Modern software and tools are no longer “nice to have”; they are the infrastructure that lets individuals and organizations work faster, more accurately, and more securely in a digital economy.

The role of modern tools in today’s world

We now build, run, and maintain most services through software, from banking and healthcare to logistics and entertainment. Modern tools encapsulate current best practices, regulations, and technologies, allowing us to keep up with rapidly changing requirements and expectations.

Efficiency and productivity at scale

Modern tools automate repetitive work such as deployments, testing, reporting, and coordination, which dramatically reduces manual effort and context switching. This automation scales: one team can now manage systems that would previously have required many more people, simply because the tools handle orchestration and routine checks.

Accuracy, reliability, and reduced risk

Contemporary platforms embed validation, type checking, automated tests, and monitoring capabilities that reduce the likelihood of human error. As a result, systems become more reliable, analytics more trustworthy, and business decisions less exposed to mistakes arising from inconsistent or incorrect data.

Collaboration in a distributed world

Work has become inherently distributed across locations and time zones, and modern software is designed to support this reality. Shared repositories, real‑time document and code collaboration, integrated chat, and task tracking make it feasible for cross‑functional teams to coordinate effectively without being physically co‑located.

Security, compliance, and maintainability

Security threats evolve constantly, and older tools tend not to receive timely patches or support for new standards. Modern platforms incorporate stronger authentication, encryption, audit trails, and compliance features, helping organizations protect data and meet regulatory obligations while keeping maintenance overhead manageable.

Innovation and competitive advantage

New capabilities—AI-assisted development, advanced analytics, low‑code platforms, cloud‑native services—are exposed primarily through modern tools and ecosystems. Organizations that adopt them can experiment faster, ship features more quickly, and create better user experiences, while those tied to outdated tooling tend to move slowly and lose competitive ground.

In short, we use modern software and tools because they are the practical way to achieve speed, quality, security, and innovation in a world where all of these are moving targets.

Cloud Native Applications and the Twelve‑Factor Methodology

Cloud native and the twelve‑factor methodology describe two tightly related but distinct layers of modern software: cloud native is primarily about the environment and platform you deploy to, while twelve‑factor is about how you design and implement the application so it thrives in that environment.

What “cloud native” actually means

Cloud‑native applications are designed to run on dynamic, elastic infrastructure such as public clouds, private clouds, or hybrid environments. They assume that:

  • Infrastructure is ephemeral: instances can disappear and be recreated at any time.
  • Scale is horizontal: you handle more load by adding instances, not vertically scaling a single machine.
  • Configuration, networking, and persistence are provided by the platform and external services, not by local machine setup.

Typically, cloud‑native systems use:

  • Containers (OCI images) as the primary packaging and deployment unit.
  • Orchestration (e.g., Kubernetes) to schedule, scale, heal, and roll out workloads.
  • Declarative configuration and infrastructure‑as‑code to describe desired state.
  • Observability (logs, metrics, traces) and automation (CI/CD, auto‑scaling, auto‑healing) as first‑class concerns.

From an architect’s perspective, “cloud native” is the combination of these platform capabilities with an application design that can exploit them. Twelve‑factor is one of the earliest and still influential descriptions of that design.

The twelve‑factor app in a nutshell

The twelve‑factor methodology was introduced to codify best practices for building Software‑as‑a‑Service applications that are:

  • Portable across environments.
  • Easy to scale horizontally.
  • Amenable to continuous deployment.
  • Robust under frequent change.

The original factors (Codebase, Dependencies, Config, Backing services, Build/Release/Run, Processes, Port binding, Concurrency, Disposability, Dev/prod parity, Logs, Admin processes) constrain how you structure and operate the app. The key idea is that by following these constraints, you produce an application that is:

  • Stateless in its compute tier.
  • Strict about configuration boundaries.
  • Explicit about dependencies.
  • Friendly to automation and orchestration.

Notice how those properties line up almost one‑for‑one with cloud‑native expectations.

How twelve‑factor underpins cloud‑native properties

Let’s connect specific twelve‑factor principles to core cloud‑native characteristics.

Portability and containerization

Several factors directly support packaging and running your app in containers:

  • Dependencies: All dependencies are declared explicitly and isolated from the base system. This maps naturally to container images, where your application and its runtime are packaged together.
  • Config: Configuration is stored in the environment, not baked into the image. That means the same image can be promoted across environments (dev → test → prod) simply by changing environment variables, ConfigMaps, or Secrets.
  • Backing services: Backing services (databases, queues, caches, etc.) are treated as attached resources, accessed via configuration. This decouples code from specific infrastructure instances, making it easy to bind to managed cloud services.

Result: your artifact (image) becomes environment‑agnostic, which is a prerequisite for true cloud‑native deployments across multiple clusters, regions, or even cloud providers.

Statelessness and horizontal scalability

Cloud‑native platforms shine when workloads are stateless and scale horizontally. Several factors enforce that:

  • Processes: The app executes as one or more stateless processes; any persistent state is stored in external services.
  • Concurrency: Scaling is achieved by running multiple instances of the process rather than threading tricks inside a single instance.
  • Disposability: Processes are fast to start and stop, enabling rapid scaling, rolling updates, and failure recovery.

On an orchestrator like Kubernetes, these characteristics translate directly into:

  • Replica counts controlling concurrency.
  • Pod restarts and rescheduling being safe and routine.
  • Auto‑scaling policies that can add or remove instances in response to load.

If your app violates these factors (e.g., uses local disk for state, maintains sticky in‑memory sessions, or takes minutes to start), it fights the cloud‑native platform rather than benefiting from it.

Reliability, operability, and automation

Cloud‑native systems rely heavily on automation and observability. Twelve‑factor anticipates this:

  • Dev/prod parity: Minimizing the gap between development, staging, and production environments reduces surprises and supports continuous delivery.
  • Logs: Treating logs as an event stream, written to stdout/stderr, fits perfectly with container logging and centralized log aggregation. The platform can capture, ship, and index logs without the application managing log files.
  • Admin processes: One‑off tasks (migrations, batch jobs) run as separate processes (or jobs), using the same codebase and configuration as long‑running services. This aligns with Kubernetes Jobs/CronJobs or serverless functions.

Together, these make it far easier to build reliable CI/CD pipelines, perform safe rollouts/rollbacks, and operate the system with minimal manual intervention—hallmarks of cloud‑native operations.

How to use twelve‑factor as a cloud‑native checklist

you can treat twelve‑factor as a practical assessment framework for cloud‑readiness of an application, regardless of language or stack.

For each factor, ask: “If I deployed this on a modern orchestrator, would this factor hold, or would it cause friction?” For example:

  • Config: Can I deploy the same container image to dev, QA, and prod, changing only environment settings? If not, there is a cloud‑native anti‑pattern.
  • Processes & Disposability: Can I safely kill any instance at any time without data loss and with quick recovery? If not, the app is not truly cloud‑native‑friendly.
  • Logs: If I run multiple instances, can I still understand system behavior from aggregated logs, or is there stateful, instance‑local logging?

You will usually discover that bringing a legacy application “into Kubernetes” without addressing these factors leads to brittle deployments: liveness probes fail under load, rollouts are risky, and scaling is unpredictable.

Conversely, if an app cleanly passes a twelve‑factor review, it tends to behave very well in a cloud‑native environment with minimal additional work.

How to position twelve‑factor today

Twelve‑factor is not the whole story in 2026, but it remains an excellent baseline:

  • It does not cover all modern concerns (e.g., multi‑tenant isolation, advanced security, service mesh, zero‑trust networking, event‑driven patterns).
  • It is, however, an excellent “minimum bar” for application behavior in a cloud‑native context.

I recommend treating it as:

  • A design standard for service teams: code reviews and design docs should reference the factors explicitly where relevant.
  • A readiness checklist before migrating a service to a Kubernetes cluster or similar platform.
  • A teaching tool for new engineers to understand why “just dockerizing the app” is not enough.

Scaffolding a Modern VS Code Extension with Yeoman

In this article we focus purely on scaffolding: generating the initial VS Code extension project using the Yeoman generator, with TypeScript and esbuild, ready for you to start coding.


Prerequisites

Before you scaffold the project, ensure you have:

  • Node.js 18+ installed (check with node -v).
  • Git installed (check with git --version).

These are required because the generator uses Node, and the template can optionally initialise a Git repository for you.


Generating the extension with Yeoman

VS Code’s official generator is distributed as a Yeoman generator. You don’t need to install anything globally; you can invoke it directly via npx:

# One-time scaffold (no global install needed)
npx --package yo --package generator-code -- yo code

This command:

  • Downloads yo (Yeoman) and generator-code on demand.
  • Runs the VS Code extension generator.
  • Prompts you with a series of questions about the extension you want to create.

Recommended answers to the generator prompts

When the interactive prompts appear, choose:

? What type of extension do you want to create? → New Extension (TypeScript)
? What's the name of your extension?            → my-ai-extension
? What's the identifier?                        → my-ai-extension
? Initialize a git repository?                  → Yes
? Which bundler to use?                         → esbuild
? Which package manager?                        → npm

Why these choices matter:

  • New Extension (TypeScript) – gives you a typed development experience and a standard project layout.
  • Name / Identifier – the identifier becomes the technical ID used in the marketplace and in settings; pick something stable and lowercase.
  • Initialize a git repository – sets up Git so you can immediately start version-controlling your work.
  • esbuild – a modern, fast bundler that creates a single bundled extension.js for VS Code.
  • npm – a widely used default package manager; you can adapt to pnpm/yarn later if needed.

After you answer the prompts, Yeoman will generate the project in a new folder named after your extension (e.g. my-ai-extension).


Understanding the generated structure

Open the new folder in VS Code. The generator gives you a standard layout, including:

  • src/extension.ts
    This is the entry point of your extension. It exports activate and (optionally) deactivate. All your activation logic, command registration, and other behaviour start here.
  • package.json
    This acts as the extension manifest. It contains:

    • Metadata (name, version, publisher).
    • "main" field pointing to the compiled bundle (e.g. ./dist/extension.js).
    • "activationEvents" describing when your extension loads.
    • "contributes" describing commands, configuration, views, etc., that your extension adds to VS Code.

From an architectural perspective, package.json is the single most important file: it tells VS Code what your extension is and how and when it integrates into the editor.

You’ll also see other generated files such as:

  • tsconfig.json – TypeScript compiler configuration.
  • Build scripts in package.json – used to compile and bundle the extension with esbuild.
  • .vscode/launch.json – debug configuration for running the extension in a development host.

At this stage, you don’t need to modify any of these to get a working scaffold.


Running the scaffolded extension

Once the generator finishes:

  1. Install dependencies:

    cd my-ai-extension
    npm install
  2. Open the folder in VS Code (if you aren’t already).

  3. Press F5.

    VS Code will:

    • Run the build task defined by the generator.
    • Launch a new Extension Development Host window.
    • Load your extension into that window.

In the Extension Development Host:

  • Open the Command Palette.
  • Run the sample command that the generator added (typically named something like “Hello World”).

If the command runs and shows the sample notification, you have a fully working scaffolded extension. From here, you can start replacing the generated sample logic in src/extension.ts and adjusting package.json to declare your own contributions.

« Older posts