How I reduced Velocity Limbo Handlers Footprint by ~50%

For the last year, I've been working on a little side project of mine, called Velocity Limbo Handler. It's a system meant for larger Velocity Proxy networks, to handle cases where a server fails, by rerouting every player on said server to a Limbo Server, and managing player reconnection through a fair per-server queue.

You can check out the plugin for yourself here Velocity Limbo Handler

The thought behind Velocity Limbo Handler was always to make it as light-weight as possible, to support thousands of concurrent players. And sure it did do a great job at that, a fantastic job even. It has never been bad, or hogged any extreme resources. But that said, I wanted it to be as light-weight as possible, and as the plugin grew with more and more features, I forgot to prioritize the maximum performance, which was the goal. So I decided to dedicate release version v1.8.1 to improve upon the performance, which managed to reduce memory usage by ~50% and CPU usage by ~35%. In this blog post I'll be doing a deep dive into what measures I took to do this, without sacrificing on quality.

The Problem

Before optimizing, the plugin had a few implementation details that were harmless at small scale, but inefficient under large player loads:

Repeated reflection lookups for maintenance checks The maintenance provider was resolved via reflection inside frequently executed paths, meaning the same lookup work was performed over and over during queue processing.
Player state held with strong object references Some internal maps stored Player objects directly. After disconnects or server switches, these references could live longer than necessary, increasing retained heap and risking stale entries.
Redundant state stored across multiple collections Parts of the queue and player metadata were tracked in overlapping structures, which increased allocation churn and made cleanup harder.
Synchronous work inside hot paths Queue iteration, permission checks, and state updates happened in code paths triggered by joins, reconnect attempts, and maintenance toggles, amplifying the cost during spikes.

Individually, none of these issues were dramatic. Combined, they created unnecessary CPU work and memory retention that scaled poorly with larger queues.

Identifying the problems

I didn’t know these issues upfront, so I started by profiling the plugin using Spark (/sparkv profiler start) to see where time and memory were actually spent under load.

During profiling, two patterns stood out:

Reconnect handling consumed a disproportionate amount of CPU time
Memory usage increased noticeably as player counts grew

The profiler traces consistently pointed to queue processing and maintenance checks as hot paths. Heap sampling also suggested that player-related objects were being retained longer than expected after disconnects.

Eliminating Reflections in Hot Paths

When I dug into the reconnect CPU spike, profiler traces kept pointing back to maintenance checks running during queue processing. That led me straight into two innocent-looking helpers:

Utility.isServerInMaintenance(...)
Utility.playerMaintenanceWhitelisted(...)

At first glance these look innocent enough. But digging a little deeper, you quickly realise just how much of a mess they are.

To talk to the Maintenance plugin, I had decided to use reflection. That meant every time a maintenance check ran, the plugin would:

grab the API instance
look up the method dynamically
invoke it via reflection
sometimes repeat this several times just to figure out which overload existed

If you don't know what reflection is, you can read about them here, or this short post on stackoverflow What is reflection and why is it useful?

This is of course a very costly method, as it does a lot of lookups. It's not a problem at smaller scales, but can quickly hog resources when scaled.

Here's an example of how I would fetch methods:

boolean globalMaintenance = (boolean) maintenanceAPI.getClass()
    .getMethod("isMaintenance")
    .invoke(maintenanceAPI);

This doesn't look that bad, until you start thinking about how often it's run. It would run every single time the function ran.

Then we dig a little deeper, and it gets much worse. If we take the function isServerInMaintenance, which is utilised by the reconnection loop to... well check if the server is in maintenance.

The function would work like this:

First call isMaintenance globally, to check if the entire network was under maintenance
If global maintenance wasn't enabled, then check for server-specific maintenance. This internally had two checks
- First check maintenance with server name
- If that didn't exist it would then check again, but this time with the server object
Now finally if none of the methods checks above returned true, it would then iterate over every method in the Maintenance API and try to match a compatible signature.

Keep in mind that every time we tried to invoke a method, we would first do a reflection lookup for it. Then we take into account that this would be run on every reconnection attempt, which may be hundreds, if not thousands of times per second. Then we also start to think about "What happens if the server is not in maintenance"... Well... We would scan all methods of the Maintenance API, for no reason. This is a laughably bad system.

I had basically built a tiny runtime method-discovery engine... Inside my system which was supposed to be as performant as possible. Can't say I did very well at that.

If you want to take a proper look at the old utility file, then you can find it here

So how did I fix this?

If you're a little technical, you might have been screaming at me in disappointment, and asking me why I haven't thought of caching... And well, for that I don't really have an answer. Could've designed this to be performant from the start, but instead chose to ship to production as fast as possible.

Now how did I go about the caching system? To start I added a small helper that resolves a method once and stores it in a ConcurrentHashMap, keyed by the API class. Subsequent calls reuse the cached Method instead of performing another reflective lookup.

private static Method getCachedMethod(Map<Class<?>, Method> cache, Class<?> targetClass, String methodName, Class<?>... paramTypes) {
    return cache.computeIfAbsent(targetClass, key -> {
        try {
            return key.getMethod(methodName, paramTypes);
        } catch (NoSuchMethodException ignored) {
            return null;
        }
    });
}

An example of how to use it:

Method stringMaintenanceMethod = getCachedMethod(IS_MAINTENANCE_STRING_CACHE, apiClass, "isMaintenance", String.class);
if (stringMaintenanceMethod != null) {
    try {
        return (boolean) stringMaintenanceMethod.invoke(maintenanceAPI, serverName);
    } catch (Exception ignored) {
        // Ignore and continue
    }
}

I then replaced every other reflection call with this caching system instead.

The Results

In isolation, this change reduced reconnect CPU cost by ~35% in a synthetic test with 10,000 simulated players on an M2 machine.

Cleaning Up Player State and Memory Retention

The other problem Velocity Limbo Handler had, was the high memory usage. Now when looking at it blindly you wouldn't think of it as an issue. But when you compare it to how little data it actually stores it was quite high.

I used to save the players in queue like this

private final Map<Player, String> playerData;
private final Map<String, Queue<Player>> reconnectQueues = new ConcurrentHashMap<>();

At first glance, this doesn't really look that bad. But if we take a look at when and how we use the stored players, you quickly realise we're storing too much. The problem isn't just size. A Player object references connection state, permissions, and proxy internals. Keeping it in maps means the JVM has to retain that entire object graph, even after the player disconnects.

A typical check which includes reconnectQueues or playerData looks something like this

return playerData.containsKey(player);
 
or
 
String serverName = this.playerData.get(player);

If you think a little about this, you come to the realisation "Why do we need to store the whole player object, when all we're doing is binding it to other data?". That's exactly why it's overkill to do this in our case. This unnecessarily increases our memory usage, since we're storing way too much data per player. At small scales this isn't really an issue. But think about when we scale it up to 1,000 players... 10,000... It quickly becomes a lot of data we're storing in memory, seemingly for no reason.

Another problem which I found, was that we weren't cleaning up properly, after a player disconnects. There is this method pruneInactivePlayers(), which as the name implies cleans up old player data, if they're no longer active. But it only cleaned up in the queue, and playerdata maps. It didn't clean up in the other maps: connectingPlayers and playerConnectionIssues

Over time, this caused stale entries to accumulate, slowly increasing retained heap on long-running servers.

The Fix

Now these issues weren't in any way as big as the reflection issue. But still when we have a lot of small issues they can quickly add up. Fixing the memory usage was of course also a lot simpler.

Firstly I fixed player data being tied to the player object, by simply just tying it to the players UUID instead. It's just as unique, but a lot less information to store.

So all of the player tied data was simply just changed to

private final Map<UUID, String> playerData;
private final Map<UUID, Boolean> connectingPlayers;
private final Map<String, Queue<UUID>> reconnectQueues = new ConcurrentHashMap<>();

The other issue with pruneInactivePlayers() was fixed, by simply just deploying the same clean logic we used for reconnectQueues and deployed it with the other maps. So the old one was like this:

for (Queue<Player> queue : reconnectQueues.values()) {
    queue.removeIf(p -> !p.isActive());

Then of course it was changed to also work with the new UUID system:

for (Queue<UUID> queue : reconnectQueues.values()) {
    queue.removeIf(playerId -> VelocityLimboHandler.getProxyServer()
        .getPlayer(playerId)
        .map(player -> !player.isActive())
        .orElse(true));
}

Results

Now these memory fixes actually had quite the impact. At first I didn't really think of it as a big issue, but running the profiler again, I saw quite the contrary. Memory usage was down by 50% at 10,000 simulated players.

Final Results

Across these changes:

Reconnect CPU cost dropped by ~35% overall
Memory usage dropped by ~50% under 10,000 simulated players
Reflection was removed from hot paths
Player lifecycle cleanup became predictable and leak-free

In practice, this means the plugin handles large queues far more efficiently, with lower baseline resource usage and fewer spikes during reconnect storms, all without changing external behavior or configuration.

Lessons Learned

Most of these problems weren't caused by complex algorithms. They came from small design decisions made early:

flexible integrations using reflection instead of stable call paths
storing heavyweight objects where lightweight identifiers would do
cleanup logic that worked in most cases, but not all

None of these mistakes hurt at smaller scale, but when scaling up to thousands of players, they compound very quickly.

So the lessons I've learned from this, is that performance issues rearely come from one big mistake. Usually they come from many tiny bad decisions, which combine and cause issues.

Aksel GlyholtSoftware Engineer

Share on Twitter/X