Solving Node.js Memory Leaks in High-Traffic Rate Limiters: A Key Software Engineering KPI

Illustration of a Node.js memory leak, showing a rising memory graph and a warning sign.
Illustration of a Node.js memory leak, showing a rising memory graph and a warning sign.

The Challenge: Unbounded Memory Growth in High-Volume Node.js Services

Running high-volume Node.js microservices presents unique challenges, especially when it comes to memory management. A recent discussion on GitHub highlighted a common, yet insidious, problem: a memory leak in a long-running Node.js 20 service processing around 5,000 events per second. The service, built with Fastify 4 and deployed on AWS ECS with a 2GB memory limit, saw its heap memory climb from an initial ~180MB to over 2GB within 6-8 hours, leading to eventual crashes.

The author, liya-daisuki, noted that this issue was not reproducible in staging environments due to lower traffic, making diagnosis difficult. Despite auditing event listeners, ensuring DB connection releases, and even attempting manual garbage collection with global.gc(), the heap continued its relentless growth. Heap snapshot diffs ultimately pointed to a Map within a rate-limiter middleware accumulating entries that were never properly evicted.

An initial attempt to mitigate this involved a TTL-based cleanup interval, but it only slowed the leak, failing to stop it. The suspected culprit was high key cardinality (unique client IPs), causing the Map to grow faster than the cleanup could process:

setInterval(() => {
  const now = Date.now();
  for (const [key, ts] of rateLimiter) {
    if (now - ts > TTL) rateLimiter.delete(key);
  }
}, 60_000);
Layered caching strategy with local LRU cache and Redis for distributed rate limiting.
Layered caching strategy with local LRU cache and Redis for distributed rate limiting.

The Solution: LRU Caching and Layered Redis for Robust Rate Limiting

Fellow community member zha0090 provided a comprehensive solution based on similar past experiences, emphasizing that a raw Map with a cleanup interval is often insufficient for high-scale traffic. The fix involved a two-pronged approach:

1. Ditch Raw Map for an LRU Cache with a Hard Cap

The first crucial step was to replace the standard JavaScript Map with an LRU (Least Recently Used) cache. Unlike a simple Map that grows indefinitely, an LRU cache with a hard size cap automatically evicts the oldest (or least recently used) entries when the limit is reached. This design ensures that memory usage remains bounded and predictable, a vital software engineering kpi for system stability.

zha0090 recommended the lru-cache library, which provides O(1) eviction and lazy cleanup on access, making it highly efficient:

import { LRUCache } from 'lru-cache';

const rateLimiter = new LRUCache({
  max: 100_000, // hard cap, evicts oldest automatically
  ttl: 60_000,
  ttlAutopurge: false,
});

2. Layered Caching with Redis for Distributed Environments

The second part of the solution addressed the challenge of distributed microservices, particularly relevant for deployments on platforms like AWS ECS. In such environments, each container instance holds its own in-process state, meaning a single client could bypass rate limits by hitting different instances. To solve this, zha0090 moved the authoritative counter to Redis, utilizing a sorted set for a sliding window approach, batched into a single pipeline for low latency.

However, hitting Redis on every request at 5k/sec can introduce significant overhead. The ingenious optimization was to layer the caching: check the local LRU cache first, and only hit Redis if the local check passes. This strategy allowed blocked IPs to be cached locally for a short duration (e.g., ~10 seconds), drastically cutting Redis calls by 60-80% in practice, as repeat offenders often dominate real-world traffic patterns.

Key Takeaways for Sustainable Software Project Measurement

This discussion offers invaluable lessons for developers managing high-throughput Node.js services:

  • Beware of Unbounded Data Structures: Simple Map or array-based caches with periodic cleanup intervals are often insufficient for high-cardinality data under heavy load.
  • Embrace LRU Caches: Libraries like lru-cache provide efficient, memory-bounded, in-process caching solutions ideal for rate limiting and similar use cases.
  • Layer Your Caching: For distributed microservices, combine local in-process caches (like LRU) with external, authoritative stores (like Redis) to achieve both performance and consistency.
  • Monitor Memory Usage: Observing heap growth and stability is a critical software engineering kpi that directly reflects the health and efficiency of your service. Proactive monitoring and profiling are key to identifying and resolving memory leaks before they impact production.

|

Dashboards, alerts, and review-ready summaries built on your GitHub activity.

 Install GitHub App to Start
Dashboard with engineering activity trends