Scaling from the Right Foundation: Introducing HPA-Aware Optimization in Akamas Insights

March 24, 2026

Share this post

Scaling from the Right Foundation: Introducing HPA-Aware Optimization in Akamas Insights

Modern Kubernetes environments rely heavily on the Horizontal Pod Autoscaler (HPA) and tools like KEDA to manage dynamic workloads. Autoscaling is a powerful mechanism for handling peak traffic. However, autoscaling alone does not guarantee efficiency. In practice, scaling from a poorly configured resource baseline simply amplifies underlying inefficiencies across your infrastructure.

Today, we are thrilled to announce native support to optimize HPA workloads in Akamas Insights, bringing full-stack, autonomous optimization to your most dynamic and business-critical services.

The Hidden Tax of Over-Provisioning and Manual Toil

Platform Engineers and SREs are intimately familiar with the just to be safe tax. To prevent CPU throttling or agonizingly slow scale-ups during sudden traffic spikes, engineering teams routinely over-provision pod replicas, CPU requests, and memory limits. While this brute-force approach might temporarily protect uptime, it actively destroys cluster density and drives up cloud costs.

Furthermore, this leads to the Scaling Paradox, where the very mechanism intended to provide stability introduces new vectors of failure. Teams often struggle to determine the optimal balance between vertical and horizontal scaling, such as whether to deploy many small pods or fewer large ones. This is further complicated by the fact that HPA (Horizontal Pod Autoscaler) and VPA (Vertical Pod Autoscaler) cannot work together effectively, and the VPA lacks awareness of application runtimes, often leading to misaligned resource requests that trigger instability rather than resolving it.

Trying to solve this by manually tuning complex runtimes alongside HPA introduces massive operational toil. Application runtimes like the JVM or Node.js were not inherently designed for volatile, resource-constrained container environments. When CPU and memory requests are misaligned with real workload behavior, frequent scaling leads to instability. Teams experience pod flapping, severe cold-start performance penalties during JIT compilation, and degraded reliability exactly when the application needs to perform at its peak.

The Akamas Way: Orchestrating a High-Performance Foundation

Akamas Insights now brings native optimization directly to workloads governed by HPA. Rather than modifying your carefully crafted scaling thresholds or HPA policies, the platform optimizes the foundational configuration from which the scaling occurs.

By right-sizing CPU and memory requests, limits, and runtime configurations, Akamas ensures that your HPA scales from a highly efficient and stable starting point. This guarantees a safe and predictable rollout where your HPA continues to operate exactly as intended, but with drastically improved resource utilization.

Deep Dive: How HPA-Aware Optimization Works

Our latest release goes beyond simple resource tracking to synchronize your vertical configuration with your horizontal scaling strategy.

Uncovering Hidden Inefficiencies in Scaling Behavior

Many HPA-related performance issues are invisible to standard monitoring. Akamas proactively identifies friction points that are notoriously hard to detect manually, such as:

  • Detecting when CPU throttling during application initialization (like JVM JIT compilation) causes new replicas to lag, forcing the HPA to spin up even more unnecessary pods.
  • Identifying when scaling thresholds are set too high to trigger before performance degrades, or when max replica limits are too low to safely absorb peak traffic.
  • Spotting instances where the container’s resource limits are at odds with the application runtime’s own memory management or garbage collection settings.

Precision Recommendations for Stable Scaling

Once these risks are identified, Akamas provides a unified recommendation to stabilize your workloads. Instead of guessing at pod sizes or HPA configurations, Akamas determines the precise mix of CPU/memory requests, runtime parameters, and scaling thresholds required to meet your specific goals.

This full-stack optimization approach delivers:

  • Maximized Efficiency: Akamas tunes min/max replicas and scaling thresholds to ensure you aren’t running idle capacity, significantly increasing cluster density and reducing cloud spend.
  • Guaranteed Performance: By optimizing the “unit of scale” (the pod) alongside the scaling policy, Akamas ensures that every new replica is stable and ready to handle its load immediately. This eliminates the performance dips that occur when unoptimized runtimes struggle under sudden stress.

The Impact of a Right-Sized Foundation

By addressing the entire stack as a single, interconnected system, organizations can transform their Kubernetes efficiency. Akamas treats pod sizing, JVM/Node.js tuning, and HPA configuration as a unified challenge rather than isolated tasks.  Early data indicates that optimizing the HPA baseline allows platform teams to achieve up to a 35% reduction in compute waste and significantly higher cluster density. More importantly, this is achieved while eliminating the manual toil of tuning runtimes and HPA thresholds separately. This approach ensures 99.9% stable scaling and prevents the performance degradation and start-up incidents that occur when unoptimized runtimes crash under the stress of rapid scaling.

Instead of spending hours manually tuning JVM parameters, sizing pods, and adjusting HPA thresholds in isolation, Akamas enables your team to optimize across the entire stack. This unified, automated framework ensures that your infrastructure is not just scaling, but scaling on a foundation built for peak performance and reliability.

Ready to eliminate the “just to be safe” tax and stabilize your autoscaling? Start your free trial of Akamas today and see how HPA-aware optimization can transform your cluster efficiency.

See for Yourself

Experience the benefits of Akamas autonomous optimization.
No overselling, no strings attached, no commitments.