GitHub Actions

Unpacking GitHub Actions Delays: When Self-Hosted Runners Go Idle But Workflows Stay Queued

In the fast-paced world of software development, continuous integration and continuous delivery (CI/CD) pipelines are the lifeblood of modern engineering teams. They are the engines that drive rapid iteration, ensure code quality, and ultimately, accelerate time to market. When these pipelines stall, even for a few minutes, the ripple effect can significantly impact team productivity, delivery schedules, and overall development efficiency. A recent GitHub Community discussion, initiated by user shurikovyy, brings to light a particularly frustrating intermittent issue: self-hosted GitHub Actions runners appearing Online/Idle, yet workflow runs remain stubbornly Queued and unassigned.

This isn't just a minor glitch; it's a critical bottleneck that can halt deployments, delay feature releases, and erode confidence in automated processes. For dev team members, product managers, and CTOs alike, understanding the root cause and potential mitigations for such issues is paramount to maintaining high-performing teams and predictable delivery.

The Mystery of the Stalled Workflow: An Unseen Bottleneck

The core of the problem, first noticed around February 2026, involves GitHub Actions workflows failing to start despite an available self-hosted runner. The workflow run stays Queued, with no runner assigned (indicated by runner_id=0 in job details), even though the runner is reported as online and busy=false via the GitHub API. These queued runs do not resolve themselves; they only get picked up after a manual intervention, such as triggering a re-run, using workflow_dispatch, or pushing another commit. This manual intervention, while effective, introduces an unacceptable delay and undermines the very purpose of automation, directly impacting development efficiency.

A Case Study in Frustration

Shurikovyy provided a detailed example from February 10, 2026, involving a private repository IcoverLLC/AirFlow and a workflow deploy-airflow-dags.yml. A push to the develop branch created Run 21863545638 at 11:45:00Z. For approximately 7-8 minutes, this run remained queued while the designated airflow runner (version 2.331.0, running as a systemd service with labels [self-hosted, Linux, X64]) was online and idle. The runner logs confirmed it did not receive any job request until 11:52:52Z, after a manual workflow_dispatch was initiated to unblock the process. This clearly demonstrates a disconnect between GitHub's scheduling system and the runner's availability.

This isn't an isolated incident; it's an intermittent problem that has been observed for weeks. Such unpredictability makes it challenging for teams to rely on their CI/CD pipelines, forcing them to implement manual checks or workarounds that drain valuable engineering time.

Developer troubleshooting a queued GitHub Actions workflow with diagnostic tools
Developer troubleshooting a queued GitHub Actions workflow with diagnostic tools

Deep Dive into Diagnostics: Ruling Out Local Issues

What makes this issue particularly insidious is that all local indicators point to a healthy runner environment. Shurikovyy's team undertook a commendable diagnostic effort, leveraging multiple data sources to systematically rule out common culprits:

  • GitHub API View: Confirmed the run as queued, job as queued with runner_id = 0, and the runner as online and busy=false via gh api.
  • Runner Internal Logs: Showed no activity related to the queued job until it finally acknowledged a request at 11:52:52Z.
  • Host Networking Snapshot: Confirmed stable DNS resolution, successful curl connections to the Actions broker, and no increases in NIC errors or drops.

The conclusion from these meticulous diagnostics is clear: while the run was queued, GitHub’s API showed an eligible runner online and idle, and the runner host had working network connectivity to the Actions broker. However, the job remained unassigned for several minutes, and the runner did not receive any job request until it was manually unblocked. This strongly suggests a delay or issue within GitHub's job dispatch, broker messaging, or scheduling mechanisms, rather than a local runner or network outage.

Diagram illustrating a communication breakdown between GitHub Actions broker and self-hosted runner
Diagram illustrating a communication breakdown between GitHub Actions broker and self-hosted runner

The GitHub Actions Broker/Scheduler Hypothesis

Given the evidence, the problem appears to reside squarely within the GitHub Actions service itself. Specifically, the questions raised by the community member point to potential architectural weak points:

  • Are there known issues where long-poll sessions might become stale, preventing job dispatch?
  • Could there be an intermittent scheduling backlog that delays assignment even when resources are available?

For engineering leaders, understanding these internal mechanisms is crucial. When a platform's core scheduling logic exhibits such intermittent failures, it directly impacts the predictability of CI/CD, making it harder to manage development stats and project timelines. This kind of unpredictability can lead to a loss of trust in automated systems, forcing teams to allocate resources to manual monitoring and intervention—a direct hit to development efficiency.

Impact on Delivery & Productivity: More Than Just a Delay

While 7-8 minutes might seem like a minor delay in isolation, its cumulative impact on development efficiency can be substantial. For teams practicing continuous deployment, even short, unpredictable stalls can:

  • Delay Deployments: Critical fixes or features can be held up, impacting users and business objectives.
  • Waste Developer Time: Engineers are forced to context-switch, manually check pipeline statuses, and re-run workflows.
  • Erode Trust: Unreliable pipelines lead to a lack of confidence in automation.
  • Complicate Metrics: Accurate development stats on lead time and deployment frequency become skewed.

For delivery managers and CTOs, such issues translate directly into missed deadlines, increased operational overhead, and a tangible drag on overall team output. It underscores the importance of robust, transparent tooling that provides clear insights into its operational state, especially when it comes to self-hosted components.

What's Needed from GitHub: Towards a Resolution

The community discussion highlights a clear need for GitHub's intervention and deeper insights. The user's request for "additional diagnostics" from GitHub's side is critical. As senior tech leaders, we understand that complex distributed systems like GitHub Actions have many moving parts. What appears as an "idle" runner from the API might be experiencing a transient communication issue with the broker, or the broker itself might be under load or experiencing a bug in its scheduling algorithm.

Transparency and collaboration are key. GitHub's product teams need to investigate backend logs and internal metrics to pinpoint why job dispatch is not happening promptly. This isn't just about fixing a bug; it's about ensuring the reliability of a foundational platform that underpins the development efficiency of countless organizations.

Until a definitive solution is provided, teams experiencing this issue are left with manual interventions, which is far from ideal. The ability to trust that an available runner will immediately pick up a queued job is fundamental to maintaining a smooth, efficient CI/CD pipeline and fostering a productive developer experience.

Conclusion: Prioritizing Predictable CI/CD for Peak Performance

The GitHub Actions "queued but idle" runner dilemma is a stark reminder that even the most advanced platforms can have their intermittent challenges. For organizations striving for peak development efficiency and seamless delivery, the reliability of their CI/CD tooling is non-negotiable. This detailed community report serves as a valuable case study in proactive diagnostics and highlights the critical need for platform providers to offer clear explanations and robust solutions for such core operational issues.

As we push the boundaries of automation, ensuring that our tools are not just powerful but also consistently reliable becomes the ultimate measure of their value. Engineering leaders must continue to advocate for and invest in systems that provide predictable performance, allowing their teams to focus on innovation rather than troubleshooting stalled pipelines.

Share:

Track, Analyze and Optimize Your Software DeveEx!

Effortlessly implement gamification, pre-generated performance reviews and retrospective, work quality analytics, alerts on top of your code repository activity

 Install GitHub App to Start
devActivity Screenshot