Streamlining PR Workflows: Building a Non-Blocking Merge Queue for Improved Software Project Metrics
In the fast-paced world of software development, managing a high volume of pull requests (PRs) efficiently is crucial for maintaining velocity and positive software project metrics. A common bottleneck arises when a single failing PR blocks the entire merge lane, bringing development to a halt. This challenge was recently highlighted in a GitHub Community discussion, where developers explored building custom, non-blocking merge queues, particularly within Azure environments.
The Core Problem: Blocked Merge Lanes
The original poster sought a solution to automatically detect and eject failed PRs—due to CI failures, unmet required checks, policy violations, or merge conflicts—without impeding other valid PRs. The goal was to maintain an ordered, non-blocking flow, complete with retry logic for transient issues and priority handling.
Native Solution First: GitHub's Merge Queue
Before diving into custom implementations, a crucial piece of advice from the community was to first evaluate GitHub's native Merge Queue. Available on GitHub Team and Enterprise plans, this built-in feature is specifically designed to address these "non-blocking" requirements. It creates temporary branches for validation and automatically ejects failing PRs, potentially saving significant engineering effort. If your organization has access, it's often the most straightforward path to improving your software development metrics dashboard.
Building a Custom Non-Blocking Merge Queue: A "LinearB Free Alternative" Approach
When GitHub's native solution isn't an option, or if highly specific custom logic is required, the community provided robust blueprints for a custom merge queue. This approach essentially creates a "LinearB free alternative" for merge management, offering flexibility and control.
High-Level Architecture
The consensus outlines a three-piece design, often leveraging Azure services or GitHub Actions:
- Queue of PRs: A backing store (e.g., Azure Table Storage, Redis, Azure Service Bus, or even GitHub Issues) to hold PR metadata, ordered by priority or labels, including retry counts and validation status.
- Event Triggers: GitHub webhooks listening for key events like
pull_request(opened, synchronized),check_suite,status, andpull_request_review. - Sequential Processor (Worker): An Azure Function (especially Azure Durable Functions for their smart retries and orchestration capabilities) or a GitHub Action that processes one PR at a time, per target branch.
Key Logic and Principles
- Always Revalidate Before Merge: To prevent race conditions and stale statuses, the worker must re-verify all required checks, mergeability, policy validations, and approvals immediately before attempting a merge.
- Decision Logic:
- Success: If all checks pass, merge the PR via the GitHub API.
- Permanent Failure: If there's a policy violation or merge conflict, the PR is immediately ejected from the queue.
- Transient Failure: For flaky CI checks, the PR is re-queued with an exponential backoff strategy (e.g., retry after 5, 15, then 30 minutes) up to a maximum number of attempts.
- Non-Blocking Principle: The critical insight is that if a PR fails (permanently or transiently), the processor should immediately move to the next valid PR in the queue, ensuring the lane keeps flowing.
- State Tracking: Maintain comprehensive logging and idempotency for auditability and reliability.
Useful GitHub APIs
Implementing this requires interaction with the GitHub REST API:
GET /repos/{owner}/{repo}/commits/{sha}/status: To check combined status of checks.GET /repos/{owner}/{repo}/pulls/{pull_number}/merge: To check mergeability.PUT /repos/{owner}/{repo}/pulls/{pull_number}/merge: To perform the merge.
// Example: Checking combined status (simplified)
GET /repos/octocat/hello-world/commits/6dcb09b5b57875f334f61aebed695e2e4193db5e/status
Conclusion
Whether through GitHub's native features or a custom-built solution, implementing a non-blocking merge queue significantly enhances developer productivity and provides clearer software project metrics. By automating the handling of problematic PRs, teams can ensure a smoother, more predictable integration process, reducing friction and accelerating delivery.