Mastering GitHub PR Review Metrics: From Search API Pitfalls to Accurate GraphQL Insights for Development Quality

Frustrated developer viewing inconsistent data, representing GitHub Search API limitations for metrics.
Frustrated developer viewing inconsistent data, representing GitHub Search API limitations for metrics.

Unpacking GitHub PR Review Metrics: Why Your Search API Queries Might Be Misleading

Accurately tracking development quality metrics, such as the rate of Pull Request (PR) reviews before merge, is crucial for any engineering team. However, developers often encounter discrepancies when attempting to gather these vital software project statistics directly from GitHub's Search API. A recent discussion in the GitHub Community highlights a common pitfall and offers a robust solution.

The Problem with GitHub's Search API for PR Review Status

A developer, amitschang, sought to analyze PR review rates using the GitHub Search API. Their approach involved two queries:

  • is:pr is:merged review:approved org:{org} (for merged PRs with an approval)
  • is:pr is:merged -review:approved org:{org} (for merged PRs without an approval)

The expectation was that combining these would yield the total merged PRs, and the first query would accurately identify all approved ones. However, spot checks revealed that PRs with approvals were sometimes appearing in the 'unapproved' list, leading to incorrect development quality metrics.

As community experts abbosaliboev and Gecko51 explained, the core issue lies in the nature of the Search API. It's "eventually consistent," meaning there's a delay in indexing updates. More critically, the review:approved filter in the Search API reflects the *indexed review state*, which can be stale or inconsistent, especially for older PRs. It doesn't necessarily mean "had an approval at merge time" or "had an approval at any point." If a PR was approved, then new commits were pushed (dismissing the approval), and then merged, the index might still show an inconsistent state.

The Solution: Leveraging the GitHub GraphQL API for Accuracy

For precise and real-time software project statistics, the recommendation is to switch from the Search API to the GraphQL API. GraphQL queries the database state directly, offering much higher reliability for metrics.

Initially, abbosaliboev suggested using reviewDecision:

graphql
{
  organization(login: "your-org") {
    repositories(first: 50) {
      nodes {
        pullRequests(states: MERGED, last: 100) {
          nodes {
            reviewDecision # Returns APPROVED, CHANGES_REQUESTED, or null
          }
        }
      }
    }
  }
}

However, Gecko51 clarified that for the specific goal of determining if a PR *ever* had an approval (regardless of later dismissals or timing relative to merge), the reviews(states: APPROVED) field is more appropriate. This field checks the PR's history for any approval reviews, providing a consistent record.

Here's the refined GraphQL query to check for approvals:

graphql
{
  repository(owner: "your-org", name: "your-repo") {
    pullRequests(states: MERGED, first: 100, after: "CURSOR") {
      pageInfo {
        hasNextPage
        endCursor
      }
      nodes {
        number
        mergedAt
        reviews(states: APPROVED, first: 1) {
          totalCount
        }
      }
    }
  }
}

If reviews.totalCount > 0, the PR had at least one approval at some point in its lifecycle. This approach provides a much more accurate picture for your development quality metrics.

Practical Considerations for Org-Level Metrics

When gathering organization-wide software project statistics, remember that you'll need to paginate through repositories first, and then paginate through PRs within each repository. Be mindful of GraphQL's node cost system and your API rate limits, especially for organizations with many repositories and high PR volumes.

By transitioning to the GraphQL API and using the correct fields, developers can overcome the inconsistencies of the Search API and gain truly reliable insights into their PR review processes and overall code quality.

Developer confidently analyzing accurate data visualizations, representing the power of GitHub GraphQL API for metrics.
Developer confidently analyzing accurate data visualizations, representing the power of GitHub GraphQL API for metrics.

|

Dashboards, alerts, and review-ready summaries built on your GitHub activity.

 Install GitHub App to Start
Dashboard with engineering activity trends