Git LFS

Decoding GH008: When Git Push --mirror Succeeds Despite Missing LFS Objects

The Curious Case of Missing LFS Objects and Successful Pushes

Imagine successfully pushing a Git repository to GitHub, only to discover later that crucial large files (managed by Git LFS) are missing. This perplexing situation, recently highlighted in a GitHub Community discussion, reveals a nuanced interaction between Git, Git LFS, and GitHub's server-side validation. It's a scenario that can create blind spots in your software development tracking and lead to broken repositories.

Aakash4792 described a situation where a source repository had missing LFS objects. A mirror clone worked: git clone --mirror https://github.com/ORG/SOURCE.git ProbRepo, but attempting to fetch LFS objects failed with 404 Object does not exist on the server errors. What was truly confusing was the inconsistent behavior of git push --mirror. On one occasion, pushing this broken mirror to a new GitHub repository failed with the expected GH008 error, indicating unknown LFS objects. However, on a subsequent attempt to a brand-new empty repository, the exact same git push --mirror command succeeded without a hitch. Yet, cloning this newly pushed repository and running git lfs fetch --all confirmed that the LFS objects were still missing.

error: failed to fetch some objects from 'https://github.com/ORG/EmptyRepo9.git/info/lfs'

This inconsistency isn't just a minor annoyance; it’s a critical issue for dev teams, product managers, and CTOs who rely on the integrity of their version control systems for reliable delivery and accurate software project metrics. Understanding why this happens is key to preventing data loss and ensuring robust development workflows.

Diagram showing the decoupled nature of Git push and Git LFS object upload, with a pre-receive hook validating LFS payloads, illustrating potential points of failure.
Diagram showing the decoupled nature of Git push and Git LFS object upload, with a pre-receive hook validating LFS payloads, illustrating potential points of failure.

Unpacking Git LFS: Pointers, Payloads, and the Push Process

The core of this mystery lies in how Git and Git LFS operate. When you use Git LFS, the actual large files are not stored directly in your Git repository. Instead, Git stores small "pointer files" that reference the large objects. These pointers contain metadata like the object's SHA-256 hash and size. During a standard git push (or git push --mirror), Git primarily transfers the Git history and these pointer files.

As sraixl pointed out in the discussion, the upload of LFS objects is a separate, decoupled process. Git sees the LFS pointer files as small text files and successfully transfers them along with the rest of the repository's history (commits, branches, tags). The actual large files are uploaded via a dedicated LFS API. If LFS isn't properly initialized locally, or if specific conditions are met, the LFS upload phase might be skipped without the main Git push failing.

The Enigmatic GH008 Error

GitHub employs a pre-receive hook to validate the integrity of LFS objects. As grsantos56 explained, this hook is designed to check if the LFS payloads for the pointers you are pushing have actually been uploaded to their servers. If these large files are missing, the hook should block the Git push with the GH008 error, preventing a broken repository from being created.

This is why Aakash4792 initially saw the expected failure:

remote: error: GH008: Your push referenced at least 10 unknown Git LFS objects:

This error is GitHub's way of saying, "Hey, you're trying to push references to large files that we don't actually have in our LFS storage." It's a crucial safeguard for data integrity.

When the Rules Bend: Inconsistent Push Behavior

The perplexing part is when this safeguard seemingly fails. Aakash's subsequent successful push to a brand-new empty repository, despite the same missing LFS objects, points to what grsantos56 identified as a transient infrastructure inconsistency on GitHub's side. GitHub's pre-receive hooks can sometimes:

  • Time out: Under heavy server load, the validation hook might exceed its execution time limit.
  • Be temporarily bypassed: To prevent bottlenecks, GitHub might temporarily disable or relax certain pre-receive hook validations.
  • Behave differently for mirror pushes: Massive --mirror pushes, especially to empty repositories, might trigger different server-side logic or resource allocation.

The Netlify thread Aakash found, where GH008 issues magically resolved without intervention, further supports the idea that these are often transient, server-side issues rather than a fundamental change in Git or LFS behavior. When the hook fails to run or is disabled, the standard Git push (of the pointers) succeeds, leaving you with a repository that appears complete but is fundamentally broken due to missing LFS objects. This can severely skew your software project metrics, as successful pushes might mask underlying data integrity problems.

A github monitoring dashboard displaying software project metrics, including a warning for missing Git LFS objects, emphasizing the need for robust software development tracking.
A github monitoring dashboard displaying software project metrics, including a warning for missing Git LFS objects, emphasizing the need for robust software development tracking.

Implications for Productivity and Delivery

For dev teams, product/project managers, delivery managers, and CTOs, this scenario presents significant challenges:

  • Broken Builds and Environments: Developers cloning such a repository will encounter missing files, leading to failed builds, incomplete local environments, and wasted time debugging.
  • Loss of Confidence: If the version control system itself can't guarantee data integrity, trust in the entire development pipeline erodes. This impacts team morale and productivity.
  • Inaccurate Software Development Tracking: A "successful" push recorded in your software development tracking tools might be misleading. Without the LFS objects, the codebase is incomplete, affecting project progress and delivery timelines.
  • Risk to Delivery: Product and delivery managers rely on stable repositories for releases. Missing LFS objects can halt deployments and introduce critical bugs.
  • Technical Leadership Concerns: CTOs need reliable tooling infrastructure. Inconsistent behavior like this raises questions about the robustness of the platform and the integrity of critical assets.

Safeguarding Your Repositories: Best Practices and Mitigation

While the root cause might be a GitHub-side inconsistency, there are proactive steps you can take to mitigate the risk and ensure the integrity of your repositories:

Proactive LFS Management

  • Ensure Proper LFS Initialization: Always confirm Git LFS is correctly installed and initialized in your local environment and repository.
  • Explicitly Push LFS Objects: If you suspect issues or are performing critical operations like mirroring, explicitly run git lfs push --all origin to ensure all LFS objects are uploaded.
  • Regular Integrity Checks: Implement automated checks (e.g., in CI/CD pipelines) to verify the existence of LFS objects referenced by your repository. Tools like git lfs fsck can help.
  • Retain Local LFS Objects: Configure Git LFS to retain objects locally for longer periods, providing a local backup source.

Understanding Mirror Pushes

Remember that git push --mirror copies all references (branches, tags, etc.) and the Git history. If your source repository is already missing LFS objects, mirroring it will simply copy the references to those missing objects, perpetuating the problem. It mirrors the state, including its brokenness.

Monitoring and Visibility

For robust software development tracking, consider enhancing your monitoring:

  • Post-Push LFS Verification: After a significant push, especially a mirror push, perform a quick git lfs fetch --all and check for errors.
  • Custom GitHub Monitoring Dashboard: Develop or integrate a github monitoring dashboard that includes metrics on LFS object health and push success rates, flagging any inconsistencies.

Recovery Strategies

If you find yourself with a broken destination repository:

  • Locate Original Files: The primary solution is to find a local copy of the repository that still has the original LFS objects and then run git lfs push --all from that working copy.
  • History Rewriting (Last Resort): If the original LFS objects are truly lost, you might need to rewrite your Git history (e.g., using git filter-repo) to remove the broken LFS pointers entirely. This should be done with extreme caution and clear communication to the team.

Conclusion: The Unseen Layers of Version Control

The case of the inconsistent git push --mirror behavior with Git LFS highlights the complex interplay between Git's core mechanics, LFS's decoupled storage, and GitHub's server-side validation. While it appears to be a transient infrastructure quirk on GitHub's part, it underscores the critical importance of understanding your tooling at a deeper level.

For engineering leaders, product managers, and dev teams, this isn't just a technical curiosity; it's a reminder that robust software development tracking and reliable delivery depend on the integrity of every component in your toolchain. By adopting proactive LFS management, understanding push mechanics, and implementing vigilant monitoring, you can safeguard your repositories and ensure that a "successful" push truly means your code and its assets are where they should be.

Share:

Track, Analyze and Optimize Your Software DeveEx!

Effortlessly implement gamification, pre-generated performance reviews and retrospective, work quality analytics, alerts on top of your code repository activity

 Install GitHub App to Start
devActivity Screenshot