Streamlining GitHub EMU Migration: Essential Pre-Migration Prep for Software Engineering Management

Migrating to GitHub Enterprise Managed Users (EMU) is a significant undertaking that promises enhanced security and streamlined identity management. However, as highlighted in a recent GitHub Community discussion by jmassardo, the success of this transition hinges on meticulous pre-migration preparation. This crucial second part of a six-part series emphasizes that skipping the "unglamorous but critical work" of auditing, cleaning, and readying your identity infrastructure will inevitably lead to painful surprises down the line.

Developer preparing for migration, organizing digital assets
Developer preparing for migration, organizing digital assets

Laying the Foundation: Identity Provider Readiness

EMU mandates a compatible identity provider (IdP). GitHub offers "paved-path" integrations with Microsoft Entra ID (Azure AD), Okta, and PingFederate, supporting SAML SSO and SCIM provisioning. A critical caveat for software engineering management is the explicit lack of support for combining Okta and Entra ID for SSO and SCIM; this configuration will result in GitHub SCIM API errors. Organizations using non-partner IdPs must ensure their systems adhere to GitHub's integration guidelines, providing SAML 2.0 for authentication and SCIM 2.0 for user lifecycle management.

Secure data migration process with cleanup and filtering
Secure data migration process with cleanup and filtering

Inventory and Assessment: Understanding Your Digital Landscape

Before moving anything, you need a complete picture of your current state. Leveraging tools like the gh-repo-stats extension for GitHub CLI can generate a comprehensive inventory of repositories, including owners, activity timestamps, and pull request/issue counts. This provides vital software development stats. For deeper insights into repository health and potential migration blockers, git-sizer is invaluable for identifying large files or excessive history that could impact migration time. Files over 100MB, for instance, often require Git LFS or history rewriting.

# Install gh-repo-stats
gh extension install mona-actions/gh-repo-stats

# Generate inventory
gh repo-stats --org your-org-name --output inventory.csv

# Analyze repository size with git-sizer
git clone --mirror https://github.com/org/repo.git
cd repo.git
git-sizer --no-progress -j | jq ".max_blob_size"

Crucial User Communication

Often overlooked, a robust user communication plan is paramount. Users will experience significant changes, including new usernames, altered authentication flows, and the loss of direct contribution to public repositories. Early and frequent communication, comprehensive documentation, training sessions, and dedicated support channels are essential to manage expectations and ensure a smooth transition.

Pre-Migration Cleanup: Don't Migrate Your Mess

This phase is your opportunity to shed technical debt. "Every piece of technical debt, every abandoned repository, every stale PR you migrate is technical debt you’re paying to move."

Key Cleanup Tasks:

  • Archive Unused Repositories: Identify and archive repositories with no activity for over a year. While still migratable, archiving signals their historical status.
  • Close Stale Pull Requests & Issues: PRs inactive for 90+ days and issues untouched for six months are prime candidates for closure, preventing clutter in the new environment.
  • Delete Stale Branches: Merge or delete feature branches that haven't seen activity in months.
  • Audit and Remove Unused Integrations: Review OAuth apps, GitHub Apps, and webhooks. Verify EMU compatibility for remaining integrations, as orphaned webhooks pose security risks.
  • Clean Up Teams and Access: Align your GitHub team structure with your IdP groups, as EMU manages team membership via the IdP. Consolidate duplicate teams and remove inactive ones.
  • Remove Secrets and Sensitive Data: This is critical. Rotate all secrets, scan for committed secrets using tools like truffleHog or gitleaks, and document all repository and organization secrets. Remember, secrets do NOT migrate automatically and must be recreated in the new EMU environment. This is an ideal time to implement robust secrets management solutions.
# Find repositories with no activity in the last year
gh api graphql -f query=' query($org: String!, $cursor: String) { organization(login: $org) { repositories(first: 100, after: $cursor) { pageInfo { hasNextPage endCursor } nodes { name pushedAt isArchived defaultBranchRef { target { ... on Commit { committedDate } } } } } } }' -f org=YOUR_ORG | jq '.data.organization.repositories.nodes[] | select(.isArchived == false) | select(.pushedAt == null or (.pushedAt | fromdateiso8601) < ((now | fromdateiso8601) - 31536000))'

# Example: Close stale PRs (script for bulk operations often needed)
gh pr list --repo OWNER/REPO --state open --json number,title,updatedAt --jq '.[] | select(.updatedAt | fromdateiso8601 < ((now | fromdateiso8601) - (90 * 24 * 60 * 60)))'

# Scan for committed secrets using gitleaks
gitleaks detect --source . --verbose

The Goal: Migrate Clean

The ultimate objective of this pre-migration phase is simple: migrate only what you need, and migrate it clean. Investing time in this meticulous preparation will save countless hours and prevent significant headaches during and after your GitHub EMU transition, reflecting strong software engineering management practices.

Track, Analyze and Optimize Your Software DeveEx!

Effortlessly implement gamification, pre-generated performance reviews and retrospective, work quality analytics, alerts on top of your code repository activity

 Install GitHub App to Start
devActivity Screenshot