How to Get GitHub to Recognize Your New Programming Language: A Guide for Dev Teams
Creating a new programming language is an impressive feat, and naturally, developers want their creations to be properly recognized by platforms like GitHub. A recent discussion in the GitHub Community highlighted a common challenge: how to get GitHub to display a brand-new language in the repository's 'Languages' section. This is more than just a cosmetic detail; it's crucial for a clear development overview, accurate engineering performance metrics, and effective tooling decisions within your team.
The GitHub Language Conundrum: Why .gitattributes Isn't Enough
User flaviokalleu, the maintainer of 'Flang' (a bilingual declarative programming language with .fg extension), sought to have GitHub recognize Flang in their repository. They had correctly added a .gitattributes file with *.fg text linguist-language=Flang, expecting GitHub to pick it up. However, the 'Languages' section remained unchanged.
The core of the problem lies in how GitHub determines language statistics. GitHub uses an open-source library called GitHub Linguist. As community member Gecko51 clarified, the linguist-language attribute in .gitattributes is designed for remapping files to a language that Linguist already knows. It cannot define a completely new language from scratch.
*.fg linguist-language=Python
For example, if you wanted your .fg files to be counted as Python, the above line would work. But for a truly new language like Flang, Linguist has no existing definition to map to, rendering the attribute ineffective for full recognition. This means that while .gitattributes is a powerful tool for fine-tuning how existing languages are categorized, it's not the gateway for entirely novel ones.
The Definitive Path: Contributing to GitHub Linguist
The definitive path to getting a new language recognized by GitHub is to contribute it directly to the github-linguist/linguist repository. This ensures that GitHub's core language detection system is updated, providing proper syntax highlighting, accurate statistics, and a legitimate presence for your language.
A Step-by-Step Guide to Linguist Contribution
Based on the insights from the community discussion and the Linguist contributing guide, here's the rough checklist for getting a new language accepted:
-
A TextMate/VS Code Grammar: You'll need a
.tmLanguageor.plistfile that defines syntax highlighting for your language's files (e.g.,.fgfiles). This file will reside in thegrammars/folder of the Linguist repository. -
Sample Files: Provide a sufficient number of sample files in a dedicated folder (e.g.,
samples/Flang/). These should be small but non-trivial programs that demonstrate real usage and various syntax features of your language. -
An Entry in
languages.yml: This YAML file is Linguist's central registry. You'll add an entry defining your language's name, associated extensions (like.fg), type (e.g., programming, markup), preferred color for display, and other metadata. -
Proof of Usage: Linguist generally expects new languages to demonstrate some real-world adoption. This means having a reasonable number of public repositories already using your language. For newer languages, this can sometimes be a hurdle, so having a few repos with actual code helps your case significantly.
The contributing guide in the Linguist repo walks through the exact process in detail, which is essential reading before submitting a Pull Request.
Navigating Potential Hurdles
As Gecko51 wisely pointed out, be aware of potential naming conflicts. For instance, "Flang" is also the name of an LLVM-based Fortran compiler. Reviewers might raise this, so be prepared to discuss the naming or consider alternatives if a strong conflict exists. The review process for new languages can take time, as the maintainers ensure quality and consistency across the vast array of languages Linguist supports.
Beyond Recognition: Why Accurate Language Stats Matter for Leadership
For dev team members, product/project managers, delivery managers, and CTOs, accurate language recognition isn't just about aesthetics; it's about actionable insights into your codebase and team's efforts. The 'Languages' section on GitHub provides a quick development overview that contributes directly to understanding engineering performance and informing strategic decisions.
-
Resource Allocation: Knowing the dominant languages helps managers allocate resources, plan training, and hire for specific skill sets.
-
Tech Stack Analysis: It offers an immediate snapshot of the project's technological foundation, crucial for architects and tech leads.
-
Onboarding: New team members can quickly grasp the primary languages used in a repository, accelerating their ramp-up time.
-
Promoting Internal Languages: If your team develops internal DSLs or domain-specific languages, proper GitHub recognition validates their use and helps track their adoption and impact.
-
Software Developer Metrics: While not a direct metric of productivity, the language breakdown contributes to a holistic view of project health and complexity, influencing how other software developer metrics are interpreted.
Without proper recognition, your project's true composition remains hidden, potentially leading to misinformed decisions about tooling, talent, and strategic direction.
Temporary Workarounds: A Stopgap, Not a Solution
While contributing to Linguist is the long-term solution, some temporary workarounds were discussed:
-
Mapping to an Existing Language: As Hamdan-Saddique-ai suggested, you could temporarily map your
.fgfiles to an existing language that has a similar syntax (e.g.,*.fg linguist-language=Python). This would at least get your files counted, but inaccurately, and without proper syntax highlighting. -
Renaming Files: A less ideal option is to temporarily rename your files to a supported extension. This is highly disruptive and should only be considered if language stats are critically important for a very short period.
It's vital to understand that these are mere stopgaps. They obscure the true engineering performance and provide a distorted development overview. They don't offer proper syntax highlighting or the accurate representation your language deserves.
Empowering Language Creators, Enhancing Development Overview
The journey to getting a new language recognized by GitHub is a testament to the power of open-source contribution. It's a process that not only benefits individual language creators like flaviokalleu but also enriches the entire developer ecosystem. For technical leaders and dev teams, advocating for and supporting these contributions ensures that your project's digital footprint on GitHub accurately reflects its technical reality, providing the clear development overview and precise software developer metrics needed to drive success.
If you're building a new language, embrace the challenge of contributing to GitHub Linguist. Your efforts will pave the way for better tooling, clearer project insights, and a more inclusive platform for all developers.
