Microsoft is turning one of its most widely used developer tools into a new source of AI training data. GitHub announced today that interactions with GitHub Copilot will now be collected and used to “train and improve our AI models” — by default, and across both free and paid accounts.

What Gets Collected

The scope of collection is broad. Any input or output data generated through Copilot is in scope, including code snippets, inline comments and documentation, file names, repository structure, and other contextual information passed through the tool. This applies whether you are using code completion in Visual Studio Code, asking Copilot questions directly on the GitHub website, or working with any other Copilot-integrated feature.

If you have never used GitHub Copilot, nothing changes. But if you have — even casually — your interactions may now be feeding into Microsoft’s model improvement pipeline.

The policy covers Copilot Free, Copilot Pro, and Copilot Pro+ users. Notably, Copilot Business and Copilot Enterprise accounts are excluded, likely reflecting the stricter data handling obligations those tiers carry for organizational customers.

The Rationale

GitHub was candid about its reasoning. The original Copilot models were built from publicly available data and hand-crafted code samples — a process that drew criticism from parts of the developer community over questions of consent and intellectual property. The company says it has since seen meaningful model improvements by incorporating interaction data from Microsoft employees, and is now extending that approach to its broader user base.

In the announcement on the GitHub Blog, the company framed participation as a collective benefit: “By participating, you’ll help our models better understand development workflows, deliver more accurate and secure code pattern suggestions, and improve their ability to help you catch potential bugs before they reach production.”

How to Opt Out

Opting out is straightforward. From your GitHub account, navigate to Settings → Copilot → Privacy, and locate the “Allow GitHub to use my data for AI model training” toggle. Set it to Disabled. If you maintain multiple GitHub accounts, the setting must be changed individually on each one — there is no global toggle across accounts.

The opt-out is available immediately. GitHub has not indicated that it will prompt users proactively or surface the setting during normal Copilot usage, so the default-on nature of the policy means users who want to opt out will need to seek out the setting themselves.

Comments

No comments yet. Be the first to share your thoughts.

or to leave a comment.