Programmer and lawyer Matthew Butterick has sued Microsoft, GitHub, and OpenAI, alleging that GitHub’s Copilot violates the terms of open-source licenses and infringes the rights of programmers.
GitHub Copilot, released in June 2022, is an AI-based programming aid that uses OpenAI Codex to generate real-time source code and function recommendations in Visual Studio.
The tool was trained with machine learning using billions of lines of code from public repositories and can transform natural language into code snippets across dozens of programming languages.
Clipping authors out
While Copilot can speed up the process of writing code and ease software development, its use of public open-source code has caused experts to worry that it violates licensing attributions and limitations.
Open-source licenses, like the GPL, Apache, and MIT licenses, require attribution of the author’s name and defining particular copyrights.
However, Copilot is removing this component, and even when the snippets are longer than 150 characters and taken directly from the training set, no attribution is given.
Some programmers have gone as far as to call this open-source laundering, and the legal implications of this approach were demonstrated after the launch of the AI tool.
“It appears Microsoft is profiting from others’ work by disregarding the conditions of the underlying open-source licenses and other legal requirements,” comments Joseph Saveri, the law firm representing Butterick in the litigation.
To make matters worse, people have reported cases of Copilot leaking secrets published on public repositories by mistake and thus included in the training set, like API keys.
Apart from the license violations, Butterick also alleges that the development feature violates the following:
– GitHub’s terms of service and privacy policies,
– DMCA 1202, which forbids the removal of copyright-management information,
– the California Consumer Privacy Act,
– and other laws giving rise to the related legal claims.