Facebook takes your personal data and exploits it for profit in shady ways. GitHub has now done the same, but with your source code. GitHub is a much better company than Facebook is, and they have an opportunity to prove it now. So, prove it.
Facebook takes your personal data and exploits it for profit in shady ways. GitHub has now done the same, but with your source code. GitHub is a much better company than Facebook is, and they have an opportunity to prove it now. So, prove it.
@nitinkhanna there are many things that they could do:
Its a complicated problem, but they'll need to attack it head-on.
@cleverdevil You have a link where I can read about this?
@pimoore so, Nora Tindall has some great tweets about the issue. But, the long and short of it is that GitHub released something called GitHub Copilot which uses AI / machine learning to predictively pair program along with you.
The issue is that they trained their machine learning model using tons of code on GitHub without anyone's consent. There are a myriad of issues with this, not the least of which is copyright, but software licensing also comes into play.
There are several very popular open source licenses, such as the GPL, which explicitly prohibit "derivative works" that are created from GPL-licensed code, unless those works are also released under the GPL. They're immediately in violation of hundreds (thousands?) of GPL licensed projects.
Worse, if you're a user of Copilot, there is a decent chance that when you're writing some code, Copilot predictively spits out some code that is very very close to, if not verbatim, lifted from a GPL licensed project. Guess what? Now you are in violation of the GPL unless you open source your work under the same license.
Its a bit of a nightmare, and an absolute self-own from GitHub.
@cleverdevil interesting... I take it the whole thing left a bad taste in your mouth. It certainly is a concern what they've taught the model on. But as far as absracted code goes, they could very well have done a good job using well known open source software to train on. Of course, technical breakdowns will tell us more.
But I don't see how shutting it down and rethinking it would help. Every time they do, someone might come up with a new issue which they'll have to respond to.
You're right about open sourcing the model though - it's in line with what we've come to expect from the open source world, even though github per se hasn't always been a good caretaker of that.
@cleverdevil This kind of makes me want to move my stuff to either Gitlab/Bitbucket, or a self-hosted Gitea instance.
Thanks for the link and breakdown of what’s happening with this!
@nitinkhanna well, they trained their model on my open source software without my consent, for their own benefit, which is not only kind of icky, its also a legal problem for them as they're in violation of many, many, many projects' licenses. It also puts their users at risk as a result. If they don't shut it down and rethink, they're asking for many lawsuits that they'll very likely lose.
@pimoore I'm not quite there just yet, as I think that they're genuinely trying (and succeeding) to do something very cool and innovative. They just misfired a bit on the critical thinking side. Many technologists suffer from this :)
I'm hopeful that they pivot.