Skip to main content


The chardet open source library relicensed from LGPL to MIT two days ago thanks to a Claude Code assisted "clean room" rewrite - but original author Mark Pilgrim is disputing that the way this was done justifies the change in license - my notes here: simonwillison.net/2026/Mar/5/c…
in reply to Simon Willison

APIs are copyrightable in the US following Oracle v Google, so by definition the AI output is a derivative work, and the question is whether it constitutes fair use or not?
in reply to Simon Willison

"There are several twists that make this case particularly hard to confidently resolve:"

I really expected one of them to be that LLM output isn't subject to copyright under US law. Since a license is a grant of permissions that would not otherwise exist due to copyright, applying a license to LLM output doesn't make any sense.

No one needs explicit permission to use LLM output.

in reply to Simon Willison

Oh interesting way to do it; if it was just a language translation then I'd say that's like a book translation and would follow the original copyright; but hmm splitting it through a design document is pretty clever.
in reply to Simon Willison

On the legal side, I am not an expert. But I understand the concerns of moving to a more permissive license regardings the user's freedom.

And my general feeling is, well, generative AI is technically impressive, but its really putting a lot of mess on the planet and humans relations.

I am not entirely stubbornly opposed (:p), otherwise following you would be masochism ;), but I struggle to find benefits in this tools, for us, as a society.

in reply to Simon Willison

in reply to Simon Willison

Pretty much what is happening here: quippd.com/writing/2025/12/17/…
in reply to Simon Willison

Obviously! The source code is: that code required to produce the binary. That code was LGPL. It doesn't matter how many algorithms, nor the nature of the algorithms, it goes through to become those 1s and 0s.

#law #lawfare #computerScience #intellectualProperty #licensing #FOSS #GNU #LGPL #MIT #code #softwareEngineering #LLM #codeWashing

in reply to Simon Willison

> Claude itself was very likely trained on chardet as part of its enormous quantity of training data—though we have no way of confirming this for sure

A note on that: It would be easy to paste snippets of code from the original codebase into Claude and ask it to analyze, attribute, and fill in the next few lines. Depending on what the answers are they may constitute a near certain confirmation.

in reply to Simon Willison

github.com/chardet/chardet/iss…
in reply to Simon Willison

the situation is interesting and the questions are challenging. The situation is even more complex and undefined if you create a new AI-based implementation based on an existing AI-implementation:

I created a Rust implementation of chardet based on this particular chardet v 7 version. I decided to pick the original LGPL version for this AI-based-on-AI implementation (which is by all numbers much, much faster than V7).

github.com/zopyx/chardet-rust