Skip to main content


An interesting stunt: Malus.sh will take your money and in exchange it will ingest any free/open source code you want, refactor that code using an LLM, and spit out a "clean room" version that is freed from all the obligations imposed by the original project's software license:

404media.co/this-ai-tool-rips-…

-

If you'd like an essay-formatted version of this thread to read or share, here's a link to it on pluralistic.net, my surveillance-free, ad-free, tracker-free blog:

pluralistic.net/2026/04/23/poi…

1/

in reply to Cory Doctorow

Great essay. Corporate America has a lot more to lose from this tool than the Free Software community.

Now every single corporate hack can result in their Golden Goose being freed, permanently.

in reply to Cory Doctorow

I anticipate thousands of "clean room" versions of open source software that will be unmaintained from the moment they are created. Short term gain for the people doing this will become long term technical debt paid for by the people using it. This is actually a strong incentive for those of us who actually understand the consequences to rally behind real projects with real security updates.
in reply to Domestic Supply

@ddgulledge I look forward to hearing the horror stories from organizations who are so unethical to think this is a good way forward.

Many moons ago I was asked to do some due diligence on a company. They proudly explained to me how they had extended a FOSS product, but failed to contribute back or to integrate patches from the original. They were a major release out of date and getting up to date would be major work.

For that reason my recommendation was that the investor give this one a pass.

in reply to Domestic Supply

@ddgulledge
Academic scientific software authors have been slitting each other's throats this way for decades.

You have a grad student re-derive all the features in a competitors software and swear that they never saw the code. You write a paper with a few cherry picked benchmarks to show how your software is the same or better.
Then the grad student leaves and new grants don't pay for maintenance on old software.
This is basically a 5 year cycle.

@pluralistic

in reply to Domestic Supply

@ddgulledge Honestly, I see no such risk. There could be thousands of forks of open source things today! That's perfectly legal and trivial to do.

Using this to shift open source into a proprietary product (eg. violating the GPL) is a legit risk.

Using this on proprietary software is another likely outcome (ex. take that custom SAP app, get clean room implementation, stop paying).

in reply to Cory Doctorow

Huh!

You mean... A scummy scammy business is using LLM, and selling a novel idea: steal other people's work and sell it. With little to no attention paid to actual consequences.

ORIGINAL, amirite??? LLMs were NOT created like this or their entire business model focused on that practice. Whatsoever. No...

#LLM #ai #scam

#ai #scam #llm
in reply to Faraiwe

@faraiwe Ironically that painting was elevated to a work of international importance (prior it was nothing notable) and the woman who started the restoration became a celebrity. I stopped using it as an exampel of failure and instead as one of a HUGE mistake paying off.
in reply to Scott Galloway

@scottgal She made some money. She also destroyed a wonderful piece of art, generated heaps of work that may not even be feasible to restore the art work.

She is an obscure, one-hit meme maker, with an even more obscure online shop, to cash in on her shitjob.

Nobody knows her name. You'd have to search HARD.

And the search would need to include the original work title/artist.

It's the PERFECT analogue, what the hell are you talking about, man =D

#LLM #ai #scam

in reply to Faraiwe

@scottgal WITHOUT searching online.... WHAT IS HER NAME?

IS SHE ALIVE?

I remember the NAME of the art work.

I don't even know if she is alive.

So, yeah.

in reply to Faraiwe

@faraiwe @scottgal Cecilia GimΓ©nez Zueco. She died recently. She's quite well-known in Spain, and widely admired for her dedication to her community.
in reply to Cory Doctorow

Is that technically still a Clean Room design? In the original scenario anybody who had ever even "seen" bits of the original code was deemed contaminated and couldn't be on the clean room team.

Since every LLM has been trained on the original FOSS code, it must be seen in the same way.

This entry was edited (5 days ago)
in reply to Cory Doctorow

One problem with the LLM generated code is that the LLM used available source code on the net for training. I.e. all existing free software. So the part that recreates the free software from the spec has deep knowledge of the project it is going to rip off.

Which is not the same situation as the IBM case.

In fact, my copyright police point here would be that any software created by LLMs is based on human works used as training data, and therefore a derivative work of the training data. Which is all some kind or another of FOSS, different licenses mixed together. You have to comply to all of them.

I.e. LLM generated code is already the free software apocalypse. It's not PD. It is derived from copyrighted code.

This time Disney is our friend. They'll jump through all the hoops to make sure AI animation slop will count as Disney-derived work, and pave the road to claim all generated software back to the commons where it was ripped off.

in reply to Cory Doctorow

[PART 1/2]
*Applies an algorithm(compiler) to source code*
Corporations moving government's lips: "machine code is still copyrightable as a book despite not being written by or easily readable by humans".
*Applies an algorithm(LLM) to source code*
Corporations moving government's lips: "new source code is not covered by GPL despite being as readable by humans as a book".

And as usual, there are more than sides in this: developers/users, AI corpos and publishers.

This entry was edited (3 days ago)
in reply to Cory Doctorow

Long thread/2

Sensitive content

in reply to Cory Doctorow

Long thread/3

Sensitive content

in reply to Cory Doctorow

Long thread/4

Sensitive content

in reply to Cory Doctorow

Long thread/5

Sensitive content

in reply to Cory Doctorow

Long thread/6

Sensitive content

in reply to Cory Doctorow

Long thread/7

Sensitive content

in reply to Cory Doctorow

Long thread/8

Sensitive content

in reply to Cory Doctorow

Long thread/9

Sensitive content

in reply to Cory Doctorow

Long thread/10

Sensitive content

in reply to Cory Doctorow

Long thread/11

Sensitive content

in reply to Cory Doctorow

Long thread/12

Sensitive content

in reply to Cory Doctorow

Long thread/13

Sensitive content

in reply to Cory Doctorow

Long thread/14

Sensitive content

in reply to Cory Doctorow

Long thread/15

Sensitive content

in reply to Cory Doctorow

Long thread/16

Sensitive content

in reply to Cory Doctorow

Long thread/17

Sensitive content

in reply to Cory Doctorow

Long thread/18

Sensitive content

in reply to Cory Doctorow

Long thread/19

Sensitive content

in reply to Cory Doctorow

Long thread/20

Sensitive content

in reply to Cory Doctorow

Long thread/21

Sensitive content

in reply to Cory Doctorow

Long thread/22

Sensitive content

in reply to Cory Doctorow

Long thread/23

Sensitive content

in reply to Cory Doctorow

Long thread/23
There are two things I don't get: first, the corporation is fine with Malus if it doesn't distribute the code (cloud or compiled binary), isn't it? And second, if it distributes readable code (not sure how readable minified JavaScript is, for instance), then how do the users know it is not copyrighted? If I look at random code on the Internet, either it explicitly says that it is open source, or I have to assume it is copyrighted, right?
⇧