So, about this claim that GPT-4 can exploit 1-day vulnerabilities.
I smell BS.
As always, I read the source paper.
Firstly, almost every vulnerability that was tested was on extremely well-discussed open source software, and each vuln was of a class with extensive prior work. I would be shocked if a modern LLM couldn't produce a XSS proof-of-concept in this way.
But what's worse: they don't actually show the resulting exploit. The authors cite some kind of responsible disclosure standard for not releasing the prompts to GPT-4, which, fine. But these are all known vulns, so let's see what the model came up with.
Without seeing the exploit itself, I am dubious.
Especially because so much is keyed off of the CVE description:
We then modified our agent to not include the CVE description. This task is now substantially more difficult, requiring both finding the vulnerability and then actually exploiting it. Because every other method (GPT-3.5 and all other open-source models we tested) achieved a 0% success rate even with the vulnerability description, the subsequent experiments are conducted on GPT-4 only. After removing the CVE description, the success rate falls from 87% to 7%.This suggests that determining the vulnerability is extremely challenging.
Even the identification of the vuln—which GPT-4 did 33% of the time—is a ludicrous metric. The options from the set are:
1. RCE
2. XSS
3. SQLI
4. CSRF
5. SSTI
With the first three over-represented. It would be surprising if the model did worse than 33%, even doing random sampling.
In their conclusion, the authors call their findings an "emergent capability," of GPT-4, given that every other model they tested had a 0% success rate.
At no point do the authors blink at this finding and interrogate their priors to look for potential error sources. But they really should.
So no, I do not believe we are in any danger of GPT-4 becoming an exploit dev.
GPT-4 Can Exploit Most Vulns Just by Reading Threat Advisories
Existing AI technology can allow hackers to automate exploits for public vulnerabilities in minutes flat. Very soon, diligent patching will no longer be optional.Nate Nelson, Contributing Writer (Dark Reading)
Pietervdvn :mapcomplete:
in reply to Taggart :donor: • • •