Skip to main content


I made my LLM stop bullshitting. Nothing leaves your machine.


This entry was edited (4 days ago)
in reply to SuspciousCarrot78

I'm no dev so I don't understand all the technicalities but if I got it right you made it so the AI is itself showing how confident it is about its own answers? That is neat.

Not sure to understand the downvotes? Ins't it a good idea to make it harder for AI to be telling bullshit without blushing?

in reply to Libb

That's exactly what I did. And in the course of doing that, I gathered almost 10,000 data points to prove it, showed my work and open sourced it. (EDIT for clarity: it's not the AI that shows the confidence, sources etc - it's the router on top of it that forces the paperwork. I wouldn't trust an AI as far as I could throw it. But yes, the combined system shows its work).

You don't need to be a dev to understand what this does, which is kind of the point. I don't consider myself a dev - I'm was just unusually pissed off at ShitGPT, but instead of complaining about, did something.

Down-vote: dunno. Knee jerk reaction to anything AI? It's a known thing. Ironically, the thing I built is exactly against AI slop shit.

To say I dislike ChatGPT would be to undersell it.

This entry was edited (5 days ago)
in reply to SuspciousCarrot78

TLDR.

So you basically solved humanity problems with LLMs, you should sell it to NVIDIA and be rich, no more hallucination.

in reply to CodenameDarlen

TL;DR:

The post has a section called "So, wait…are you saying you solved LLM hallucinations?" followed by the word "No." in large letters.

You'd have found it if you'd read past the title. I'll go back and bold it for you.

But if you have a hook up at NVIDIA that wants to buy me a shiny new car, I'll put on a pretty dress and bat my eyelashes.

in reply to SuspciousCarrot78

You should have made it clear in the title. The title is the most important part and you're literally saying you've made LLM stop "bullshitting". Pretentious phrase to draw everyone attention. Then in the body you correct everybody assumption. Dirty move.
in reply to CodenameDarlen

Yeah, I did stop it bullshitting. Quite literally.

Also, "bullshitting" isn't a rhetorical flourish; it's a defined term in AI ethics literature. The model produces fluent, confident output without any mechanism to assess truth. That's domain accepted definition of bullshit. No bullshit. See -

link.springer.com/article/10.1…

This entry was edited (4 days ago)
in reply to SuspciousCarrot78

I was like, why aren't you publishing it to a conference/journal if it is good? Then realized that you are doing exactly that.
Kudos for the work, looking forward to the progress!
in reply to someacnt

Getting shit published - especially as an outsider to the field - involves getting raked over coals. If someone in the field can vouch for me on arXiv (later) that might help because that's at least a low level signal what I have is interesting and within the field.

Writing journal articles, especially contentious ones, is usually 6-8 weeks of writing and then 6 months of back and forth with reviewers / trying really hard not to hang yourself from the ceiling fan.

This entry was edited (4 days ago)
in reply to SuspciousCarrot78

Although I'm generally opposed to AI in general and LLMs in particular, this project seems really cool. Might actually change my stance on LLM usage. Kudos and hope this gets more attention and development!
in reply to machiavellian

Me too! I built it to be used, so if people use it, that's my win.
in reply to SuspciousCarrot78

So basically, you created a prompt wrapper that removes position bias by using trust to evaluate both, and forcing an evidence path with scratch. This is a really cool development. It probably will not solve everything but it solves alot.

Is llama open source?

in reply to ScoffingLizard

This entry was edited (4 days ago)
in reply to SuspciousCarrot78

I think interesting? It’s kind of hard to tell.

You are going to have to significantly tone down the editorialization and platitudes to get this to a place where a journal might consider it.

Make the point of how it’s novel or useful by explaining what it does, not by repeating that it’s novel and useful.

in reply to seadoo

Well, this was a social media post, aimed at an intelligent, non-scholarly audience. The preprint is a different document with a different structure entirely: bounded claims, explicit limitations, disclosed adjudication gaps, no words like "novel" or "revolutionary" anywhere in it. Not my first rodeo :)

If the preprint has specific passages that read as editorialized, point them and I'll fix them. But "tone it down for journals" is feedback for a document that isn't trying to be submitted to journals.

The draft is here

in reply to seadoo

The description has such an unsettling, overconfident, llm-style tone for a project described as something to challenge LLM hallucinations.
in reply to glarf

Hmm. The post has swearing, a personal ASD disclosure, a Feynman quote, statistics, reference to Lawrence of Arabia and ends with "a meat popsicle wrote this," with a link to a blog as proof and a scientific pre-print with almost 10,000 data points (with raw data and errata). If you have an LLM that can do that, kudos to you.

If there are specific passages that pattern-match to LLM output for you, point them and I'll look.

But "confident tone" and "LLM tone" aren't the same thing - I'm just not apologetic about what the project does.

The data is the data.

I'm not going to alter the way I write to approximate Reddit Common.

This entry was edited (4 days ago)
in reply to SuspciousCarrot78

Good for you, welcome to the internet where people's opinions abound. I didn't accuse you of writing it with an LLM I said it was an LLM style, if you don't like my opinion, that's fine with me. I simply found the writing style unsettling. Cheers!
in reply to glarf

"I have introduced myself. You have introduced yourself. This was a very good conversation."

Confidence: Zero | Source: Model

This entry was edited (4 days ago)
in reply to glarf

LLMs were created by reading millions of *social media posts written by neurodivergent people sharing their passions online.

*edit: spelling

This entry was edited (4 days ago)
in reply to SuspciousCarrot78

So I was curious about how you accomplished this and took a look with the robots to figure it out.

TL;DR: the router is a massive decision tree using heuristics and regex to avoid LLM calls on unprefixed prompts.

I think this is an interesting, brute force approach to the problem, but one that will always struggle with edge cases. The other bit it will struggle with is transparency. Yes, it might be deterministic because it is a decision tree, but unless you really understand how that decision tree works under the hood and know where the pitfalls are, you're going to end up talking to the LLM a lot of the time anyhow.

Something you might want to consider is doing a fine-tune of a smol model (think something like qwen3:1.7B or even smaller like one of the gemma3n sub-1B) that will do the routing for you. You can easily build the dataset synthetically or harvest your own logs. I think this might end up covering more edge cases more smoothly without resorting to a big call to a larger model

in reply to okwhateverdude

This entry was edited (4 days ago)
in reply to SuspciousCarrot78

Cool man. It is really refreshing to see this level of engagement. You've really thought this though. You're right about the routing model moving it up a level and also about retraining. It's all trade-offs.

Are you intending this for others to use or is this really just for you? Because I think what you're slowly building is a power tool with a whack-a-mole set of routing tweaks specifically for you. Nothing wrong with that, but the barrier to entry for others to use this is reading that routing and understanding the foibles that have been baked in with your preferences in mind, and even adding fixes and tweaks of their own which kinda breaks the magic a little.

This was really the point I was making about transparency.

I appreciate others also doing real work with potato GPUs because I, too, have a potato GPU (6GB). I think there is real utility in continuing to develop this.

I'll give this a star and follow along. It doesn't really fit my mental model of how I'd like my harness to behave, but I will totally steal some of these ideas.

in reply to okwhateverdude

It's for everyone to use :)

I get that it's maybe an acquired taste though.

Steal what you can, make it better, and then I can steal it back.

And thanks for the star!

This entry was edited (4 days ago)
in reply to SuspciousCarrot78

Can't it source other LLM outputs as "verified source" and thus still say whatever sounds good, like any LLM? Providing "technical" verification, e.g. SHA, gives no insurance about the content itself being from a reputable source. I don't think adding confidence and sourcing changes anything, the user STILL has to verify that whatever is provided is coherent and a third party is actually a good source. Thanks for making the process public though, doing better than OpenAI does.
This entry was edited (4 days ago)
in reply to utopiah

This entry was edited (4 days ago)
in reply to SuspciousCarrot78

Isn't it "source: model" basically roulette? We go back to the initial problem. Also anything else that is not model might also be hallucinated if at any point the string that gives back "source:" goes through the model.
in reply to utopiah

Nope.

  1. Source: Model is not pretending otherwise
    It is basically “priors lane.” That’s the point of the label: explicit uncertainty, not fake certainty.
  2. Source footer is harness-generated, not model-authored
    In this stack, footer normalization happens post-generation in Python. I've specifically hardened this because of earlier bleed cases. So the model does not get to self-award Wiki/Docs/Cheatsheets etc.
  3. Model lane is controlled, not roulette
  • deterministic-first routing where applicable
  • fail-loud behavior in grounded lanes
  • provenance downgrade when grounding didn’t actually occur

So yes: Source: Model means “less trustworthy, verify me.” Always do that. Don't trust the stochastic parrot.

But also no: it’s not equivalent to a silent hallucination system pretending to be grounded. That’s exactly what the provenance layer is there to prevent.

This entry was edited (4 days ago)
in reply to utopiah

Fair, but that's the same problem human thinkers face. Faulty inputs == faulty outputs. You should always be validating your sources.
in reply to JustinTheGM

Right but if one person keeps on giving me wrong answers, knowingly or not, my distrust in them in not linear. They'll have to "earn" it back and it's going to be very challenging. If they do learn though, then it might come back faster. In this setup I have no guarantee of any progress. There no "one" in there trying to fix any mistake.
in reply to utopiah

This entry was edited (4 days ago)
in reply to SuspciousCarrot78

AI will think for you if you prompt it to do so. It's up to the user to use the tool in a way that suits your style.
in reply to SuspciousCarrot78

Thanks for sharing. I've not yet delved into reading it in depth but appreciate your goals and the fact that you documented it all.
in reply to twoBrokenThumbs

You're welcome. Hope it makes sense. If not, you can marvel at the (many, many) nestled swears in my -commit messages.
in reply to SuspciousCarrot78

Looks interesting. Will give it a whirl on my home server.

In this article, they talk about bringing up a local RAG system to let people run an LLM off a large document corpus: en.andros.dev/blog/aa31d744/fr…

Wonder if this, connected to something like that, and wrapped in an easy end-user friendly script or UI could be a good combination for a local, domain-specific, grounded knowledge-base?

in reply to fubarx

I genuinely don't know. A small part of llama-conductor is a triple pass RAG system, using Qdrant, but the interesting bit is what sits on top of it. It's a thinker/critic/thinker pipeline over RAG retrieval.

  • Step 1 (Thinker): Draft answer using only the retrieved FACTS_BLOCK
  • Step 2 (Critic): Check for overstatement, constraint violations
  • Step 3 (Thinker): Fix issues, output structured answer

I built it that way based what the research shows works best to reduce hallucinations

Let's Verify Step by Step,

Inverse Knowledge Search over Verifiable Reasoning

To be honest, I have been looking at converting to CAG (Cache Augmented Generation) or GAG (Graph Augmented Generation). The issues are - GAG still has hops, and CAG eats VRAM fast. Technically, for a small, curated domain, CAG potentially outperforms RAG (because you eliminate the retrieval lottery entirely). But on a potato that VRAM ceiling arrives fast.

OTOH, for a domain-specific knowledge base like you're describing, CAG is worth serious evaluation.

Needs more braining on my end.

This entry was edited (4 days ago)
in reply to SuspciousCarrot78

The problem with CAG is not just that it hogs memory, but to keep it fresh you have to keep re-indexing. If the corpus is large and dynamic, it can easily fall out of date and, at runtime, blow out the context window.

GraphRAG has some promise. NVidia has a playbook for converting text into a knowledge graph: build.nvidia.com/spark/txt2kg

It'll probably have the same issues with reindexing, but that will be a common problem, until someone comes up with better incremental training/indexing.

in reply to SuspciousCarrot78

I have trouble understanding what makes it list "Context" as its source as opposed to "Model" and how that makes it any more deterministic, can you give a more detailed example?
This entry was edited (2 days ago)
in reply to iByteABit

This entry was edited (4 days ago)
in reply to SuspciousCarrot78