Why AI Gets It Wrong

Jan 30

I took a little break in December to enjoy the holidays with my family and I came back thinking a lot about how our kids use AI. In my last post before the break, we dug into AI hallucinations and practiced fact-checking them with lateral reading. Today I want to go one level deeper by not just sharing how to spot when AI is wrong, but also explaining why it gets things wrong in the first place.

Garbage In, Garbage Out: The Foundation Problem

There's an old principle in computer science that perfectly captures AI's biggest limitation: "garbage in, garbage out."

Here's what that means: AI can only be as good as the data it learns from. Remember how we talked about AI being a text predictor trained on the internet? Well, the internet is a mixed bag. Alongside accurate, well-researched content, there are outdated Wikipedia entries from 2010, opinion pieces presented as facts, forum posts where people are guessing, blog articles with subtle inaccuracies, and websites promoting conspiracy theories or misinformation.

AI doesn’t have a built-in “truth detector.” It learns from everything it sees, both the good and bad alike. So when flawed data goes in, flawed understanding comes out.

That "100,000 kilometers of blood vessels" fact my son asked about that I shared in my last post? It's repeated all over the internet, which is why AI confidently gave us that answer. The system learned that pattern from thousands of websites, many of which are just copying each other without checking the science. The AI didn't know it was perpetuating a myth. It just knew that pattern was extremely common in its training data.

Even Good Content Can Create Bad Patterns

Here's something fascinating I've learned through my work. Even when the source material is accurate, AI can still get things wrong.

In my job, I track the accuracy of technical content about my company's products when it appears in chatbot responses. We've been finding inaccuracies, and here's the interesting part, we're seeing our own technical documentation cited with those inaccurate responses. Our content isn't wrong. But we're discovering that if our documentation isn't crystal clear, or if it's organized in ways that make patterns hard for AI to identify, the chatbot can misinterpret it or pull together pieces in ways that create new inaccuracies.

Sometimes our documentation needs updating to reflect current features. Sometimes the language needs to be more explicit about relationships between concepts. Sometimes the formatting needs to be more consistent so AI can parse it correctly.

This is the critical insight. AI doesn't think or make judgments like humans about accuracy, it finds patterns. And if those patterns are unclear, outdated, or organized in confusing ways, even in otherwise accurate content, the AI will follow the patterns it detects, not the truth we intended.

It's a humbling reminder that the "garbage in, garbage out" principle isn't just about obvious misinformation. It's also about how information is structured, how clearly it's written, and whether it's current. Even good content can lead AI astray if it's not optimized for pattern recognition.

The Internet Doesn't Actually Have Everything

Here's another piece of the puzzle. AI chatbots are working with an incomplete picture of human knowledge.

Why? Because huge amounts of valuable information simply aren't available on the public internet that AI systems can access during training.

Books are a perfect example. Most books, especially recent ones, are protected by copyright and never make it into AI training data. When I asked ChatGPT about character relationships in a novel I was reading, it couldn't access the actual book. So it filled in the gaps with patterns from similar stories, creating confident fiction. Yet, it was totally wrong in terms of character relationships and how characters developed in the book.

Academic research often lives behind paywalls. Company databases, medical records, government archives, subscription-only journals, and proprietary research all stay locked away.

Even content that exists on the web might be in formats AI can't process well, behind login walls, or simply never indexed by the systems that gather training data.

So when your child asks AI about something that's not freely available online, the AI doesn’t say "I don't have access to that source” (which I wish it would). Instead, it generates an answer based on similar patterns it has seen. It makes an educated guess that sounds authoritative. That's the hallucination problem we talked about last time, but now you know why it happens.

The Wider the Net, the Messier the Catch

Here's something critical that I've learned through trial and error in using AI. When AI casts a wider net across the internet to find information, accuracy often goes down, not up.

It seems counterintuitive, right? Shouldn't more sources mean better information?

But think about what happens when AI pulls from a broader range of internet content. It's mixing high-quality research papers with random forum posts. Authoritative sources get blended with someone's blog from 2008. Current information gets muddied by outdated content that's still searchable. That myth about blood vessels gets reinforced because it appears on hundreds of sites.

What’s the result? AI responses that are incomplete, inaccurate, or both. All this happens because AI is pattern-matching across such a wide variety of quality levels and time periods that the signal gets lost in the noise.

This is exactly what we're seeing with our technical documentation at work. When AI pulls from multiple versions of our docs, some current, some outdated but still scanned by AI, it can mash together patterns from different time periods, creating answers that sound authoritative but describe features that no longer work that way.

The key takeaways are that broader isn't better and source quality matters more than source quantity.

What This Means for Our Kids

Understanding these mechanics changes how we think about AI with our kids. They need to know:

AI is predicting based on patterns, not retrieving verified facts. Every response is a statistical best-guess based on what usually comes next in similar contexts.

The training data has huge gaps. If information isn't on the public internet in a form AI can access, the chatbot will fill in the blanks with plausible-sounding patterns.

Even accurate content can lead to inaccurate answers. If source material isn't clearly written, well-organized, or current, AI can misinterpret patterns and create new errors.

Broader isn't better. Pulling from more internet sources doesn't guarantee accuracy—it often introduces more noise, outdated information, and conflicting patterns.

Verification is still essential. Even when AI pulls from good sources, it can misinterpret or mash together information in unexpected ways. That's why the lateral reading habit we practiced last time matters so much.

Remember that washing machine incident I mentioned in my last post? I gave the AI a photo and specific model information which were pretty good inputs. But it still confidently made things up because it didn't have reliable source material about that particular machine in its training data. It followed patterns from similar appliances instead of admitting it didn't know.

Why This Understanding Matters

Our kids are growing up in a world where AI will be everywhere, in their homework tools, their phones, their future workplaces. If we want them to use it effectively, they need to understand both its power and its fundamental limitations.

They need to know that AI is working from an incomplete dataset using pattern-matching rather than actual knowledge. They need to learn that casting a wider net doesn't automatically mean better information. They need to understand that even good content can lead AI astray when it's not clearly written or organized.

This isn't about making them distrust AI. It's about helping them develop healthy skepticism combined with smart strategies.

What's Next

Now that you understand why AI gets things wrong, the natural question is: "What do I actually DO about this with my kids?"

That's next. I'll share specific strategies for helping your kids give AI better inputs through practical exercises, conversation starters, and a simple framework that makes this manageable.

Here's why we started with the conceptual foundation. When you understand that AI is pattern-matching from messy, incomplete data, the strategies will make so much more sense.

This Week's Try-It

This week, when your child mentions something they learned from AI, ask them: "Where do you think the AI got that information? What sources might it have used?" You're not fact-checking yet, but you're helping them start thinking about the invisible sources behind every AI answer. This awareness is the foundation for the strategies we'll practice in the next post.

A Note on Process

I used Claude, ChatGPT, and Gemini to help draft and refine this post. My process: I provided notes summarizing the blog's purpose and asked each AI to review my prior posts. Then I worked with them as collaborative editors by going back and forth to help me shape the ideas and language. The examples of AI in my family's life are all real experiences, not AI inventions.

Eliza Spang