Close Menu
journearn.comjournearn.com
  • Home
  • Apps
  • Business
  • Make Money Online
  • Money Saving
  • Finance
  • Food
  • Investment
  • Travel
Facebook X (Twitter) Instagram
journearn.comjournearn.com
Facebook Instagram Pinterest Vimeo
  • Home
  • Apps

    Intelligent Load Assignment, Driver Orchestration

    June 18, 2026

    The Real Differences Between White Label App Reseller Platforms (And Why They Matter)

    June 11, 2026

    20 Best Insurance Software Development Companies in 2026

    June 4, 2026

    Selecting the Best Video Streaming Protocol Architecture for Latency and Delivery Reliability

    June 2, 2026

    10 Best AI Lead Scoring Tools in 2026 (Tested & Reviewed)

    May 23, 2026
  • Business

    7 Steps to Find Your Ideal Quick Service Restaurant Franchise

    June 19, 2026

    Robert F. Smith Urges Companies Not To Replace Interns With AI

    June 18, 2026

    What Is an AI Employee? Digital Workers Explained

    June 18, 2026

    6 Best Lead Capture Software I’d Recommend in 2026

    June 17, 2026

    I Reviewed the Best PEO Providers on G2: 7 Standouts

    June 16, 2026
  • Make Money Online

    Is Going Back to an Old Job a Smart Move — or a Step Backward?

    June 18, 2026

    265. “We spend 179% of what we make. Are we screwed?”

    June 16, 2026

    How to Collect Social Security While Working (and Jobs to Consider)

    June 15, 2026

    What Income Do You Need to Be Middle Class in Pittsburgh and Pennsylvania?

    June 13, 2026

    Spirit Halloween Hiring 52,000 Seasonal Jobs. Here’s How to Apply

    June 11, 2026
  • Money Saving

    Having a will is essential (and easier than you think)

    June 18, 2026

    Five Bills You Need to Renegotiate Before Summer 2026 – or Risk Paying Hundreds More Than You Need To

    June 16, 2026

    Modern Bathroom Ideas That Are Easy to Maintain and Keep Clean

    June 14, 2026

    California’s Property Tax Postponement Program and Its February Deadline

    June 13, 2026

    Best online brokers in Canada for 2026

    June 12, 2026
  • Finance

    How to Prepare Financially for Unexpected Expenses

    June 18, 2026

    Stop Waiting For Permission To Build A Fortune

    June 17, 2026

    Automating Your Finances Is More Effective Than Relying on Discipline

    June 15, 2026

    Calvin is looking for ways to avoid paying probate in Ontario. What are the risks of doing this?

    June 14, 2026

    A Young Saver’s Complete Guide for 2026

    June 12, 2026
  • Food

    20 Summer Cookie Recipes to Bake All Season

    June 18, 2026

    Easy Slow Cooker BBQ Shredded Beef

    June 17, 2026

    James Beard Foundation Awards 2026: Winners, News, and Updates

    June 15, 2026

    Best Ever Zucchini – Cookie and Kate

    June 14, 2026

    Giant Chicken Milanese (Viral Sheet Pan Chicken Cutlet)

    June 13, 2026
  • Investment

    Chart of the Week: AI Is a Black Box

    June 18, 2026

    Steve Neamtz: The Diversification Illusion Hiding Beneath Record Highs

    June 17, 2026

    DIGITAL ID: THE LOCKDOWN THEY NEVER ABANDONED

    June 16, 2026

    How Deandra McDonald Went From Lender Rejections to 10+ Unit Multifamily Properties

    June 15, 2026

    Decision Architecture: The Real AI Edge

    June 14, 2026
  • Travel

    Three Days Hiking in the Albanian Alps, From the Lake to the Pass

    June 19, 2026

    Airport Luggage Tag Scam: Protect Checked Bags in 2026

    June 17, 2026

    Can You Book Four Seasons on Points? Yes, But I Wouldn’t – Here’s Why

    June 16, 2026

    What a Tokyo Kendo Dojo Teaches You About How to Live

    June 15, 2026

    Can You Book Rosewood on Points? Here’s Why We Don’t

    June 12, 2026
journearn.comjournearn.com
Home»Investment»Chart of the Week: AI Is a Black Box
Investment

Chart of the Week: AI Is a Black Box

info@journearn.comBy info@journearn.comJune 18, 2026No Comments4 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Telegram Email
Chart of the Week: AI Is a Black Box
Share
Facebook Twitter LinkedIn Pinterest Email


A strange thing happened last week.

Anthropic was forced to take its newest AI models offline only days after releasing them.

The company’s new Fable 5 and Mythos 5 systems were designed to be some of the most powerful AI models ever released. But shortly after launch, researchers discovered ways to get around some of the models’ built-in safety measures.

Government officials soon got involved as fears spread that these systems could become powerful cybersecurity weapons in the wrong hands.

Maybe those concerns were justified, and maybe they weren’t.

But to me, they raise an obvious question that not enough people are asking.

How would anyone know?

What’s Inside the Box?

Modern AI systems aren’t like traditional software.

Engineers don’t sit down and write lines of code telling them exactly how to reason through a problem.

Instead, researchers train these systems and then observe their behavior.

The result is what many researchers call a black box.

We can see what goes in, and we can see what comes out.

But what happens in between is often much harder to explain.

That’s why companies like Anthropic spend so much time studying AI interpretability, or the science of understanding how these systems arrive at their conclusions.

And that brings us to this week’s chart.

Because a group of researchers recently performed a strange experiment.

They secretly modified an AI model’s internal state. Then they asked whether the model could detect that something had changed.

AI interpretability experiment

Image: Uzay Macar and Li Yang

This chart might look complicated, but the basic idea is simple.

Researchers injected information directly into an AI model’s internal processing, then tested whether it could tell the difference between those injections and its normal thought process.

The chart compares three versions of the same model.

The first is the Base model, the raw AI system before it receives additional training.

The second is the Instruct model, which was trained to behave more like the helpful AI assistants most people interact with today.

The third is an Abliterated version of the model, where some of the refusal and safety behaviors were removed.

The blue line shows how often the model correctly detected a real change, while the orange line shows how often it falsely claimed that something changed when nothing had actually happened.

And the results are surprising.

The Base model performed poorly. When researchers secretly altered its internal processing, it often couldn’t tell the difference between a real change and a false alarm.

But the Instruct model performed much better.

Somewhere during the additional training process, the model appears to have developed an ability to recognize when something unusual had happened inside its own processing.

And in several cases, the Abliterated model performed even better still.

In other words, removing some of the AI’s safety and refusal behaviors actually improved the model’s ability to detect what was going on inside it.

That doesn’t mean the model became conscious or self-aware.

You can compare it to a computer server that detects when someone has tampered with its memory. The server isn’t aware of anything, but it can still recognize when something unusual has happened.

Researchers believe something similar happened here.

More importantly, they think capabilities like this could eventually help us better understand what’s happening inside advanced AI systems.

After all, these models have access to information that remains largely hidden from the people studying them.

Which means one way researchers could eventually learn more about advanced AI systems is by asking the systems themselves.

That might seem counterintuitive.

But it would give researchers something they’ve never really had before.

A window into what’s happening inside the model itself.

Here’s My Take

The primary goal of the AI industry has been to build more capable models.

But another challenge is gaining urgency.

Understanding them.

The controversy surrounding Anthropic’s latest models shows why we need to get a handle on this issue sooner than later.

Because it’s one thing to build a powerful AI system. It’s something else entirely to create a new form of intelligence yet only partially understand how it works.

So here’s my question to you:

If future AI systems become too complex for humans to fully understand on their own, would you trust AI to help explain what’s happening inside other AI models?

Or does that sound like asking the fox to guard the henhouse?

I’d love to hear what you think.

Let me know at dailydisruptor@banyanhill.com.

We won’t reveal your full name in the event we publish a response, so feel free to share your honest opinion.

Regards,

Ian King's Signature
Ian King
Chief Strategist, Banyan Hill Publishing





Source link

AI Anthropic
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
info
info@journearn.com
  • Website

Related Posts

Robert F. Smith Urges Companies Not To Replace Interns With AI

June 18, 2026

Steve Neamtz: The Diversification Illusion Hiding Beneath Record Highs

June 17, 2026

DIGITAL ID: THE LOCKDOWN THEY NEVER ABANDONED

June 16, 2026

How Deandra McDonald Went From Lender Rejections to 10+ Unit Multifamily Properties

June 15, 2026

Decision Architecture: The Real AI Edge

June 14, 2026

The Nvidia Moment for Space

June 13, 2026
Add A Comment
Leave A Reply Cancel Reply

  • Facebook
  • Twitter
  • Instagram
  • Pinterest
Don't Miss

Three Days Hiking in the Albanian Alps, From the Lake to the Pass

7 Steps to Find Your Ideal Quick Service Restaurant Franchise

20 Summer Cookie Recipes to Bake All Season

Is Going Back to an Old Job a Smart Move — or a Step Backward?

About Us

Welcome to Journearn.com – your trusted guide on the journey to earning smarter, saving better, and building a more financially secure future. At Journearn, we believe that financial knowledge should be accessible to everyone.

Quicklinks
  • Business
  • Food
  • Make Money Online
  • Money Saving
  • Travel
Useful Links
  • About Us
  • Contact Us
  • Disclaimer
  • Privacy Policy
  • Terms and Conditions
Popular Posts

Three Days Hiking in the Albanian Alps, From the Lake to the Pass

June 19, 2026

7 Steps to Find Your Ideal Quick Service Restaurant Franchise

June 19, 2026
© 2026 Designed by journearn.All Right Reserved

Type above and press Enter to search. Press Esc to cancel.