top of page

Another Crazy Day in AI: How to Teach Models to Say “I Don’t Know”

Another Crazy Day in AI: An Almost Daily Newsletter

Hello, AI Enthusiasts.


If your brain's a bit fried by now, Google Research just gave us something fresh to chew on. They’re rethinking RAG systems—not just for relevance but for sufficiency. The takeaway? It’s not enough for AI to find facts. It needs enough of the right ones to actually answer (or know when not to).


Meanwhile, arketing pros are hitting the brakes on over-AI’d content. And if you’re on TikTok, your still photos are about to start moving. Literally.


Wednesdays used to be slow. Not anymore.


Here's another crazy day in AI:

  • Why RAG systems still hallucinate

  • Avoiding the AI marketing trap

  • TikTok introduces AI Alive to animate your photos

  • Some AI tools to try out


TODAY'S FEATURED ITEM: When AI Should Say Nothing


A robotic scientist in a classic white coat with 'AI Scientist' on its back stands beside a human scientist with 'Human Scientist' on their coat, looking towards the AI Scientist.

Image Credit: Wowza (created with Ideogram)


What if your AI model could tell you when it shouldn’t answer?


Researchers Cyrus Rashtchian and Da-Cheng Juan propose a new way to evaluate retrieval-augmented generation (RAG) systems—not just by how relevant the retrieved context is, but by whether it’s sufficient. Their paper, “Sufficient Context: A New Lens on Retrieval Augmented Generation Systems,” was presented at ICLR 2025 and dives into why RAG models hallucinate, how to classify when context is actually enough to answer a question, and what we can do to reduce false outputs.


They introduce the concept of “sufficient context”—the idea that the retrieved information must include everything needed to produce a correct answer. Using a new automatic rating method and a selective generation approach, they examine when it might be better for a model to skip answering altogether. Interestingly, their work shows that providing more context can sometimes make things worse—leading to hallucinations when the extra information isn’t quite right.


Source: Google Research
Source: Google Research

What they discovered about how RAG systems really behave:

  • Relevance ≠ Sufficiency: Context may be on-topic but still not have the information needed to answer a question accurately.

  • Introducing the “autorater”: An LLM-based tool classifies context as sufficient or insufficient with 93%+ accuracy—no ground truth answers needed.

  • Top models still struggle: Even the best models (Gemini, GPT, Claude) tend to hallucinate when context is insufficient, instead of just saying “I don’t know.”

  • Smaller models hallucinate more: Open-source models often fail even when context is technically sufficient.

  • More context, more confidence... more hallucinations: Adding context increases the risk of confident, wrong answers—especially in models like Gemma.

  • Selective generation helps: Combining model confidence with sufficiency ratings reduces hallucinations without sacrificing too many correct answers.

  • Dataset matters: Datasets like FreshQA, with human-curated supporting docs, provide more sufficient context than others like HotPotQA.


Source: Google Research
Source: Google Research

The implications are far-reaching for anyone working with RAG pipelines or trying to improve trust in AI-generated responses. Instead of focusing only on document relevance or retrieval hit rates, the paper encourages a closer look at whether the input truly supports a reliable answer. This shift in evaluation could influence how future systems are trained, optimized, and judged in real-world use.


For builders, this opens up a practical takeaway: sometimes, less is more. Knowing when not to answer—or when the retrieved context doesn’t cut it—might be just as important as knowing the right thing to say. As more teams deploy AI assistants into high-stakes settings, understanding and applying the idea of “sufficient context” could be key to reducing costly errors and improving user trust.




Read the full blog here.

Read the full paper here.

OTHER INTERESTING AI HIGHLIGHTS:


Avoiding the AI Marketing Trap

/Olivia Bunescu, Senior Associate Editor, on Multi-Housing News


As AI takes center stage in modern marketing strategies, experts warn against over-reliance. While AI tools streamline workflows and spark ideas, they can’t replace human intuition, emotional intelligence, or strategic thinking. From mishandling negative reviews to generating tone-deaf content, the risks of letting AI steer your messaging are real. Marketers are encouraged to treat AI as a creative collaborator—not the lead driver.



Read more here.


TikTok Introduces AI Alive To Animate Your Photos

/TikTok Newsroom


TikTok’s newest feature, AI Alive, transforms still images into dynamic video stories with atmospheric motion and expressive effects—no editing experience required. Integrated directly into the Story Camera, this creative tool brings static moments to life through subtle animation and sound, unlocking fresh storytelling possibilities for everyday users. Safety and transparency are built in, including visible AI-generated labels and behind-the-scenes content checks. It's another step in TikTok’s push to democratize creative tools for its global user base.





Read more here.

Source: TikTok
Source: TikTok

SOME AI TOOLS TO TRY OUT:


  • Fluig – Turn documents and ideas into diagrams instantly with AI.

  • Willow Voice – Fast, accurate AI dictation that works across any app.

  • Gleo AI – Rehearse tough conversations and get feedback on your speaking style.


That’s a wrap on today’s Almost Daily craziness.


Catch us almost every day—almost! 😉

EXCITING NEWS:

The Another Crazy Day in AI newsletter is on LinkedIn!!!



Wowza, Inc.

Leveraging AI for Enhanced Content: As part of our commitment to exploring new technologies, we used AI to help curate and refine our newsletters. This enriches our content and keeps us at the forefront of digital innovation, ensuring you stay informed with the latest trends and developments.





Comentários


Subscribe to Another Crazy Day in AI​

Catch us almost every day—almost! 😉

Thanks for signing up!

Copyright Wowza, inc 2025
bottom of page