Another Crazy Day in AI: New Model Targets Knowledge Work and Coding
- Wowza Team

- 12 hours ago
- 4 min read

Hello, AI Enthusiasts.
Made it through another week? Take a breath. Here’s what moved the needle in AI.
OpenAI just rolled out a new model built for people doing real, messy work. Early testers say it handles long-running tasks, heavy tooling, and complex reasoning better than before, though results still depend on how you use it.
IBM and Pearson, meanwhile, are focusing on how AI can help people learn new skills faster without losing the human side of education.
And if your browser already feels chaotic, Google Labs is quietly experimenting with a new way to turn open tabs into something actually useful.
That’s enough future-thinking for now. Enjoy the weekend.
Here's another crazy day in AI:
GPT-5.2 launches with expert-level claims
IBM, Pearson launch global AI education partnership
Google Labs launches GenTabs for task-driven web navigation
Some AI tools to try out
TODAY'S FEATURED ITEM: The Newest Model from OpenAI

Image Credit: Wowza (created with Ideogram)
What if the next leap in technology isn’t just smarter… but finally capable of handling the work you never have time for?
OpenAI has just released GPT-5.2, their newest and most advanced model designed for deep professional work, long-running tasks, and tool-heavy projects. This update highlights major improvements across coding, analytics, reasoning, safety, speed, and real-world task execution. A range of early testers—from productivity platforms to developer tools—reported stronger results in their respective environments, though performance will naturally depend on the specific task and how it's used.
Here's what the release includes:
Performs at expert level on 70.9% of professional tasks across 44 occupations in the GDPval benchmark, completing work at notably faster speeds and lower costs than human professionals
Achieves 55.6% on SWE-Bench Pro, handling real-world software engineering problems across four programming languages, with 80% accuracy on the earlier SWE-bench Verified test
Produces 30% fewer errors compared to its predecessor, though OpenAI notes users should still verify outputs for anything critical
Maintains accuracy across documents up to 256,000 tokens, making it more practical for analyzing lengthy contracts, reports, or complex multi-file projects
Scores 98.7% on tool-calling benchmarks measuring the ability to coordinate multiple steps, such as customer service workflows that require accessing different systems
Reaches 93.2% on GPQA Diamond, a graduate-level science assessment, and solves 40.3% of expert-level mathematics problems
Offers three configurations—Instant for quick everyday tasks, Thinking for complex reasoning work, and Pro for situations where maximum accuracy justifies longer wait times
The numbers look impressive, and companies like Notion, Shopify, and Box reported seeing real improvements during testing. But there's an obvious question here: how much of this actually matters when you're sitting at your desk trying to get work done? Benchmarks measure specific things under ideal conditions—well-defined problems with clear success criteria. Most professional work doesn't look like that. It's messy, it changes halfway through, and it requires judgment calls based on context that's hard to explain, let alone feed into a prompt. OpenAI's own advice to double-check critical work suggests they know there's a gap between what performs well in tests and what you can rely on without supervision.
What's interesting is that GPT-5.2 seems to address some of the more frustrating limitations of earlier models—better at handling long documents, fewer instances of making things up, more reliable when it needs to use tools or complete multi-step tasks. Those are practical improvements that could genuinely save time if you're working with the kinds of tasks where AI already fits reasonably well. The pricing went up for API users, which tells you OpenAI thinks the quality boost is worth it, but also that not everyone will need or want to pay for that extra capability. Whether GPT-5.2 lives up to its claims will depend less on benchmark scores and more on whether it can handle the unpredictable, ambiguous situations that make up most people's actual workdays.
Check it out here.
Watch the news here.
OTHER INTERESTING AI HIGHLIGHTS:
IBM, Pearson Launch Global AI Education Partnership
/IBM Newsroom
IBM and Pearson are teaming up to create a new generation of AI-powered learning tools aimed at helping people adapt to a fast-changing workforce. Their collaboration focuses on personalized, skills-based learning solutions built on IBM’s watsonx platform, with the goal of improving how organizations upskill employees at scale. Pearson will also develop a custom AI learning platform with IBM to support better workflows, data-driven decisions, and new educational products. Beyond tools, both companies plan to explore ways to verify AI agents’ capabilities so organizations can deploy them more confidently.
Read more here.
Google Labs Launches GenTabs for Task-Driven Web Navigation
/Manini Roy, Senior Product Manager for AI Innovation, Chrome, and Amit Pitaru, Director, Creative Lab, on Google Blogs – The Keyword
Google Labs is introducing Disco, a new experimental space designed to rethink how we browse and interact with the web. Its first feature, GenTabs, uses Gemini 3 to understand your open tabs and tasks, then builds interactive mini-apps to help you get things done without writing code. Early testers are already using it for everything from trip planning to creating learning tools for kids. Google is opening a waitlist as it gathers feedback on what works, what doesn’t, and what future browsing might look like.
Check it out here.
SOME AI TOOLS TO TRY OUT:
That’s a wrap on today’s Almost Daily craziness.
Catch us almost every day—almost! 😉
EXCITING NEWS:
The Another Crazy Day in AI newsletter is on LinkedIn!!!

Leveraging AI for Enhanced Content: As part of our commitment to exploring new technologies, we used AI to help curate and refine our newsletters. This enriches our content and keeps us at the forefront of digital innovation, ensuring you stay informed with the latest trends and developments.





Comments