AI Digest: the main events of the week, March 24–28, 2026

TL;DR: OpenAI shut down Sora and lost Disney. Anthropic won the lawsuit against the Pentagon and accidentally disclosed data about Claude Mythos. Google released Gemini 3.1 Flash Live and TurboQuant. Figma opened a canvas for AI agents. Reddit is introducing biometric verification. Mistral and Cohere released open-source audio models. ChatGPT got cloud storage. Claude Code — auto mode and auto-dream. Cursor is learning in real time. And 15+ more events — all in one digest.

The week turned out to be one of the busiest this year: the race for next-generation models emerged from the shadows through leaks and rumors, the open-source community responded to corporate developments, and several major players revised their strategies. Let’s go through everything step by step — we won’t miss a single piece of news.

OpenAI: the end of Sora, Codex plugins, and advertising at $60 CPM
Anthropic: Claude Mythos leak, court victory, and computer use on Mac
Google: Gemini 3.1 Flash Live, TurboQuant, and an AI browser
Open-source wave: audio, agentic search, quantization
ARC-AGI-3: humans 100%, models <1%
Developer tools: Figma, Claude Code, Cursor, Codex
ChatGPT Library: 10 GB of cloud storage and memory for files
Security: LiteLLM and Reddit
Music and voice: Suno v5.5 and Lyria 3 Pro
Hardware and infrastructure: Flash-MoE on MacBook
AI geopolitics: Manus, Pentagon, GLM 5.1
Research tools: Feynman and neuraldeep.ru
Week in review
FAQ

1. OpenAI: the end of Sora, Codex plugins, and advertising at $60 CPM

Sora is shutting down — Disney leaves

OpenAI announced the shutdown Sora as a standalone product: the app, website, and API are ceasing operation. At the same time, Disney canceled its partnership and a $1 billion investment. This is one of the largest public failures in video generation — a product that a year ago seemed revolutionary for content production failed to find a sustainable audience. $1 billion.

The reasons are likely complex: high generation costs, unpredictable quality for professional tasks, and growing competition from Runway, Kling, and Pika. Disney likely assessed the risks and preferred not to bet on a single supplier.

Codex: 1.6 million weekly users and 20+ plugins

OpenAI Codex reached 1.6 million WAU and launched a plugin marketplace with 20+ integrations: Slack, Figma, Notion, Gmail. This turns Codex from a code-writing tool into a full-fledged workflow orchestration hub.

Spud: OpenAI has completed pretraining of its next flagship model

According to The Information, OpenAI has completed pretraining of a new powerful model codenamed Spud. Details have not been disclosed, but the completion of the pretraining stage itself is an important milestone: next come fine-tuning, RLHF, and potentially a release within a few months.

Advertising in ChatGPT: expensive and without analytics

According to The Information, the ChatGPT advertising program is struggling: CPM is holding at $60 — 3 times more expensive than Meta. At the same time, there is no proper analytics system and the entry barrier for advertisers is high. The potential is enormous (ChatGPT’s audience is hundreds of millions), but monetization through ads has not taken off yet.

2. Anthropic: Claude Mythos leak, court victory, and computer use on Mac

Claude Mythos (codename: Capybara)

Fortune reported on a data leak involving Anthropic’s next flagship model — Claude Mythos, codenamed Capybara. According to the leaked data, the model significantly outperforms Claude Opus in capabilities, especially in coding and cybersecurity. Anthropic itself confirmed the leak, describing the model as a “step-change in capabilities,” but did not provide details.

This is a direct response to rumors about OpenAI Spud. The next-generation model race is unfolding faster than anyone expected — and now a broad audience knows about it.

Victory over the Pentagon

Anthropic won an injunction against the Trump administration’s attempt to restrict its work with the Department of Defense. The judge called the attempt “classic revenge” — a rare formulation for a federal case. This is an important precedent: attempts to use regulatory mechanisms as a political tool against AI companies will now face heightened legal scrutiny.

Claude gained computer control on Mac

Claude Computer Use is now available on macOS: the agent controls apps, the browser, the mouse, and the keyboard directly through the dispatch interface. Capabilities range from automating routine tasks to fully agentic execution of multi-step scenarios without human involvement.

3. Google: Gemini 3.1 Flash Live, TurboQuant, and an AI browser

Gemini 3.1 Flash Live — real time for voice agents

Google released Gemini 3.1 Flash Live — a realtime multimodal model optimized for voice agents:

90+ languages with native audio processing
2x longer conversation memory compared with the previous version Low latency optimized for conversational interfaces
This is a direct competitor to GPT-4o Realtime and ElevenLabs Conversational AI — the voice agent market is becoming one of the most competitive segments.

TurboQuant: 6× less memory, 8× faster on H100

Google Research

published published TurboQuant — a quantization technology for KV-cache down to 3 bits:

6× reduction in memory consumption
Up to 8× speedup on NVIDIA H100 GPU
A direct application for reducing the cost of large-model inference

The community responded immediately: on Reddit, RotorQuant appeared (based on Clifford algebra, 10–19× faster than TurboQuant) and sparse V dequant for llama.cpp with a +22.8%

AI browser in Google AI Studio

Google showed an experimental AI browser in AI Studio that does not load existing pages, but generates them from scratch based on a prompt. It is still a prototype, but the direction is clear: future browsers may become interfaces for generating content, not just consuming it.

4. Open-source wave: audio, agentic search, quantization

Mistral Voxtral TTS — better than ElevenLabs by human evaluation

Mistral released Voxtral TTS — an open-source text-to-speech model that, according to independent testers, outperforms ElevenLabs. This is a serious blow to paid TTS solutions: Voxtral is available for local deployment and commercial use.

Cohere Transcribe — No. 1 on the HuggingFace Open ASR Leaderboard

Cohere released Transcribe — an open-source ASR model for audio transcription that took first place on the HuggingFace Open ASR Leaderboard. For the Russian market, this is especially interesting: a strong open-source alternative to commercial speech recognition solutions is emerging.

Chroma Context-1: agentic search at 25× lower cost than frontier models

Chroma released Context-1 — an open-source 20B-parameter model for agentic search:

Comparable to frontier models in search quality
10× lower latency compared to GPT-4o/Claude
25× cheaper in inference
License Apache 2.0

This changes the economics of RAG applications: a specialized small model can outperform a giant general-purpose monster on a specific task.

5. ARC-AGI-3: humans 100%, models <1%

Released ARC-AGI-3 — a new interactive benchmark for agentic reasoning. The gap is enormous: humans solve 100% of tasks, while the best AI models on the current leaderboard — less than 1%. The tasks test the ability to adapt to new rules during the solution process — something LLMs traditionally handle poorly.

This is a reminder that AGI is still more of a marketing term than a technical fact. Pattern matching on trillions of tokens is not the same as reasoning.

6. Developer tools: Figma, Claude Code, Cursor, Codex

Figma opened a canvas for AI agents via MCP

Figma opened the canvas via an MCP server for AI agents. Claude Code, Codex, Cursor, and others can now read and write directly on the Figma canvas. Free during the beta period. This is the first major design tool with native AI agent integration — potentially changing designers' workflows just as GitHub Copilot changed software development.

Claude Code: Auto Mode and Auto-Dream

Anthropic released Claude Code Auto Mode — an autonomous task execution mode that is safer than the --dangerously-skip-permissions flag. Sonnet decides in real time which actions can be performed without user confirmation.

At the same time, auto-dream — a mechanism for consolidating memory between sessions. In the background, the agent “digests” the experience of previous sessions and forms a compressed context for the next one — analogous to how the human brain processes memories during sleep.

Cursor: real-time RL and self-hosted agents

Cursor trains Composer through real-time RL directly in production — new checkpoints every ~5 hours. This means the model is literally improving from use right now, while you are reading this text.

Additionally, a self-hosted mode has appeared for Cursor’s cloud agents — for companies that care about data isolation and running in their own infrastructure.

7. ChatGPT Library: 10 GB of cloud storage and file memory

OpenAI launched ChatGPT Library — cloud file storage with a capacity of up to 10 GB with memory integration. Now ChatGPT can “remember” the contents of uploaded files across different chats: upload a report once — and you can refer to it in any subsequent conversation.

This qualitatively changes ChatGPT’s use case for working with documents: you no longer need to attach the file to a message every time.

8. Security: LiteLLM and Reddit

Supply chain attack on LiteLLM via PyPI

A serious incident: LiteLLM (3.4 million downloads per day) fell victim to a supply chain attack through the TeamPCP group. The scheme:

Compromised the security scanner Trivy
Obtained a PyPI token
Uploaded malware stealing SSH keys, cloud tokens, and Kubernetes secrets

If you use LiteLLM — update immediately, check your logs, and rotate SSH keys and API tokens. This is a reminder that the open-source supply chain is a weak point, especially for popular packages in the AI stack.

Reddit introduces biometric verification

Reddit announced biometric verification for suspicious accounts: passkeys, Face ID, and documents. This is a response to the mass spread of AI bots on the platform. The paradox of the moment: AI generates content that is being protected against with biometrics created using AI.

9. Music and voice: Suno v5.5 and Lyria 3 Pro

Suno released version v5.5 with a key new feature — music generation using the user’s cloned voice. Now you can literally write a song in your own voice without leaving the studio.

Google updated Lyria 3 Pro — now the model generates tracks up to 3 minutes long. The previous 30–60 second limit was the main practical barrier to use in real projects.

Audio AI is accelerating: what seemed like fantasy a year ago has become a routine feature.

10. Hardware and infrastructure: Flash-MoE on MacBook

Released Flash-MoE — an inference engine in C/Metal that allows streaming a 397B MoE model directly from SSD on a MacBook with 48 GB of RAM at a speed of 5.5 tokens/sec.

For tasks that do not require an instant response — document analysis, batch processing, overnight pipelines — this opens up new possibilities for running huge models locally without server infrastructure. A year ago, this was the stuff of science fiction.

11. AI geopolitics: Manus, the Pentagon, GLM 5.1

China detains Manus co-founders

China detained the co-founders of the Manus startup — the company Meta had planned to buy for $2 billion. The official reason has not been disclosed, but the context is obvious: Beijing is concerned about AI companies moving to Western jurisdictions. Manus has already moved its legal entity to Singapore. This is a signal to all Chinese AI startups considering a sale to Western buyers.

The Pentagon: Palantir Maven’s budget grew from $480 million to $13 billion

The U.S. Department of Defense formalized Palantir Maven AI as a core military system. The budget grew from $480 million in 2024 to $13 billion — a 27-fold increase in two years. Military use of AI has moved from the experimental phase to the operational one.

GLM 5.1: 94.6% of Claude Opus 4.6 in coding

Released GLM 5.1 — the model is gaining 94.6% of Claude Opus 4.6’s results in a number of coding benchmarks. The gap between Western and Chinese models continues to narrow — and this is not only a matter of competition between companies, but also a question of technological sovereignty.

12. Research tools: Feynman and neuraldeep.ru

Feynman — a new open-source AI agent for scientific research, launched directly from the command line. Designed for working with academic sources, searching databases, and synthesizing information from multiple papers.

neuraldeep.ru — an aggregator of Russian-language skills, MCP, and CLI tools for AI developers. A convenient entry point for those working in the Russian market and looking for ready-made integrations. Telegram: @neuraldeep.

Week in review

Three cross-cutting trends:

1. The next generation of models is emerging from the shadows. Claude Mythos, OpenAI Spud, GLM 5.1 — all of them either leaked or made themselves known precisely this week. The next 3–6 months will be decisive for the balance of power.

2. Open-source is closing the gap faster than expected. Voxtral TTS beats ElevenLabs, Context-1 competes with frontier models at 25× lower cost, RotorQuant outperforms Google TurboQuant by 10–19 times. Corporate monopolies on quality are ending.

3. AI is becoming embedded in infrastructure. Figma + MCP, Cursor real-time RL, Codex plugins, Claude Computer Use, ChatGPT Library — AI is no longer a separate tool and is becoming a layer inside familiar work environments.

Frequently asked questions

Why did OpenAI shut down Sora? There are no official reasons, but the high infrastructure costs, low user retention, and Disney’s refusal of the $1 billion investment make it impractical to continue. The video generation functionality will most likely be integrated into other OpenAI products.

What is Claude Mythos and when can we expect it? Claude Mythos (codename Capybara) is Anthropic’s next flagship model. According to leaked data, it significantly outperforms Claude Opus, especially in coding and cybersecurity. The release date has not been officially announced.

What is ARC-AGI-3 and why does the gap between humans and models matter? ARC-AGI-3 is an agentic reasoning benchmark where humans solve 100% of tasks, while the best AI models solve less than 1%. This shows that current LLMs are good at pattern matching, but perform poorly on tasks that require genuine adaptation to new rules.

Is it safe to use LiteLLM after the attack? Update LiteLLM to the latest version, check the logs for suspicious activity, and change SSH keys and API tokens if the package was used in production before the patch.

What does the opening of the Figma canvas mean for AI agents? Through the MCP server, Claude Code, Codex, or Cursor can now directly read and modify Figma designs. This opens the way for automatic UI generation, code-and-design synchronization, and agentic prototyping — without manual export or copy-pasting.

What is Claude Code auto-dream? A memory consolidation mechanism: in the background, the agent processes the experience of previous sessions and forms a compressed context for the next one. It is similar to how the brain processes memories during sleep — hence the name.

*Follow the next digest — the race is only accelerating.*

AI Digest: March 24-28, 2026 Highlights