Physical Bodies, Digital Minds

Issue #99 | 18th May 2026

May 18, 2026

Hey Superintelligence Fam 👋

This week, the AI race shifts from chat screens to warehouse floors and coding terminals. Humanoid robots are captivating millions by mastering real-world labor, proving that physical embodiment is no longer a distant dream.

Meanwhile, every major AI lab is relentlessly obsessed with autonomous coding. From mobile developer tools to heavy-thinking reasoning models, the industry’s brightest minds are fiercely competing to automate software engineering.

Let’s dive into what’s new this week..

Figure AI Turns Package Sorting Into Robot TV : Figure AI’s humanoids turned warehouse work into viral content, sorting 30,000+ packages across 24 hours with zero reported failures and 3M+ X views.
Grok Build Enters the Coding CLI Race : xAI launched Grok Build in early beta for SuperGrok Heavy users, bringing a terminal-native coding agent with plan mode, clean diffs, plugins, hooks, and MCP support.
Codex Moves From Desk to Phone : OpenAI’s Codex for mobile lets developers start, steer, approve, and monitor coding sessions from iOS and Android while staying connected to their existing workspace.
OpenAI Trial Ends, Musk Ecosystem Expands : TechCrunch frames the Musk v. Altman trial around AI leadership trust, alongside SpaceX IPO momentum, Anduril’s $5B Series H, and Musk-linked founder spinouts.

GPT-Rosalind : OpenAI’s GPT-Rosalind brings frontier reasoning to biology and drug discovery, connecting scientists to 50+ tools and ranking above 95th percentile human experts in prediction tasks.
Kimi K2.6 : Kimi K2.6 pushes open-source coding agents forward, running 12-hour tasks, 4,000+ tool calls, and boosting exchange-core throughput by 185%.
Codex on Mobile : Codex now works from iOS and Android, letting developers start, steer, approve, and monitor live coding sessions while work continues on connected machines.
Qwen3.6-35B-A3B : Qwen3.6-35B-A3B is an open-source MoE model with 35B total and 3B active parameters, built for efficient agentic coding and multimodal tasks.

HeavySkill: Internalizing Orchestration with Parallel Reasoning and Deliberation : HeavySkill turns agentic “heavy thinking” into a learnable model skill, lifting GPT-OSS-20B from 69.7% to 85.5% on LiveCodeBench through parallel reasoning and deliberation.
Conductor: Learning to Orchestrate Agents in Natural Language : Sakana’s 7B Conductor beats frontier workers by coordinating them, scoring 93.3% on AIME25, 87.5% on GPQA-Diamond, and 83.93% on LiveCodeBench.
Self-Improving Pretraining: Using Post-Trained Models to Pretrain Better Models : Meta FAIR’s SIP uses stronger models as rewrites and judges during pretraining, improving factuality by 36.2%, safety by 18.5%, and generation quality up to 86.3%.

EU cyber-governance, BaFin AI-risk inspections, UK frontier-model safeguards, and Vatican’s AI study group. Meta employees also protested mouse-click and keystroke tracking, with internal petitions and backlash over workplace surveillance for AI training.

Thank you for tuning in to this week’s edition of Superintelligence Newsletter! Stay connected for more groundbreaking insights and updates on the latest in AI and superintelligence.

For more in-depth articles and expert perspectives, visit our website | Have feedback? Provide feedback.

To explore Superintelligence Media : Explore Here

Stay curious, stay informed, and keep pushing the boundaries of what’s possible!

Until Next Time!

Superintelligence Team.

Discussion about this post

Ready for more?