A2A v1.0 Ships, Conway Exposed & 96% of Enterprises Running Agents
New Tools & Platform Launches
C3 AI launched C3 Code, an enterprise agentic platform that autonomously handles the full app lifecycle — design, data modeling, testing, and deployment — from natural language. It scored 9.2/10 overall in independent evaluation (vs. Codex 6.0, Palantir AIP 7.7) and a perfect 10 for "domain intelligence," with 40+ industry-specific packages across defense, healthcare, and manufacturing.
GitHub Copilot CLI added a "Rubber Duck" cross-model review agent that uses a second model from a different AI family to catch errors before the primary agent acts. Claude Sonnet 4.6 + Rubber Duck (running GPT-5.4) closed 74.7% of the performance gap between Sonnet and Opus on SWE-Bench Pro, particularly on complex 3+ file changes.
GitHub Copilot CLI now supports BYOK and local models, letting enterprises plug in Azure OpenAI, Anthropic, Ollama, vLLM, or any OpenAI-compatible endpoint. A
COPILOT_OFFLINE=truemode enables fully air-gapped development with all telemetry disabled.Google open-sourced Scion, an experimental multi-agent orchestration testbed that acts as a "hypervisor for agents" — giving Claude Code, Gemini CLI, and Codex each their own isolated container, Git worktree, and credentials so they can work concurrently on the same codebase without collision.
Z.ai released GLM-5.1 under MIT license, an open-source model built for long-running autonomous coding agents. It scored 58.4 on SWE-Bench Pro — above GPT-5.4, Opus 4.6, and Gemini 3.1 Pro — and demonstrated sustained performance over 600+ iterations and 6,000 tool calls, improving a vector database task to 6x its single-session peak.
Atlassian embedded third-party AI agents into Confluence via MCP, including Lovable (turn product ideas into prototypes), Replit (convert docs into starter apps), and Gamma (build presentations). Also launched "Remix" in open beta — a visual tool that turns Confluence data into charts without leaving the app.
Infosys and Harness announced a partnership to automate the full post-code delivery lifecycle — testing, deployment, security, governance, reliability, and cost optimization — targeting large-scale, regulated, hybrid and multi-cloud environments.
A2A v1.0 Ships; MCP Reaches 97M Monthly Downloads
The Agentic AI Foundation (AAIF) formally released A2A v1.0 — the agent-to-agent coordination standard complementary to MCP — at its inaugural North American MCP Dev Summit. AAIF has surpassed CNCF in Linux Foundation membership in just three months.
MCP reached 97 million monthly downloads. Maintainers from Anthropic, AWS, Microsoft, and OpenAI reaffirmed MCP's scope as agent-to-resource connectivity only — observability, identity, and governance are explicitly out of scope, leaving those layers to other standards.
Enterprise keynotes from Uber, Nordstrom, Bloomberg, and PwC at the summit showed production MCP deployments, signaling the protocol has moved beyond experimentation in regulated sectors.
Enterprise Adoption Data: Mainstream but Sprawling
An OutSystems survey of 1,900 IT leaders finds 96% of enterprises now use AI agents, but 94% report AI sprawl is creating complexity, technical debt, and security risks. Only 12% have a centralized AI management platform; 38% mix custom and pre-built agents with no standardization.
An a16z analysis finds 29% of Fortune 500 and ~19% of Global 2000 are active paying customers of AI startups — a faster adoption curve than any prior tech wave. Coding is the dominant enterprise use case (10–20x productivity gains cited); technology, legal, and healthcare are the highest-adoption sectors.
McKinsey Senior Partner Senthil Muthiah warns that two-thirds of organizations running agentic AI are still stuck in pilots — and it's a strategy problem, not a technology one. For every $1 spent on technology, organizations need $2 in change management to realize benefits. No function within enterprises currently owns agent lifecycle management (creation, tuning, sunsetting).
Perplexity's pivot from AI search to task-executing agents drove 50% revenue growth in one month, with ARR topping $450M in March. The company now serves 100M+ monthly users including tens of thousands of enterprise clients.
Anthropic's "Conway": The Always-On Agent Hiding in the Leak
Analysis of the leaked Claude Code source reveals an undisclosed internal project called "Conway" — a standalone always-on agent environment separate from the chat interface, with its own extension format, browser control, external event triggers, and connections to enterprise tools. It was not on Anthropic's public roadmap.
Conway's architecture uses a proprietary
.cnw.zipextension format that sits on top of MCP, creating a platform-controlled distribution layer — similar to how Google Play Services sits atop open-source Android. Extensions built for Conway work only inside Conway, not across MCP-compatible clients.The strategic pattern emerging from the leak: Claude Code Channels (neutralized OpenClaw), Cowork (non-engineer workforce), Claude Marketplace (enterprise procurement), third-party harness restrictions (previously reported), and Conway as a persistent memory layer — executed across five surfaces in one quarter.
The core lock-in risk is behavioral, not data: if an agent accumulates six months of learned workflows, communication patterns, and organizational context, switching providers means losing that compounding — there is no "CSV of how this person thinks" to export. No legal frameworks for "intelligence portability" currently exist.
AI Code Quality Under the Microscope
A Sonar benchmark across 50 models and 4,000+ Java tasks finds Gemini 3.1 Pro leads with an 84% pass rate, but GPT-5.4 requires 4x more lines of code to achieve a comparable result — increasing review burden significantly. Every model tested introduces vulnerabilities; even the best (GPT-5.4 High) introduces 50 security issues per million lines of code with 20 blockers.
The open-source GLM-5 model (MIT license) generates the fewest bugs of any tested model and fewer lines than GPT-5.4, making it a cost-effective option for regulated enterprises that need to self-host and keep code off external APIs.
A METR study found AI was slowing down teams 19% of the time when best practices weren't followed — attributing lost time to reviewing AI output, iterative prompting, and waiting on agents. The data comes from a controlled comparison of groups using and not using AI coding assistance.
An InfoQ analysis finds stateful WebSocket connections deliver 15–29% faster execution and 80–86% less data sent vs. stateless HTTP for multi-turn agentic coding loops. OpenAI Codex and Cline already support WebSocket; Claude Code and Cursor still use HTTP. At 1 million concurrent agent sessions, this represents a 144 GB reduction in payload per task.
Get Agentic software development in your inbox
Subscribe to receive new issues as they're published.