A2A v1.0 Ships, Conway Exposed & 96% of Enterprises Running Agents

New Tools & Platform Launches

  • C3 AI launched C3 Code, an enterprise agentic platform that autonomously handles the full app lifecycle — design, data modeling, testing, and deployment — from natural language. It scored 9.2/10 overall in independent evaluation (vs. Codex 6.0, Palantir AIP 7.7) and a perfect 10 for "domain intelligence," with 40+ industry-specific packages across defense, healthcare, and manufacturing.

  • GitHub Copilot CLI added a "Rubber Duck" cross-model review agent that uses a second model from a different AI family to catch errors before the primary agent acts. Claude Sonnet 4.6 + Rubber Duck (running GPT-5.4) closed 74.7% of the performance gap between Sonnet and Opus on SWE-Bench Pro, particularly on complex 3+ file changes.

  • GitHub Copilot CLI now supports BYOK and local models, letting enterprises plug in Azure OpenAI, Anthropic, Ollama, vLLM, or any OpenAI-compatible endpoint. A COPILOT_OFFLINE=true mode enables fully air-gapped development with all telemetry disabled.

  • Google open-sourced Scion, an experimental multi-agent orchestration testbed that acts as a "hypervisor for agents" — giving Claude Code, Gemini CLI, and Codex each their own isolated container, Git worktree, and credentials so they can work concurrently on the same codebase without collision.

  • Z.ai released GLM-5.1 under MIT license, an open-source model built for long-running autonomous coding agents. It scored 58.4 on SWE-Bench Pro — above GPT-5.4, Opus 4.6, and Gemini 3.1 Pro — and demonstrated sustained performance over 600+ iterations and 6,000 tool calls, improving a vector database task to 6x its single-session peak.

  • Atlassian embedded third-party AI agents into Confluence via MCP, including Lovable (turn product ideas into prototypes), Replit (convert docs into starter apps), and Gamma (build presentations). Also launched "Remix" in open beta — a visual tool that turns Confluence data into charts without leaving the app.

  • Infosys and Harness announced a partnership to automate the full post-code delivery lifecycle — testing, deployment, security, governance, reliability, and cost optimization — targeting large-scale, regulated, hybrid and multi-cloud environments.


A2A v1.0 Ships; MCP Reaches 97M Monthly Downloads

  • The Agentic AI Foundation (AAIF) formally released A2A v1.0 — the agent-to-agent coordination standard complementary to MCP — at its inaugural North American MCP Dev Summit. AAIF has surpassed CNCF in Linux Foundation membership in just three months.

  • MCP reached 97 million monthly downloads. Maintainers from Anthropic, AWS, Microsoft, and OpenAI reaffirmed MCP's scope as agent-to-resource connectivity only — observability, identity, and governance are explicitly out of scope, leaving those layers to other standards.

  • Enterprise keynotes from Uber, Nordstrom, Bloomberg, and PwC at the summit showed production MCP deployments, signaling the protocol has moved beyond experimentation in regulated sectors.


Enterprise Adoption Data: Mainstream but Sprawling

  • An OutSystems survey of 1,900 IT leaders finds 96% of enterprises now use AI agents, but 94% report AI sprawl is creating complexity, technical debt, and security risks. Only 12% have a centralized AI management platform; 38% mix custom and pre-built agents with no standardization.

  • An a16z analysis finds 29% of Fortune 500 and ~19% of Global 2000 are active paying customers of AI startups — a faster adoption curve than any prior tech wave. Coding is the dominant enterprise use case (10–20x productivity gains cited); technology, legal, and healthcare are the highest-adoption sectors.

  • McKinsey Senior Partner Senthil Muthiah warns that two-thirds of organizations running agentic AI are still stuck in pilots — and it's a strategy problem, not a technology one. For every $1 spent on technology, organizations need $2 in change management to realize benefits. No function within enterprises currently owns agent lifecycle management (creation, tuning, sunsetting).

  • Perplexity's pivot from AI search to task-executing agents drove 50% revenue growth in one month, with ARR topping $450M in March. The company now serves 100M+ monthly users including tens of thousands of enterprise clients.


Anthropic's "Conway": The Always-On Agent Hiding in the Leak

  • Analysis of the leaked Claude Code source reveals an undisclosed internal project called "Conway" — a standalone always-on agent environment separate from the chat interface, with its own extension format, browser control, external event triggers, and connections to enterprise tools. It was not on Anthropic's public roadmap.

  • Conway's architecture uses a proprietary .cnw.zip extension format that sits on top of MCP, creating a platform-controlled distribution layer — similar to how Google Play Services sits atop open-source Android. Extensions built for Conway work only inside Conway, not across MCP-compatible clients.

  • The strategic pattern emerging from the leak: Claude Code Channels (neutralized OpenClaw), Cowork (non-engineer workforce), Claude Marketplace (enterprise procurement), third-party harness restrictions (previously reported), and Conway as a persistent memory layer — executed across five surfaces in one quarter.

  • The core lock-in risk is behavioral, not data: if an agent accumulates six months of learned workflows, communication patterns, and organizational context, switching providers means losing that compounding — there is no "CSV of how this person thinks" to export. No legal frameworks for "intelligence portability" currently exist.


AI Code Quality Under the Microscope

  • A Sonar benchmark across 50 models and 4,000+ Java tasks finds Gemini 3.1 Pro leads with an 84% pass rate, but GPT-5.4 requires 4x more lines of code to achieve a comparable result — increasing review burden significantly. Every model tested introduces vulnerabilities; even the best (GPT-5.4 High) introduces 50 security issues per million lines of code with 20 blockers.

  • The open-source GLM-5 model (MIT license) generates the fewest bugs of any tested model and fewer lines than GPT-5.4, making it a cost-effective option for regulated enterprises that need to self-host and keep code off external APIs.

  • A METR study found AI was slowing down teams 19% of the time when best practices weren't followed — attributing lost time to reviewing AI output, iterative prompting, and waiting on agents. The data comes from a controlled comparison of groups using and not using AI coding assistance.

  • An InfoQ analysis finds stateful WebSocket connections deliver 15–29% faster execution and 80–86% less data sent vs. stateless HTTP for multi-turn agentic coding loops. OpenAI Codex and Cline already support WebSocket; Claude Code and Cursor still use HTTP. At 1 million concurrent agent sessions, this represents a 144 GB reduction in payload per task.

Get Agentic software development in your inbox

Subscribe to receive new issues as they're published.