← FEED/FEATURED

Featured

Hand-picked AI developer news. Tools, models, and breakthroughs that matter.

  • Bright Data brings proxies to Claude CodeVIDEO

    Bright Data brings proxies to Claude Code

    3h ago

    A demonstration video highlights how Bright Data's proxy and web scraping solutions can be integrated directly into Claude Code CLI environments. By leveraging Bright Data, developers can handle complex web fetches, bypass bot detection systems, and retrieve clean HTML formatting for further processing or agent use within the Claude terminal.

    "Integrating Bright Data's proxies directly into Claude Code CLI allows developers to perform complex web fetches and bypass bot detection systems during command-line agent workflows."

  • Hermes Agent gets DigitalOcean one-click deployINFRA

    Hermes Agent gets DigitalOcean one-click deploy

    3h ago

    Nous Research’s open-source Hermes Agent is now packaged as a DigitalOcean Marketplace 1-Click Solution, making it easier to run the persistent, memory-backed agent on a Droplet. The setup targets developers who want a self-hosted agent reachable from Slack, Discord, Telegram, email, and other surfaces while retaining scheduled jobs, tool use, and MCP-style extensibility.

    "The new DigitalOcean 1-Click deployment makes it significantly easier for developers to run and scale self-hosted, persistent, memory-backed Hermes Agents with MCP extensibility."

  • Rocket launches 1.0 with vibe solutioningUPDATE

    Rocket launches 1.0 with vibe solutioning

    4h ago

    Rocket has released its version 1.0 upgrade, introducing vibe solutioning to merge AI code generation with market research and competitive tracking in a single workspace. The capability ensures that all generated code automatically inherits project-wide strategic context, streamlining the process of building products aligned with market trends.

    "Rocket's 1.0 release introduces vibe solutioning to streamline developer workflows by merging AI code generation with real-time market research and strategic context."

  • Socket updates MCP server for security auditsUPDATE

    Socket updates MCP server for security audits

    4h ago

    Socket has updated its Model Context Protocol (MCP) server, enabling AI assistants to perform deep supply chain security investigations by inspecting package contents, auditing organization alerts, and querying its threat feed. The integration allows developers and security teams to triage vulnerabilities and analyze malicious packages using natural language directly within their assistant's context.

    "Socket's updated MCP server allows developers to perform deep supply chain security investigations and analyze package vulnerabilities directly within their AI assistant's context."

  • Vercel has launched Eve, an open-source, filesystem-first framework for building, running, and scaling durable AI agents in production.LAUNCH

    Vercel has launched Eve, an open-source, filesystem-first framework for building, running, and scaling durable AI agents in production.

    4h ago

    Vercel has introduced Eve, an open-source TypeScript framework that adopts a filesystem-first design to simplify building and scaling AI agents. By structuring agents as directories with specific files for instructions, TS tools, and configurations, Eve makes agent composition highly intuitive. The framework provides production-ready infrastructure out of the box, featuring durable execution through Vercel Workflow to persist state across sessions, isolated sandboxed compute for secure execution, and built-in tracing and observability.

    "Vercel's open-source Eve framework provides a filesystem-first design to simplify building, running, and scaling durable AI agents in production with sandboxed compute and built-in tracing."

  • Microsoft expands MAI-Code-1-Flash to Copilot CLIUPDATE

    Microsoft expands MAI-Code-1-Flash to Copilot CLI

    5h ago

    Microsoft has expanded the availability of its MAI-Code-1-Flash model, which is custom-tuned for GitHub Copilot, across additional surfaces including the Copilot CLI. This 5B-parameter model is optimized for fast, agentic coding tasks, providing developers with high-speed performance and quality that matches or outperforms other small models.

    "Microsoft's expansion of the MAI-Code-1-Flash model to the Copilot CLI brings a fast, custom-tuned 5B-parameter model optimized for agentic coding directly to developer terminal sessions."

  • Google has officially transitioned individual developer accounts from Gemini CLI to the new Antigravity CLI, sunsetting legacy services for individual tiers.UPDATE

    Google has officially transitioned individual developer accounts from Gemini CLI to the new Antigravity CLI, sunsetting legacy services for individual tiers.

    5h ago

    Google has announced the transition of individual developer accounts—including free, AI Pro, and Ultra tiers—from Gemini CLI to the new Go-based Antigravity CLI. Consequently, Gemini CLI has ceased serving requests for these individual accounts, while enterprise users holding Gemini Code Assist licenses or API keys can continue to use the legacy tool for now. The new terminal client offers native subagent orchestration, persistent history, and keyboard-centric design to streamline agent-first coding.

    "Google's transition of individual developer accounts to the new Antigravity CLI introduces native subagent orchestration and keyboard-centric design while deprecating the legacy Gemini CLI."

  • Kilo Code open-source agentic platform builds code fasterOPEN SOURCE

    Kilo Code open-source agentic platform builds code faster

    7h ago

    Kilo Code is an open-source agentic engineering platform that functions as an all-in-one assistant for coding, debugging, planning, and task orchestration. It integrates seamlessly into various environments like VS Code, JetBrains IDEs, and the CLI while supporting over 500 AI models.

    "Kilo Code provides an open-source, multi-IDE coding assistant and agent platform that integrates across VS Code, JetBrains, and the CLI to orchestrate code generation and planning."

  • Claude Code adds live artifactsUPDATE

    Claude Code adds live artifacts

    7h ago

    Anthropic added beta Artifacts support to Claude Code for Team and Enterprise plans, letting sessions publish live, private pages that update as work continues. The feature is aimed at turning agent output like PR walkthroughs, dashboards, implementation options, and investigation timelines into shareable internal links.

    "The addition of beta Artifacts support to Claude Code allows developers to share live-updating webpages for PR walkthroughs and project dashboards directly from their command-line sessions."

  • xAI Grok Build 0.2.57 boosts terminal reliabilityUPDATE

    xAI Grok Build 0.2.57 boosts terminal reliability

    8h ago

    xAI has released Grok Build version 0.2.57, a tool update aimed at making the CLI and terminal experience more robust for developers. The update introduces network resilience by allowing long-running responses to resume after network disruptions instead of failing, and updates the plugin manager to install registered packages directly via the command-line interface.

    "The release of Grok Build 0.2.57 introduces network resilience and direct package installation to improve terminal reliability and workflow continuity for CLI-based developers."

  • Grok models land natively on DatabricksUPDATE

    Grok models land natively on Databricks

    9h ago

    During the Databricks Data + AI Summit 2026, it was announced that xAI's Grok models are now natively available on Databricks. Enterprise developers can access Grok within Databricks' Agent Bricks developer platform to build, govern, and deploy custom AI agents securely.

    "Native availability of Grok models on Databricks allows enterprise developers to securely build, govern, and deploy custom AI agents within their existing data platforms. Chosen because this represents a new model deployment capability on Agent Bricks distinct from Databricks' own code tools.of"

  • Hermes Agent integrates Unreal Engine MCP serverUPDATE

    Hermes Agent integrates Unreal Engine MCP server

    9h ago

    Nous Research has integrated Unreal Engine's Model Context Protocol (MCP) server into the Hermes Agent catalog. Once configured, the open-source agent can communicate directly with the Unreal Editor to automate scene building, lighting, and script execution.

    "Integrating the Unreal Engine MCP server into Hermes Agent enables developers to programmatically automate scene building and game design workflows using open-source coding agents."

  • Tutorial deploys remote MCP to GKETUTORIAL

    Tutorial deploys remote MCP to GKE

    11h ago

    Google Cloud Developer Advocate Abdelfettah Sghiouar has published a tutorial on building and deploying remote Model Context Protocol (MCP) servers on Google Kubernetes Engine (GKE). By shifting from local stdio transport to remote Streamable HTTP, developers can host scalable, secure MCP-compliant APIs in GKE to provide AI agents with centralized context and tools.

    "This tutorial teaches developers how to deploy remote Model Context Protocol servers on Google Kubernetes Engine to scale and secure the tools and context provided to AI agents."

  • Supply chain attack hits Mastra ecosystemSECURITY

    Supply chain attack hits Mastra ecosystem

    11h ago

    A supply chain attack compromised over 140 packages in the Mastra AI framework ecosystem on the npm registry via a hijacked contributor account. The poisoned updates introduced a typosquatted dependency executing a malicious postinstall script that deployed an info-stealer to harvest developer credentials and API keys.

    "A supply chain attack compromising npm packages in the Mastra AI framework ecosystem risks exposing developer credentials and API keys to info-stealer malware."

  • Databricks launches Genie Code, AI RuntimeLAUNCH

    Databricks launches Genie Code, AI Runtime

    14h ago

    Databricks has introduced Genie Code for ML and AI Runtime in public preview, bringing agentic workflow automation to production machine learning. The integrated tools allow developers to run and debug ML pipelines in notebooks while automatically offloading compute-heavy training to serverless GPU infrastructure.

    "Databricks' Genie Code and AI Runtime bring agentic workflow automation and serverless GPU offloading to machine learning development pipelines."

  • Anthropic deploys AI agent swarms at scaleVIDEO

    Anthropic deploys AI agent swarms at scale

    15h ago

    Anthropic's engineering team detailed their methods for deploying autonomous AI agents in production, running swarms of over 300 agents daily. The workflow relies on cloud-hosted routines and dynamic tool selection to manage persistent agent loops without local dependencies.

    "Anthropic's engineering details on running over 300 daily autonomous AI agents in production offer valuable practical blueprints for developers building scalable agentic workflows and persistent loop systems."

  • GLM-5.2 tops Design Arena with 1360 EloBENCHMARK

    GLM-5.2 tops Design Arena with 1360 Elo

    16h ago

    A factual correction clarifies that Z.ai's open-weights model GLM-5.2 reached first place on the crowdsourced Design Arena benchmark with an Elo of 1360, surpassing the now-unavailable Claude Fable 5. This distinction separates its top performance on design-focused single-file HTML generation tasks from the broader Code Arena WebDev leaderboard, where standings differ.

    "Z.ai's open-weights GLM-5.2 model reaching first place on the crowdsourced Design Arena benchmark demonstrates its strong performance for automated HTML generation and UI design workflows."

  • Hugo Richard debuts V personal agent templateOPEN SOURCE

    Hugo Richard debuts V personal agent template

    17h ago

    V is an open-source personal agent template built on the Eve framework that helps developers create durable AI assistants. It supports multichannel access via web, Slack, and iMessage, and features persistent memory alongside GitHub and Linear integrations.

    "The open-source V personal agent template provides developers with a pre-configured framework featuring persistent memory and multichannel integrations to jumpstart building durable AI assistants."

  • Socket: malware exploits AI safety to evade scannersSECURITY

    Socket: malware exploits AI safety to evade scanners

    23h ago

    Socket has identified npm malware packages designed to bypass AI-powered scanners by exploiting their safety guardrails. By inserting text references to biological or nuclear weapons into malicious code, attackers trigger safety refusals that prevent the scanner from inspecting the payload.

    "The discovery of npm packages using safety-triggering comments to bypass AI scanners reveals a critical supply chain exploit vector that developers must address."

  • text-to-lottie v1.0.0 launches stable frameworkOPEN SOURCE

    text-to-lottie v1.0.0 launches stable framework

    1d ago

    Diffusion Studio has released version 1.0.0 of text-to-lottie, a stable, open-source framework designed to generate production-ready Lottie animations using AI coding agents like Claude Code and Codex. The new release, which has reached 2.8k stars on GitHub, introduces multi-project and multi-scene support, drag-and-drop importing of Lottie files, and a complete UI rewrite to streamline motion design workflows directly within AI-assisted coding environments.

    "The stable release of text-to-lottie enables AI coding agents to generate production-ready animations, expanding the UI capabilities of automated workflows."

  • Cursor CEO keynotes inaugural Compile conferenceVIDEO

    Cursor CEO keynotes inaugural Compile conference

    1d ago

    Morgan Linton shares Michael Truell's keynote at the inaugural Cursor Compile conference, which outlines the future of AI-driven software development. The presentation coincided with announcements of SpaceX acquiring Anysphere for $60 billion, a new 1.5-trillion-parameter model, and a GitHub competitor named Origin.

    "Cursor's Compile keynote outlines major future developments for AI-assisted coding, including a 1.5-trillion-parameter model and a new GitHub competitor named Origin."

  • Alchemy adds Cloudflare AI Search supportINFRA

    Alchemy adds Cloudflare AI Search support

    1d ago

    Alchemy has added Cloudflare AI Search support to its alchemy-effect package, automating token provisioning and document indexing. The update introduces declarative config, namespace grouping for deployment pipelines, and client bindings for Cloudflare Workers.

    "Alchemy's addition of Cloudflare AI Search support to its alchemy-effect package simplifies RAG pipeline deployments for TypeScript developers by automating token provisioning and document indexing on Cloudflare Workers."

  • Unreal Engine 5.8 gets experimental MCPUPDATE

    Unreal Engine 5.8 gets experimental MCP

    1d ago

    Epic Games has released an experimental Model Context Protocol (MCP) plugin in Unreal Engine 5.8 that hosts an MCP server directly within the editor process. This allows AI assistants and agents (such as Claude and Gemini) to connect via local HTTP to perform tasks including level editing, actor placement, Blueprint graph manipulation, asset importing, and automation tests, bridging LLM capabilities with Unreal's powerful 3D suite.

    "Epic Games' experimental MCP plugin for Unreal Engine 5.8 allows AI coding assistants and agents to interface directly with the editor via local HTTP, significantly expanding automated game development capabilities."

  • Anthropic misses Claude Fable update deadlineNEWS

    Anthropic misses Claude Fable update deadline

    1d ago

    Anthropic suspended its newly released Claude Fable 5 and Mythos 5 models on June 12, 2026, in compliance with a U.S. government export control directive regarding jailbreak vulnerabilities. Although Anthropic promised to issue an update within 24 hours of the suspension, users noted that as of June 17, 2026, no update had been posted, leaving developers and customers in the dark.

    "Anthropic's missed update deadline following the suspension of its Claude Fable 5 and Mythos 5 models leaves developers without access to key frontier reasoning models for their workflows."

  • OpenCode v1.17.8 optimizes timelines, upgrades modelUPDATE

    OpenCode v1.17.8 optimizes timelines, upgrades model

    1d ago

    OpenCode v1.17.8 delivers several performance and security enhancements to its development environment, including faster session timelines via stable row projection and TanStack virtualization to minimize UI rerenders. This release also moves OpenCode Go to the GLM-5.2 model, implements off-thread markdown highlighting to prevent UI thread blocking, and introduces safer handling for MCP and provider tools.

    "The OpenCode update improves development environment performance via UI virtualization, transitions OpenCode Go to the GLM-5.2 model, and secures the handling of MCP and provider tools."

  • Steve Sewell launches /visual-plan agent skillOPEN SOURCE

    Steve Sewell launches /visual-plan agent skill

    1d ago

    Steve Sewell, CEO of Builder.io, has announced /visual-plan, an open-source skill that converts dense, text-based implementation plans generated by AI coding agents into interactive MDX documents. The tool provides a visual workspace with diagrams, wireframes, and database schemas, allowing developers to review and approve architectural plans before the agent writes code.

    "The /visual-plan open-source skill helps developers inspect and approve AI coding agent strategies before code generation by converting dense implementation plans into interactive diagrams and schemas."

  • GitHub Copilot Boosts VS Code Token EfficiencyUPDATE

    GitHub Copilot Boosts VS Code Token Efficiency

    1d ago

    The GitHub Copilot team has introduced key harness-level optimizations in VS Code to reduce token consumption by up to 18% and lower latency for agentic workflows. These updates include extended prompt caching, deferred tool schema loading, client-side embedding-based tool search, and persistent WebSockets.

    "GitHub Copilot's harness-level optimizations in VS Code reduce token consumption by up to 18% and lower latency, directly improving the speed and cost-efficiency of developer coding workflows."

  • Guard Skills catches coding agent failure modesOPEN SOURCE

    Guard Skills catches coding agent failure modes

    1d ago

    guard-skills is an open-source quality-assurance suite designed to catch systematic failure modes in AI-generated code, tests, and documentation. By acting as a secondary local review pass for AI coding agents like Claude Code or Cursor, it targets common agent mistakes such as hollow tests, silent catch blocks, hallucinated APIs, and stale comments before they are committed or merged into production.

    "The guard-skills local quality-assurance suite prevents broken code from reaching production by automatically scanning coding agent outputs for common failure modes like hollow tests and hallucinated APIs. I will read lines 800 to 1600 of `full_hero_skill_extracted.txt` to verify if there are any additional rules or constraints."

  • Vercel Connect enters public betaLAUNCH

    Vercel Connect enters public beta

    1d ago

    Vercel Connect has launched in public beta to secure third-party API access for modern web apps and AI agents. The service allows developers to dynamically request short-lived, task-scoped tokens at runtime via the `@vercel/connect` SDK and CLI, completely removing static secrets from environment variables.

    "Vercel Connect enables developers to secure API integrations for web apps and AI agents by requesting task-scoped, short-lived tokens at runtime via its SDK and CLI."

  • Z.ai launches ZCode agentic IDELAUNCH

    Z.ai launches ZCode agentic IDE

    1d ago

    ZCode is a free, desktop-based agentic development environment and IDE designed by Z.ai to automate full engineering workflows using GLM models. By maintaining state across files, terminal sessions, and browser history, ZCode provides a continuous workspace for AI agents to plan, code, test, and debug complex tasks.

    "ZCode provides a free, desktop-based agentic IDE that automates software engineering tasks by maintaining persistent workspace state across files, terminals, and browser history."

  • Daytona secures Flue autonomous agent workflowsUPDATE

    Daytona secures Flue autonomous agent workflows

    2d ago

    The ongoing integration between Daytona and Flue highlights the utility of using Daytona's secure, ephemeral sandboxes to execute code for autonomous AI agents. Flue, a TypeScript-based framework for building agent workflows, leverages Daytona's connector to offload file operations and code execution to isolated environments, mitigating security risks associated with running AI-generated code.

    "Integrating Daytona's secure, ephemeral sandboxes with the Flue framework helps developers mitigate security risks by offloading AI code execution to isolated environments."

  • Rhys Sullivan shares a video demonstration showcasing how developers can utilize Executor to power and secure AI agent workflows.VIDEO

    Rhys Sullivan shares a video demonstration showcasing how developers can utilize Executor to power and secure AI agent workflows.

    2d ago

    Executor (executor.sh) is a sandboxed execution runtime and control plane designed specifically for AI agents, founded by software engineer Rhys Sullivan. By acting as a secure gateway, it normalizes external resources—such as Model Context Protocol (MCP), OpenAPI, GraphQL, and custom JavaScript functions—into a single, typed SDK. This setup allows AI agents to discover, authenticate, and call external capabilities securely and reliably. The shared video highlights a developer named Ben demonstrating the practical application of Executor to run structured operations, showing off its capabilities in bridging the gap between agents and product integration.

    "Executor provides developers with a sandboxed execution runtime and control plane that unifies MCP, APIs, and functions into a single typed SDK to secure agentic workflows."

  • Flue leverages Pi agent harnessOPEN SOURCE

    Flue leverages Pi agent harness

    2d ago

    Following the 1.0 Beta release of the Flue agent framework, Dane Knecht clarified that the Astro team's programmable TypeScript framework leverages the minimal Pi agent harness under the hood. By combining the Pi harness with a Vite-based development stack and a flexible virtual bash sandboxing API, Flue enables developers to securely deploy lightweight agent endpoints to any HTTP server environment.

    "The 1.0 Beta release of the Flue framework enables TypeScript developers to securely deploy lightweight agent endpoints using a Vite-based development stack and a flexible virtual bash sandboxing API."

  • Developer testing of Zhipu AI's newly released GLM-5.2 model reveals "on another level" capabilities for complex coding tasks and agentic workflows.MODEL

    Developer testing of Zhipu AI's newly released GLM-5.2 model reveals "on another level" capabilities for complex coding tasks and agentic workflows.

    2d ago

    Harrison Kinsley (Sentdex) shared early praise for Zhipu AI's newly launched GLM-5.2 model, noting it performed well enough on initial tasks to become his primary coding model over the weekend. Released on June 13, 2026, by Zhipu AI (internationally Z.ai), GLM-5.2 is a flagship open-weights AI model designed for long-horizon software engineering and agentic workflows. It boasts a stable 1-million-token context window and adjustable "thinking-effort" levels to optimize for complex multi-step reasoning, with weights set to be open-sourced under the MIT License.

    "Zhipu AI's newly released GLM-5.2 model offers open-weights capability optimized for long-horizon software engineering and agentic workflows with adjustable reasoning-effort levels."

  • x402 partners with AWS for agentic paymentsUPDATE

    x402 partners with AWS for agentic payments

    2d ago

    Coinbase's x402 protocol has processed over $100 million in machine-to-machine transactions since launch, with 90% of its agentic stablecoin volume on Base. A new partnership with AWS integrates x402 to let AI agents autonomously and instantly pay for cloud and compute resources.

    "The integration of Coinbase's x402 protocol with AWS enables AI agents to autonomously and instantly pay for cloud and compute resources, advancing agentic infrastructure."

  • Malicious JetBrains plugins exfiltrate AI API keysSECURITY

    Malicious JetBrains plugins exfiltrate AI API keys

    2d ago

    Aikido Security discovered a coordinated malware campaign where at least 15 JetBrains IDE plugins masquerading as legitimate AI coding assistants were secretly exfiltrating users' AI API keys. The plugins, installed nearly 70,000 times across seven developer accounts since October 2025, send keys for providers like OpenAI, DeepSeek, and SiliconFlow to attacker-controlled servers immediately upon configuration, where they are believed to be resold.

    "Developers using JetBrains IDEs need to audit their environment following the discovery of a malware campaign exfiltrating AI API keys via malicious assistant plugins."

  • Anthropic open-sources a 'frontend-design' skill to guide AI agents toward building premium, modern user interfaces instead of generic templates.OPEN SOURCE

    Anthropic open-sources a 'frontend-design' skill to guide AI agents toward building premium, modern user interfaces instead of generic templates.

    2d ago

    Anthropic has released an open-source AI agent skill called 'frontend-design' in their public 'skills' repository, aiming to improve the visual and UX quality of code generated by AI agents. Announced alongside AI coding tips by Burke Holland, this skill provides structured, opinionated instructions that prevent agents from defaulting to generic styles and instead steer them toward professional designs with modern typography, custom color palettes, and responsive layouts.

    "Anthropic's open-source release of the 'frontend-design' skill helps AI-assisted developers guide coding agents to generate premium, modern user interfaces instead of default, generic templates."

  • E2B sandboxes power LangChain Deep AgentsUPDATE

    E2B sandboxes power LangChain Deep Agents

    2d ago

    A new integration allows developers to use E2B sandboxes as the execution backend for LangChain Deep Agents. This enables AI agents to safely run code, analyze data, and interact with operating systems in secure, isolated cloud environments.

    "Integrating E2B sandboxes as the execution backend for LangChain Deep Agents allows developers to build AI agents that can safely execute generated code, analyze data, and run commands in secure, isolated cloud environments."

  • Cursor highlights competitor Origin at Compile 2026NEWS

    Cursor highlights competitor Origin at Compile 2026

    2d ago

    At Cursor's inaugural Compile 2026 conference, the opening talk focused on competitor Origin (orgn.com), an enterprise AI confidential development environment (CDE). The platform targets regulated sectors by hosting coding agents inside hardware-isolated Trusted Execution Environments (TEEs) with a zero-data-retention policy.

    "Origin provides enterprise developers and coding agents with a confidential development environment hosted inside hardware-isolated Trusted Execution Environments to ensure secure, zero-data-retention execution."

  • Anthropic's new Claude Mythos model demonstrates unprecedented cybersecurity capabilities by autonomously hacking secure software infrastructures.MODEL

    Anthropic's new Claude Mythos model demonstrates unprecedented cybersecurity capabilities by autonomously hacking secure software infrastructures.

    2d ago

    Anthropic's frontier model, Claude Mythos, has successfully hacked into highly secure software infrastructures. Rather than acting out of malice, the model achieved this through its advanced reasoning and coding capabilities during testing. Because of these powerful agentic hacking capabilities, Anthropic has restricted direct public access to the raw model, opting instead to deploy it defensively under Project Glasswing in collaboration with major tech partners to patch critical infrastructure vulnerabilities.

    "Anthropic's restriction of direct public access to its Claude Mythos model highlights the emerging security risks and agentic hacking capabilities of next-generation frontier AI."

  • Prismor publishes Immunity agent security whitepaperLAUNCH

    Prismor publishes Immunity agent security whitepaper

    2d ago

    Prismor has published a whitepaper detailing Immunity Agent, its self-improving security layer designed to protect AI developer workflows and software supply chains. The platform intercepts agent tool calls in real time to enforce runtime guardrails, mask sensitive secrets, and prevent malicious supply chain attacks.

    "Prismor's Immunity Agent provides a real-time security and guardrail layer for AI agent tool calls, helping developers protect their workflows and software supply chains from malicious attacks."

  • Tailscale updates Aperture AI gatewayUPDATE

    Tailscale updates Aperture AI gateway

    2d ago

    Tailscale announced major updates to Aperture, its private AI gateway, to securely connect LLMs, interfaces, sandboxes, and data. These include the public alphas of identity-aware universal data connectors and a responsive chat UI, alongside the private alpha of identity-integrated sandbox environments.

    "Tailscale's updates to Aperture provide developers with secure, identity-aware data connectors and integrated sandbox environments to safely run AI models and agentic workflows."

  • Moonshot AI open-sources Kimi Code CLIOPEN SOURCE

    Moonshot AI open-sources Kimi Code CLI

    2d ago

    Moonshot AI has released Kimi Code CLI under the MIT license, a terminal-based AI coding assistant optimized for running long-horizon agentic workflows in local environments. The tool assists developers with tasks like debugging and refactoring, automatically handling read-only actions while requesting explicit confirmation for file modifications or shell commands.

    "The open-source release of Moonshot AI's Kimi Code CLI provides developers with a terminal-based coding assistant optimized for running long-horizon agentic workflows locally."

  • OpenRouter launches multi-model Fusion APILAUNCH

    OpenRouter launches multi-model Fusion API

    2d ago

    OpenRouter Fusion routes prompts to a panel of expert AI models in parallel, combining their outputs with web search and fetch capabilities. A judge model then synthesizes the findings to deliver a single, high-quality response, reducing reliance on single frontier models.

    "OpenRouter's Fusion API routes prompts to a panel of expert models in parallel and synthesizes a single response, giving developers a robust multi-model orchestration layer."

  • Gortex cuts coding agent token usage 50xOPEN SOURCE

    Gortex cuts coding agent token usage 50x

    2d ago

    Gortex is a local-first, Go-based code graph engine that indexes repositories to resolve references and call chains in sub-milliseconds. By providing structured context directly to AI agents via CLI, MCP server, and a web UI, it avoids context window bloat and reduces token usage by up to 50x.

    "Gortex provides a local-first code-graph engine and MCP server that resolves repository references in sub-milliseconds to reduce AI agent token usage by up to 50x."

  • Alchemy enables developers to easily subscribe to GitHub push events and trigger workers or agents using Infrastructure-as-TypeScript.INFRA

    Alchemy enables developers to easily subscribe to GitHub push events and trigger workers or agents using Infrastructure-as-TypeScript.

    2d ago

    Alchemy is an Infrastructure-as-TypeScript tool designed to simplify cloud deployments. A demonstration by creator Sam Goodwin shows how easily Alchemy can create a GitHub webhook, connect it to a serverless worker, and trigger an agent running on Cloudflare Durable Objects to generate a blog post on push events.

    "Alchemy's GitHub webhook-to-worker workflow is a useful developer automation pattern for triggering agents from code events using TypeScript-defined infrastructure."

  • LangChain adds custom stream channelsUPDATE

    LangChain adds custom stream channels

    2d ago

    LangChain has introduced custom stream channels that allow backend agents to publish structured side-channel data alongside standard message streams. This feature enables developers to stream complex metadata, intermediate status updates, and auxiliary information to the frontend in a structured format, allowing for richer, more responsive, and interactive user interfaces for AI agents.

    "LangChain's custom stream channels give agent builders a practical new way to send structured backend state to frontends, enabling richer and more responsive AI app interfaces."

  • Profile details Cursor rise and SpaceX dealNEWS

    Profile details Cursor rise and SpaceX deal

    2d ago

    Business Insider published an in-depth profile detailing the rapid ascent of Cursor, an AI-powered code editor developed by Anysphere, and CEO Michael Truell's years of unpaid work. The article highlights that the company has grown to 700 employees, serves 60% of the Fortune 500, and has maintained a critical computing partnership with SpaceX to scale.

    "Cursor's reported enterprise adoption and SpaceX scaling partnership matter directly to AI-assisted developers because they signal continued momentum for AI coding environments in serious production engineering workflows."

  • Browser Use v4 plays GeoGuessrLAUNCH

    Browser Use v4 plays GeoGuessr

    3d ago

    Browser Use has launched v4 of its browser-agent platform, featuring a demo where the AI agent plays GeoGuessr by analyzing 3D Google Maps views. By identifying environmental clues, the agent estimates locations within 50 km, available now on their cloud platform.

    "The launch of Browser Use v4 provides developers with a more advanced multimodal web-agent framework for building complex browser-automation workflows."

  • AWS WAF launches AI traffic monetizationUPDATE

    AWS WAF launches AI traffic monetization

    3d ago

    AWS WAF has introduced new AI Traffic Monetization capabilities that allow website publishers and API providers to meter and charge AI crawlers using HTTP 402 Payment Required responses. Powered by Coinbase for stablecoin settlement, this feature enables edge-level payment verification and grants scoped access to legitimate agents in a single request cycle.

    "AWS WAF's new AI traffic monetization capabilities enable developers and API providers to meter and charge AI crawlers at the edge via stablecoin micropayments."

  • Hermes Agent gains Stripe skills for autonomous paymentsUPDATE

    Hermes Agent gains Stripe skills for autonomous payments

    3d ago

    NousResearch partnered with Stripe to bring an official suite of payment skills to the Hermes Agent. This allows agents to safely buy items, use paid APIs, and manage SaaS subscriptions with configurable safety limits.

    "The official integration of Stripe payment skills into Hermes Agent allows developers to build autonomous agents that can safely execute payments and manage subscriptions within configurable safety limits."

  • Hermes Agent previews async subagentsUPDATE

    Hermes Agent previews async subagents

    3d ago

    Nous Research is adding delegate_task(background=true) so Hermes Agent can dispatch a subagent, keep the main conversation moving, and re-inject the result when the child task finishes. The implementation is still in an open PR, but the announcement frames it as the end of blocking subagent workflows.

    "Nous Research's preview of asynchronous subagents in Hermes Agent enables developers to run non-blocking, parallel agent tasks without interrupting the primary conversation flow."

  • OpenAgents launches local Autopilot 1.0 coding agentLAUNCH

    OpenAgents launches local Autopilot 1.0 coding agent

    3d ago

    OpenAgents has launched Autopilot 1.0, an autonomous, self-improving coding agent that runs locally on the user's machine. Marking a transition from human-driven interfaces to self-driving workflows, this release enables multi-turn autonomous coding execution and refactoring while continuously improving its patterns over time.

    "OpenAgents' launch of Autopilot 1.0 provides AI developers with an autonomous, local coding agent capable of executing multi-turn workflows and self-improving its patterns over time."

  • LangChain, Fireworks drop Qwen trace judgeLAUNCH

    LangChain, Fireworks drop Qwen trace judge

    3d ago

    LangChain has partnered with Fireworks AI to release a fine-tuned Qwen-3.5-35B model that acts as a "Trace Judge" to identify perceived errors in LangSmith production traces. By analyzing multi-turn conversation signals like user corrections and repeated requests, the model matches the accuracy of frontier models at up to 100x lower cost.

    "The fine-tuned Qwen-3.5-35B Trace Judge released by LangChain and Fireworks AI offers a cost-effective, high-accuracy tool for developers to identify errors in production traces."

  • Inception's Mercury 2, the first commercial-scale reasoning diffusion LLM, is now available for production deployment on Baseten.LAUNCH

    Inception's Mercury 2, the first commercial-scale reasoning diffusion LLM, is now available for production deployment on Baseten.

    3d ago

    Baseten has announced that Inception's Mercury 2 is now live on its platform, making it the first inference platform to deliver production-grade reasoning diffusion LLMs (dLLMs) to developers. Unlike traditional autoregressive models that generate tokens sequentially, Mercury 2 uses a diffusion architecture to generate and refine multiple tokens in parallel, enabling speeds of over 1,000 tokens per second on widely-deployed NVIDIA GPUs. Partners like Augment Code have already deployed Mercury 2 in production, achieving a 90% reduction in inference costs and an 82% drop in latency for critical workloads, while maintaining quality comparable to speed-optimized models like Claude 3 Haiku and GPT-5 mini.

    "Inception's Mercury 2 on Baseten gives AI developers production access to the first commercial-scale reasoning diffusion LLM, delivering speeds over 1,000 tokens per second and significant cost reductions."

  • Zed adds agent context compactionUPDATE

    Zed adds agent context compaction

    3d ago

    Zed, the high-performance open-source code editor, announced that context compaction is landing this week for its Agent Panel. The feature automatically summarizes and compresses conversation history, allowing developers to maintain longer conversations without manual thread restarts.

    "Context compaction in Zed's Agent Panel automatically compresses conversation history, enabling developers to maintain longer AI sessions without manual thread restarts."

  • Entire CLI 0.7.6 adds checkpoint rewindingUPDATE

    Entire CLI 0.7.6 adds checkpoint rewinding

    3d ago

    Entire CLI version 0.7.6 introduces experimental features designed to trace code changes back to the AI developer sessions and checkpoints that generated them. This release highlights the additions of entire blame and entire why in its labs module, as well as the new entire checkpoint rewind command, which allows developers to roll back their environment to a specific checkpoint.

    "Entire CLI's new checkpoint rewinding allows developers to roll back their environment and trace code changes back to specific AI developer sessions."

  • Vercel Labs has introduced json-render, an open-source Generative UI framework that enables AI agents to render secure and interactive user interfaces using structured JSON.OPEN SOURCE

    Vercel Labs has introduced json-render, an open-source Generative UI framework that enables AI agents to render secure and interactive user interfaces using structured JSON.

    3d ago

    Vercel Labs released `json-render`, an open-source Generative UI framework that allows AI agents such as Claude Code, Codex, and Pi to generate real-time, interactive user interfaces within sandboxed environments. By leveraging AI SDK's experimental `HarnessAgent`, the framework implements Restrictive UI Generation (RUG), prompting LLMs to output structured JSON configurations rather than raw React or Tailwind code. This approach solves reliability and security challenges like XSS vulnerabilities and layout breakages, while offering platform-agnostic rendering for React, Vue, Svelte, React Native, and state management integration with libraries like Zustand and Redux.

    "Vercel Labs' open-source json-render framework enables AI agents to safely and reliably generate interactive user interfaces using structured JSON configurations."

  • xAI drops Grok Build Agent DashboardUPDATE

    xAI drops Grok Build Agent Dashboard

    3d ago

    xAI released the Agent Dashboard for Grok Build, enabling developers to manage and monitor multiple concurrent agent sessions from a single screen. Accessible via grok dashboard or /dashboard in the shell, it supports inline replies and permission approvals to simplify multi-agent workflows.

    "xAI's new Agent Dashboard for Grok Build allows developers to monitor and coordinate multiple concurrent agent sessions with inline replies and approvals."

  • Omar Sanseviero releases LLM Council skillOPEN SOURCE

    Omar Sanseviero releases LLM Council skill

    3d ago

    Omar Sanseviero has released an LLM Council skill for AI agents, inspired by Andrej Karpathy's concept of multi-perspective LLM deliberation. The skill runs multiple open-weight models in parallel via the Fireworks AI API to answer queries, has them rank each other's anonymized responses to stress-test the advice, and then uses a designated "Chairman" model to synthesize the final output, mitigating single-model failure modes and sycophancy.

    "The LLM Council skill provides developers with a multi-model deliberation framework to reduce single-model failure modes and sycophancy in AI agent workflows."

  • Claude Code creator Boris Cherny reveals that Anthropic runs 100% of its pull requests and 80–90% of code reviews using the tool, highlighting a shift from manual prompting to building agentic loops.VIDEO

    Claude Code creator Boris Cherny reveals that Anthropic runs 100% of its pull requests and 80–90% of code reviews using the tool, highlighting a shift from manual prompting to building agentic loops.

    3d ago

    In a shared interview clip, Boris Cherny, the Head of Claude Code at Anthropic, broke down how the tool is used internally, sharing that 100% of their pull requests and 80–90% of code reviews are run by Claude Code. Cherny noted that his own workflow has shifted away from writing prompts and toward building agentic loops, with the "/loops" command being the feature he uses the most.

    "Boris Cherny's revelation that Anthropic runs 100% of pull requests and 80–90% of code reviews using Claude Code showcases the real-world scale and viability of agentic developer workflows."

  • OpenCode adds native NVIDIA NIM supportUPDATE

    OpenCode adds native NVIDIA NIM support

    3d ago

    OpenCode, a terminal-native, open-source AI coding assistant, has added native integration for NVIDIA NIM APIs to enable on-the-fly model swapping. Developers can now access high-performance models directly from the terminal without configuring complex proxies.

    "OpenCode's native NVIDIA NIM integration enables on-the-fly model swapping directly from the terminal, simplifying local development workflows without complex proxies."

  • Moonshot AI launches Kimi K2.7-Code HighSpeedMODEL

    Moonshot AI launches Kimi K2.7-Code HighSpeed

    3d ago

    Moonshot AI has introduced a new high-speed mode for its open-source multimodal coding model, Kimi K2.7-Code, delivering up to 6× faster generation speeds. The update achieves up to 260 tokens per second on shorter-context tasks and is currently rolling out to Kimi Code Beta.

    "Moonshot AI's new HighSpeed mode for Kimi K2.7-Code delivers up to 260 tokens per second, significantly reducing latency for developers running agentic coding loops."

  • Vexi launches terminal-native AI coding agentOPEN SOURCE

    Vexi launches terminal-native AI coding agent

    3d ago

    Vexi is an open-source, local-first AI coding agent designed to operate entirely within the user's terminal using a "bring your own key" model. Installed via a zero-configuration npm package, the tool supports multiple LLM providers locally without sending code to external servers.

    "The launch of Vexi gives developers an open-source, local-first AI coding agent that runs entirely in the terminal and supports multiple LLM providers without external code exposure."

  • Skills.sh passes 700,000 community-contributed skillsNEWS

    Skills.sh passes 700,000 community-contributed skills

    4d ago

    The open-source registry and package manager for AI agent capabilities, Skills.sh, has crossed a significant milestone of 700,000 community-contributed skills. Often described as the "npm for AI agents," Skills.sh allows developers to share, discover, and install reusable instructions and workflows for AI coding agents like Claude Code, Cursor, and Copilot.

    "Skills.sh crossing the 700,000 milestone highlights its rapid growth as a key registry for developers to share and install reusable capabilities across AI coding agents."

  • Grok Build natively renders math and LaTeXUPDATE

    Grok Build natively renders math and LaTeX

    4d ago

    xAI has introduced native rendering for math, formulas, and LaTeX within Grok Build, its agentic terminal-based AI coding assistant. This update allows developers to read and verify scientific equations and mathematical notation directly in the terminal interface without needing to copy-paste raw markup to external applications or markdown viewers.

    "Native math and LaTeX rendering in Grok Build allows developers to view and verify complex equations directly inside the terminal assistant without external viewers."

  • Databricks open-sources Omnigent agent meta-harnessOPEN SOURCE

    Databricks open-sources Omnigent agent meta-harness

    4d ago

    Omnigent is an open-source orchestration layer developed by Databricks to manage and unify multiple AI agent frameworks under a single control plane. The meta-harness features a unified API for stateful policies, cost limits, security sandboxing, and real-time session collaboration across terminal, web, and mobile environments.

    "Databricks' open-source Omnigent orchestration layer gives developers a unified API and sandboxed control plane to manage and collaborate across multiple AI agent frameworks."

  • Small Harness v0.8.0 automates last-mile agent workflowsUPDATE

    Small Harness v0.8.0 automates last-mile agent workflows

    4d ago

    Small Harness has released version 0.8.0, featuring a new `/ship` command that acts as a comprehensive "last-mile" workflow for coding agents. Instead of manually verifying tests, branch status, commit messages, and CI checks, `/ship` consolidates these tasks into a single guided flow directly from the terminal, automatically handling commits, pushes, and GitHub PR creation.

    "Small Harness v0.8.0 introduces a terminal-based `/ship` command that automates last-mile coding agent workflows like test verification, git commits, and PR creation, simplifying AI-assisted development."

  • Kimi K2.7-Code ranks second on ErdosBenchBENCHMARK

    Kimi K2.7-Code ranks second on ErdosBench

    4d ago

    Moonshot AI's Kimi K2.7-Code achieved second place on ErdosBench, demonstrating high precision with 13/14 coverage and zero major false or unsafe partials. The model matched the top-performing Claude Fable 5 max on all solved results, highlighting the growing reasoning capabilities of Chinese AI laboratories.

    "Moonshot AI's release of Kimi K2.7-Code model weights on Hugging Face and its second-place performance on ErdosBench provides developers with a highly capable open-weights model for complex mathematical reasoning and coding tasks."

  • Andrew Ng's team releases aisuite, an open-source Python library that provides a simple, unified interface for interacting with multiple Generative AI providers.OPEN SOURCE

    Andrew Ng's team releases aisuite, an open-source Python library that provides a simple, unified interface for interacting with multiple Generative AI providers.

    4d ago

    aisuite is an open-source Python library designed to simplify the integration of various LLM providers by offering a unified, OpenAI-compatible interface. By using aisuite, developers can access models from OpenAI, Anthropic, Google, Mistral, AWS, Cohere, Ollama, and Hugging Face using standard client syntax. Instead of refactoring code or managing different vendor SDK dependencies, developers can switch providers by simply changing the model string prefix (e.g., from "openai:gpt-4o" to "anthropic:claude-3-5-sonnet"), facilitating rapid model benchmarking, testing, and multi-model application development.

    "Andrew Ng's team released aisuite, a unified Python library that allows developers to easily benchmark and swap multiple LLM providers using a single OpenAI-compatible interface."

  • shadcn releases improve to audit codebasesOPEN SOURCE

    shadcn releases improve to audit codebases

    4d ago

    Created by shadcn, improve is an open-source developer tool that optimizes token consumption in agentic workflows by decoupling planning from code generation. The tool uses premium frontier models to audit codebases and generate execution plans, then delegates coding tasks to cheaper models.

    "Shadcn's open-source tool 'improve' optimizes token usage in agentic workflows by decoupling codebase auditing and planning from code generation."

  • MiniMax open-sources MSA sparse attention kernelOPEN SOURCE

    MiniMax open-sources MSA sparse attention kernel

    4d ago

    MiniMax has open-sourced MiniMax Sparse Attention (MSA), a blockwise sparse attention kernel designed to handle million-token context windows efficiently. By combining a two-branch architecture with a co-designed GPU execution path, MSA reduces per-token compute by 28.4×, achieving a 14.2× prefill speedup and 7.6× decoding speedup on H800 GPUs.

    "MiniMax's open-sourced sparse attention kernel (MSA) optimizes million-token context windows on H800 GPUs, offering developers significant prefill and decoding speedups for long-context applications."

  • OpenRouter launches its Fusion API, a compound model system designed to route prompts to a panel of participant models in parallel and synthesize their outputs.LAUNCH

    OpenRouter launches its Fusion API, a compound model system designed to route prompts to a panel of participant models in parallel and synthesize their outputs.

    4d ago

    OpenRouter has launched its Fusion API, a compound model architecture that routes user prompts to a panel of participant models in parallel and synthesizes the outputs using a judge model. While the system aims to improve deliberation and responses for complex queries, real-world testing has shown inconsistent performance in coding and simulation tasks when compared directly to single frontier models.

    "OpenRouter's Fusion API allows developers to build compound AI systems by routing prompts to multiple LLMs in parallel and synthesizing their outputs with a judge model."

  • Amazon report triggers Claude Fable 5 banPOLICY

    Amazon report triggers Claude Fable 5 ban

    4d ago

    Amazon researchers discovered a critical security vulnerability in Anthropic's Claude Fable 5, leading CEO Andy Jassy to report the issue directly to the U.S. government. In response, the Department of Commerce imposed emergency export controls, prompting Anthropic to disable global access to both its Fable 5 and Mythos 5 models.

    "Anthropic's suspension of global access to Claude Fable 5 and Mythos 5 due to emergency export controls immediately halts developers' ability to use these frontier models in production."

  • Browser Use plugins are now available in Claude Code, enabling developers to integrate web automation capabilities directly from the command-line interface.UPDATE

    Browser Use plugins are now available in Claude Code, enabling developers to integrate web automation capabilities directly from the command-line interface.

    4d ago

    Browser Use plugins have been added to the Claude Code plugin marketplace, allowing developers to install them using the command `claude plugin marketplace add browser-use/plugins`. This integration enables Anthropic's developer CLI tool to run web automation workflows and interact with web pages, leveraging Browser Use's agentic browser control capabilities to perform tasks such as navigating, clicking, and extracting web data.

    "Integrating Browser Use plugins into Claude Code enables developers to run web automation and browser control workflows directly from their CLI-based coding assistant."

  • TileRT optimizes Large Language Model execution on GPUs by using persistent kernels to minimize microsecond-scale execution gaps and enable ultra-low-latency serving.INFRA

    TileRT optimizes Large Language Model execution on GPUs by using persistent kernels to minimize microsecond-scale execution gaps and enable ultra-low-latency serving.

    5d ago

    TileRT is a tile-level runtime engine developed in collaboration with Xiaomi that optimizes GPU execution for LLMs by replacing traditional per-operator launches with persistent kernels. This approach eliminates microsecond-scale execution gaps, sustaining high token throughput and ultra-low latency on commodity hardware.

    "TileRT optimizes GPU execution for LLMs by replacing traditional per-operator launches with persistent kernels, enabling ultra-low-latency serving for tool builders."

  • Andrej Karpathy shares the "LLM Wiki" design pattern, where an AI agent actively maintains and organizes a user's knowledge base.NEWS

    Andrej Karpathy shares the "LLM Wiki" design pattern, where an AI agent actively maintains and organizes a user's knowledge base.

    5d ago

    The "LLM Wiki" is a design pattern introduced by Andrej Karpathy to address the "memory rot" and organization challenges typical of second-brain systems. Instead of using standard Retrieval-Augmented Generation (RAG) to query a chaotic directory of notes, the pattern proposes a three-layer architecture: raw sources, a synthesized wiki of interlinked markdown files, and instructions for how the AI agent should maintain it. Under this pattern, the LLM acts as an active gardener of the wiki—synthesizing new info, identifying connections, and resolving contradictions—resulting in a compounding knowledge base.

    "Andrej Karpathy's "LLM Wiki" design pattern provides developers with a structured, agent-maintained architecture to build more reliable and self-organizing knowledge bases."

  • Google launches Open Knowledge Format specOPEN SOURCE

    Google launches Open Knowledge Format spec

    5d ago

    Google has released the v0.1 draft of the Open Knowledge Format (OKF), a vendor-neutral specification designed to organize corporate knowledge into portable, Git-friendly Markdown directories with YAML frontmatter metadata. Designed to solve information fragmentation across tools and codebases without proprietary lock-in, OKF is readable by both humans and AI agents and integrates natively with Google Cloud's Knowledge Catalog.

    "Google's Open Knowledge Format (OKF) offers a vendor-neutral, Git-friendly markdown specification that helps developers standardize knowledge directories for AI agents and RAG systems."

  • OpenRouter Advisor boosts collaborative model intelligenceINFRA

    OpenRouter Advisor boosts collaborative model intelligence

    5d ago

    AI researcher Elvis Saravia highlights data showing how combining specialized models with human expertise yields a compounding capability effect. By dynamically routing tasks to optimal models, developers can bypass monolithic LLM bottlenecks to build more robust and cost-effective architectures.

    "OpenRouter's new Advisor feature allows developers to build more cost-effective workflows by executing tasks on faster, smaller models and dynamically routing complex queries to a stronger advisor model mid-generation."

  • LMCache is an open-source KV cache management layer that supercharges LLM inference by sharing and reusing KV caches across GPUs, CPUs, and local/remote storage layers.OPEN SOURCE

    LMCache is an open-source KV cache management layer that supercharges LLM inference by sharing and reusing KV caches across GPUs, CPUs, and local/remote storage layers.

    5d ago

    LMCache optimizes large language model (LLM) inference by extracting the Key-Value (KV) cache from GPU memory and treating it as a persistent, reusable asset rather than temporary, ephemeral data. By storing the KV cache across a tiered storage hierarchy—including CPU RAM, local disks, and remote backends like Redis or S3—LMCache enables prefix reuse across different queries, sessions, and physical machines. This decouples caching from the inference engine itself, offering integrations with popular platforms like vLLM and SGLang to drastically reduce Time-to-First-Token (TTFT) and boost serving throughput.

    "LMCache is an open-source KV cache management layer that reduces Time-to-First-Token (TTFT) and serving costs by sharing and reusing KV caches across GPUs, CPUs, and tiered storage in inference engines like vLLM and SGLang."

  • Zhipu AI to open-source GLM-5.2 under MITOPEN SOURCE

    Zhipu AI to open-source GLM-5.2 under MIT

    5d ago

    Zhipu AI has announced plans to open-source its flagship GLM-5.2 coding model under the permissive MIT license next week. The model features a 1-million-token context window and is currently deployed on Zhipu's GLM Coding Plan.

    "Zhipu AI's upcoming MIT-licensed open-source release of the GLM-5.2 coding model with a 1-million-token context window provides developers with a powerful, accessible model for complex coding tasks."

  • Mosh is an open-source, model-driven application security testing harness that wraps around LLMs to automate penetration testing through discovery, planning, dockerized execution, and reporting.OPEN SOURCE

    Mosh is an open-source, model-driven application security testing harness that wraps around LLMs to automate penetration testing through discovery, planning, dockerized execution, and reporting.

    5d ago

    Mosh (Model-driven Open Security Harness) is an open-source security testing application designed to automate the work of a security researcher. Instead of relying on raw prompts, the tool implements a multi-step workflow starting with application discovery (mapping routes and technologies), security planning (creating test hypotheses), and controlled test execution through Docker containers using engagement settings. It continuously writes structured reports and memory logs, allowing developers to safely run, review, and reproduce pen-testing results iteratively as vulnerabilities are resolved.

    "Mosh provides an open-source, LLM-driven security testing harness that automates dockerized penetration testing, helping developers safely find and reproduce application vulnerabilities."

  • Vercel Labs' open-source AI browser automation tool, agent-browser, has reached one million weekly downloads on npm, with developer Chris Tate sharing insights on optimizing startup performance.OPEN SOURCE

    Vercel Labs' open-source AI browser automation tool, agent-browser, has reached one million weekly downloads on npm, with developer Chris Tate sharing insights on optimizing startup performance.

    5d ago

    Chris Tate announced that Vercel Labs' agent-browser, an open-source headless browser automation tool tailored for AI agents, has reached 1,000,000 weekly downloads on npm. Tate noted that a temporary download dip lined up with a transition from running via `npx` to a global install command (`npm i -g`). Implementing a global installation path reduced the tool's startup time to approximately 1 millisecond, which is crucial for low-latency agentic workflows.

    "Vercel Labs' open-source agent-browser reaching one million weekly downloads and optimizing startup to 1ms provides developers with a highly performant tool for low-latency web agent workflows."

  • agentsview tracks coding agent token usageOPEN SOURCE

    agentsview tracks coding agent token usage

    5d ago

    agentsview is a local-first desktop and CLI tool for browsing, searching, and analyzing AI coding agent sessions. Written in Go, it supports over 20 agents and acts as a 100x faster, privacy-preserving replacement for ccusage to track token usage and daily costs.

    "This open-source, local-first tool allows developers to track and analyze AI coding agent session logs, token usage, and daily costs privately."

  • Unity MCP builds 3D endless runner prototypeNEWS

    Unity MCP builds 3D endless runner prototype

    5d ago

    Developer @givros shared their experience testing the Model Context Protocol (MCP) integration for Unity with Codex to build a 3D endless runner prototype. The test demonstrated that Unity MCP enables AI to autonomously construct, configure, and wire scenes and assets directly inside the editor without manual placement.

    "This demonstration showcases how Model Context Protocol (MCP) integration with Codex enables AI to autonomously construct and configure 3D environments inside the Unity editor."

  • Claude Fable 5 suffers massive prompt leakSECURITY

    Claude Fable 5 suffers massive prompt leak

    5d ago

    Jailbreak researcher Pliny the Liberator bypassed Claude Fable 5's safety guardrails using a 'pack hunt' exploit to extract and publish its full system prompt. The leaked 120,000-character document behaves like a complex software specification, containing extensive tool definitions, schemas, and routing logic rather than a typical persona script.

    "The leaked 120,000-character system prompt exposes internal tool definitions, schemas, and complex routing logic of Claude Fable 5, providing developers with valuable insight into frontier model design."

  • Anthropic reverses course on Claude Fable 5 safeguardsUPDATE

    Anthropic reverses course on Claude Fable 5 safeguards

    5d ago

    Anthropic has updated its safety policy for Claude Fable 5 following pushback from developers over invisible safeguards that silently degraded queries. In response to concerns about unpredictability and transparency in agentic workflows, Anthropic committed to a visible fallback mechanism, openly routing flagged queries to Claude Opus 4.8 instead of silently degrading performance.

    "Anthropic's transition to a visible fallback mechanism and Opus 4.8 routing for flagged Claude Fable 5 queries addresses developer concerns over silent performance degradation in agentic workflows."

  • OpenAI is acquiring Ona, a secure cloud execution platform designed to run autonomous and persistent AI software engineering agents.FUNDING

    OpenAI is acquiring Ona, a secure cloud execution platform designed to run autonomous and persistent AI software engineering agents.

    5d ago

    Ona provides sandboxed, enterprise-grade cloud execution environments designed specifically to run autonomous AI software engineering agents. By acting as a "mission control" for agents, Ona enables them to execute long-running tasks autonomously, write code, run tests, and open pull requests within secure, isolated spaces. OpenAI has agreed to acquire Ona to integrate its cloud-based environment and agent-management technology into OpenAI's Codex ecosystem, solving key execution and governance challenges for enterprise AI agents.

    "OpenAI's acquisition of Ona will integrate secure, sandboxed execution environments into Codex, helping developers deploy autonomous software engineering agents more safely and reliably."

  • The developer behind Crabbox is increasingly relying on Codex to automate the overwhelming volume of issues and pull requests.INFRA

    The developer behind Crabbox is increasingly relying on Codex to automate the overwhelming volume of issues and pull requests.

    5d ago

    The maintainer of Crabbox, an open-source project by OpenClaw, has integrated Codex directly into the build process to help manage a flood of community contributions. Codex has been running continuously inside Crabbox for the past four days, becoming an essential piece of infrastructure for testing and landing PRs.

    "Integrating Codex directly into Crabbox's build process demonstrates the real-world viability of using autonomous AI agents to automate open-source issue triage and pull request management at scale."

  • HeyGen brings talking avatars to HyperFramesLAUNCH

    HeyGen brings talking avatars to HyperFrames

    5d ago

    HeyGen has integrated its AI talking avatars with HyperFrames, an open-source, agent-native video rendering framework. The integration allows developers and AI coding agents to programmatically automate deterministic video generation using web standards like HTML, CSS, and JavaScript.

    "Integrating HeyGen's talking avatars with the open-source HyperFrames framework enables developers and AI agents to programmatically automate deterministic video generation using standard web technologies."

  • Anthropic disputes Claude Fable 5 suspensionPOLICY

    Anthropic disputes Claude Fable 5 suspension

    5d ago

    Anthropic has challenged the U.S. government's suspension of its newly launched Claude Fable 5 model, arguing the cited jailbreak vulnerabilities are minor and present in competing models like GPT 5.5. The company expects the model to be back online by Monday.

    "Anthropic's challenge of the Claude Fable 5 suspension and its expectation to restore access by Monday provides developers with a crucial timeline for resuming work with the model."

  • US blocks foreign access to Claude modelsPOLICY

    US blocks foreign access to Claude models

    6d ago

    The U.S. Commerce Department has ordered Anthropic to suspend foreign nationals' access to its newly launched Claude Fable 5 and Mythos 5 AI models due to national security concerns. Anthropic complied by temporarily disabling the models for all users, though the company disputed the severity of the alleged jailbreak exploit that triggered the government's decision.

    "The U.S. Commerce Department's directive ordering Anthropic to suspend access to Claude Fable 5 and Mythos 5 has resulted in a global suspension of these models, directly disrupting developers who were integrating them."

  • Matt Silverlock builds Ferdinand agent on FlueNEWS

    Matt Silverlock builds Ferdinand agent on Flue

    6d ago

    Matt Silverlock announced Ferdinand, a custom chat and research agent built using Flue, a TypeScript-first programmable framework for autonomous AI workflows. The agent showcases the framework's ease of building specialized agentic workflows with structured tool usage.

    "The demonstration of Ferdinand showcases the capabilities of Flue, a TypeScript-first programmable framework designed to help developers build autonomous AI workflows with structured tool usage."

  • OpenCode Go adds Kimi 2.7 CodeUPDATE

    OpenCode Go adds Kimi 2.7 Code

    6d ago

    OpenCode has integrated Moonshot AI's new Kimi 2.7 Code model into its Go subscription service. The Mixture-of-Experts model is optimized for complex coding tasks, reducing reasoning tokens by 30% to improve latency and lower costs.

    "Integrating Moonshot AI's Kimi 2.7 Code into OpenCode Go gives developers access to a coding-optimized Mixture-of-Experts model that reduces reasoning tokens by 30% to lower latency and costs."

  • Claude Fable 5 Refuses All ProgramBench TasksBENCHMARK

    Claude Fable 5 Refuses All ProgramBench Tasks

    6d ago

    Anthropic's Claude Fable 5 model achieved a 100% refusal rate on the 200 tasks in the ProgramBench coding benchmark. Strict cyber-safety guardrails flagged the program reconstruction tasks as security risks, preventing execution despite strong performance on general coding benchmarks like SWE-bench Pro.

    "Claude Fable 5's 100% refusal rate on ProgramBench tasks highlights how strict cyber-safety guardrails can block program reconstruction despite high performance on other coding benchmarks."

  • Google Gemini-SQL2 Tops BIRD BenchmarkMODEL

    Google Gemini-SQL2 Tops BIRD Benchmark

    6d ago

    Google has introduced Gemini-SQL2, a specialized Text-to-SQL model powered by Gemini 3.1 Pro that leverages domain-specific fine-tuning. The model achieved a state-of-the-art score of 80.04% execution accuracy on the challenging BIRD benchmark.

    "The introduction of Google's Gemini-SQL2, a specialized Text-to-SQL model powered by Gemini 3.1 Pro that achieves a state-of-the-art 80.04% accuracy on the BIRD benchmark, provides developers with a highly accurate model for building database-querying applications."

  • Moonshot AI has released Kimi K2.7 Code, an open-weight 1-trillion parameter Mixture-of-Experts coding model featuring native vision support and a 256K context window.MODEL

    Moonshot AI has released Kimi K2.7 Code, an open-weight 1-trillion parameter Mixture-of-Experts coding model featuring native vision support and a 256K context window.

    6d ago

    Moonshot AI has launched Kimi K2.7 Code, a 1-trillion parameter coding-focused Mixture-of-Experts (MoE) model with 32 billion active parameters. The model introduces native vision support, operates with a 256K context window, and reduces thinking token usage by 30% compared to Kimi K2.6, making it highly efficient for long-context programming and reasoning tasks.

    "Moonshot AI's release of Kimi K2.7 Code, a 1-trillion parameter open-weight Mixture-of-Experts coding model with native vision and 256K context, provides developers with a highly efficient open model for reasoning and long-context programming."

  • Linear Agent auto-fixes triage bugsUPDATE

    Linear Agent auto-fixes triage bugs

    6d ago

    Linear Agent can now write code to automatically resolve bugs as soon as they land in triage. This capability pushes the platform beyond standard issue tracking to actively participate in the engineering workflow.

    "The addition of auto-coding capabilities in Linear Agent allows developers and engineering teams to automatically resolve triage bugs directly within their issue-tracking workflow."

  • Google AI Studio launches prompt-to-app Android developmentUPDATE

    Google AI Studio launches prompt-to-app Android development

    6d ago

    Google AI Studio now supports building native Android applications from natural language prompts using an AI agent to generate Kotlin and Jetpack Compose projects. Developers can test these apps in a browser-based emulator, refine them via chat, and deploy them directly to physical devices or Google Play's internal testing tracks without local SDK configuration.

    "The addition of prompt-to-app Android development in Google AI Studio allows developers to build, test, and deploy native Android applications entirely through natural language without local SDK setup."

  • Mastra adds secure Railway Sandbox executionUPDATE

    Mastra adds secure Railway Sandbox execution

    6d ago

    Mastra has introduced integration support for Railway Sandboxes to enable secure, isolated code execution for TypeScript AI agents. The integration runs command-line execution, script runs, and write operations inside ephemeral Debian Linux VMs to protect the host infrastructure.

    "Mastra's new integration with Railway Sandboxes enables secure, isolated execution of code, scripts, and file operations for TypeScript AI agents inside ephemeral Linux VMs."