October 2025: AI updates from the previous month

OpenAI proclaims agentic safety researcher that may discover and repair vulnerabilities

OpenAI has launched a non-public beta for a brand new AI agent known as Aardvark that acts as a safety researcher, discovering vulnerabilities and making use of fixes, at scale.

“Software program safety is likely one of the most important—and difficult—frontiers in know-how. Every year, tens of 1000’s of latest vulnerabilities are found throughout enterprise and open-source codebases. Defenders face the daunting duties of discovering and patching vulnerabilities earlier than their adversaries do. At OpenAI, we’re working to tip that stability in favor of defenders,” OpenAI wrote in a weblog publish.

The agent repeatedly analyzes supply code repositories to establish vulnerabilities, assess their exploitability, prioritize severity, and suggest patches. As an alternative of utilizing conventional evaluation strategies like fuzzing of software program composition evaluation, Aardvark makes use of LLM-powered reasoning and tool-use.

Cursor 2.0 allows eight brokers to work in parallel with out interfering with one another

The AI coding editor Cursor introduced the launch of Cursor 2.0, the subsequent iteration of the platform, that includes a brand new interface for working with a number of brokers and its first ever coding mannequin.

The brand new multi-agent interface facilities round brokers as a substitute of recordsdata. With this new interface, as much as eight brokers can work in parallel, utilizing git worktrees and distant timber to stop them from interfering with one another. It additionally permits builders to have a number of fashions try the identical drawback and see which one produces the most effective output.

Whereas this new interface is designed for brokers, builders will nonetheless be capable to open recordsdata or change again to the traditional IDE as wanted.

The brand new coding mannequin, Composer, is 4 instances quicker than comparable fashions, the corporate claims. It was designed for low-latency agentic coding duties in Cursor, and it may well full most turns in lower than 30 seconds.

Workato launches Enterprise MCP for SaaS platforms

Organizations are spending large {dollars} on AI brokers, however are discovering that integrating the brokers into all of the techniques the enterprise must perform is a really excessive hurdle.

To assist make SaaS platforms agent-ready, integration orchestration firm Workato launched Workato Enterprise MCP, which the corporate mentioned in its announcement can “flip present workflows, integrations, and APIs into wealthy, multi-step agent expertise that any large-language-model (LLM)-based agent can name, together with ChatGPT, Claude, Gemini, and Cursor.”

Adam Seligman, chief know-how officer at Workato, informed SD Occasions that “the factor we maintain coming again to time and again is brokers present numerous promise, however to actually work for enterprise, they must get entry to enterprise information. They usually have to have the ability to do issues inside your corporation, however do it in a means that you simply belief. And it’s actually arduous to get these two issues proper.”

JetBrains launches open benchmarking platform for measuring AI productiveness

JetBrains has launched a brand new device designed to allow builders to measure their precise productiveness beneficial properties from AI instruments.

The corporate’s Developer Productiveness AI Enviornment (DPAI Enviornment) is an open benchmarking platform for a way nicely AI growth instruments full real-world software program engineering duties. In accordance with the corporate, present benchmarks that LLMs are run towards depend on outdated datasets, cowl a slim vary of applied sciences, and focus primarily on issue-to-patch workflows.

“As AI coding instruments advance quickly, the trade nonetheless lacks a impartial, standards-based framework to measure their actual impression on developer productiveness,” the corporate wrote in a weblog publish.

DPAI Enviornment makes use of a versatile, track-based structure to allow reproducible comparisons throughout workflows like patching, bug fixes, PR evaluation, check technology, static evaluation, and extra.

GitHub unveils Agent HQ, the subsequent evolution of its platform that focuses on agent-based growth

Throughout its annual convention, GitHub Universe, GitHub shared its plans for Agent HQ, its imaginative and prescient for the way forward for the platform the place AI brokers are natively built-in throughout all of GitHub.

As a part of this Agent HQ initiative, over the subsequent a number of months, paid GitHub Copilot customers will acquire direct entry to well-liked coding brokers from Anthropic, OpenAI, Google, Cognition, xAI, and extra.

Agent HQ brings with it a number of new capabilities to assist this subsequent evolution, the primary of which is mission management, a central command middle for assigning, steering, and monitoring the work of a number of brokers throughout GitHub, Copilot CLI, and VS Code.

Mission management’s department controls provides builders granular oversight over operating checks for code created by the brokers. Id options may even be launched to permit builders to handle brokers like they’d different coworkers and management which agent is constructing a activity, handle entry, and implement insurance policies.

OpenAI completes restructuring, strikes new cope with Microsoft

OpenAI at present introduced that it has accomplished the restructuring of its enterprise. When the corporate was based in 2015, it was launched as a non-profit group and that non-profit has managed the for-profit arm of the enterprise.

In the present day’s restructuring turns the for-profit arm right into a public profit company known as OpenAI PBC. The OpenAI Basis—the brand new identify for the non-profit—will nonetheless management the for-profit and maintain a 26% fairness stake in OpenAI PBC, which is at the moment valued at round $130 billion.

Being a public profit company differs from conventional company constructions in that they’re “required to advance its said mission and contemplate the broader pursuits of all stakeholders, making certain the corporate’s mission and industrial success advance collectively,” OpenAI’s web site explains.

Microsoft proclaims public preview for planning functionality that improves how Copilot in Visible Studio handles advanced duties

Microsoft has introduced a public preview for a brand new function that goals to allow Copilot in Visible Studio to sort out extra advanced initiatives.

With its new planning functionality in Agent Mode, Copilot will analysis the codebase to interrupt down large duties into smaller and extra manageable duties, whereas additionally iterating on its plan as it really works by way of the steps.

“Planning makes Copilot extra predictable and constant by giving it a structured method to purpose about your mission. It builds on strategies from hierarchical and closed-loop planning analysis – enabling Copilot to plan at a excessive degree, execute step-by-step, and regulate dynamically because it learns extra about your codebase and points encountered throughout implementation,” Rhea Patel, product supervisor at Microsoft, wrote in a weblog publish.

GitKraken releases Insights to assist corporations measure ROI of AI

GitKraken, a software program engineering intelligence firm that focuses on enhancing the developer expertise, introduced the launch of GitKraken Insights to supply corporations with higher insights into AI’s impression on developer productiveness.

Matt Johnston, CEO of Gitkraken, informed SD Occasions that regardless of the incremental investments in and perceived velocity beneficial properties from AI, they battle to grasp the impression. “I used to be speaking to a VP of developer expertise at a big Silicon Valley firm, and he was principally saying, ‘We’ve made investments of 1000’s of seats in Cursor and Copilot and Claude, and we are able to’t actually inform what’s getting used… and how on earth do I measure this in a means that’s compelling to my enterprise leaders.”

GitKraken Insights brings collectively a number of completely different metrics—DORA metrics, code high quality evaluation, technical debt monitoring, AI impression measurement, and developer expertise indicators—to color an image of what’s taking place throughout the growth lifecycle.

Mabl proclaims updates to Agentic Testing Teammate

The Agentic Testing Teammate works alongside human testers to make the method extra environment friendly. New updates embody AI vectorizations and check semantic search, enhancements to check protection, and enhancements to the MCP Server that allow testers to do a variety of duties instantly inside their IDE, together with Take a look at Affect Evaluation, clever check creation, and failure suggestions.

“This new work is constructed on the concept an agent can grow to be an integral a part of your testing staff,” mentioned Dan Belcher, co-founder of mabl. “In contrast to scripting frameworks and general-purpose giant language fashions, mabl builds deep data about your software over time and makes use of that data to make it–and your staff–more practical.”

Couchbase 8.0 provides three new vector indexing and retrieval capabilities

These new capabilities are designed to assist numerous vector workloads that facilitate real-time AI purposes.

Hyperscale Vector Index is predicated on the DiskANN nearest-neighbor search algorithm and allows operation throughout partitioned disks for distributed processing. Composite Vector Index helps pre-filtered queries that may scope the particular vector being sought. Search Vector Index helps hybrid searches containing vectors, lexical search, and structured question standards in a single SQL++ request.

Anthropic expands reminiscence to all paid Claude customers

Anthropic introduced that the current reminiscence function in Claude is being rolled out to Professional and Max plan customers, making it obtainable to all paid customers now.

Reminiscence was initially introduced in early September, however was solely obtainable to Staff and Enterprise customers to start with.

Reminiscence permits Claude to recollect your initiatives and preferences so that you simply don’t must re-explain vital context throughout periods. “Nice work builds over time. With reminiscence, every dialog with Claude improves the subsequent,” Anthropic wrote in its preliminary announcement.

Harness brings vibe coding to database migration with new AI-Powered Database Migration Authoring function

Harness is on a mission to make it simpler for builders to do database migrations with its new AI-Powered Database Migration Authoring function. This new functionality permits customers to explain schema modifications in pure language to obtain a production-ready migration.

For instance, a developer might ask “Create a desk named animals with columns for genus_species and common_name. Then add a associated desk named birds that tracks unladen airspeed and correct identify. Add rows for Captain Canary, African swallow, and European swallow.”

Harness’ platform would then analyze the present schema and insurance policies, generate a backward-compatible migration, validate the change for security and compliance, commit it to Git for testing, and create rollback migrations.

Purple Hat Developer Lightspeed brings AI help to Purple Hat’s Developer Hub and migration toolkit

Purple Hat Developer Lightspeed has been built-in into each the Purple Hat Developer Hub and the migration toolkit for purposes (MTA).

Within the Purple Hat Developer Hub, it acts as an assistant to hurry up non-coding duties, like exploring software design approaches, writing documentation, producing check plans, and troubleshooting purposes.

Within the migration toolkit, Purple Hat Developer Lightspeed automates supply code refactoring throughout the IDE. It leverages MTA’s static code evaluation to grasp migration points and the way to repair them, and in addition improves over time by studying what made previous modifications profitable.

MariaDB unifies transactional, analytical, and vector databases in MariaDB Enterprise Platform 2026 launch

MariaDB’s Enterprise Platform 2026 launch was introduced this week, with the promise that it’s going to act as “the definitive database platform for constructing next-generation clever purposes.”

To assist agentic AI, the corporate added native RAG for grounding LLMs with context from MariaDB while not having embeddings, vector shops, or retrieval pipelines. The corporate additionally added ready-to-use brokers throughout the platform, together with a developer copilot that connects to the database and might reply to pure language queries, and a DBA copilot that may handle duties like efficiency tuning and debugging.

Moreover, the corporate added an built-in MCP server in order that brokers can work together with MariaDB databases. The MCP interface in MariaDB permits customers to combine vector search, LLMs, and customary SQL operations, and permits brokers to launch serverless databases within the cloud.

Spotify Portal now usually obtainable and full of options for enhancing dev expertise

Spotify Portal for Backstage supplies builders with a ready-to-use model of Backstage, its open supply resolution for constructing inside developer portals (IDPs).

AiKA, which is an AI assistant for Portal, can now hook up with third-party MCP servers and set off actions in Portal. AiKA itself additionally capabilities as an MCP server, permitting builders to attach it as much as instruments like Cursor or Copilot and entry Portal information.

“The overall availability of Spotify Portal marks a pivotal second in how organizations construct, measure, and optimize developer expertise. What started as an inside device for Spotify engineers is now a fully-fledged platform for enterprises, combining the reliability of Backstage, the perception of Confidence, and the velocity of AI-driven workflows,” Spotify wrote.

Sonar proclaims new resolution to optimize coaching datasets for coding LLMs

Sonar, an organization that focuses on code high quality, introduced a brand new resolution that may enhance how LLMs are educated for coding functions.

In accordance with the corporate, LLMs which are used to assist with software program growth are sometimes educated on publicly obtainable, open supply code containing safety points and bugs, which grow to be amplified all through the coaching course of. “Even a small quantity of flawed information can degrade fashions of any measurement, disproportionately degrading their output,” Sonar wrote in an announcement.

SonarSweep (now in early entry) goals to mitigate these points by making certain that fashions are studying from high-quality, safe examples.

It really works by figuring out and fixing code high quality and safety points within the coaching information itself. After analyzing the dataset, it applies a strict filtering course of to take away low-quality code whereas additionally balancing the up to date dataset to make sure it should nonetheless supply numerous and consultant studying.

Amazon launches Fast Suite to supply agentic AI throughout purposes and AWS providers

Amazon Fast Suite permits customers to ask questions, conduct deep analysis, analyze and visualize information, and create automations.

It might hook up with inside repositories, like wikis or intranet, and AWS providers. Amazon additionally presents 50+ built-in connectors to purposes like Adobe Analytics, SharePoint, Snowflake, Google Drive, OneDrive, Outlook, ServiceNow, and Databricks, in addition to assist for over 1,000+ apps through connecting to their MCP servers.

This deep connection throughout the enterprise allows Fast Sight to investigate information throughout all of an organization’s techniques and create advanced enterprise workflows throughout a number of purposes and departments.

“In contrast to conventional enterprise intelligence instruments that work solely with databases and information warehouses, Fast Sight’s agentic expertise analyzes all types of information throughout all of your techniques and apps, together with your paperwork,” Amazon wrote in a weblog publish.

Google unveils Gemini Enterprise to supply corporations a extra unified platform for AI innovation

Google is asserting a brand new providing constructed round Gemini, designed particularly with giant enterprise use in thoughts.

Gemini Enterprise consolidates six core parts:

Superior Gemini fashions
A no-code workbench for analyzing data and orchestrating brokers
Pre-built Google brokers for duties like deep analysis or information insights
The flexibility to connect with firm information
A central governance framework for visualizing and securing all brokers
Entry to an ecosystem of over 100,000 trade companions

“By bringing all of those parts collectively by way of a single interface, Gemini Enterprise transforms how groups work. It strikes past easy duties to automate total workflows and drive smarter enterprise outcomes — all on Google’s safe, enterprise-grade structure,” Thomas Kurian, CEO of Google Cloud, wrote in a weblog publish.

Atlassian shares main updates to its genAI assistant Rovo at Staff ‘25 Europe

Atlassian is internet hosting its annual consumer convention Staff ‘25 Europe this week in Barcelona, and in the course of the occasion, the corporate shared a number of new and upcoming updates to its generative AI assistant Rovo.

Atlassian introduced the final availability of its AI coding agent Rovo Dev. Rovo Dev can assist with code evaluations, documentation, dependency cleanups, and extra, and it leverages context from tickets, docs, incidents, and enterprise objectives to supply builders with data that may assist them make extra knowledgeable selections.

Moreover, beginning early subsequent 12 months, Rovo Search will grow to be the default search in Jira, which is able to permit Jira’s search to recommend related points and initiatives.

Rovo Chat may even be getting over 100 out-of-the-box modular capabilities from Atlassian and its companions that can be utilized in chat, brokers, and workflows. Different new Chat capabilities embody the flexibility to recollect previous conversations and preferences and a brand new collaborative workspace known as Canvas.

Google launches ecosystem of extensions for Gemini CLI

Google is launching Gemini CLI extensions to permit completely different growth instruments to attach as much as the Gemini CLI.

Every extension features a playbook that teaches the CLI the way to successfully use that device, eliminating the necessity for builders to configure them. “If you wish to look beneath the hood, Gemini CLI extensions bundle directions, MCP servers and customized instructions into a well-recognized and user-friendly format,” Google wrote in a weblog publish.

Twenty-two extensions can be found at launch from Google companions Atlassian, Canva, Confluent, Dynatrace, Elastic, Figma, GitLab, Grafana Labs, Harness, HashiCorp, MongoDB, Neo4j, Pinecone, Postman, Qodo, Shopify, Snyk, Sonar, Stripe, ThoughtSpot, Weights & Biases by CoreWeave, and WIX.

IBM provides new capabilities to watsonx Orchestrate to facilitate agentic AI at scale

As IBM kicked off its annual developer occasion TechXchange 2025, it introduced a number of new capabilities to allow organizations to unlock worth from agentic AI.

“There’s actually been numerous buzz within the trade,” mentioned Bruno Aziza, vp of Knowledge, AI, and Analytics Technique at IBM Software program. “I feel if you happen to take a look at the context of the whole lot that’s happening, clients are struggling. They’re struggling to get worth from their funding.

It introduced many updates to its AI agent orchestration platform, watsonx Orchestrate. The platform now consists of AgentOps, an observability and governance layer for AI brokers; Agentic Workflows, standardized and reusable flows that can be utilized to construct and sequence multi-agent techniques; and Langflow integration to scale back agent setup time.

OpenAI DevDay: ChatGPT Apps, AgentKit, and GA launch of Codex

OpenAI held its annual Developer Day occasion this week the place it introduced a number of updates to its merchandise.

The corporate unveiled apps in ChatGPT in addition to an SDK for builders to construct them. Corporations which have created apps which are already obtainable embody Reserving.com, Canva, Coursera, Figma, Expedia, Spotify, and Zillow.

When a consumer says the identify of an obtainable app in a immediate, ChatGPT will robotically floor that app within the chat. For instance, saying “Spotify, make a playlist for my occasion this Friday” will carry within the Spotify app. ChatGPT may even be capable to recommend apps when it thinks they’re related to the dialog, similar to suggesting Zillow’s app in a dialog about shopping for a home.

Google’s coding agent Jules now works within the command line

Google’s coding agent Jules now can be utilized instantly in developer’s command traces in order that it may well act as extra of a coding companion.

In accordance with Google, it created this new command line interface—known as Jules Instruments—out of a recognition that the terminal is the place builders spend most of their time.

Jules Instruments permits builders to spin up duties, examine what Jules is doing, and combine Jules into automation. “Consider Jules Instruments as each a dashboard and a command floor to your coding agent,” Google wrote in a weblog publish.

Amazon Bedrock AgentCore MCP server now obtainable

The AgentCore MCP server presents built-in assist for runtime, gateway integration, id administration, and agent reminiscence. It was created to hurry up the method of making parts which are suitable with Bedrock AgentCore.

“What sometimes takes important effort and time, for instance studying about Bedrock AgentCore providers, integrating Runtime and Instruments Gateway, managing safety configurations, and deploying to manufacturing can now be accomplished in minutes by way of conversational instructions together with your coding assistant,” AWS wrote in a weblog publish.

DigitalOcean updates Gradient AI Platform

The Gradient AI Platform is a platform for constructing AI brokers while not having to handle the underlying infrastructure. New options which were added embody assist for picture technology, auto-indexing of information bases, and VPC integration.

Moreover, DigitalOcean revealed that it is going to be increasing the platform additional within the subsequent few weeks with new choices just like the Gradient AI AgentDevelopmentKit and Gradient AI Genie, which integrates into IDEs and can be utilized to handle multi-agent techniques utilizing pure language.

Microsoft proclaims preview of its new Agent Framework

Microsoft has introduced a preview of the Microsoft Agent Framework, an open-source growth equipment for .NET and Python for creating AI brokers and multi-agent workflows.

It helps creating particular person brokers in addition to graph-based workflows to attach up a number of brokers.

In accordance with Microsoft, the Agent Framework is a direct successor to its different initiatives Semantic Kernel and AutoGen, using foundations from each. It brings collectively Semantic Kernel’s enterprise-grade options like thread-based state administration, kind security, filters, telemetry, and mannequin and embedding assist, with AutoGen’s abstractions for single- and multi-agent patterns.

Mendix updates its low-code platform with agentic AI options

New agent and genAI options embody an agent builder, the flexibility to create mission plans utilizing generative AI, the flexibility to create microflows and workflows with AI, and assist for MCP.

One other focus space of the discharge is enterprise course of automation, and new options associated to that embody the flexibility for Mendix Workflows to name AI brokers, dynamic case administration, and International Inbox, a single view for all duties from a number of distributed workflows.

California passes regulation to make sure protected innovation of frontier AI fashions

Earlier this week, California’s governor Gavin Newsom signed a brand new regulation designed to make sure protected growth and deployment of frontier AI fashions.

“California has confirmed that we are able to set up laws to guard our communities whereas additionally making certain that the rising AI trade continues to thrive,” Newsom mentioned. “This laws strikes that stability. AI is the brand new frontier in innovation, and California just isn’t solely right here for it – however stands sturdy as a nationwide chief by enacting the first-in-the-nation frontier AI security laws that builds public belief as this rising know-how quickly evolves.”

The regulation, SB 53, establishes necessities for corporations growing frontier AI fashions, spanning 5 classes: transparency, innovation, security, accountability, and responsiveness.

Slack evolves to assist agentic capabilities constructed on dialog information

Salesforce is asserting a number of main updates to Slack that may allow clients to leverage their dialog historical past for AI apps and brokers.

The corporate is asserting a real-time search (RTS) API, which surfaces up-to-date discussions, recordsdata, and channels to supply brokers entry with context-aware data. To make sure safe use of knowledge, information stays in Slack and the API adheres to present consumer entry permissions and solely retrieves information that’s related to the question.

“It unlocks your group’s collective intelligence, securely connecting brokers to conversations and selections that have been as soon as trapped in silos,” Salesforce wrote in a weblog publish.

Anthropic claims its newly launched Claude Sonnet 4.5 is the “greatest coding mannequin on the earth”

Claude Sonnet 4.5 achieves a 77.2% on the SWE-bench for software program engineering, in comparison with 74.5% for Claude Opus 4.1 and 72.7% for Claude Sonnet 4. For exterior comparability, GPT-5 Codex scored at 74.5%, GPT-5 scored 72.8%, and Gemini 2.5 Professional scored 67.2%.

Moreover, it leads within the OSWorld benchmark, which checks AI fashions on real-world pc duties. It scored 61.4% on that benchmark, beating out Claude Sonnet 4, which scored 42.2%.

“Sonnet 4.5 can produce near-instant responses or prolonged, step-by-step pondering that’s made seen to the consumer,” Anthropic says.

In accordance with Anthropic, Claude Sonnet 4.5 additionally exhibits higher domain-specific data and reasoning within the fields of finance, regulation, and medication.

Workato proclaims MCP platform

Workato Enterprise MCP supplies clients with entry to over 100 absolutely managed MCP servers that may join with completely different LLMs and brokers, together with ChatGPT, Claude.AI, Amazon Q, Cursor, and Google Gemini. Among the MCP servers obtainable within the platform embody ones from Atlassian, Field, Reddit, Salesforce, Okta, and Shopify.

“At Workato, we hear day by day that whereas MCP is thrilling, enterprises nonetheless face challenges making MCP work securely, successfully, and reliably at scale,” mentioned Adam Seligman, Chief Know-how Officer at Workato. “Workato Enterprise MCP modifications that by bringing the complete spectrum of enterprise processes, from the entrance workplace to the again workplace and the whole lot in between, to AI brokers by way of MCP. With pre-built, enterprise-grade servers and expertise, we’re giving world enterprises a first-of-its-kind resolution that unlocks AI brokers to soundly execute actual enterprise processes at scale, delivering measurable enterprise worth.”

VibeSec embeds safety evaluation into AI coding fashions to stop technology of insecure code

OX Safety is shifting safety as far left as it may well go together with the launch of VibeSec, which it says can cease insecure AI-generated code earlier than the code even will get generated.

It does this by embedding dynamic safety context into the coding mannequin in order that it doesn’t recommend code that incorporates safety points.

“VibeSec doesn’t simply speed up safety – it essentially modifications how safety operates. For the primary time, safety strikes quicker than vulnerabilities,” mentioned Neatsun Ziv, co-founder and CEO, at OX Safety.

OutSystems launches Agent Workbench

Agent Workbench permits customers to create and orchestrate AI brokers that leverage their firm’s information units and workflows. For instance, in early entry, Axos Financial institution constructed a log evaluation agent to interpret error logs and Thermo Fisher Scientific used it to construct a Buyer Escalation Agent that interprets unstructured information from buyer interactions.

“Agent Workbench was created to provide our clients the instruments they should construct the agentic future with OutSystems. Our Early Entry Program contributors have realized spectacular outcomes with Agent Workbench, positioning them as trade leaders in agentic AI,” mentioned Woodson Martin, CEO of OutSystems.