{"id":13968,"date":"2026-04-20T18:05:57","date_gmt":"2026-04-20T18:05:57","guid":{"rendered":"https:\/\/techtrendfeed.com\/?p=13968"},"modified":"2026-04-20T18:05:57","modified_gmt":"2026-04-20T18:05:57","slug":"toolsimulator-scalable-device-testing-for-ai-brokers","status":"publish","type":"post","link":"https:\/\/techtrendfeed.com\/?p=13968","title":{"rendered":"ToolSimulator: scalable device testing for AI brokers"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div id=\"\">\n<p>You should use<strong> ToolSimulator, <\/strong>an\u00a0LLM-powered device simulation framework inside Strands Evals, to totally and safely check AI brokers that depend on exterior instruments, at scale. As a substitute of risking dwell API calls that expose personally identifiable data (PII), set off unintended actions, or settling for static mocks that break with multi-turn workflows, you need to use ToolSimulator\u2019s giant language mannequin (LLM)-powered simulations to validate your brokers. Accessible immediately as a part of the <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/strandsagents.com\/docs\/user-guide\/evals-sdk\/quickstart\/\" target=\"_blank\" rel=\"noopener\">Strands Evals Software program Growth Package (SDK)<\/a>, ToolSimulator helps you catch integration bugs early, check edge circumstances comprehensively, and ship production-ready brokers with confidence.<\/p>\n<table class=\"styled-table\" style=\"height: 159px\" border=\"1px\" width=\"743\" cellpadding=\"10px\">\n<tbody>\n<tr>\n<td style=\"padding: 10px;border: 1px solid #dddddd\"><strong>On this submit, you&#8217;ll discover ways to:<\/strong><\/p>\n<ul>\n<li>Arrange ToolSimulator and register instruments for simulation<\/li>\n<li>Configure stateful device simulations for multi-turn agent workflows<\/li>\n<li>Implement response schemas with Pydantic fashions<\/li>\n<li>Combine ToolSimulator into a whole Strands Evals analysis pipeline<\/li>\n<li>Apply greatest practices for simulation-based agent analysis<\/li>\n<\/ul>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Conditions<\/h2>\n<p>Earlier than you start, just be sure you have the next:<\/p>\n<ul>\n<li>Python 3.10 or later put in in your surroundings<\/li>\n<li>Strands Evals SDK put in: <code>pip set up strands-evals<\/code><\/li>\n<li>Primary familiarity with Python, together with decorators and kind hints<\/li>\n<li>Familiarity with AI brokers and tool-calling ideas (API calls, operate schemas)<\/li>\n<li>Pydantic data is useful for the superior schema examples, however isn&#8217;t required to get began<\/li>\n<li>An AWS account isn&#8217;t required to run ToolSimulator regionally<\/li>\n<\/ul>\n<h2>Why device testing challenges your growth workflow<\/h2>\n<p>Fashionable AI brokers don\u2019t simply cause. They name APIs, question databases, invoke Mannequin Context Protocol (MCP) companies, and work together with exterior programs to finish duties. Your agent\u2019s conduct relies upon not solely on its reasoning, however on what these instruments return. Once you check these brokers in opposition to dwell APIs, you run into three challenges that gradual you down and put your programs in danger.<\/p>\n<p>Three challenges that dwell APIs create:<\/p>\n<ul>\n<li><strong>Exterior dependencies gradual you down. <\/strong>Stay APIs impose charge limits, expertise downtime, and require community connectivity. Once you\u2019re working lots of of check circumstances, these constraints make complete testing impractical.<\/li>\n<li><strong>Check isolation turns into dangerous. <\/strong>Actual device calls set off actual uncomfortable side effects. You threat sending precise emails, modifying manufacturing databases, or reserving precise flights throughout testing. Your agent exams shouldn\u2019t work together with the programs that they\u2019re testing in opposition to.<\/li>\n<li><strong>Privateness and safety create limitations. <\/strong>Many instruments deal with delicate information, together with consumer data, monetary data, and PII. Working exams in opposition to dwell programs unnecessarily exposes that information and creates compliance dangers.<\/li>\n<\/ul>\n<h2>Why static mocks fall brief<\/h2>\n<p>You may think about static mocks instead. Static mocks work for straightfoward, predictable situations, however they require fixed upkeep as your APIs evolve. Extra importantly, they break down within the multi-turn, stateful workflows that actual brokers carry out.<\/p>\n<p>Think about a flight reserving agent. It searches for flights with one device name, then checks reserving standing with one other. The second response ought to depend upon what the primary name did. A hardcoded response can\u2019t mirror a database that adjustments state between calls. Static mocks can\u2019t seize this.<\/p>\n<h2>What makes ToolSimulator totally different<\/h2>\n<p>ToolSimulator solves these challenges with three important capabilities that work collectively to provide you protected, scalable agent testing with out sacrificing realism.<\/p>\n<ul>\n<li><strong>Adaptive response technology. <\/strong>Device outputs mirror what your agent really requested, not a set template. When your agent calls to seek for Seattle-to-New York flights, ToolSimulator returns believable choices with real looking costs and occasions, not a generic placeholder.<\/li>\n<li><strong>Stateful workflow help. <\/strong>Many real-world instruments preserve state throughout calls. A write operation ought to have an effect on subsequent reads. ToolSimulator maintains constant shared state throughout device calls, making it protected to check database interactions, reserving workflows, and multi-step processes with out touching manufacturing programs.<\/li>\n<li><strong>Schema enforcement. <\/strong>Builders sometimes add a post-processing layer that parses uncooked device output right into a structured format. When a device returns a malformed response, this layer breaks. ToolSimulator validates responses in opposition to Pydantic schemas that you just outline, catching malformed responses earlier than they attain your agent.<\/li>\n<\/ul>\n<h2>How ToolSimulator works<\/h2>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone wp-image-128784 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/04\/15\/ml-20730-image-1.png\" alt=\"ToolSimulator architecture diagram showing how tool calls are intercepted and routed to an LLM-based response generator\" width=\"1612\" height=\"1612\"\/>Determine 1: ToolSimulator (TS) intercepts device calls and routes them to an LLM-based response generator<\/p>\n<p>ToolSimulator intercepts calls to your registered instruments and routes them to an LLM-based response generator. The generator makes use of the device schema, your agent\u2019s enter, and the present simulation state to provide a sensible, context-appropriate response. No handwritten fixtures required.<\/p>\n<p>Your workflow follows three steps: beautify and register your instruments, optionally steer the simulation with context, then let ToolSimulator mock the device responses when your agent runs.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-128786 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/04\/15\/ml-20730-image-2.png\" alt=\"Process flow diagram showing the three-step ToolSimulator workflow: Decorate &amp; Register, Steer, and Mock, illustrating how tools are registered, configured, and provided to agents for simulation.\" width=\"2014\" height=\"1323\"\/>Determine 2: The three-step ToolSimulator (TS) workflow \u2014 Beautify &amp; Register, Steer, Mock<\/p>\n<h2>Getting began with ToolSimulator<\/h2>\n<p>The next sections stroll you thru every step of the ToolSimulator workflow, from preliminary setup to working your first simulation.<\/p>\n<h3>Step 1: Beautify and register<\/h3>\n<p>Create a ToolSimulator occasion, then wrap your device operate with the <code>@simulator.device()<\/code> decorator to register it for simulation. The actual operate physique can stay empty. ToolSimulator intercepts calls earlier than they attain the implementation:<\/p>\n<pre><code class=\"lang-python\">from strands_evals.simulation.tool_simulator import ToolSimulator\n\ntool_simulator = ToolSimulator()\n\n@tool_simulator.device()\ndef search_flights(origin: str, vacation spot: str, date: str) -&gt; dict:\n    \"\"\"Seek for obtainable flights between two airports on a given date.\"\"\"\n    go # The actual implementation isn't referred to as throughout simulation<\/code><\/pre>\n<h3>Step 2: Steer (elective configuration)<\/h3>\n<p>By default, ToolSimulator robotically infers how every device ought to behave from its schema and docstring. No extra configuration is required to get began. Once you want extra management, you need to use these three elective parameters to customise simulation conduct:<\/p>\n<ul>\n<li><code>share_state_id<\/code>: Hyperlinks instruments that share the identical backend underneath a standard state key. State adjustments made by one device (for instance, a setter) are instantly seen to subsequent calls by one other (for instance, a getter).<\/li>\n<li><code>initial_state_description<\/code>: Seeds the simulation with a pure language description of pre-existing state. Richer context produces extra real looking and constant responses.<\/li>\n<li><code>output_schema<\/code>: A Pydantic mannequin defining the anticipated response construction. ToolSimulator generates responses that conform strictly to this schema.<\/li>\n<\/ul>\n<h3>Step 3: Mock<\/h3>\n<p>When your agent calls a registered device, the ToolSimulator wrapper intercepts the decision and routes it to the dynamic response generator. The generator validates the agent\u2019s parameters in opposition to the device schema, produces a response that matches the <code>output_schema<\/code>, and updates the state registry so subsequent device calls see a constant world.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone wp-image-128785 size-full\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/04\/15\/ml-20730-image-3.png\" alt=\"Process flow diagram showing four sequential steps of ToolSimulator: Agent Calls Tool, Validate Parameters, Generate Response, and Update State, with arrows connecting each step and returning to agent.\" width=\"2596\" height=\"1023\"\/>Determine 3: The ToolSimulator (TS) simulation move when the agent calls a registered device<\/p>\n<p>The next instance simulates a flight search device hooked up to a flight search assistant:<\/p>\n<pre><code class=\"lang-python\">from strands import Agent\nfrom strands_evals.simulation.tool_simulator import ToolSimulator\n\n# 1. Create a simulator occasion\ntool_simulator = ToolSimulator()\n\n# 2. Register a device for simulation with preliminary state context\n@tool_simulator.device(\n    initial_state_description=\"Flight database: SEA-&gt;JFK flights obtainable at 8am, 12pm, and 6pm. Costs vary from $180 to $420.\",\n)\ndef search_flights(origin: str, vacation spot: str, date: str) -&gt; dict:\n    \"\"\"Seek for obtainable flights between two airports on a given date.\"\"\"\n    go\n\n# 3. Create an agent with the simulated device and run it\nflight_tool = tool_simulator.get_tool(\"search_flights\")\nagent = Agent(\n    system_prompt=\"You're a flight search assistant.\",\n    instruments=[flight_tool],\n)\n\nresponse = agent(\"Discover me flights from Seattle to New York on March 15.\")\nprint(response)\n# Anticipated output: A structured checklist of simulated SEA-&gt;JFK flights with occasions\n# and costs in step with the initial_state_description you supplied.<\/code><\/pre>\n<h2>Superior ToolSimulator utilization<\/h2>\n<p>The next sections cowl three superior capabilities that offer you extra management over simulation conduct: working unbiased situations for parallel testing, configuring shared state for multi-turn workflows, and implementing customized response schemas.<\/p>\n<h3>Run unbiased simulator situations<\/h3>\n<p>You possibly can create a number of ToolSimulator situations facet by facet. Every occasion maintains its personal device registry and state, so you may run parallel experiment configurations in the identical codebase:<\/p>\n<pre><code class=\"lang-python\">simulator_a = ToolSimulator()\nsimulator_b = ToolSimulator()\n# Every occasion has an unbiased device registry and state --\n# perfect for evaluating agent conduct throughout totally different device setups.<\/code><\/pre>\n<h3>Configure shared state for multi-turn workflows<\/h3>\n<p>For stateful instruments resembling database getters and setters, ToolSimulator maintains constant shared state throughout device calls. Use <code>share_state_id<\/code> to hyperlink instruments that function on the identical backend, and <code>initial_state_description<\/code> to seed the simulation with pre-existing context:<\/p>\n<pre><code class=\"lang-python\">@tool_simulator.device(\n    share_state_id=\"flight_booking\",\n    initial_state_description=\"Flight reserving system: SEA-&gt;JFK flights obtainable at 8am, 12pm, and 6pm. No bookings at the moment lively.\",\n)\ndef search_flights(origin: str, vacation spot: str, date: str) -&gt; dict:\n    \"\"\"Seek for obtainable flights between two airports on a given date.\"\"\"\n    go\n\n@tool_simulator.device(\nshare_state_id=\"flight_booking\",\n)\ndef get_booking_status(booking_id: str) -&gt; dict:\n    \"\"\"Retrieve the present standing of a flight reserving by reserving ID.\"\"\"\n    go\n\n# Each instruments share \"flight_booking\" state.\n# When search_flights known as, get_booking_status sees the identical\n# flight availability information in subsequent calls.<\/code><\/pre>\n<p>Examine the state earlier than and after agent execution to validate that device interactions produced the anticipated adjustments:<\/p>\n<pre><code class=\"lang-python\">initial_state = tool_simulator.get_state(\"flight_booking\")\n# ... run the agent ...\nfinal_state = tool_simulator.get_state(\"flight_booking\")\n# Confirm not simply the ultimate output, however the full sequence of device interactions.<\/code><\/pre>\n<table class=\"styled-table\" border=\"1px\" cellpadding=\"10px\">\n<tbody>\n<tr>\n<td style=\"padding: 10px;border: 1px solid #dddddd\">\n<p><strong>Tip:\u00a0<\/strong><strong>Seeding state from actual information<\/strong><\/p>\n<p>As a result of <code>initial_state_description<\/code> accepts pure language, you will get artistic with the way you seed context. For instruments that work together with tabular information, use a <code>DataFrame.describe()<\/code> name to generate statistical summaries and go these statistics instantly because the state description. ToolSimulator will generate responses that mirror real looking information distributions, with out ever accessing the precise information.<\/p>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>Implement a customized response schema<\/h3>\n<p>By default, ToolSimulator infers a response construction from the device\u2019s docstring and kind hints. For instruments that observe strict specs resembling OpenAPI or MCP schemas, outline the anticipated response as a Pydantic mannequin and go it utilizing <code>output_schema<\/code>:<\/p>\n<pre><code class=\"lang-python\">from pydantic import BaseModel, Subject\n\nclass FlightSearchResponse(BaseModel):\n    flights: checklist[dict] = Subject( ..., description=\"Listing of accessible flights with flight quantity, departure time, and value\" )\n    origin: str = Subject(..., description=\"Origin airport code\")\n    vacation spot: str = Subject(..., description=\"Vacation spot airport code\")\n    standing: str = Subject(default=\"success\", description=\"Search operation standing\")\n    message: str = Subject(default=\"\", description=\"Extra standing message\")\n\n@tool_simulator.device(output_schema=FlightSearchResponse)\ndef search_flights(origin: str, vacation spot: str, date: str) -&gt; dict:\n    \"\"\"Seek for obtainable flights between two airports on a given date.\"\"\"\n    go\n\n# ToolSimulator validates parameters strictly and returns solely legitimate JSON\n# responses that conform to the FlightSearchResponse schema.<\/code><\/pre>\n<h2>Integration with Strands Analysis pipelines<\/h2>\n<p>ToolSimulator suits naturally into the Strands Evals analysis framework. The next instance reveals a whole pipeline, from simulation setup to experiment report, utilizing the <code>GoalSuccessRateEvaluator<\/code> to attain agent efficiency on tool-calling duties:<\/p>\n<pre><code class=\"lang-python\">from typing import Any\nfrom pydantic import BaseModel, Subject\nfrom strands import Agent\nfrom strands_evals import Case, Experiment\nfrom strands_evals.evaluators import GoalSuccessRateEvaluator\nfrom strands_evals.simulation.tool_simulator import ToolSimulator\nfrom strands_evals.mappers import StrandsInMemorySessionMapper\nfrom strands_evals.telemetry import StrandsEvalsTelemetry\n\n# Arrange telemetry and power simulator\ntelemetry = StrandsEvalsTelemetry().setup_in_memory_exporter()\nmemory_exporter = telemetry.in_memory_exporter\ntool_simulator = ToolSimulator()\n\n# Outline the response schema\nclass FlightSearchResponse(BaseModel):\n    flights: checklist[dict] = Subject( ..., description=\"Accessible flights with quantity, departure time, and value\" )\n    origin: str = Subject(..., description=\"Origin airport code\")\n    vacation spot: str = Subject(..., description=\"Vacation spot airport code\")\n    standing: str = Subject(default=\"success\", description=\"Search operation standing\")\n    message: str = Subject(default=\"\", description=\"Extra standing message\")\n\n# Register instruments for simulation\n@tool_simulator.device(\n    share_state_id=\"flight_booking\",\n    initial_state_description=\"Flight reserving system: SEA-&gt;JFK flights at 8am, 12pm, and 6pm. No bookings at the moment lively.\",\n    output_schema=FlightSearchResponse,\n)\ndef search_flights(origin: str, vacation spot: str, date: str) -&gt; dict[str, Any]:\n    \"\"\"Seek for obtainable flights between two airports on a given date.\"\"\"\n    go\n\n@tool_simulator.device(share_state_id=\"flight_booking\")\ndef get_booking_status(booking_id: str) -&gt; dict[str, Any]:\n    \"\"\"Retrieve the present standing of a flight reserving by reserving ID.\"\"\"\n    go\n\n# Outline the analysis process\ndef user_task_function(case: Case) -&gt; dict:\n    initial_state = tool_simulator.get_state(\"flight_booking\")\n    print(f\"[State before]: {initial_state.get('initial_state')}\")\n\n    search_tool = tool_simulator.get_tool(\"search_flights\")\n    status_tool = tool_simulator.get_tool(\"get_booking_status\")\n    agent = Agent(\n        trace_attributes={ \"gen_ai.dialog.id\": case.session_id, \"session.id\": case.session_id },\n        system_prompt=\"You're a flight reserving assistant.\",\n        instruments=[search_tool, status_tool],\n        callback_handler=None,\n    )\n\n    agent_response = agent(case.enter)\n    print(f\"[User]: {case.enter}\")\n    print(f\"[Agent]: {agent_response}\")\n\n    final_state = tool_simulator.get_state(\"flight_booking\")\n    print(f\"[State after]: {final_state.get('previous_calls', [])}\")\n\n    finished_spans = memory_exporter.get_finished_spans()\n    mapper = StrandsInMemorySessionMapper()\n    session = mapper.map_to_session(finished_spans, session_id=case.session_id)\n    return {\"output\": str(agent_response), \"trajectory\": session}\n\n# Outline check circumstances, run the experiment, and show the report\ntest_cases = [\n    Case( name=\"flight_search\", input=\"Find me flights from Seattle to New York on March 15.\", metadata={\"category\": \"flight_booking\"}, ),\n]\nexperiment = Experiment[str, str](\n    circumstances=test_cases,\n    evaluators=[GoalSuccessRateEvaluator()]\n)\n\nexperiences = experiment.run_evaluations(user_task_function)\nexperiences[0].run_display()<\/code><\/pre>\n<p>The duty operate retrieves the simulated instruments, creates an agent, runs the interplay, and returns each the agent\u2019s output and the total telemetry trajectory. The trajectory offers evaluators like <code>GoalSuccessRateEvaluator<\/code> entry to the entire sequence of device calls and mannequin invocations, not simply the ultimate response.<\/p>\n<h2>Greatest practices for simulation-based analysis<\/h2>\n<p>The next practices aid you get probably the most out of ToolSimulator throughout growth and analysis workflows:<\/p>\n<ol>\n<li><strong>Begin with the default configuration for broad protection.<\/strong> Add configuration overrides just for the particular device environments that you just wish to management exactly. ToolSimulator\u2019s defaults are designed to provide real looking conduct with out requiring setup.<\/li>\n<li><strong>Present wealthy <\/strong><code>initial_state_description<\/code><strong> values for stateful instruments.<\/strong> The extra context that you just seed, the extra real looking and constant the simulated responses will probably be. Embody information ranges, entity counts, and relationship context.<\/li>\n<li><strong>Use <\/strong><code>share_state_id<\/code><strong> for instruments that work together with the identical backend,<\/strong> so write operations are seen to subsequent reads. That is important for testing multi-turn workflows like reserving, cart administration, or database updates.<\/li>\n<li><strong>Apply <\/strong><code>output_schema<\/code><strong> for instruments that observe strict specs,<\/strong> resembling OpenAPI or MCP schemas. Schema enforcement catches malformed responses earlier than they attain your agent and break your post-processing layer.<\/li>\n<li><strong>Validate device interplay sequences, not simply last outputs.<\/strong> Examine state adjustments earlier than and after agent execution to verify that device calls occurred in the fitting order and produced the fitting state transitions.<\/li>\n<li><strong>Begin small and develop.<\/strong> Start along with your commonest device interplay situations, then develop to edge circumstances as your analysis follow matures. Complement simulation-based testing with focused dwell API exams for crucial manufacturing paths.<\/li>\n<\/ol>\n<h2>Conclusion<\/h2>\n<p>ToolSimulator transforms the way you check AI brokers by changing dangerous dwell API calls with clever, adaptive simulations. Now you can safely validate advanced, stateful workflows at scale, catching integration bugs early and delivery production-ready brokers with confidence. Combining ToolSimulator with Strands Evals analysis pipelines offers you full visibility into agent conduct with out managing check infrastructure or risking real-world uncomfortable side effects.<\/p>\n<h3>Subsequent steps<\/h3>\n<p>Begin testing your AI brokers safely immediately. Set up ToolSimulator with the next command:<\/p>\n<pre><code class=\"lang-bash\">pip set up strands-evals<\/code><\/pre>\n<p>To proceed exploring ToolSimulator and Strands Evals, take these subsequent steps:<\/p>\n<ul>\n<li>Learn the <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/github.com\/strands-agents\/evals\" target=\"_blank\" rel=\"noopener noreferrer\">Strands Evals documentation<\/a> to discover all configuration choices, together with superior state administration and customized evaluators.<\/li>\n<li>Attempt the <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/github.com\/strands-agents\/docs\/blob\/main\/docs\/examples\/evals-sdk\/tool_simulator.py\" target=\"_blank\" rel=\"noopener noreferrer\">instance<\/a> to see ToolSimulator in motion. Prolong the instance by including extra instruments and testing multi-step agent workflows.<\/li>\n<li>Discover <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/aws.amazon.com\/bedrock\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Bedrock<\/a> for the LLM backend choices that energy ToolSimulator\u2019s response technology.<\/li>\n<li>Find out about <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/aws.amazon.com\/lambda\/\" target=\"_blank\" rel=\"noopener noreferrer\">AWS Lambda<\/a> for serverless agent deployment methods that pair effectively with ToolSimulator-based testing.<\/li>\n<li>Be a part of the Strands group boards to ask questions, share your analysis setups, and join with different agent builders.<\/li>\n<\/ul>\n<table class=\"styled-table\" border=\"1px\" cellpadding=\"10px\">\n<tbody>\n<tr>\n<td style=\"padding: 10px;border: 1px solid #dddddd\"><strong>Share your suggestions. <\/strong>We\u2019d love to listen to the way you\u2019re utilizing ToolSimulator. Share your suggestions, report points, and counsel options by means of the Strands Evals GitHub repository or group boards.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<hr\/>\n<h2>About The Authors<\/h2>\n<footer>\n<div class=\"blog-author-box\">\n<div class=\"blog-author-image\">\n          <img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-128270\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/04\/09\/1517408072307-copy.png\" alt=\"Darren Wang\" width=\"100\" height=\"100\"\/>\n         <\/div>\n<h3 class=\"lb-h4\">Darren Wang<\/h3>\n<p>Darren Wang is a Analysis Engineer at Amazon Internet Companies, the place he bridges cutting-edge AI analysis and manufacturing programs. With a Ph.D. background in speech recognition and 5 years of expertise in electronic mail anti-spam engineering, Darren transforms early-stage machine studying analysis into scalable, production-ready options that ship measurable buyer influence. Specializing in agent simulation and analysis frameworks, he empowers builders to construct extra dependable, testable AI brokers by means of sturdy testing infrastructure. Outdoors of labor, he enjoys bouldering, enjoying violin, and something about cats.<\/p>\n<\/p><\/div>\n<div class=\"blog-author-box\">\n<div class=\"blog-author-image\">\n          <img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-128214\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/04\/09\/image-4-1-1.png\" alt=\"Xuan Qi\" width=\"100\" height=\"100\"\/>\n         <\/div>\n<h3 class=\"lb-h4\">Xuan Qi<\/h3>\n<p>Xuan Qi is an Utilized Scientist at Amazon Internet Companies, the place she applies her background in physics to sort out advanced challenges in machine studying and synthetic intelligence. Specializing in ML modeling and simulation, Xuan is obsessed with translating scientific ideas into sensible functions that drive significant technological developments. Her work focuses on creating extra intuitive and environment friendly AI programs that may higher perceive and work together with the world. Outdoors of her skilled pursuits, Xuan finds steadiness and creativity by means of dancing and enjoying the violin, bringing the precision and concord of those arts into her scientific endeavors.<\/p>\n<\/p><\/div>\n<div class=\"blog-author-box\">\n<div class=\"blog-author-image\">\n          <img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-128283\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/04\/09\/smeetd-1.jpg\" alt=\"Smeet Dhakecha\" width=\"100\" height=\"133\"\/>\n         <\/div>\n<h3 class=\"lb-h4\">Smeet Dhakecha<\/h3>\n<p>Smeet Dhakecha is a Analysis Engineer at Amazon Internet Companies, working throughout the Agentic AI Science crew. His work spans agent simulation and analysis programs, in addition to the design and deployment of information transformation pipelines and to help fast-moving scientific analysis for mannequin post-training, and RL coaching.<\/p>\n<\/p><\/div>\n<div class=\"blog-author-box\">\n<div class=\"blog-author-image\">\n          <img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-128242\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2026\/04\/09\/varannil-2-1.jpg\" alt=\"Vinayak Arannil\" width=\"100\" height=\"133\"\/>\n         <\/div>\n<h3 class=\"lb-h4\">Vinayak Arannil<\/h3>\n<p>Vinayak is a Sr. Utilized Scientist at Amazon Internet Companies. With a number of years of expertise, he has labored on numerous domains of AI like laptop imaginative and prescient, pure language processing, suggestion programs and many others. At present, Vinayak helps construct new capabilities on the AgentCore and Strands, enabling prospects to guage their Agentic functions with ease, accuracy and effectivity.<\/p>\n<\/p><\/div>\n<\/footer>\n<p>       \n      <\/div>\n\n","protected":false},"excerpt":{"rendered":"<p>You should use ToolSimulator, an\u00a0LLM-powered device simulation framework inside Strands Evals, to totally and safely check AI brokers that depend on exterior instruments, at scale. As a substitute of risking dwell API calls that expose personally identifiable data (PII), set off unintended actions, or settling for static mocks that break with multi-turn workflows, you need [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":13970,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[55],"tags":[617,739,508,509,8748],"class_list":["post-13968","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","tag-agents","tag-scalable","tag-testing","tag-tool","tag-toolsimulator"],"_links":{"self":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts\/13968","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=13968"}],"version-history":[{"count":1,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts\/13968\/revisions"}],"predecessor-version":[{"id":13969,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts\/13968\/revisions\/13969"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/media\/13970"}],"wp:attachment":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=13968"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=13968"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=13968"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69d9690a190636c2e0989534. Config Timestamp: 2026-04-10 21:18:02 UTC, Cached Timestamp: 2026-04-20 21:28:44 UTC -->