• About Us
  • Privacy Policy
  • Disclaimer
  • Contact Us
TechTrendFeed
  • Home
  • Tech News
  • Cybersecurity
  • Software
  • Gaming
  • Machine Learning
  • Smart Home & IoT
No Result
View All Result
  • Home
  • Tech News
  • Cybersecurity
  • Software
  • Gaming
  • Machine Learning
  • Smart Home & IoT
No Result
View All Result
TechTrendFeed
No Result
View All Result

Construct dependable Agentic AI resolution with Amazon Bedrock: Study from Pushpay’s journey on GenAI analysis

Admin by Admin
January 29, 2026
Home Machine Learning
Share on FacebookShare on Twitter


This submit was co-written with Saurabh Gupta and Todd Colby from Pushpay.

Pushpay is a market-leading digital giving and engagement platform designed to assist church buildings and faith-based organizations drive neighborhood engagement, handle donations, and strengthen generosity fundraising processes effectively. Pushpay’s church administration system supplies church directors and ministry leaders with insight-driven reporting, donor growth dashboards, and automation of economic workflows.

Utilizing the ability of generative AI, Pushpay developed an progressive agentic AI search characteristic constructed for the distinctive wants of ministries. The strategy makes use of pure language processing so ministry workers can ask questions in plain English and generate real-time, actionable insights from their neighborhood information. The AI search characteristic addresses a vital problem confronted by ministry leaders: the necessity for fast entry to neighborhood insights with out requiring technical experience. For instance, ministry leaders can enter “present me people who find themselves members in a gaggle, however haven’t given this 12 months” or “present me people who find themselves not engaged in my church,” and use the outcomes to take significant motion to higher assist people of their neighborhood. Most neighborhood leaders are time-constrained and lack technical backgrounds; they’ll use this resolution to acquire significant information about their congregations in seconds utilizing pure language queries.

By empowering ministry workers with sooner entry to neighborhood insights, the AI search characteristic helps Pushpay’s mission to encourage generosity and connection between church buildings and their neighborhood members. Early adoption customers report that this resolution has shortened their time to insights from minutes to seconds. To realize this end result, the Pushpay group constructed the characteristic utilizing agentic AI capabilities on Amazon Net Companies (AWS) whereas implementing sturdy high quality assurance measures and establishing a speedy iterative suggestions loop for steady enhancements.

On this submit, we stroll you thru Pushpay’s journey in constructing this resolution and discover how Pushpay used Amazon Bedrock to create a customized generative AI analysis framework for steady high quality assurance and establishing speedy iteration suggestions loops on AWS.

Resolution overview: AI powered search structure

The answer consists of a number of key elements that work collectively to ship an enhanced search expertise. The next determine exhibits the answer structure diagram and the general workflow.

Figure 1: AI Search Solution Architecture

Determine 1: AI Search Resolution Structure

  • Person interface layer: The answer begins with Pushpay customers submitting pure language queries by way of the prevailing Pushpay software interface. Through the use of pure language queries, church ministry workers can get hold of information insights utilizing AI capabilities with out studying new instruments or interfaces.
  • AI search agent: On the coronary heart of the system lies the AI search agent, which consists of two key elements:
    • System immediate: Accommodates the big language mannequin (LLM) position definitions, directions, and software descriptions that information the agent’s conduct.
    • Dynamic immediate constructor (DPC): routinely constructs further personalized system prompts primarily based on the person particular data, resembling church context, pattern queries, and software filter stock. Additionally they use semantic search to pick out solely related filters amongst a whole bunch of obtainable software filters. The DPC improves response accuracy and person expertise.
  • Amazon Bedrock superior characteristic: The answer makes use of the next Amazon Bedrock managed companies:
    • Immediate caching: Reduces latency and prices by caching steadily used system immediate.
    • LLM processing: Makes use of Claude Sonnet 4.5 to course of prompts and generate JSON output required by the applying to show the specified question outcomes as insights to customers.
  • Analysis system: The analysis system implements a closed-loop enchancment resolution the place person interactions are instrumented, captured and evaluated offline. The analysis outcomes feed right into a dashboard for product and engineering groups to investigate and drive iterative enhancements to the AI search agent. Throughout this course of, the info science group collects a golden dataset and constantly curates this dataset primarily based on the precise person queries coupled with validated responses.

The challenges of preliminary resolution with out analysis

To create the AI search characteristic, Pushpay developed the primary iteration of the AI search agent. The answer implements a single agent configured with a fastidiously tuned system immediate that features the system position, directions, and the way the person interface works with detailed clarification of every filter device and their sub-settings. The system immediate is cached utilizing Amazon Bedrock immediate caching to cut back token value and latency. The agent makes use of the system immediate to invoke an Amazon Bedrock LLM which generates the JSON doc that Pushpay’s software makes use of to use filters and current question outcomes to customers.

Nevertheless, this primary iteration rapidly revealed some limitations. Whereas it demonstrated a 60-70% success fee with fundamental enterprise queries, the group reached an accuracy plateau. The analysis of the agent was a handbook and tedious course of Tuning the system immediate past this accuracy threshold proved difficult given the various spectrum of person queries and the applying’s protection of over 100 distinct configurable filters. These offered vital blockers for the group’s path to manufacturing.

Figure 2: AI Search First Solution

Determine 2: AI Search First Resolution

Enhancing the answer by including a customized generative AI analysis framework

To deal with the challenges of measuring and bettering agent accuracy, the group carried out a generative AI analysis framework built-in into the prevailing structure, proven within the following determine. This framework consists of 4 key elements that work collectively to offer complete efficiency insights and allow data-driven enhancements.

Figure 3: Introducing the GenAI Evaluation Framework

Determine 3: Introducing the GenAI Analysis Framework

  1. The golden dataset: A curated golden dataset containing over 300 consultant queries, every paired with its corresponding anticipated output, varieties the muse of automated analysis. The product and information science groups fastidiously developed and validated this dataset to attain complete protection of real-world use circumstances and edge circumstances. Moreover, there’s a steady curation strategy of including consultant precise person queries with validated outcomes.
  2. The evaluator: The evaluator element processes person enter queries and compares the agent-generated output in opposition to the golden dataset utilizing the LLM as a decide sample This strategy generates core accuracy metrics whereas capturing detailed logs and efficiency information, resembling latency, for additional evaluation and debugging.
  3. Area class: Area classes are developed utilizing a mix of generative AI area summarization and human-defined common expressions to successfully categorize person queries. The evaluator determines the area class for every question, enabling nuanced, category-based analysis as an extra dimension of analysis metrics.
  4. Generative AI analysis dashboard: The dashboard serves because the mission management for Pushpay’s product and engineering groups, displaying area category-level metrics to evaluate efficiency and latency and information choices. It shifts the group from single combination scores to nuanced, domain-based efficiency insights.

The accuracy dashboard: Pinpointing weaknesses by area

As a result of person queries are categorized into area classes, the dashboard incorporates statistical confidence visualization utilizing a 95% Wilson rating interval to show accuracy metrics and question volumes at every area degree. Through the use of classes, the group can pinpoint the AI agent’s weaknesses by area. Within the following instance , the “exercise” area exhibits considerably decrease accuracy than different classes.

Figure 4: Pinpointing Agent Weaknesses by Domain

Determine 4: Pinpointing Agent Weaknesses by Area

Moreover, a efficiency dashboard, proven within the following determine, visualizes latency indicators on the area class degree, together with latency distributions from p50 to p90 percentiles. Within the following instance, the exercise area reveals notably greater latency than others.

Identifying Latency Bottlenecks by Domain

Determine 5: Figuring out Latency Bottlenecks by Area

Strategic rollout by way of domain-Stage insights

Area-based metrics revealed various efficiency ranges throughout semantic domains, offering essential insights into agent effectiveness. Pushpay used this granular visibility to make strategic characteristic rollout choices. By briefly suppressing underperforming classes—resembling exercise queries—whereas present process optimization, the system achieved 95% total accuracy. Through the use of this strategy, customers skilled solely the highest-performing options whereas the group refined others to manufacturing requirements.

Determine 6: Attaining 95% Accuracy with Area-Stage Function Rollout

Strategic prioritization: Specializing in high-impact domains

To prioritize enhancements systematically, Pushpay employed a 2×2 matrix framework plotting matters in opposition to two dimensions (proven within the following determine): Enterprise precedence (vertical axis) and present efficiency or feasibility (horizontal axis). This visualization positioned matters with each excessive enterprise worth and powerful present efficiency within the top-right quadrant. The group then targeted on these areas as a result of they required much less heavy lifting to attain additional accuracy enchancment from already-good ranges to an distinctive 95% accuracy for the enterprise targeted matters.

The implementation adopted an iterative cycle: after every spherical of enhancements, they re-analyze the outcomes to determine the following set of high-potential matters. This systematic, cyclical strategy enabled steady optimization whereas sustaining deal with business-critical areas.

Figure 7: Strategic Prioritization Framework for Domain Category Optimization

Determine 7: Strategic Prioritization Framework for Area Class Optimization

Dynamic immediate development

The insights gained from the analysis framework led to an architectural enhancement: the introduction of a dynamic immediate constructor. This element enabled speedy iterative enhancements by permitting fine-grained management over which area classes the agent might deal with. The structured area stock – beforehand embedded within the system immediate – was remodeled right into a dynamic aspect, utilizing semantic search to assemble contextually related prompts for every person question. This strategy tailors the immediate filter stock primarily based on three key contextual dimensions: question content material, person persona, and tenant-specific necessities. The result’s a extra exact and environment friendly system that generates extremely related responses whereas sustaining the flexibleness wanted for steady optimization.

Enterprise influence

The generative AI analysis framework grew to become the cornerstone of Pushpay’s AI characteristic growth, delivering measurable worth throughout three dimensions:

  • Person expertise: The AI search characteristic diminished time-to-insight from roughly 120 seconds (skilled customers manually navigating advanced UX) to below 4 seconds – a 15-fold acceleration that straight helps improve ministry leaders’ productiveness and decision-making velocity. This characteristic democratized information insights, in order that customers of various technical ranges can entry significant intelligence with out requiring specialised experience.
  • Growth velocity: The scientific analysis strategy remodeled optimization cycles. Somewhat than debating immediate modifications, the group now validates adjustments and measures domain-specific impacts inside minutes, changing extended deliberations with data-driven iteration.
  • Manufacturing readiness: Enhancements from 60–70% accuracy to greater than 95% accuracy utilizing high-performance domains supplied the quantitative confidence required for customer-facing deployment, whereas the framework’s structure allows steady refinement throughout different area classes.

Key takeaways to your AI agent journey

The next are key takeaways from Pushpay’s expertise that you need to use in your individual AI agent journey.

1/ Construct with manufacturing in thoughts from day one

Constructing agentic AI programs is easy, however scaling them to manufacturing is difficult. Builders ought to undertake a scaling mindset through the proof-of-concept part, not after. Implementing sturdy tracing and analysis frameworks early, supplies a transparent pathway from experimentation to manufacturing. Through the use of this methodology, groups can determine and deal with accuracy points systematically earlier than they change into blockers.

2/ Make the most of the superior options of Amazon Bedrock

Amazon Bedrock immediate caching considerably reduces token prices and latency by caching steadily used system prompts. For brokers with giant, steady system prompts, this characteristic is important for production-grade efficiency.

3/ Assume past combination metrics

Mixture accuracy scores can generally masks vital efficiency variations. By evaluating agent efficiency on the area class degree, Pushpay uncovered weaknesses past what a single accuracy metric can seize. This granular strategy allows focused optimization and knowledgeable rollout choices, ensuring customers solely expertise high-performing options whereas others are refined.

4/ Knowledge safety and accountable AI

When creating agentic AI programs, contemplate data safety and LLM safety concerns from the outset, following the AWS Shared Duty Mannequin, as a result of safety necessities basically influence the architectural design. Pushpay’s prospects are church buildings and faith-based organizations who’re stewards of delicate data—together with pastoral care conversations, monetary giving patterns, household struggles, prayer requests and extra. On this implementation instance, Pushpay set a transparent strategy to incorporating AI ethically inside its product ecosystem, sustaining strict safety requirements to make sure church information and personally identifiable data (PII) stays inside its safe partnership ecosystem. Knowledge is shared solely with safe and acceptable information protections utilized and isn’t used to coach exterior fashions. To study extra about Pushpay’s requirements for incorporating AI inside their merchandise, go to the Pushpay Information Middle for a extra in-depth assessment of firm requirements.

Conclusion: Your Path to Manufacturing-Prepared AI Brokers

Pushpay’s journey from a 60–70% accuracy prototype to a 95% correct production-ready AI agent demonstrates that constructing dependable agentic AI programs requires extra than simply refined prompts—it calls for a scientific, data-driven strategy to analysis and optimization. The important thing breakthrough wasn’t within the AI expertise itself, however in implementing a complete analysis framework constructed on sturdy observability basis that supplied granular visibility into agent efficiency throughout completely different domains. This systematic strategy enabled speedy iteration, strategic rollout choices, and steady enchancment.

Able to construct your individual production-ready AI agent?

  • Discover Amazon Bedrock: Start constructing your agent with Amazon Bedrock
  • Implement LLM-as-a-judge: Create your individual analysis system utilizing the patterns described on this LLM-as-a-judge on Amazon Bedrock Mannequin Analysis
  • Construct your golden dataset: Begin curating consultant queries and anticipated outputs to your particular use case

In regards to the authors

Roger Wang is a Senior Resolution Architect at AWS. He’s a seasoned architect with over 20 years of expertise within the software program business. He helps New Zealand and international software program and SaaS corporations use cutting-edge expertise at AWS to unravel advanced enterprise challenges. Roger is captivated with bridging the hole between enterprise drivers and technological capabilities and thrives on facilitating conversations that drive impactful outcomes.

Melanie LiMelanie Li, PhD, is a Senior Generative AI Specialist Options Architect at AWS primarily based in Sydney, Australia, the place her focus is on working with prospects to construct options leveraging state-of-the-art AI and machine studying instruments. She has been actively concerned in a number of Generative AI initiatives throughout APJ, harnessing the ability of Giant Language Fashions (LLMs). Previous to becoming a member of AWS, Dr. Li held information science roles within the monetary and retail industries.

Frank Huang, PhD, is a Senior Analytics Specialist Options Architect at AWS primarily based in Auckland, New Zealand. He focuses on serving to prospects ship superior analytics and AI/ML options. All through his profession, Frank has labored throughout a wide range of industries resembling monetary companies, Web3, hospitality, media and leisure, and telecommunications. Frank is keen to make use of his deep experience in cloud structure, AIOps, and end-to-end resolution supply to assist prospects obtain tangible enterprise outcomes with the ability of knowledge and AI.

Saurabh Gupta is a knowledge science and AI skilled at Pushpay primarily based in Auckland, New Zealand, the place he focuses on implementing sensible AI options and statistical modeling. He has intensive expertise in machine studying, information science, and Python for information science purposes, with specialised expertise coaching in database brokers and AI implementation. Previous to his present position, he gained expertise in telecom, retail and monetary companies, creating experience in advertising and marketing analytics and buyer retention packages. He has a Grasp’s in Statistics from College of Auckland and a Grasp’s in Enterprise Administration from the Indian Institute of Administration, Calcutta.

Todd Colby is a Senior Software program Engineer at Pushpay primarily based in Seattle. His experience is concentrated on evolving advanced legacy purposes with AI, and translating person wants into structured, high-accuracy options. He leverages AI to extend supply velocity and produce leading edge metrics and enterprise resolution instruments.

Tags: AgenticAmazonBedrockBuildEvaluationGenAIJourneyLearnPushpaysReliableSolution
Admin

Admin

Next Post
LLMs Hijacked, Monetized in ‘Operation Weird Bazaar’

LLMs Hijacked, Monetized in 'Operation Weird Bazaar'

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Trending.

Reconeyez Launches New Web site | SDM Journal

Reconeyez Launches New Web site | SDM Journal

May 15, 2025
Safety Amplified: Audio’s Affect Speaks Volumes About Preventive Safety

Safety Amplified: Audio’s Affect Speaks Volumes About Preventive Safety

May 18, 2025
Flip Your Toilet Right into a Good Oasis

Flip Your Toilet Right into a Good Oasis

May 15, 2025
Apollo joins the Works With House Assistant Program

Apollo joins the Works With House Assistant Program

May 17, 2025
Discover Vibrant Spring 2025 Kitchen Decor Colours and Equipment – Chefio

Discover Vibrant Spring 2025 Kitchen Decor Colours and Equipment – Chefio

May 17, 2025

TechTrendFeed

Welcome to TechTrendFeed, your go-to source for the latest news and insights from the world of technology. Our mission is to bring you the most relevant and up-to-date information on everything tech-related, from machine learning and artificial intelligence to cybersecurity, gaming, and the exciting world of smart home technology and IoT.

Categories

  • Cybersecurity
  • Gaming
  • Machine Learning
  • Smart Home & IoT
  • Software
  • Tech News

Recent News

Overwatch 2 Is Ditching the ‘2’ Amid Launch of ‘New, Story-Pushed Period’ With 10 New Heroes

Overwatch 2 Is Ditching the ‘2’ Amid Launch of ‘New, Story-Pushed Period’ With 10 New Heroes

February 5, 2026
Forescout menace roundup – IT Safety Guru

Forescout menace roundup – IT Safety Guru

February 5, 2026
  • About Us
  • Privacy Policy
  • Disclaimer
  • Contact Us

© 2025 https://techtrendfeed.com/ - All Rights Reserved

No Result
View All Result
  • Home
  • Tech News
  • Cybersecurity
  • Software
  • Gaming
  • Machine Learning
  • Smart Home & IoT

© 2025 https://techtrendfeed.com/ - All Rights Reserved