built – techtrendfeed.com

Cluely says its ARR hit $7M after signing a public firm; a startup referred to as Pickle says it constructed Glass, an open supply, free product much like Cluely (Marina Temkin/TechCrunch)

Admin — Sat, 05 Jul 2025 18:15:38 +0000

Featured Podcasts

Arduous Fork:

Arduous Fork Stay, Half 2: Patrick Collison of Stripe + Kathryn Zealand of Skip + Listener Questions

The long run is already right here. Every week, journalists Kevin Roose and Casey Newton discover and make sense of the newest within the quickly altering world of tech.

Subscribe to Arduous Fork.

Lenny’s Podcast:

I’ve run 75+ companies. This is why you are in all probability chasing the fallacious thought. | Andrew Wilkinson (co‑founding father of Tiny)

Interviews with world-class product leaders and progress consultants to uncover actionable recommendation that can assist you construct, launch, and develop your individual product.

Subscribe to Lenny’s Podcast.

Nice Chat:

AI is killing the summer season slowdown

A podcast principally about tech. Dropped at you weekly by Angela Du, Sally Shin, Mac Bohannon, Helen Min, and Ashley Mayer.

Subscribe to Nice Chat.

Techmeme Experience House:

In Meta v. OpenAI, Who’s Determined And Who’s Scared?

The day’s tech information, day by day at 5pm ET. Fifteen minutes and also you’re updated.

Subscribe to Techmeme Experience House.

Sponsor this podcast

Channels with Peter Kafka:

Black Mirror’s Charlie Brooker on the issue with tech – and other people

Media and tech aren’t simply intersecting – they’re absolutely intertwined. To grasp how these worlds work, Peter Kafka talks to trade leaders, upstarts and observers.

Subscribe to Channels with Peter Kafka.

Highlight On:

Freshworks CEO Dennis Woodside on translating daring visions into operational excellence

A podcast from Accel about how corporations are constructed, from the folks doing the constructing.

Subscribe to Highlight On.

Add your podcast right here

Reworking community operations with AI: How Swisscom constructed a community assistant utilizing Amazon Bedrock

Admin — Fri, 04 Jul 2025 19:09:47 +0000

Within the telecommunications business, managing complicated community infrastructures requires processing huge quantities of knowledge from a number of sources. Community engineers typically spend appreciable time manually gathering and analyzing this information, taking away beneficial hours that may very well be spent on strategic initiatives. This problem led Swisscom, Switzerland’s main telecommunications supplier, to discover how AI can rework their community operations.

Swisscom’s Community Assistant, constructed on Amazon Bedrock, represents a major step ahead in automating community operations. This answer combines generative AI capabilities with a complicated information processing pipeline to assist engineers rapidly entry and analyze community information. Swisscom used AWS providers to create a scalable answer that reduces guide effort and offers correct and well timed community insights.

On this publish, we discover how Swisscom developed their Community Assistant. We talk about the preliminary challenges and the way they carried out an answer that delivers measurable advantages. We look at the technical structure, talk about key learnings, and have a look at future enhancements that may additional rework community operations. We spotlight greatest practices for dealing with delicate information for Swisscom to adjust to the strict rules governing the telecommunications business. This publish offers telecommunications suppliers or different organizations managing complicated infrastructure with beneficial insights into how you should utilize AWS providers to modernize operations via AI-powered automation.

The chance: Enhance community operations

Community engineers at Swisscom confronted the each day problem to handle complicated community operations and preserve optimum efficiency and compliance. These expert professionals have been tasked to observe and analyze huge quantities of knowledge from a number of and decoupled sources. The method was repetitive and demanded appreciable time and a spotlight to element. In sure situations, fulfilling the assigned duties consumed greater than 10% of their availability. The guide nature of their work offered a number of essential ache factors. The information consolidation course of from a number of community entities right into a coherent overview was notably difficult, as a result of engineers needed to navigate via numerous instruments and methods to retrieve telemetry details about information sources and community parameters from intensive documentation, confirm KPIs via complicated calculations, and determine potential problems with various nature. This fragmented method consumed beneficial time and launched the chance of human error in information interpretation and evaluation. The scenario referred to as for an answer to deal with three main issues:

Effectivity in information retrieval and evaluation
Accuracy in calculations and reporting
Scalability to accommodate rising information sources and use circumstances

The workforce required a streamlined method to entry and analyze community information, preserve compliance with outlined metrics and thresholds, and ship quick and correct responses to occasions whereas sustaining the very best requirements of knowledge safety and sovereignty.

Answer overview

Swisscom’s method to develop the Community Assistant was methodical and iterative. The workforce selected Amazon Bedrock as the inspiration for his or her generative AI software and carried out a Retrieval Augmented Technology (RAG) structure utilizing Amazon Bedrock Data Bases to allow exact and contextual responses to engineer queries. The RAG method is carried out in three distinct phases:

Retrieval – Person queries are matched with related information base content material via embedding fashions
Augmentation – The context is enriched with retrieved info
Technology – The massive language mannequin (LLM) produces knowledgeable responses

The next diagram illustrates the answer structure.

The answer structure advanced via a number of iterations. The preliminary implementation established fundamental RAG performance by feeding the Amazon Bedrock information base with tabular information and documentation. Nonetheless, the Community Assistant struggled to handle giant enter recordsdata containing 1000’s of rows with numerical values throughout a number of parameter columns. This complexity highlighted the necessity for a extra selective method that might determine solely the rows related for particular KPI calculations. At that time, the retrieval course of wasn’t returning the exact variety of vector embeddings required to calculate the formulation, prompting the workforce to refine the answer for better accuracy.

Subsequent iterations enhanced the assistant with agent-based processing and motion teams. The workforce carried out AWS Lambda features utilizing Pandas or Spark for information processing, facilitating correct numerical calculations retrieval utilizing pure language from the consumer enter immediate.

A major development was launched with the implementation of a multi-agent method, utilizing Amazon Bedrock Brokers, the place specialised brokers deal with completely different points of the system:

Supervisor agent – Orchestrates interactions between documentation administration and calculator brokers to supply complete and correct responses.
Documentation administration agent – Helps the community engineers entry info in giant volumes of knowledge effectively and extract insights about information sources, community parameters, configuration, or tooling.
Calculator agent – Helps the community engineers to grasp complicated community parameters and carry out exact information calculations out of telemetry information. This produces numerical insights that assist carry out community administration duties; optimize efficiency; preserve community reliability, uptime, and compliance; and help in troubleshooting.

This following diagram illustrates the improved information extract, rework, and cargo (ETL) pipeline interplay with Amazon Bedrock.

To realize the specified accuracy in KPI calculations, the information pipeline was refined to attain constant and exact efficiency, which results in significant insights. The workforce carried out an ETL pipeline with Amazon Easy Storage Service (Amazon S3) as the information lake to retailer enter recordsdata following a each day batch ingestion method, AWS Glue for automated information crawling and cataloging, and Amazon Athena for SQL querying. At this level, it grew to become doable for the calculator agent to forego the Pandas or Spark information processing implementation. As an alternative, by utilizing Amazon Bedrock Brokers, the agent interprets pure language consumer prompts into SQL queries. In a subsequent step, the agent runs the related SQL queries chosen dynamically via evaluation of varied enter parameters, offering the calculator agent an correct consequence. This serverless structure helps scalability, cost-effectiveness, and maintains excessive accuracy in KPI calculations. The system integrates with Swisscom’s on-premises information lake via each day batch information ingestion, with cautious consideration of knowledge safety and sovereignty necessities.

To reinforce information safety and acceptable ethics within the Community Assistant responses, a collection of guardrails have been outlined in Amazon Bedrock. The applying implements a complete set of knowledge safety guardrails to guard towards malicious inputs and safeguard delicate info. These embrace content material filters that block dangerous classes similar to hate, insults, violence, and prompt-based threats like SQL injection. Particular denied subjects and delicate identifiers (for instance, IMSI, IMEI, MAC tackle, or GPS coordinates) are filtered via guide phrase filters and pattern-based detection, together with common expressions (regex). Delicate information similar to personally identifiable info (PII), AWS entry keys, and serial numbers are blocked or masked. The system additionally makes use of contextual grounding and relevance checks to confirm mannequin responses are factually correct and acceptable. Within the occasion of restricted enter or output, standardized messaging notifies the consumer that the request can’t be processed. These guardrails assist stop information leaks, scale back the chance of DDoS-driven value spikes, and preserve the integrity of the applying’s outputs.

Outcomes and advantages

The implementation of the Community Assistant is ready to ship substantial and measurable advantages to Swisscom’s community operations. Essentially the most important influence is time financial savings. Community engineers are estimated to expertise 10% discount in time spent on routine information retrieval and evaluation duties. This effectivity achieve interprets to just about 200 hours per engineer saved yearly, and represents a major enchancment in operational effectivity. The monetary influence is equally spectacular. The answer is projected to supply substantial value financial savings per engineer yearly, with minimal operational prices at lower than 1% of the overall worth generated. The return on funding will increase as extra groups and use circumstances are included into the system, demonstrating robust scalability potential.

Past the quantifiable advantages, the Community Assistant is anticipated to rework how engineers work together with community information. The improved information pipeline helps accuracy in KPI calculations, essential for community well being monitoring, and the multi-agent method offers orchestrated and complete responses to complicated queries out of consumer pure language.

In consequence, engineers can have immediate entry to a variety of community parameters, information supply info, and troubleshooting steering from a person personalised endpoint with which they’ll rapidly work together and acquire insights via pure language. This permits them to deal with strategic duties reasonably than routine information gathering and evaluation, resulting in a major work discount that aligns with Swisscom SRE ideas.

Classes discovered

All through the event and implementation of the Swisscom Community Assistant, a number of learnings emerged that formed the answer. The workforce wanted to deal with information sovereignty and safety necessities for the answer, notably when processing information on AWS. This led to cautious consideration of knowledge classification and compliance with relevant regulatory necessities within the telecommunications sector, to make it possible for delicate information is dealt with appropriately. On this regard, the applying underwent a strict risk mannequin analysis, verifying the robustness of its interfaces towards vulnerabilities and performing proactively in the direction of securitization. The risk mannequin was utilized to evaluate doomsday situations, and information circulate diagrams have been created to depict main information flows inside and past the applying boundaries. The AWS structure was laid out in element, and belief boundaries have been set to point which parts of the applying trusted one another. Threats have been recognized following the STRIDE methodology (Spoofing, Tampering, Repudiation, Info disclosure, Denial of service, Elevation of privilege), and countermeasures, together with Amazon Bedrock Guardrails, have been outlined to keep away from or mitigate threats prematurely.

A essential technical perception was that complicated calculations involving important information quantity administration required a distinct method than mere AI mannequin interpretation. The workforce carried out an enhanced information processing pipeline that mixes the contextual understanding of AI fashions with direct database queries for numerical calculations. This hybrid method facilitates each accuracy in calculations and richness in contextual responses.

The selection of a serverless structure proved to be notably useful: it minimized the necessity to handle compute sources and offers computerized scaling capabilities. The pay-per-use mannequin of AWS providers helped maintain operational prices low and preserve excessive efficiency. Moreover, the workforce’s resolution to implement a multi-agent method supplied the flexibleness wanted to deal with various kinds of queries and use circumstances successfully.

Subsequent steps

Swisscom has formidable plans to reinforce the Community Assistant’s capabilities additional. A key upcoming characteristic is the implementation of a community well being tracker agent to supply proactive monitoring of community KPIs. This agent will routinely generate studies to categorize points primarily based on criticality, allow sooner response time, and enhance the standard of situation decision to potential community points. The workforce can also be exploring the mixing of Amazon Easy Notification Service (Amazon SNS) to allow proactive alerting for essential community standing modifications. This could embrace direct integration with operational instruments that alert on-call engineers, to additional streamline the incident response course of. The improved notification system will assist engineers tackle potential points earlier than they critically influence community efficiency and acquire an in depth motion plan together with the affected community entities, the severity of the occasion, and what went improper exactly.

The roadmap additionally contains increasing the system’s information sources and use circumstances. Integration with extra inner community methods will present extra complete community insights. The workforce can also be engaged on growing extra refined troubleshooting options, utilizing the rising information base and agentic capabilities to supply more and more detailed steering to engineers.

Moreover, Swisscom is adopting infrastructure as code (IaC) ideas by implementing the answer utilizing AWS CloudFormation. This method introduces automated and constant deployments whereas offering model management of infrastructure parts, facilitating less complicated scaling and administration of the Community Assistant answer because it grows.

Conclusion

The Community Assistant represents a major development in how Swisscom can handle its community operations. By utilizing AWS providers and implementing a complicated AI-powered answer, they’ve efficiently addressed the challenges of guide information retrieval and evaluation. In consequence, they’ve boosted each accuracy and effectivity so community engineers can reply rapidly and decisively to community occasions. The answer’s success is aided not solely by the quantifiable advantages in time and value financial savings but in addition by its potential for future growth. The serverless structure and multi-agent method present a stable basis for including new capabilities and scaling throughout completely different groups and use circumstances.As organizations worldwide grapple with comparable challenges in community operations, Swisscom’s implementation serves as a beneficial blueprint for utilizing cloud providers and AI to rework conventional operations. The mixture of Amazon Bedrock with cautious consideration to information safety and accuracy demonstrates how trendy AI options can assist resolve real-world engineering challenges.

As managing community operations complexity continues to develop, the teachings from Swisscom’s journey will be utilized to many engineering disciplines. We encourage you to think about how Amazon Bedrock and comparable AI options may assist your group overcome its personal comprehension and course of enchancment boundaries. To be taught extra about implementing generative AI in your workflows, discover Amazon Bedrock Assets or contact AWS.

Further sources

For extra details about Amazon Bedrock Brokers and its use circumstances, discuss with the next sources:

In regards to the authors

Pablo García Benedicto is an skilled Information & AI Cloud Engineer with robust experience in cloud hyperscalers and information engineering. With a background in telecommunications, he at present works at Swisscom, the place he leads and contributes to tasks involving Generative AI functions and brokers utilizing Amazon Bedrock. Aiming for AI and information specialization, his newest tasks deal with constructing clever assistants and autonomous brokers that streamline enterprise info retrieval, leveraging cloud-native architectures and scalable information pipelines to cut back toil and drive operational effectivity.

Rajesh Sripathi is a Generative AI Specialist Options Architect at AWS, the place he companions with international Telecommunication and Retail & CPG clients to develop and scale generative AI functions. With over 18 years of expertise within the IT business, Rajesh helps organizations use cutting-edge cloud and AI applied sciences for enterprise transformation. Exterior of labor, he enjoys exploring new locations via his ardour for journey and driving.

Ruben Merz Ruben Merz is a Principal Options Architect at AWS. With a background in distributed methods and networking, his work with clients at AWS focuses on digital sovereignty, AI, and networking.

Jordi Montoliu Nerin is a Information & AI Chief at present serving as Senior AI/ML Specialist at AWS, the place he helps worldwide telecommunications clients implement AI methods after beforehand driving Information & Analytics enterprise throughout EMEA areas. He has over 10 years of expertise, the place he has led a number of Information & AI implementations at scale, led executions of knowledge technique and information governance frameworks, and has pushed strategic technical and enterprise growth packages throughout a number of industries and continents. Exterior of labor, he enjoys sports activities, cooking and touring.

How iFood constructed a platform to run lots of of machine studying fashions with Amazon SageMaker Inference

Admin — Tue, 08 Apr 2025 21:30:00 +0000

Headquartered in São Paulo, Brazil, iFood is a nationwide non-public firm and the chief in food-tech in Latin America, processing tens of millions of orders month-to-month. iFood has stood out for its technique of incorporating cutting-edge expertise into its operations. With the help of AWS, iFood has developed a sturdy machine studying (ML) inference infrastructure, utilizing companies similar to Amazon SageMaker to effectively create and deploy ML fashions. This partnership has allowed iFood not solely to optimize its inner processes, but in addition to supply revolutionary options to its supply companions and eating places.

iFood’s ML platform contains a set of instruments, processes, and workflows developed with the next aims:

Speed up the event and coaching of AI/ML fashions, making them extra dependable and reproducible
Be sure that deploying these fashions to manufacturing is dependable, scalable, and traceable
Facilitate the testing, monitoring, and analysis of fashions in manufacturing in a clear, accessible, and standardized method

To attain these aims, iFood makes use of SageMaker, which simplifies the coaching and deployment of fashions. Moreover, the combination of SageMaker options in iFood’s infrastructure automates essential processes, similar to producing coaching datasets, coaching fashions, deploying fashions to manufacturing, and constantly monitoring their efficiency.

On this put up, we present how iFood makes use of SageMaker to revolutionize its ML operations. By harnessing the facility of SageMaker, iFood streamlines the whole ML lifecycle, from mannequin coaching to deployment. This integration not solely simplifies advanced processes but in addition automates essential duties.

AI inference at iFood
iFood has harnessed the facility of a sturdy AI/ML platform to raise the client expertise throughout its numerous touchpoints. Utilizing the chopping fringe of AI/ML capabilities, the corporate has developed a set of transformative options to handle a large number of buyer use circumstances:

Personalised suggestions – At iFood, AI-powered advice fashions analyze a buyer’s previous order historical past, preferences, and contextual components to recommend essentially the most related eating places and menu gadgets. This personalised method makes certain prospects uncover new cuisines and dishes tailor-made to their tastes, enhancing satisfaction and driving elevated order volumes.
Clever order monitoring – iFood’s AI techniques monitor orders in actual time, predicting supply instances with a excessive diploma of accuracy. By understanding components like site visitors patterns, restaurant preparation instances, and courier areas, the AI can proactively notify prospects of their order standing and anticipated arrival, lowering uncertainty and anxiousness throughout the supply course of.
Automated buyer Service – To deal with the 1000’s of every day buyer inquiries, iFood has developed an AI-powered chatbot that may rapidly resolve frequent points and questions. This clever digital agent understands pure language, accesses related information, and offers personalised responses, delivering quick and constant help with out overburdening the human customer support group.
Grocery buying help – Integrating superior language fashions, iFood’s app permits prospects to easily converse or kind their recipe wants or grocery checklist, and the AI will robotically generate an in depth buying checklist. This voice-enabled grocery planning characteristic saves prospects effort and time, enhancing their total buying expertise.

By these numerous AI-powered initiatives, iFood is ready to anticipate buyer wants, streamline key processes, and ship a constantly distinctive expertise—additional strengthening its place because the main food-tech platform in Latin America.

Resolution overview

The next diagram illustrates iFood’s legacy structure, which had separate workflows for information science and engineering groups, creating challenges in effectively deploying correct, real-time machine studying fashions into manufacturing techniques.

Up to now, the information science and engineering groups at iFood operated independently. Knowledge scientists would construct fashions utilizing notebooks, regulate weights, and publish them onto companies. Engineering groups would then wrestle to combine these fashions into manufacturing techniques. This disconnection between the 2 groups made it difficult to deploy correct real-time ML fashions.

To beat this problem, iFood constructed an inner ML platform that helped bridge this hole. This platform has streamlined the workflow, offering a seamless expertise for creating, coaching, and delivering fashions for inference. It offers a centralized integration the place information scientists might construct, prepare, and deploy fashions seamlessly from an built-in method, contemplating the event workflow of the groups. The interplay with engineering groups might devour these fashions and combine them into purposes from each a web-based and offline perspective, enabling a extra environment friendly and streamlined workflow.

By breaking down the boundaries between information science and engineering, AWS AI platforms empowered iFood to make use of the total potential of their information and speed up the event of AI purposes. The automated deployment and scalable inference capabilities supplied by SageMaker made certain that fashions had been available to energy clever purposes and supply correct predictions on demand. This centralization of ML companies as a product has been a recreation changer for iFood, permitting them to concentrate on constructing high-performing fashions moderately than the intricate particulars of inference.

One of many core capabilities of iFood’s ML platform is the power to offer the infrastructure to serve predictions. A number of use circumstances are supported by the inference made out there via ML Go!, accountable for deploying SageMaker pipelines and endpoints. The previous are used to schedule offline predictions jobs, and the latter are employed to create mannequin companies, to be consumed by the appliance companies. The next diagram illustrates iFood’s up to date structure, which contains an inner ML platform constructed to streamline workflows between information science and engineering groups, enabling environment friendly deployment of machine studying fashions into manufacturing techniques.

Integrating mannequin deployment into the service improvement course of was a key initiative to allow information scientists and ML engineers to deploy and preserve these fashions. The ML platform empowers the constructing and evolution of ML techniques. A number of different integrations with different essential platforms, just like the characteristic platform and information platform, had been delivered to extend the expertise for the customers as a complete. The method of consuming ML-based selections was streamlined—nevertheless it doesn’t finish there. The iFood’s ML platform, ML Go!, is now specializing in new inference capabilities, supported by latest options by which the iFood’s group was accountable for supporting their ideation and improvement. The next diagram illustrates the ultimate structure of iFood’s ML platform, showcasing how mannequin deployment is built-in into the service improvement course of, the platform’s connections with characteristic and information platforms, and its concentrate on new inference capabilities.

One of many largest adjustments is oriented to the creation of 1 abstraction for connecting with SageMaker Endpoints and Jobs, known as ML Go! Gateway, and in addition, the separation of considerations inside the Endpoints, by means of the Inference Elements characteristic, making the serving quicker and extra environment friendly. On this new inference construction, the Endpoints are additionally managed by the ML Go! CI/CD, leaving for the pipelines, to deal solely with mannequin promotions, and never the infrastructure itself. It is going to cut back the lead time to adjustments, and alter failure ratio over the deployments.

Utilizing SageMaker Inference Mannequin Serving Containers:

One of many key options of recent machine studying platforms is the standardization of machine studying and AI companies. By encapsulating fashions and dependencies as Docker containers, these platforms guarantee consistency and portability throughout totally different environments and phases of ML. Utilizing SageMaker, information scientists and builders can use pre-built Docker containers, making it easy to deploy and handle ML companies. As a venture progresses, they’ll spin up new cases and configure them in keeping with their particular necessities. SageMaker offers Docker containers which might be designed to work seamlessly with SageMaker. These containers present a standardized and scalable atmosphere for operating ML workloads on SageMaker.

SageMaker offers a set of pre-built containers for fashionable ML frameworks and algorithms, similar to TensorFlow, PyTorch, XGBoost, and lots of others. These containers are optimized for efficiency and embody all the required dependencies and libraries pre-installed, making it easy to get began together with your ML tasks. Along with the pre-built containers, it offers choices to convey your individual customized containers to SageMaker, which embody your particular ML code, dependencies, and libraries. This may be notably helpful for those who’re utilizing a much less frequent framework or have particular necessities that aren’t met by the pre-built containers.

iFood was extremely centered on utilizing customized containers for the coaching and deployment of ML workloads, offering a constant and reproducible atmosphere for ML experiments, and making it easy to trace and replicate outcomes. Step one on this journey was to standardize the ML customized code, which is definitely the piece of code that the information scientists ought to concentrate on. With out a pocket book, and with BruceML, the way in which to create the code to coach and serve fashions has modified, to be encapsulated from the beginning as container photographs. BruceML was accountable for creating the scaffolding required to seamlessly combine with the SageMaker platform, permitting the groups to make the most of its varied options, similar to hyperparameter tuning, mannequin deployment, and monitoring. By standardizing ML companies and utilizing containerization, trendy platforms democratize ML, enabling iFood to quickly construct, deploy, and scale clever purposes.

Automating mannequin deployment and ML system retraining

When operating ML fashions in manufacturing, it’s essential to have a sturdy and automatic course of for deploying and recalibrating these fashions throughout totally different use circumstances. This helps be sure that the fashions stay correct and performant over time. The group at iFood understood this problem nicely—not solely the mannequin is deployed. As an alternative, they depend on one other idea to maintain issues operating nicely: ML pipelines.

Utilizing Amazon SageMaker Pipelines, they had been capable of construct a CI/CD system for ML, to ship automated retraining and mannequin deployment. Additionally they built-in this whole system with the corporate’s present CI/CD pipeline, making it environment friendly and in addition sustaining good DevOps practices used at iFood. It begins with the ML Go! CI/CD pipeline pushing the most recent code artifacts containing the mannequin coaching and deployment logic. It consists of the coaching course of, which makes use of totally different containers for implementing the whole pipeline. When coaching is full, the inference pipeline could be executed to start the mannequin deployment. It may be a completely new mannequin, or the promotion of a brand new model to extend the efficiency of an present one. Each mannequin out there for deployment can be secured and registered robotically by ML Go! in Amazon SageMaker Mannequin Registry, offering versioning and monitoring capabilities.

The ultimate step is determined by the meant inference necessities. For batch prediction use circumstances, the pipeline creates a SageMaker batch rework job to run large-scale predictions. For real-time inference, the pipeline deploys the mannequin to a SageMaker endpoint, fastidiously choosing the suitable container variant and occasion kind to deal with the anticipated manufacturing site visitors and latency wants. This end-to-end automation has been a recreation changer for iFood, permitting them to quickly iterate on their ML fashions and deploy updates and recalibrations rapidly and confidently throughout their varied use circumstances. SageMaker Pipelines has supplied a streamlined strategy to orchestrate these advanced workflows, ensuring mannequin operationalization is environment friendly and dependable.

Working inference in numerous SLA codecs

iFood makes use of the inference capabilities of SageMaker to energy its clever purposes and ship correct predictions to its prospects. By integrating the strong inference choices out there in SageMaker, iFood has been capable of seamlessly deploy ML fashions and make them out there for real-time and batch predictions. For iFood’s on-line, real-time prediction use circumstances, the corporate makes use of SageMaker hosted endpoints to deploy their fashions. These endpoints are built-in into iFood’s customer-facing purposes, permitting for speedy inference on incoming information from customers. SageMaker handles the scaling and administration of those endpoints, ensuring that iFood’s fashions are available to offer correct predictions and improve the consumer expertise.

Along with real-time predictions, iFood additionally makes use of SageMaker batch rework to carry out large-scale, asynchronous inference on datasets. That is notably helpful for iFood’s information preprocessing and batch prediction necessities, similar to producing suggestions or insights for his or her restaurant companions. SageMaker batch rework jobs allow iFood to effectively course of huge quantities of knowledge, additional enhancing their data-driven decision-making.

Constructing upon the success of standardization to SageMaker Inference, iFood has been instrumental in partnering with the SageMaker Inference group to construct and improve key AI inference capabilities inside the SageMaker platform. For the reason that early days of ML, iFood has supplied the SageMaker Inference group with priceless inputs and experience, enabling the introduction of a number of new options and optimizations:

Value and efficiency optimizations for generative AI inference – iFood helped the SageMaker Inference group develop revolutionary strategies to optimize the usage of accelerators, enabling SageMaker Inference to scale back basis mannequin (FM) deployment prices by 50% on common and latency by 20% on common with inference parts. This breakthrough delivers important value financial savings and efficiency enhancements for purchasers operating generative AI workloads on SageMaker.
Scaling enhancements for AI inference – iFood’s experience in distributed techniques and auto scaling has additionally helped the SageMaker group develop superior capabilities to higher deal with the scaling necessities of generative AI fashions. These enhancements cut back auto scaling instances by as much as 40% and auto scaling detection by six instances, ensuring that prospects can quickly scale their inference workloads on SageMaker to satisfy spikes in demand with out compromising efficiency.
Streamlined generative AI mannequin deployment for inference – Recognizing the necessity for simplified mannequin deployment, iFood collaborated with AWS to introduce the power to deploy open supply giant language fashions (LLMs) and FMs with only a few clicks. This user-friendly performance removes the complexity historically related to deploying these superior fashions, empowering extra prospects to harness the facility of AI.
Scale-to-zero for inference endpoints – iFood performed an important position in collaborating with SageMaker Inference to develop and launch the scale-to-zero characteristic for SageMaker inference endpoints. This revolutionary functionality permits inference endpoints to robotically shut down when not in use and quickly spin up on demand when new requests arrive. This characteristic is especially helpful for dev/take a look at environments, low-traffic purposes, and inference use circumstances with various inference calls for, as a result of it eliminates idle useful resource prices whereas sustaining the power to rapidly serve requests when wanted. The dimensions-to-zero performance represents a serious development in cost-efficiency for AI inference, making it extra accessible and economically viable for a wider vary of use circumstances.
Packaging AI mannequin inference extra effectively – To additional simplify the AI mannequin lifecycle, iFood labored with AWS to boost SageMaker’s capabilities for packaging LLMs and fashions for deployment. These enhancements make it easy to arrange and deploy these AI fashions, accelerating their adoption and integration.
Multi-model endpoints for GPU – iFood collaborated with the SageMaker Inference group to launch multi-model endpoints for GPU-based cases. This enhancement means that you can deploy a number of AI fashions on a single GPU-enabled endpoint, considerably enhancing useful resource utilization and cost-efficiency. By profiting from iFood’s experience in GPU optimization and mannequin serving, SageMaker now gives an answer that may dynamically load and unload fashions on GPUs, lowering infrastructure prices by as much as 75% for purchasers with a number of fashions and ranging site visitors patterns.
Asynchronous inference – Recognizing the necessity for dealing with long-running inference requests, the group at iFood labored carefully with the SageMaker Inference group to develop and launch Asynchronous Inference in SageMaker. This characteristic allows you to course of giant payloads or time-consuming inference requests with out the constraints of real-time API calls. iFood’s expertise with large-scale distributed techniques helped form this resolution, which now permits for higher administration of resource-intensive inference duties, and the power to deal with inference requests that may take a number of minutes to finish. This functionality has opened up new use circumstances for AI inference, notably in industries coping with advanced information processing duties similar to genomics, video evaluation, and monetary modeling.

By carefully partnering with the SageMaker Inference group, iFood has performed a pivotal position in driving the fast evolution of AI inference and generative AI inference capabilities in SageMaker. The options and optimizations launched via this collaboration are empowering AWS prospects to unlock the transformative potential of inference with higher ease, cost-effectiveness, and efficiency.

“At iFood, we had been on the forefront of adopting transformative machine studying and AI applied sciences, and our partnership with the SageMaker Inference product group has been instrumental in shaping the way forward for AI purposes. Collectively, we’ve developed methods to effectively handle inference workloads, permitting us to run fashions with pace and price-performance. The teachings we’ve realized supported us within the creation of our inner platform, which may function a blueprint for different organizations seeking to harness the facility of AI inference. We imagine the options we have now inbuilt collaboration will broadly assist different enterprises who run inference workloads on SageMaker, unlocking new frontiers of innovation and enterprise transformation, by fixing recurring and essential issues within the universe of machine studying engineering.”

– says Daniel Vieira, ML Platform supervisor at iFood.

Conclusion

Utilizing the capabilities of SageMaker, iFood remodeled its method to ML and AI, unleashing new prospects for enhancing the client expertise. By constructing a sturdy and centralized ML platform, iFood has bridged the hole between its information science and engineering groups, streamlining the mannequin lifecycle from improvement to deployment. The mixing of SageMaker options has enabled iFood to deploy ML fashions for each real-time and batch-oriented use circumstances. For real-time, customer-facing purposes, iFood makes use of SageMaker hosted endpoints to offer speedy predictions and improve the consumer expertise. Moreover, the corporate makes use of SageMaker batch rework to effectively course of giant datasets and generate insights for its restaurant companions. This flexibility in inference choices has been key to iFood’s skill to energy a various vary of clever purposes.

The automation of deployment and retraining via ML Go!, supported by SageMaker Pipelines and SageMaker Inference, has been a recreation changer for iFood. This has enabled the corporate to quickly iterate on its ML fashions, deploy updates with confidence, and preserve the continuing efficiency and reliability of its clever purposes. Furthermore, iFood’s strategic partnership with the SageMaker Inference group has been instrumental in driving the evolution of AI inference capabilities inside the platform. By this collaboration, iFood has helped form value and efficiency optimizations, scale enhancements, and simplify mannequin deployment options—all of which are actually benefiting a wider vary of AWS prospects.

By profiting from the capabilities SageMaker gives, iFood has been capable of unlock the transformative potential of AI and ML, delivering revolutionary options that improve the client expertise and strengthen its place because the main food-tech platform in Latin America. This journey serves as a testomony to the facility of cloud-based AI infrastructure and the worth of strategic partnerships in driving technology-driven enterprise transformation.

By following iFood’s instance, you may unlock the total potential of SageMaker for your corporation, driving innovation and staying forward in your trade.

Concerning the Authors

Daniel Vieira is a seasoned Machine Studying Engineering Supervisor at iFood, with a robust tutorial background in pc science, holding each a bachelor’s and a grasp’s diploma from the Federal College of Minas Gerais (UFMG). With over a decade of expertise in software program engineering and platform improvement, Daniel leads iFood’s ML platform, constructing a sturdy, scalable ecosystem that drives impactful ML options throughout the corporate. In his spare time, Daniel Vieira enjoys music, philosophy, and studying about new issues whereas consuming cup of espresso.

Debora Fanin serves as a Senior Buyer Options Supervisor AWS for the Digital Native Enterprise section in Brazil. On this position, Debora manages buyer transformations, creating cloud adoption methods to help cost-effective, well timed deployments. Her tasks embody designing change administration plans, guiding solution-focused selections, and addressing potential dangers to align with buyer aims. Debora’s tutorial path features a Grasp’s diploma in Administration at FEI and certifications similar to Amazon Options Architect Affiliate and Agile credentials. Her skilled historical past spans IT and venture administration roles throughout numerous sectors, the place she developed experience in cloud applied sciences, information science, and buyer relations.

Saurabh Trikande is a Senior Product Supervisor for Amazon Bedrock and Amazon SageMaker Inference. He’s keen about working with prospects and companions, motivated by the aim of democratizing AI. He focuses on core challenges associated to deploying advanced AI purposes, inference with multi-tenant fashions, value optimizations, and making the deployment of generative AI fashions extra accessible. In his spare time, Saurabh enjoys climbing, studying about revolutionary applied sciences, following TechCrunch, and spending time along with his household.

Gopi Mudiyala is a Senior Technical Account Supervisor at AWS. He helps prospects within the monetary companies trade with their operations in AWS. As a machine studying fanatic, Gopi works to assist prospects succeed of their ML journey. In his spare time, he likes to play badminton, spend time with household, and journey.

IBM releases a brand new mainframe constructed for the age of AI

Admin — Tue, 08 Apr 2025 06:37:39 +0000

IBM is releasing the most recent model of its mainframe {hardware} that features new updates meant to speed up AI adoption.

The {hardware} and consulting firm on Monday introduced IBM z17, the most recent model of its mainframe laptop {hardware}. This totally encrypted mainframe is powered by an IBM Telum II processor and is designed for greater than 250 AI use instances, the corporate says, together with AI brokers and generative AI.

Mainframes would possibly seem to be outdated hat, however they’re utilized by 71% of Fortune 500 corporations right this moment, based on one supply. In 2024, the mainframe market was price an estimated $5.3 billion, per consulting agency Market Analysis Future.

The z17 can course of 450 billion inference operations in a day, a 50% improve over its predecessor, the IBM z16, which was launched in 2022 and ran on the corporate’s authentic Tellum processor. The system is designed to have the ability to totally combine with different {hardware}, software program, and open-source instruments.

Tina Tarquinio, VP of product administration and design for IBM Z, instructed TechCrunch that this mainframe improve has been within the works for 5 years — effectively earlier than the present AI frenzy that began with the discharge of OpenAI’s ChatGPT in November 2022.

IBM spent greater than 2,000 analysis hours getting suggestions from over 100 prospects because it constructed the z17, Tarquinio stated. She thinks it’s attention-grabbing to see that, now, 5 years later, the suggestions they received aligned with the place the market ended up heading.

“It has been wild realizing that we’re introducing an AI accelerator, after which seeing, particularly within the later half of 2022, all the adjustments within the business concerning AI,” Tarquinio instructed TechCrunch. “It’s been actually thrilling. I believe the most important level has been [that] we don’t know what we don’t learn about what’s coming, proper? So the chances are actually limitless by way of what AI might help us do.”

The z17 is about as much as adapt and accommodate the place the AI market heads, Tarquinio stated. The mainframe will assist 48 IBM Spyre AI accelerator chips upon launch, with the plan to carry that quantity as much as 96 inside 12 months.

“We’re purposely constructing in headroom,” Tarquinio stated. “We’re purposely constructing in AI agility. In order new fashions are launched, [we’re] ensuring that we’ve constructed within the headroom for greater, bigger fashions — fashions that perhaps want extra native reminiscence to speak to one another. We’ve inbuilt that as a result of we all know it’s actually the strategy that can change, proper? The brand new fashions will come and go.”

Tarquinio stated that one of many highlights of this newest {hardware} — though she joked it was like being requested to choose her favourite baby — is that the z17 is extra energy-efficient than its predecessor and supposedly rivals, too.

“On-chip, we’re rising the AI acceleration by seven and a half instances, however that’s 5 and a half instances much less power than you would wish to do, like, multi-model on one other sort of accelerator or platform within the business,” Tarquinio stated.

The z17 mainframes will grow to be usually out there on June 8.