How iFood constructed a platform to run lots of of machine studying fashions with Amazon SageMaker Inference

Headquartered in São Paulo, Brazil, iFood is a nationwide non-public firm and the chief in food-tech in Latin America, processing tens of millions of orders month-to-month. iFood has stood out for its technique of incorporating cutting-edge expertise into its operations. With the help of AWS, iFood has developed a sturdy machine studying (ML) inference infrastructure, utilizing companies similar to Amazon SageMaker to effectively create and deploy ML fashions. This partnership has allowed iFood not solely to optimize its inner processes, but in addition to supply revolutionary options to its supply companions and eating places.

iFood’s ML platform contains a set of instruments, processes, and workflows developed with the next aims:

Speed up the event and coaching of AI/ML fashions, making them extra dependable and reproducible
Be sure that deploying these fashions to manufacturing is dependable, scalable, and traceable
Facilitate the testing, monitoring, and analysis of fashions in manufacturing in a clear, accessible, and standardized method

To attain these aims, iFood makes use of SageMaker, which simplifies the coaching and deployment of fashions. Moreover, the combination of SageMaker options in iFood’s infrastructure automates essential processes, similar to producing coaching datasets, coaching fashions, deploying fashions to manufacturing, and constantly monitoring their efficiency.

On this put up, we present how iFood makes use of SageMaker to revolutionize its ML operations. By harnessing the facility of SageMaker, iFood streamlines the whole ML lifecycle, from mannequin coaching to deployment. This integration not solely simplifies advanced processes but in addition automates essential duties.

AI inference at iFood
iFood has harnessed the facility of a sturdy AI/ML platform to raise the client expertise throughout its numerous touchpoints. Utilizing the chopping fringe of AI/ML capabilities, the corporate has developed a set of transformative options to handle a large number of buyer use circumstances:

Personalised suggestions – At iFood, AI-powered advice fashions analyze a buyer’s previous order historical past, preferences, and contextual components to recommend essentially the most related eating places and menu gadgets. This personalised method makes certain prospects uncover new cuisines and dishes tailor-made to their tastes, enhancing satisfaction and driving elevated order volumes.
Clever order monitoring – iFood’s AI techniques monitor orders in actual time, predicting supply instances with a excessive diploma of accuracy. By understanding components like site visitors patterns, restaurant preparation instances, and courier areas, the AI can proactively notify prospects of their order standing and anticipated arrival, lowering uncertainty and anxiousness throughout the supply course of.
Automated buyer Service – To deal with the 1000’s of every day buyer inquiries, iFood has developed an AI-powered chatbot that may rapidly resolve frequent points and questions. This clever digital agent understands pure language, accesses related information, and offers personalised responses, delivering quick and constant help with out overburdening the human customer support group.
Grocery buying help – Integrating superior language fashions, iFood’s app permits prospects to easily converse or kind their recipe wants or grocery checklist, and the AI will robotically generate an in depth buying checklist. This voice-enabled grocery planning characteristic saves prospects effort and time, enhancing their total buying expertise.

By these numerous AI-powered initiatives, iFood is ready to anticipate buyer wants, streamline key processes, and ship a constantly distinctive expertise—additional strengthening its place because the main food-tech platform in Latin America.

Resolution overview

The next diagram illustrates iFood’s legacy structure, which had separate workflows for information science and engineering groups, creating challenges in effectively deploying correct, real-time machine studying fashions into manufacturing techniques.

Up to now, the information science and engineering groups at iFood operated independently. Knowledge scientists would construct fashions utilizing notebooks, regulate weights, and publish them onto companies. Engineering groups would then wrestle to combine these fashions into manufacturing techniques. This disconnection between the 2 groups made it difficult to deploy correct real-time ML fashions.

To beat this problem, iFood constructed an inner ML platform that helped bridge this hole. This platform has streamlined the workflow, offering a seamless expertise for creating, coaching, and delivering fashions for inference. It offers a centralized integration the place information scientists might construct, prepare, and deploy fashions seamlessly from an built-in method, contemplating the event workflow of the groups. The interplay with engineering groups might devour these fashions and combine them into purposes from each a web-based and offline perspective, enabling a extra environment friendly and streamlined workflow.

By breaking down the boundaries between information science and engineering, AWS AI platforms empowered iFood to make use of the total potential of their information and speed up the event of AI purposes. The automated deployment and scalable inference capabilities supplied by SageMaker made certain that fashions had been available to energy clever purposes and supply correct predictions on demand. This centralization of ML companies as a product has been a recreation changer for iFood, permitting them to concentrate on constructing high-performing fashions moderately than the intricate particulars of inference.

One of many core capabilities of iFood’s ML platform is the power to offer the infrastructure to serve predictions. A number of use circumstances are supported by the inference made out there via ML Go!, accountable for deploying SageMaker pipelines and endpoints. The previous are used to schedule offline predictions jobs, and the latter are employed to create mannequin companies, to be consumed by the appliance companies. The next diagram illustrates iFood’s up to date structure, which contains an inner ML platform constructed to streamline workflows between information science and engineering groups, enabling environment friendly deployment of machine studying fashions into manufacturing techniques.

Integrating mannequin deployment into the service improvement course of was a key initiative to allow information scientists and ML engineers to deploy and preserve these fashions. The ML platform empowers the constructing and evolution of ML techniques. A number of different integrations with different essential platforms, just like the characteristic platform and information platform, had been delivered to extend the expertise for the customers as a complete. The method of consuming ML-based selections was streamlined—nevertheless it doesn’t finish there. The iFood’s ML platform, ML Go!, is now specializing in new inference capabilities, supported by latest options by which the iFood’s group was accountable for supporting their ideation and improvement. The next diagram illustrates the ultimate structure of iFood’s ML platform, showcasing how mannequin deployment is built-in into the service improvement course of, the platform’s connections with characteristic and information platforms, and its concentrate on new inference capabilities.

One of many largest adjustments is oriented to the creation of 1 abstraction for connecting with SageMaker Endpoints and Jobs, known as ML Go! Gateway, and in addition, the separation of considerations inside the Endpoints, by means of the Inference Elements characteristic, making the serving quicker and extra environment friendly. On this new inference construction, the Endpoints are additionally managed by the ML Go! CI/CD, leaving for the pipelines, to deal solely with mannequin promotions, and never the infrastructure itself. It is going to cut back the lead time to adjustments, and alter failure ratio over the deployments.

Utilizing SageMaker Inference Mannequin Serving Containers:

One of many key options of recent machine studying platforms is the standardization of machine studying and AI companies. By encapsulating fashions and dependencies as Docker containers, these platforms guarantee consistency and portability throughout totally different environments and phases of ML. Utilizing SageMaker, information scientists and builders can use pre-built Docker containers, making it easy to deploy and handle ML companies. As a venture progresses, they’ll spin up new cases and configure them in keeping with their particular necessities. SageMaker offers Docker containers which might be designed to work seamlessly with SageMaker. These containers present a standardized and scalable atmosphere for operating ML workloads on SageMaker.

SageMaker offers a set of pre-built containers for fashionable ML frameworks and algorithms, similar to TensorFlow, PyTorch, XGBoost, and lots of others. These containers are optimized for efficiency and embody all the required dependencies and libraries pre-installed, making it easy to get began together with your ML tasks. Along with the pre-built containers, it offers choices to convey your individual customized containers to SageMaker, which embody your particular ML code, dependencies, and libraries. This may be notably helpful for those who’re utilizing a much less frequent framework or have particular necessities that aren’t met by the pre-built containers.

iFood was extremely centered on utilizing customized containers for the coaching and deployment of ML workloads, offering a constant and reproducible atmosphere for ML experiments, and making it easy to trace and replicate outcomes. Step one on this journey was to standardize the ML customized code, which is definitely the piece of code that the information scientists ought to concentrate on. With out a pocket book, and with BruceML, the way in which to create the code to coach and serve fashions has modified, to be encapsulated from the beginning as container photographs. BruceML was accountable for creating the scaffolding required to seamlessly combine with the SageMaker platform, permitting the groups to make the most of its varied options, similar to hyperparameter tuning, mannequin deployment, and monitoring. By standardizing ML companies and utilizing containerization, trendy platforms democratize ML, enabling iFood to quickly construct, deploy, and scale clever purposes.

Automating mannequin deployment and ML system retraining

When operating ML fashions in manufacturing, it’s essential to have a sturdy and automatic course of for deploying and recalibrating these fashions throughout totally different use circumstances. This helps be sure that the fashions stay correct and performant over time. The group at iFood understood this problem nicely—not solely the mannequin is deployed. As an alternative, they depend on one other idea to maintain issues operating nicely: ML pipelines.

Utilizing Amazon SageMaker Pipelines, they had been capable of construct a CI/CD system for ML, to ship automated retraining and mannequin deployment. Additionally they built-in this whole system with the corporate’s present CI/CD pipeline, making it environment friendly and in addition sustaining good DevOps practices used at iFood. It begins with the ML Go! CI/CD pipeline pushing the most recent code artifacts containing the mannequin coaching and deployment logic. It consists of the coaching course of, which makes use of totally different containers for implementing the whole pipeline. When coaching is full, the inference pipeline could be executed to start the mannequin deployment. It may be a completely new mannequin, or the promotion of a brand new model to extend the efficiency of an present one. Each mannequin out there for deployment can be secured and registered robotically by ML Go! in Amazon SageMaker Mannequin Registry, offering versioning and monitoring capabilities.

The ultimate step is determined by the meant inference necessities. For batch prediction use circumstances, the pipeline creates a SageMaker batch rework job to run large-scale predictions. For real-time inference, the pipeline deploys the mannequin to a SageMaker endpoint, fastidiously choosing the suitable container variant and occasion kind to deal with the anticipated manufacturing site visitors and latency wants. This end-to-end automation has been a recreation changer for iFood, permitting them to quickly iterate on their ML fashions and deploy updates and recalibrations rapidly and confidently throughout their varied use circumstances. SageMaker Pipelines has supplied a streamlined strategy to orchestrate these advanced workflows, ensuring mannequin operationalization is environment friendly and dependable.

Working inference in numerous SLA codecs

iFood makes use of the inference capabilities of SageMaker to energy its clever purposes and ship correct predictions to its prospects. By integrating the strong inference choices out there in SageMaker, iFood has been capable of seamlessly deploy ML fashions and make them out there for real-time and batch predictions. For iFood’s on-line, real-time prediction use circumstances, the corporate makes use of SageMaker hosted endpoints to deploy their fashions. These endpoints are built-in into iFood’s customer-facing purposes, permitting for speedy inference on incoming information from customers. SageMaker handles the scaling and administration of those endpoints, ensuring that iFood’s fashions are available to offer correct predictions and improve the consumer expertise.

Along with real-time predictions, iFood additionally makes use of SageMaker batch rework to carry out large-scale, asynchronous inference on datasets. That is notably helpful for iFood’s information preprocessing and batch prediction necessities, similar to producing suggestions or insights for his or her restaurant companions. SageMaker batch rework jobs allow iFood to effectively course of huge quantities of knowledge, additional enhancing their data-driven decision-making.

Constructing upon the success of standardization to SageMaker Inference, iFood has been instrumental in partnering with the SageMaker Inference group to construct and improve key AI inference capabilities inside the SageMaker platform. For the reason that early days of ML, iFood has supplied the SageMaker Inference group with priceless inputs and experience, enabling the introduction of a number of new options and optimizations:

Value and efficiency optimizations for generative AI inference – iFood helped the SageMaker Inference group develop revolutionary strategies to optimize the usage of accelerators, enabling SageMaker Inference to scale back basis mannequin (FM) deployment prices by 50% on common and latency by 20% on common with inference parts. This breakthrough delivers important value financial savings and efficiency enhancements for purchasers operating generative AI workloads on SageMaker.
Scaling enhancements for AI inference – iFood’s experience in distributed techniques and auto scaling has additionally helped the SageMaker group develop superior capabilities to higher deal with the scaling necessities of generative AI fashions. These enhancements cut back auto scaling instances by as much as 40% and auto scaling detection by six instances, ensuring that prospects can quickly scale their inference workloads on SageMaker to satisfy spikes in demand with out compromising efficiency.
Streamlined generative AI mannequin deployment for inference – Recognizing the necessity for simplified mannequin deployment, iFood collaborated with AWS to introduce the power to deploy open supply giant language fashions (LLMs) and FMs with only a few clicks. This user-friendly performance removes the complexity historically related to deploying these superior fashions, empowering extra prospects to harness the facility of AI.
Scale-to-zero for inference endpoints – iFood performed an important position in collaborating with SageMaker Inference to develop and launch the scale-to-zero characteristic for SageMaker inference endpoints. This revolutionary functionality permits inference endpoints to robotically shut down when not in use and quickly spin up on demand when new requests arrive. This characteristic is especially helpful for dev/take a look at environments, low-traffic purposes, and inference use circumstances with various inference calls for, as a result of it eliminates idle useful resource prices whereas sustaining the power to rapidly serve requests when wanted. The dimensions-to-zero performance represents a serious development in cost-efficiency for AI inference, making it extra accessible and economically viable for a wider vary of use circumstances.
Packaging AI mannequin inference extra effectively – To additional simplify the AI mannequin lifecycle, iFood labored with AWS to boost SageMaker’s capabilities for packaging LLMs and fashions for deployment. These enhancements make it easy to arrange and deploy these AI fashions, accelerating their adoption and integration.
Multi-model endpoints for GPU – iFood collaborated with the SageMaker Inference group to launch multi-model endpoints for GPU-based cases. This enhancement means that you can deploy a number of AI fashions on a single GPU-enabled endpoint, considerably enhancing useful resource utilization and cost-efficiency. By profiting from iFood’s experience in GPU optimization and mannequin serving, SageMaker now gives an answer that may dynamically load and unload fashions on GPUs, lowering infrastructure prices by as much as 75% for purchasers with a number of fashions and ranging site visitors patterns.
Asynchronous inference – Recognizing the necessity for dealing with long-running inference requests, the group at iFood labored carefully with the SageMaker Inference group to develop and launch Asynchronous Inference in SageMaker. This characteristic allows you to course of giant payloads or time-consuming inference requests with out the constraints of real-time API calls. iFood’s expertise with large-scale distributed techniques helped form this resolution, which now permits for higher administration of resource-intensive inference duties, and the power to deal with inference requests that may take a number of minutes to finish. This functionality has opened up new use circumstances for AI inference, notably in industries coping with advanced information processing duties similar to genomics, video evaluation, and monetary modeling.

By carefully partnering with the SageMaker Inference group, iFood has performed a pivotal position in driving the fast evolution of AI inference and generative AI inference capabilities in SageMaker. The options and optimizations launched via this collaboration are empowering AWS prospects to unlock the transformative potential of inference with higher ease, cost-effectiveness, and efficiency.

“At iFood, we had been on the forefront of adopting transformative machine studying and AI applied sciences, and our partnership with the SageMaker Inference product group has been instrumental in shaping the way forward for AI purposes. Collectively, we’ve developed methods to effectively handle inference workloads, permitting us to run fashions with pace and price-performance. The teachings we’ve realized supported us within the creation of our inner platform, which may function a blueprint for different organizations seeking to harness the facility of AI inference. We imagine the options we have now inbuilt collaboration will broadly assist different enterprises who run inference workloads on SageMaker, unlocking new frontiers of innovation and enterprise transformation, by fixing recurring and essential issues within the universe of machine studying engineering.”

– says Daniel Vieira, ML Platform supervisor at iFood.

Conclusion

Utilizing the capabilities of SageMaker, iFood remodeled its method to ML and AI, unleashing new prospects for enhancing the client expertise. By constructing a sturdy and centralized ML platform, iFood has bridged the hole between its information science and engineering groups, streamlining the mannequin lifecycle from improvement to deployment. The mixing of SageMaker options has enabled iFood to deploy ML fashions for each real-time and batch-oriented use circumstances. For real-time, customer-facing purposes, iFood makes use of SageMaker hosted endpoints to offer speedy predictions and improve the consumer expertise. Moreover, the corporate makes use of SageMaker batch rework to effectively course of giant datasets and generate insights for its restaurant companions. This flexibility in inference choices has been key to iFood’s skill to energy a various vary of clever purposes.

The automation of deployment and retraining via ML Go!, supported by SageMaker Pipelines and SageMaker Inference, has been a recreation changer for iFood. This has enabled the corporate to quickly iterate on its ML fashions, deploy updates with confidence, and preserve the continuing efficiency and reliability of its clever purposes. Furthermore, iFood’s strategic partnership with the SageMaker Inference group has been instrumental in driving the evolution of AI inference capabilities inside the platform. By this collaboration, iFood has helped form value and efficiency optimizations, scale enhancements, and simplify mannequin deployment options—all of which are actually benefiting a wider vary of AWS prospects.

By profiting from the capabilities SageMaker gives, iFood has been capable of unlock the transformative potential of AI and ML, delivering revolutionary options that improve the client expertise and strengthen its place because the main food-tech platform in Latin America. This journey serves as a testomony to the facility of cloud-based AI infrastructure and the worth of strategic partnerships in driving technology-driven enterprise transformation.

By following iFood’s instance, you may unlock the total potential of SageMaker for your corporation, driving innovation and staying forward in your trade.

Concerning the Authors

Daniel Vieira is a seasoned Machine Studying Engineering Supervisor at iFood, with a robust tutorial background in pc science, holding each a bachelor’s and a grasp’s diploma from the Federal College of Minas Gerais (UFMG). With over a decade of expertise in software program engineering and platform improvement, Daniel leads iFood’s ML platform, constructing a sturdy, scalable ecosystem that drives impactful ML options throughout the corporate. In his spare time, Daniel Vieira enjoys music, philosophy, and studying about new issues whereas consuming cup of espresso.

Debora Fanin serves as a Senior Buyer Options Supervisor AWS for the Digital Native Enterprise section in Brazil. On this position, Debora manages buyer transformations, creating cloud adoption methods to help cost-effective, well timed deployments. Her tasks embody designing change administration plans, guiding solution-focused selections, and addressing potential dangers to align with buyer aims. Debora’s tutorial path features a Grasp’s diploma in Administration at FEI and certifications similar to Amazon Options Architect Affiliate and Agile credentials. Her skilled historical past spans IT and venture administration roles throughout numerous sectors, the place she developed experience in cloud applied sciences, information science, and buyer relations.

Saurabh Trikande is a Senior Product Supervisor for Amazon Bedrock and Amazon SageMaker Inference. He’s keen about working with prospects and companions, motivated by the aim of democratizing AI. He focuses on core challenges associated to deploying advanced AI purposes, inference with multi-tenant fashions, value optimizations, and making the deployment of generative AI fashions extra accessible. In his spare time, Saurabh enjoys climbing, studying about revolutionary applied sciences, following TechCrunch, and spending time along with his household.

Gopi Mudiyala is a Senior Technical Account Supervisor at AWS. He helps prospects within the monetary companies trade with their operations in AWS. As a machine studying fanatic, Gopi works to assist prospects succeed of their ML journey. In his spare time, he likes to play badminton, spend time with household, and journey.