{"id":5175,"date":"2025-08-02T06:42:08","date_gmt":"2025-08-02T06:42:08","guid":{"rendered":"https:\/\/techtrendfeed.com\/?p=5175"},"modified":"2025-08-02T06:42:08","modified_gmt":"2025-08-02T06:42:08","slug":"debugging-and-tracing-llms-like-a-professional","status":"publish","type":"post","link":"https:\/\/techtrendfeed.com\/?p=5175","title":{"rendered":"Debugging and Tracing LLMs Like a Professional"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div id=\"post-\">\n<p>    <center><img decoding=\"async\" alt=\"Debugging and Tracing LLMs Like a Pro\" width=\"100%\" class=\"perfmatters-lazy\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/Phoenix-Tracing-And-Debugging-LLMs-like-a-Pro.png\"\/><img decoding=\"async\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/Phoenix-Tracing-And-Debugging-LLMs-like-a-Pro.png\" alt=\"Debugging and Tracing LLMs Like a Pro\" width=\"100%\"\/><br \/><span>Picture by Creator | Canva<\/span><\/center><br \/>\n\u00a0<\/p>\n<h2><span>#\u00a0<\/span>Introduction<\/h2>\n<p>\u00a0<br \/>Conventional debugging with <code style=\"background: #F5F5F5;\">print()<\/code> or logging works, however it\u2019s sluggish and clunky with LLMs. Phoenix gives a timeline view of each step, immediate, and response inspection, error detection with retries, visibility into latency and prices, and an entire visible understanding of your app. Phoenix by Arize AI is a robust open-source observability and tracing software particularly designed for LLM functions. It helps you monitor, debug, and hint every little thing occurring in your LLM pipelines visually. On this article, we\u2019ll stroll by what Phoenix does and why it issues, tips on how to combine Phoenix with LangChain step-by-step, and tips on how to visualize traces within the Phoenix UI.<\/p>\n<p>\u00a0<\/p>\n<h2><span>#\u00a0<\/span>What&#8217;s Phoenix?<\/h2>\n<p>\u00a0<br \/>Phoenix is an open-source observability and debugging software made for big language mannequin functions. It captures detailed telemetry knowledge out of your LLM workflows, together with prompts, responses, latency, errors, and gear utilization, and presents this info in an intuitive, interactive dashboard. Phoenix permits builders to deeply perceive how their LLM pipelines behave contained in the system, establish and debug points with immediate outputs, analyze efficiency bottlenecks, monitor utilizing tokens and related prices, and hint any errors\/retry logic throughout execution part. It helps constant integrations with fashionable frameworks like LangChain and LlamaIndex, and likewise presents OpenTelemetry assist for extra personalized setups.<\/p>\n<p>\u00a0<\/p>\n<h2><span>#\u00a0<\/span>Step-by-Step Setup<\/h2>\n<p>\u00a0<\/p>\n<h4><span>\/\/\u00a0<\/span>1. Putting in Required Libraries<\/h4>\n<p>Ensure you have Python 3.8+ and set up the dependencies:<\/p>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>pip set up arize-phoenix langchain langchain-together openinference-instrumentation-langchain langchain-community<\/code><\/pre>\n<\/div>\n<p>\u00a0<\/p>\n<h4><span>\/\/\u00a0<\/span>2. Launching Phoenix<\/h4>\n<p>Add this line to launch the Phoenix dashboard:<\/p>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>import phoenix as px&#13;\npx.launch_app()<\/code><\/pre>\n<\/div>\n<p>\u00a0<br \/>This begins an area dashboard at <a rel=\"nofollow\" target=\"_blank\" href=\"http:\/\/localhost:6006\" target=\"_blank\">http:\/\/localhost:6006<\/a>.<\/p>\n<p>\u00a0<\/p>\n<h4><span>\/\/\u00a0<\/span>3. Constructing the LangChain Pipeline with Phoenix Callback<\/h4>\n<p>Let\u2019s perceive Phoenix utilizing a use case. We&#8217;re constructing a easy LangChain-powered chatbot. Now, we wish to:<\/p>\n<ul>\n<li>Debug if the immediate is working<\/li>\n<li>Monitor how lengthy the mannequin takes to reply<\/li>\n<li>Monitor immediate construction, mannequin utilization, and outputs<\/li>\n<li>See all this visually as an alternative of logging every little thing manually<\/li>\n<\/ul>\n<p>\u00a0<\/p>\n<h4><span>\/\/\u00a0<\/span>Step 1: Launch the Phoenix Dashboard within the Background<\/h4>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>import threading&#13;\nimport phoenix as px&#13;\n&#13;\n# Launch Phoenix app regionally (entry at http:\/\/localhost:6006)&#13;\ndef run_phoenix():&#13;\n    px.launch_app()&#13;\n&#13;\nthreading.Thread(goal=run_phoenix, daemon=True).begin()<\/code><\/pre>\n<\/div>\n<p>\u00a0<\/p>\n<h4><span>\/\/\u00a0<\/span>Step 2: Register Phoenix with OpenTelemetry &amp; Instrument LangChain<\/h4>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>from phoenix.otel import register&#13;\nfrom openinference.instrumentation.langchain import LangChainInstrumentor&#13;\n&#13;\n# Register OpenTelemetry tracer&#13;\ntracer_provider = register()&#13;\n&#13;\n# Instrument LangChain with Phoenix&#13;\nLangChainInstrumentor().instrument(tracer_provider=tracer_provider)<\/code><\/pre>\n<\/div>\n<p>\u00a0<\/p>\n<h4><span>\/\/\u00a0<\/span>Step 3: Initialize the LLM (Collectively API)<\/h4>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>from langchain_together import Collectively&#13;\n&#13;\nllm = Collectively(&#13;\n    mannequin=\"meta-llama\/Llama-3-8b-chat-hf\",&#13;\n    temperature=0.7,&#13;\n    max_tokens=256,&#13;\n    together_api_key=\"your-api-key\",  # Substitute together with your precise API key&#13;\n)<\/code><\/pre>\n<\/div>\n<p>\u00a0<br \/>Please don\u2019t neglect to interchange the \u201cyour-api-key\u201d together with your precise collectively.ai API key. You will get it utilizing this <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/api.together.xyz\" target=\"_blank\">hyperlink<\/a>.<\/p>\n<p>\u00a0<\/p>\n<h4><span>\/\/\u00a0<\/span>Step 4: Outline the Immediate Template<\/h4>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>from langchain.prompts import ChatPromptTemplate&#13;\n&#13;\nimmediate = ChatPromptTemplate.from_messages([&#13;\n    (\"system\", \"You are a helpful assistant.\"),&#13;\n    (\"human\", \"{question}\"),&#13;\n])<\/code><\/pre>\n<\/div>\n<p>\u00a0<\/p>\n<h4><span>\/\/\u00a0<\/span>Step 5: Mix Immediate and Mannequin right into a Chain<\/h4>\n<p>\u00a0<\/p>\n<h4><span>\/\/\u00a0<\/span>Step 6: Ask A number of Questions and Print Responses<\/h4>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>questions = [&#13;\n    \"What is the capital of France?\",&#13;\n    \"Who discovered gravity?\",&#13;\n    \"Give me a motivational quote about perseverance.\",&#13;\n    \"Explain photosynthesis in one sentence.\",&#13;\n    \"What is the speed of light?\",&#13;\n]&#13;\n&#13;\nprint(\"Phoenix operating at http:\/\/localhost:6006n\")&#13;\n&#13;\nfor q in questions:&#13;\n    print(f\" Query: {q}\")&#13;\n    response = chain.invoke({\"query\": q})&#13;\n    print(\" Reply:\", response, \"n\")<\/code><\/pre>\n<\/div>\n<p>\u00a0<\/p>\n<h4><span>\/\/\u00a0<\/span>Step 7: Hold the App Alive for Monitoring<\/h4>\n<div style=\"width: 98%; overflow: auto; padding-left: 10px; padding-bottom: 10px; padding-top: 10px; background: #F5F5F5;\">\n<pre><code>attempt:&#13;\n    whereas True:&#13;\n        move&#13;\nbesides KeyboardInterrupt:&#13;\n    print(\" Exiting.\")<\/code><\/pre>\n<\/div>\n<p>\u00a0<\/p>\n<h2><span>#\u00a0<\/span>Understanding Phoenix Traces &amp; Metrics<\/h2>\n<p>\u00a0<br \/>Earlier than seeing the output, we must always first perceive Phoenix metrics. You have to to first perceive what traces and spans are:<br \/><strong>Hint:<\/strong> Every hint represents one full run of your LLM pipeline. For instance, every query like \u201cWhat&#8217;s the capital of France?\u201d generates a brand new hint.<br \/><strong>Spans:<\/strong> Every hint is combined of a number of spans, every representing a stage in your chain:<\/p>\n<ul>\n<li>ChatPromptTemplate.format: Immediate formatting<\/li>\n<li>TogetherLLM.invoke: LLM name<\/li>\n<li>Any customized elements you add<\/li>\n<\/ul>\n<p><strong>Metrics Proven per Hint<\/strong><br \/>\u00a0<\/p>\n<table style=\"width: 100%; border-collapse: collapse; font-family: Arial, sans-serif; font-size: 14px; color: #333;\">\n  <\/p>\n<thead>\n<tr style=\"background-color: #ffd29a;\">\n<th style=\"padding: 12px; border: 1px solid #ddd; text-align: left;\">Metric<\/th>\n<th style=\"padding: 12px; border: 1px solid #ddd; text-align: left;\">Which means &amp; Significance<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">Latency (ms)<\/td>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">\n        Measures whole time for full LLM chain execution, together with immediate formatting, LLM response, and post-processing. Helps establish efficiency bottlenecks and debug sluggish responses.\n      <\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">Enter Tokens<\/td>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">\n        Variety of tokens despatched to the mannequin. Vital for monitoring enter dimension and controlling API prices, since most utilization is token-based.\n      <\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">Output Tokens<\/td>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">\n        Variety of tokens generated by the mannequin. Helpful for understanding verbosity, response high quality, and price impression.\n      <\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">Immediate Template<\/td>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">\n        Shows the total immediate with inserted variables. Helps affirm whether or not prompts are structured and stuffed in accurately.\n      <\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">Enter \/ Output Textual content<\/td>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">\n        Exhibits each consumer enter and the mannequin\u2019s response. Helpful for checking interplay high quality and recognizing hallucinations or incorrect solutions.\n      <\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">Span Durations<\/td>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">\n        Breaks down the time taken by every step (like immediate creation or mannequin invocation). Helps establish efficiency bottlenecks inside the chain.\n      <\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">Chain Identify<\/td>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">\n        Specifies which a part of the pipeline a span belongs to (e.g., <code>immediate.format<\/code>, <code>TogetherLLM.invoke<\/code>). Helps isolate the place points are occurring.\n      <\/td>\n<\/tr>\n<tr>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">Tags \/ Metadata<\/td>\n<td style=\"padding: 12px; border: 1px solid #ddd;\">\n        Further info like mannequin identify, temperature, and so on. Helpful for filtering runs, evaluating outcomes, and analyzing parameter impression.\n      <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\u00a0<\/p>\n<p>Now go to <a rel=\"nofollow\" target=\"_blank\" href=\"http:\/\/localhost:6006\" target=\"_blank\">http:\/\/localhost:6006<\/a> to view the Phoenix dashboard. You will notice one thing like:<br \/>\u00a0<br \/><img decoding=\"async\" alt=\"Phoenix dashboard\" width=\"100%\" class=\"perfmatters-lazy\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/image-1-1.png\"\/><img decoding=\"async\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/image-1-1.png\" alt=\"Phoenix dashboard\" width=\"100%\"\/><br \/>\u00a0<br \/>Open the primary hint to view its particulars.<br \/>\u00a0<br \/><img decoding=\"async\" alt=\"Phoenix first trace\" width=\"100%\" class=\"perfmatters-lazy\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/image-2-1.png\"\/><img decoding=\"async\" src=\"https:\/\/www.kdnuggets.com\/wp-content\/uploads\/image-2-1.png\" alt=\"Phoenix first trace\" width=\"100%\"\/><\/p>\n<p>\u00a0<\/p>\n<h2><span>#\u00a0<\/span>Wrapping Up<\/h2>\n<p>\u00a0<br \/>To wrap it up, Arize Phoenix makes it extremely simple to debug, hint, and monitor your LLM functions. You don\u2019t need to guess what went flawed or dig by logs. Every thing\u2019s proper there: prompts, responses, timings, and extra. It helps you notice points, perceive efficiency, and simply construct higher AI experiences with approach much less stress.<br \/>\u00a0<br \/>\u00a0<\/p>\n<p><b><a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/www.linkedin.com\/in\/kanwal-mehreen1\/\" rel=\"noopener\"><strong><a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/www.linkedin.com\/in\/kanwal-mehreen1\/\" target=\"_blank\" rel=\"noopener noreferrer\">Kanwal Mehreen<\/a><\/strong><\/a><\/b> is a machine studying engineer and a technical author with a profound ardour for knowledge science and the intersection of AI with drugs. She co-authored the book &#8220;Maximizing Productiveness with ChatGPT&#8221;. As a Google Era Scholar 2022 for APAC, she champions range and tutorial excellence. She&#8217;s additionally acknowledged as a Teradata Variety in Tech Scholar, Mitacs Globalink Analysis Scholar, and Harvard WeCode Scholar. Kanwal is an ardent advocate for change, having based FEMCodes to empower ladies in STEM fields.<\/p>\n<\/p><\/div>\n\n","protected":false},"excerpt":{"rendered":"<p>Picture by Creator | Canva \u00a0 #\u00a0Introduction \u00a0Conventional debugging with print() or logging works, however it\u2019s sluggish and clunky with LLMs. Phoenix gives a timeline view of each step, immediate, and response inspection, error detection with retries, visibility into latency and prices, and an entire visible understanding of your app. Phoenix by Arize AI is [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":5177,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[55],"tags":[3888,1112,401,4433],"class_list":["post-5175","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","tag-debugging","tag-llms","tag-pro","tag-tracing"],"_links":{"self":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts\/5175","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5175"}],"version-history":[{"count":1,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts\/5175\/revisions"}],"predecessor-version":[{"id":5176,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts\/5175\/revisions\/5176"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/media\/5177"}],"wp:attachment":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5175"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5175"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5175"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69d9690a190636c2e0989534. Config Timestamp: 2026-04-10 21:18:02 UTC, Cached Timestamp: 2026-06-28 03:15:24 UTC -->