{"id":3521,"date":"2025-06-14T07:28:30","date_gmt":"2025-06-14T07:28:30","guid":{"rendered":"https:\/\/techtrendfeed.com\/?p=3521"},"modified":"2025-06-14T07:28:30","modified_gmt":"2025-06-14T07:28:30","slug":"deploy-qwen-fashions-with-amazon-bedrock-customized-mannequin-import","status":"publish","type":"post","link":"https:\/\/techtrendfeed.com\/?p=3521","title":{"rendered":"Deploy Qwen fashions with Amazon Bedrock Customized Mannequin Import"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div id=\"\">\n<p>We\u2019re excited to announce that\u00a0<a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/aws.amazon.com\/bedrock\/custom-model-import\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Bedrock Customized Mannequin Import<\/a>\u00a0now helps\u00a0<a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/qwenlm.github.io\/blog\/qwen2.5\/\" target=\"_blank\" rel=\"noopener noreferrer\">Qwen<\/a> fashions.\u00a0Now you can import customized weights\u00a0for Qwen2, Qwen2_VL, and Qwen2_5_VL architectures, together with fashions like Qwen 2, 2.5 Coder, Qwen 2.5 VL, and QwQ 32B.\u00a0You possibly can deliver your personal custom-made Qwen fashions into Amazon Bedrock and deploy them in a completely managed, serverless atmosphere\u2014with out having to handle infrastructure or mannequin serving.<\/p>\n<p>On this put up, we cowl the right way to deploy Qwen 2.5 fashions with Amazon Bedrock Customized Mannequin Import, making them accessible to organizations wanting to make use of state-of-the-art AI capabilities throughout the AWS infrastructure at an efficient value.<\/p>\n<h2>Overview of Qwen fashions<\/h2>\n<p>Qwen 2 and a couple of.5 are households of enormous language fashions, out there in a variety of sizes and specialised variants to go well with numerous wants:<\/p>\n<ul>\n<li><strong>Common language fashions<\/strong>: Fashions starting from 0.5B to 72B parameters, with each base and instruct variations for general-purpose duties<\/li>\n<li><strong>Qwen 2.5-Coder<\/strong>: Specialised for code era and completion<\/li>\n<li><strong>Qwen 2.5-Math<\/strong>: Targeted on superior mathematical reasoning<\/li>\n<li><strong>Qwen 2.5-VL (vision-language)<\/strong>: Picture and video processing capabilities, enabling multimodal purposes<\/li>\n<\/ul>\n<h2>Overview of Amazon Bedrock Customized Mannequin Import<\/h2>\n<p>Amazon Bedrock Customized Mannequin Import permits the import and use of your custom-made fashions alongside current basis fashions (FMs) by way of a single serverless, unified API. You possibly can entry your imported customized fashions on-demand and with out the necessity to handle the underlying infrastructure. Speed up your generative AI software growth by integrating your supported customized fashions with native Amazon Bedrock instruments and options like Amazon Bedrock Information Bases, Amazon Bedrock Guardrails, and Amazon Bedrock Brokers. Amazon Bedrock Customized Mannequin Import is mostly out there within the US-East (N. Virginia), US-West (Oregon), and Europe (Frankfurt) AWS Areas. Now, we\u2019ll discover how you should utilize Qwen 2.5 fashions for 2 frequent use instances: as a coding assistant and for picture understanding. Qwen2.5-Coder is a state-of-the-art code mannequin, matching capabilities of proprietary fashions like GPT-4o. It helps over 90 programming languages and excels at code era, debugging, and reasoning. Qwen 2.5-VL brings superior multimodal capabilities. In accordance with Qwen, Qwen 2.5-VL isn&#8217;t solely proficient at recognizing objects similar to flowers and animals, but in addition at analyzing charts, extracting textual content from pictures, deciphering doc layouts, and processing lengthy movies.<\/p>\n<h2>Conditions<\/h2>\n<p>Earlier than importing the Qwen mannequin with Amazon Bedrock Customized Mannequin Import, just be sure you have the next in place:<\/p>\n<ol>\n<li>An energetic AWS account<\/li>\n<li>An <a rel=\"nofollow\" target=\"_blank\" href=\"http:\/\/aws.amazon.com\/s3\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Easy Storage Service (Amazon S3)<\/a> bucket to retailer the Qwen mannequin recordsdata<\/li>\n<li><a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/docs.aws.amazon.com\/bedrock\/latest\/userguide\/model-import-iam-role.html\" target=\"_blank\" rel=\"noopener noreferrer\">Ample permissions<\/a> to create Amazon Bedrock mannequin import jobs<\/li>\n<li>Verified that your <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/docs.aws.amazon.com\/bedrock\/latest\/userguide\/model-customization-import-model.html\" target=\"_blank\" rel=\"noopener noreferrer\">Area helps Amazon Bedrock Customized Mannequin Import<\/a><\/li>\n<\/ol>\n<h2>Use case 1: Qwen coding assistant<\/h2>\n<p>On this instance, we&#8217;ll exhibit the right way to construct a coding assistant utilizing the Qwen2.5-Coder-7B-Instruct mannequin<\/p>\n<ol>\n<li>Go to to <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/huggingface.co\/\" target=\"_blank\" rel=\"noopener noreferrer\">Hugging Face<\/a> and seek for and duplicate the Mannequin ID Qwen\/Qwen2.5-Coder-7B-Instruct:<\/li>\n<\/ol>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-108717\" style=\"margin: 10px 0px 10px 0px;border: 1px solid #CCCCCC\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/ML-18505-image-1.jpeg\" alt=\"\" width=\"748\" height=\"590\"\/><\/p>\n<p>You&#8217;ll use <code>Qwen\/Qwen2.5-Coder-7B-Instruct<\/code>\u00a0for the remainder of the walkthrough.\u00a0We don\u2019t exhibit fine-tuning steps, however you may also <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/docs.aws.amazon.com\/sagemaker\/latest\/dg\/jumpstart-fine-tune.html\" target=\"_blank\" rel=\"noopener noreferrer\">fine-tune<\/a> earlier than importing.<\/p>\n<ol start=\"2\">\n<li>Use the next command to obtain a snapshot of the mannequin regionally. The Python library for Hugging Face supplies a utility referred to as snapshot obtain for this:<\/li>\n<\/ol>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">from huggingface_hub import snapshot_download\n\nsnapshot_download(repo_id=\"\u00a0Qwen\/Qwen2.5-Coder-7B-Instruct\", \n\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 local_dir=f\".\/extractedmodel\/\")<\/code><\/pre>\n<\/p><\/div>\n<p>Relying in your mannequin dimension, this might take a couple of minutes. When accomplished, your Qwen Coder 7B mannequin folder will include the next recordsdata.<\/p>\n<ul>\n<li><strong>Configuration recordsdata<\/strong>: Together with <code>config.json<\/code>, <code>generation_config.json<\/code>, <code>tokenizer_config.json<\/code>, <code>tokenizer.json<\/code>, and <code>vocab.json<\/code><\/li>\n<li><strong>Mannequin recordsdata<\/strong>: 4 <code>safetensor<\/code> recordsdata and <code>mannequin.safetensors.index.json<\/code><\/li>\n<li><strong>Documentation<\/strong>: <code>LICENSE<\/code>, <code>README.md<\/code>, and <code>merges.txt<\/code><\/li>\n<\/ul>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-108718\" style=\"margin: 10px 0px 10px 0px;border: 1px solid #CCCCCC\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/ML-18505-image-2.jpeg\" alt=\"\" width=\"287\" height=\"384\"\/><\/p>\n<ol start=\"3\">\n<li>Add the mannequin to Amazon S3, utilizing <code>boto3<\/code>\u00a0or the command line:<\/li>\n<\/ol>\n<p><code>aws s3 cp .\/extractedfolder s3:\/\/yourbucket\/path\/ --recursive<\/code><\/p>\n<ol start=\"4\">\n<li>Begin the import mannequin job utilizing the next API name:<\/li>\n<\/ol>\n<div class=\"hide-language\">\n<pre><code class=\"lang-css\">response = self.bedrock_client.create_model_import_job(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0jobName=\"uniquejobname\",\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0importedModelName=\"uniquemodelname\",\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0roleArn=\"fullrolearn\",\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0modelDataSource={\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0's3DataSource': {\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0's3Uri':\u00a0\"s3:\/\/yourbucket\/path\/\"\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0}\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0}\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0)\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0<\/code><\/pre>\n<\/p><\/div>\n<p>You can too do that utilizing the AWS Administration Console for Amazon Bedrock.<\/p>\n<ol start=\"5\">\n<li>Within the Amazon Bedrock console, select <strong>Imported fashions<\/strong> within the navigation pane.<\/li>\n<li>Select <strong>Import a mannequin<\/strong>.<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-108719\" style=\"margin: 10px 0px 10px 0px;border: 1px solid #CCCCCC\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/ML-18505-image-3.jpeg\" alt=\"\" width=\"774\" height=\"554\"\/><\/p>\n<ol start=\"7\">\n<li>Enter the small print, together with a <strong>Mannequin title<\/strong>, <strong>Import job title<\/strong>, and mannequin <strong>S3 location<\/strong>.<\/li>\n<\/ol>\n<p><img decoding=\"async\" class=\"alignnone size-full wp-image-108720\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/ML-18505-image-4.jpeg\" alt=\"\" height=\"1008\"\/><\/p>\n<ol start=\"8\">\n<li>Create a brand new service position or use an current service position. Then select Import mannequin<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-108721\" style=\"margin: 10px 0px 10px 0px;border: 1px solid #CCCCCC\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/ML-18505-image-5.jpeg\" alt=\"\" width=\"733\" height=\"250\"\/><\/p>\n<ol start=\"9\">\n<li>After you select <strong>Import<\/strong> on the console, you need to see standing as importing when mannequin is being imported:<\/li>\n<\/ol>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-108722\" style=\"margin: 10px 0px 10px 0px;border: 1px solid #CCCCCC\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/ML-18505-image-6.jpeg\" alt=\"\" width=\"1338\" height=\"252\"\/><\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-108723\" style=\"margin: 10px 0px 10px 0px;border: 1px solid #CCCCCC\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/ML-18505-image-7.jpeg\" alt=\"\" width=\"1057\" height=\"338\"\/><\/p>\n<p>For those who\u2019re utilizing your personal position, be sure you add the next belief relationship as describes in \u00a0<a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/docs.aws.amazon.com\/bedrock\/latest\/userguide\/model-import-iam-role.html\" target=\"_blank\" rel=\"noopener noreferrer\">Create a service position for mannequin import<\/a>.<\/p>\n<p>After your mannequin is imported, anticipate mannequin inference to be prepared, after which chat with the mannequin on the playground or by way of the API. Within the following instance, we append <code>Python<\/code> to immediate the mannequin to straight output Python code to checklist gadgets in an S3 bucket. Keep in mind to make use of the appropriate chat template to enter prompts within the format required. For instance, you will get the appropriate chat template for any appropriate mannequin on Hugging Face utilizing under code:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">from transformers import AutoTokenizer\ntokenizer = AutoTokenizer.from_pretrained(\"Qwen\/Qwen2.5-Coder-7B-Instruct\")\n\n# As a substitute of utilizing mannequin.chat(), we straight use mannequin.generate()\n# However it is advisable to use tokenizer.apply_chat_template() to format your inputs as proven under\nimmediate = \"Write pattern boto3 python code to checklist recordsdata in a bucket saved within the variable `my_bucket`\"\nmessages = [\n\u00a0\u00a0 \u00a0{\"role\": \"system\", \"content\": \"You are a helpful coding assistant.\"},\n\u00a0\u00a0 \u00a0{\"role\": \"user\", \"content\": prompt}\n]\ntextual content = tokenizer.apply_chat_template(\n\u00a0\u00a0 \u00a0messages,\n\u00a0\u00a0 \u00a0tokenize=False,\n\u00a0\u00a0 \u00a0add_generation_prompt=True\n)<\/code><\/pre>\n<\/p><\/div>\n<p>Notice that when utilizing the <code>invoke_model<\/code>\u00a0APIs, it&#8217;s essential to use the complete Amazon Useful resource Title (ARN) for the imported mannequin. You&#8217;ll find the Mannequin ARN within the Bedrock console, by navigating to the Imported fashions part after which viewing the Mannequin particulars web page, as proven within the following determine<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-108724\" style=\"margin: 10px 0px 10px 0px;border: 1px solid #CCCCCC\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/ML-18505-image-8.jpeg\" alt=\"\" width=\"452\" height=\"214\"\/><\/p>\n<p>After the mannequin is prepared for inference, you should utilize Chat Playground in Bedrock console or APIs to invoke the mannequin.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-108725\" style=\"margin: 10px 0px 10px 0px;border: 1px solid #CCCCCC\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/ML-18505-image-9.jpeg\" alt=\"\" width=\"726\" height=\"551\"\/><\/p>\n<h2>Use case 2: Qwen 2.5 VL picture understanding<\/h2>\n<p>Qwen2.5-VL-* affords multimodal capabilities, combining imaginative and prescient and language understanding in a single mannequin. This part demonstrates the right way to deploy Qwen2.5-VL utilizing Amazon Bedrock Customized Mannequin Import and take a look at its picture understanding capabilities.<\/p>\n<h3>Import Qwen2.5-VL-7B to Amazon Bedrock<\/h3>\n<p>Obtain the mannequin from Huggingface Face and add it to Amazon S3:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">from huggingface_hub import snapshot_download\n\nhf_model_id = \"Qwen\/Qwen2.5-VL-7B-Instruct\"\n\n# Allow quicker downloads\nos.environ[\"HF_HUB_ENABLE_HF_TRANSFER\"] = \"1\"\n\n# Obtain mannequin regionally\nsnapshot_download(repo_id=hf_model_id, local_dir=f\".\/{local_directory}\")<\/code><\/pre>\n<\/p><\/div>\n<p>Subsequent, import the mannequin to Amazon Bedrock (both through Console or API):<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-css\">response = bedrock.create_model_import_job(\n\u00a0\u00a0 \u00a0jobName=job_name,\n\u00a0\u00a0 \u00a0importedModelName=imported_model_name,\n\u00a0\u00a0 \u00a0roleArn=role_arn,\n\u00a0\u00a0 \u00a0modelDataSource={\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0's3DataSource': {\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0's3Uri': s3_uri\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0}\n\u00a0\u00a0 \u00a0}\n)<\/code><\/pre>\n<\/p><\/div>\n<h3>Check the imaginative and prescient capabilities<\/h3>\n<p>After the import is full, take a look at the mannequin with a picture enter. The Qwen2.5-VL-* mannequin requires correct formatting of multimodal inputs:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">def generate_vl(messages, image_base64, temperature=0.3, max_tokens=4096, top_p=0.9):\n\u00a0\u00a0 \u00a0processor = AutoProcessor.from_pretrained(\"Qwen\/QVQ-72B-Preview\")\n\u00a0\u00a0 \u00a0immediate = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)\n\u00a0\u00a0 \u00a0\n\u00a0\u00a0 \u00a0response = consumer.invoke_model(\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0modelId=model_id,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0physique=json.dumps({\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0'immediate': immediate,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0'temperature': temperature,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0'max_gen_len': max_tokens,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0'top_p': top_p,\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0'pictures': [image_base64]\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0}),\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0settle for=\"software\/json\",\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0contentType=\"software\/json\"\n\u00a0\u00a0 \u00a0)\n\u00a0\u00a0 \u00a0\n\u00a0\u00a0 \u00a0return json.hundreds(response['body'].learn().decode('utf-8'))\n\n# Utilizing the mannequin with a picture\nfile_path = \"cat_image.jpg\"\nbase64_data = image_to_base64(file_path)\n\nmessages = [\n\u00a0\u00a0 \u00a0{\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"role\": \"user\",\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0\"content\": [\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0{\"image\": base64_data},\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0{\"text\": \"Describe this image.\"}\n\u00a0\u00a0 \u00a0 \u00a0 \u00a0]\n\u00a0\u00a0 \u00a0}\n]\n\nresponse = generate_vl(messages, base64_data)\n\n# Print response\nprint(\"Mannequin Response:\")\nif 'selections' in response:\n\u00a0\u00a0 \u00a0print(response['choices'][0]['text'])\nelif 'outputs' in response:\n\u00a0\u00a0 \u00a0print(response['outputs'][0]['text'])\nelse:\n\u00a0\u00a0 \u00a0print(response)\n\u00a0\u00a0\u00a0\u00a0<\/code><\/pre>\n<\/p><\/div>\n<p>When supplied with an instance picture of a cat (such the next picture), the mannequin precisely describes key options such because the cat\u2019s place, fur coloration, eye coloration, and basic look. This demonstrates Qwen2.5-VL-* mannequin\u2019s capacity to course of visible data and generate related textual content descriptions.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"alignnone size-full wp-image-108726\" style=\"margin: 10px 0px 10px 0px;border: 1px solid #CCCCCC\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/ML-18505-image-10.jpeg\" alt=\"\" width=\"2560\" height=\"1920\"\/><\/p>\n<p>The mannequin\u2019s response:<\/p>\n<div class=\"hide-language\">\n<pre><code class=\"lang-python\">This picture includes a close-up of a cat mendacity down on a gentle, textured floor, probably a sofa or a mattress. The cat has a tabby coat with a mixture of darkish and lightweight brown fur, and its eyes are a hanging inexperienced with vertical pupils, giving it a fascinating look. The cat's whiskers are distinguished and lengthen outward from its face, including to the detailed texture of the picture. The background is softly blurred, suggesting a comfortable indoor setting with some furnishings and probably a window letting in pure gentle. The general environment of the picture is heat and serene, highlighting the cat's relaxed and content material demeanor. <\/code><\/pre>\n<\/p><\/div>\n<h2>Pricing<\/h2>\n<p>You should utilize Amazon Bedrock Customized Mannequin Import to make use of your customized mannequin weights inside Amazon Bedrock for supported architectures, serving them alongside Amazon Bedrock hosted FMs in a completely managed approach by way of On-Demand mode. Customized Mannequin Import doesn\u2019t cost for mannequin import. You might be charged for inference based mostly on two components: the variety of energetic mannequin copies and their length of exercise. Billing happens in 5-minute increments, ranging from the primary profitable invocation of every mannequin copy. The pricing per mannequin copy per minute varies based mostly on components together with structure, context size, Area, and compute unit model, and is tiered by mannequin copy dimension. The customized mannequin unites required for internet hosting is determined by the mannequin\u2019s structure, parameter rely, and context size. Amazon Bedrock mechanically manages scaling based mostly in your utilization patterns. If there aren&#8217;t any invocations for five minutes, it scales to zero and scales up when wanted, although this may contain cold-start latency of as much as a minute. Extra copies are added if inference quantity constantly exceeds single-copy concurrency limits. The utmost throughput and concurrency per copy is set throughout import, based mostly on components similar to enter\/output token combine, {hardware} kind, mannequin dimension, structure, and inference optimizations.<\/p>\n<p>For extra data, see <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/aws.amazon.com\/bedrock\/pricing\/\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Bedrock pricing<\/a>.<\/p>\n<h2>Clear up<\/h2>\n<p>To keep away from ongoing expenses after finishing the experiments:<\/p>\n<ol>\n<li>Delete your imported Qwen fashions from Amazon Bedrock Customized Mannequin Import utilizing the console or the API.<\/li>\n<li>Optionally, delete the mannequin recordsdata out of your S3 bucket should you now not want them.<\/li>\n<\/ol>\n<p>Keep in mind that whereas Amazon Bedrock Customized Mannequin Import doesn\u2019t cost for the import course of itself, you&#8217;re billed for mannequin inference utilization and storage.<\/p>\n<h2>Conclusion<\/h2>\n<p>Amazon Bedrock Customized Mannequin Import empowers organizations to make use of highly effective publicly out there fashions like Qwen 2.5, amongst others, whereas benefiting from enterprise-grade infrastructure. The serverless nature of Amazon Bedrock eliminates the complexity of managing mannequin deployments and operations, permitting groups to concentrate on constructing purposes quite than infrastructure. With options like auto scaling, pay-per-use pricing, and seamless integration with AWS providers, Amazon Bedrock supplies a production-ready atmosphere for AI workloads. The mixture of Qwen 2.5\u2019s superior AI capabilities and Amazon Bedrock managed infrastructure affords an optimum stability of efficiency, value, and operational effectivity. Organizations can begin with smaller fashions and scale up as wanted, whereas sustaining full management over their mannequin deployments and benefiting from AWS safety and compliance capabilities.<\/p>\n<p>For extra data, consult with the <a rel=\"nofollow\" target=\"_blank\" href=\"https:\/\/docs.aws.amazon.com\/bedrock\/latest\/userguide\/model-customization-import-model.html\" target=\"_blank\" rel=\"noopener noreferrer\">Amazon Bedrock Consumer Information.<\/a><\/p>\n<hr\/>\n<h3>Concerning the Authors<\/h3>\n<p style=\"clear: both\"><strong><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-108714 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/Ajit.png\" alt=\"\" width=\"100\" height=\"116\"\/>Ajit Mahareddy<\/strong> is an skilled\u00a0Product and Go-To-Market (GTM) chief\u00a0with over 20 years of expertise in\u00a0Product Administration, Engineering, and Go-To-Market. Previous to his present position, Ajit led product administration constructing\u00a0AI\/ML merchandise\u00a0at main expertise corporations, together with Uber, Turing, and eHealth. He&#8217;s keen about\u00a0advancing Generative AI applied sciences\u00a0and\u00a0driving real-world affect with Generative AI.<\/p>\n<p style=\"clear: both\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-64683 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2023\/10\/26\/Shreyas-Subramanian-100.jpg\" alt=\"\" width=\"100\" height=\"134\"\/><strong>Shreyas Subramanian<\/strong> is a Principal Knowledge Scientist and helps prospects through the use of generative AI and deep studying to resolve their enterprise challenges utilizing AWS providers. Shreyas has a background in large-scale optimization and ML and in using ML and reinforcement studying for accelerating optimization duties.<\/p>\n<p style=\"clear: both\"><strong><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-78760 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2024\/06\/16\/yanyan.png\" alt=\"\" width=\"100\" height=\"100\"\/>Yanyan Zhang<\/strong>\u00a0is a Senior Generative AI Knowledge Scientist at Amazon Internet Companies, the place she has been engaged on cutting-edge AI\/ML applied sciences as a Generative AI Specialist, serving to prospects use generative AI to realize their desired outcomes. Yanyan graduated from Texas A&amp;M College with a PhD in Electrical Engineering. Outdoors of labor, she loves touring, understanding, and exploring new issues.<\/p>\n<p style=\"clear: both\"><strong><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-108715 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/dharini.jpeg\" alt=\"\" width=\"100\" height=\"138\"\/>Dharinee Gupta<\/strong> is an Engineering Supervisor at AWS Bedrock, the place she focuses on enabling prospects to seamlessly make the most of open supply fashions by way of serverless options. Her staff focuses on optimizing these fashions to ship the most effective cost-performance stability for purchasers. Previous to her present position, she gained intensive expertise in authentication and authorization methods at Amazon, creating safe entry options for Amazon choices. Dharinee is keen about making superior AI applied sciences accessible and environment friendly for AWS prospects.<\/p>\n<p style=\"clear: both\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-108727 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2025\/06\/13\/ravi.png\" alt=\"\" width=\"100\" height=\"133\"\/><strong>Lokeshwaran Ravi<\/strong> is a Senior Deep Studying Compiler Engineer at AWS, specializing in ML optimization, mannequin acceleration, and AI safety. He focuses on enhancing effectivity, lowering prices, and constructing safe ecosystems to democratize AI applied sciences, making cutting-edge ML accessible and impactful throughout industries.<\/p>\n<p style=\"clear: both\"><img decoding=\"async\" loading=\"lazy\" class=\"size-full wp-image-46115 alignleft\" src=\"https:\/\/d2908q01vomqb2.cloudfront.net\/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59\/2022\/11\/15\/June-Won-.jpg\" alt=\"\" width=\"100\" height=\"133\"\/><a rel=\"nofollow\" target=\"_blank\" href=\"http:\/\/www.linkedin.com\/in\/june-jung-ho-won-a2872a2a\/\" target=\"_blank\" rel=\"noopener noreferrer\"><strong>June Received<\/strong> <\/a>is a Principal Product Supervisor with Amazon SageMaker JumpStart. He focuses on making basis fashions simply discoverable and usable to assist prospects construct generative AI purposes. His expertise at Amazon additionally contains cellular procuring purposes and final mile supply.<\/p>\n<p>       \n      <\/div>\n\n","protected":false},"excerpt":{"rendered":"<p>We\u2019re excited to announce that\u00a0Amazon Bedrock Customized Mannequin Import\u00a0now helps\u00a0Qwen fashions.\u00a0Now you can import customized weights\u00a0for Qwen2, Qwen2_VL, and Qwen2_5_VL architectures, together with fashions like Qwen 2, 2.5 Coder, Qwen 2.5 VL, and QwQ 32B.\u00a0You possibly can deliver your personal custom-made Qwen fashions into Amazon Bedrock and deploy them in a completely managed, serverless atmosphere\u2014with [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":3523,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[55],"tags":[387,1289,2451,2309,3296,358,266,3295],"class_list":["post-3521","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","tag-amazon","tag-bedrock","tag-custom","tag-deploy","tag-import","tag-model","tag-models","tag-qwen"],"_links":{"self":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts\/3521","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=3521"}],"version-history":[{"count":1,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts\/3521\/revisions"}],"predecessor-version":[{"id":3522,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/posts\/3521\/revisions\/3522"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=\/wp\/v2\/media\/3523"}],"wp:attachment":[{"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=3521"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=3521"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/techtrendfeed.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=3521"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}<!-- This website is optimized by Airlift. Learn more: https://airlift.net. Template:. Learn more: https://airlift.net. Template: 69d9690a190636c2e0989534. Config Timestamp: 2026-04-10 21:18:02 UTC, Cached Timestamp: 2026-05-17 04:36:12 UTC -->