Name: Optimize and Accelerate Cloud AI Infrastructure with Autoscaling | 通过自动缩放优化和加速云AI基础设施 - Yuan Mo, Alibaba Cloud
Start: 2024-08-22T15:35:00+0800
End: 2024-08-22T16:10:00+0800

In-person
21-23 August, 2024
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon + Open Source Summit + AI_Dev China 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Hong Kong Standard Time (UTC +8). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.

亲临现场

2024年8月21-23日

了解更多并注册参加

Sched应用程序允许您创建自己的日程安排，但不能替代您的活动注册。您必须注册参加KubeCon + CloudNativeCon + Open Source Summit + AI_Dev China 2024，才能参加会议。如果您尚未注册但希望加入我们，请访问活动注册页面购买注册。

请注意：本日程自动显示为香港标准时间（UTC +8）。要查看您偏好的时区的日程，请从右侧“按日期筛选”上方的下拉菜单中选择。日程可能会有变动，会议席位先到先得。

Thursday August 22, 2024 3:35pm - 4:10pm HKT

Level 1 | Hung Hom Room 7

With the rise of generative AI technology, more and more applications are starting to integrate with the capabilities of generative AI. However, the high costs of training and inference can be daunting for developers. In this talk, we will discuss the issues and solutions that need additional consideration when using elastic scaling in generative AI scenarios, including: ● How to enhance the elastic startup efficiency of generative AI ● How to address the efficiency of inference when separating compute and storage in generative AI ● How to reduce the costs of training and inference ● How to solve the interruption problem in AI training scenarios using Spot instances ● How to address the issue of capacity elasticity in LLM scenarios Finally, we will introduce the practical experience of the world's leading generative AI service provider: HaiYi (seaart.ai), allowing more developers to understand the architectural methods of elastic cloud AI infrastructure.

随着生成式人工智能技术的兴起，越来越多的应用程序开始与生成式人工智能的能力集成。然而，训练和推理的高成本可能会让开发人员望而却步。在这次演讲中，我们将讨论在生成式人工智能场景中使用弹性扩展时需要额外考虑的问题和解决方案，包括： ● 如何提高生成式人工智能的弹性启动效率 ● 如何在生成式人工智能中分离计算和存储时解决推理效率的问题 ● 如何降低训练和推理的成本 ● 如何使用Spot实例解决AI训练场景中的中断问题 ● 如何解决LLM场景中的容量弹性问题最后，我们将介绍世界领先的生成式人工智能服务提供商海艺（seaart.ai）的实际经验，让更多开发人员了解弹性云AI基础设施的架构方法。

Speakers

Yuan Mo

Staff Engineer, Alibaba Cloud

Senior technical expert at Alibaba Cloud, the maintainer of the Kubernetes elastic component autoscaler, the founder of the cloud-native gaming community and OpenKruiseGame, and has given several talks at kubecon before. Focus on the cloud-native transformation of the gaming industry... Read More →

Thursday August 22, 2024 3:35pm - 4:10pm HKT
Level 1 | Hung Hom Room 7

KubeCon + CloudNativeCon Sessions, Platform Engineering

Experience Level | 内容经验水平 高级 (Advanced)
Language | 语言 英语 (English)

KubeCon + CloudNativeCon + Open Source Summit + AI_dev China 2024

Yuan Mo

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!