Name: Write Once Run Anywhere, but for GPUs | GPU 时代的“一次编写，到处运行” - Michael Yuan, Second State
Start: 2024-08-23T13:20:00+0800
End: 2024-08-23T13:55:00+0800

In-person
21-23 August, 2024
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon + Open Source Summit + AI_Dev China 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Hong Kong Standard Time (UTC +8). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis.

亲临现场

2024年8月21-23日

了解更多并注册参加

Sched应用程序允许您创建自己的日程安排，但不能替代您的活动注册。您必须注册参加KubeCon + CloudNativeCon + Open Source Summit + AI_Dev China 2024，才能参加会议。如果您尚未注册但希望加入我们，请访问活动注册页面购买注册。

请注意：本日程自动显示为香港标准时间（UTC +8）。要查看您偏好的时区的日程，请从右侧“按日期筛选”上方的下拉菜单中选择。日程可能会有变动，会议席位先到先得。

Friday August 23, 2024 1:20pm - 1:55pm HKT

Level 1 | Hung Hom Room 3

With the popularity of LLM apps, there is an increasing demand for running and scaling AI workloads in the cloud and on edge devices. Rust and Wasm offer a solution by providing a portable bytecode that abstracts hardware complexities. LlamaEdge is a lightweight, high-performance and cross-platform LLM inference runtime. Written in Rust and built on WasmEdge, LlamaEdge provides a standard WASI-NN API to developers. Developers only need to write against the API and compile to Wasm. The Wasm file can run on any device, where WasmEdge translates and routes Wasm calls to the underlying native libraries such as llama.cpp. This talk will discuss the design and implementation of LlamaEdge and show how it enables cross-platform LLM app development and deployment. We will also walk through several code examples from a basic sentence completion app, to a chat bot, to an RAG agent app with external knowledge in vector databases, to a Kubernetes managed app across a heterogeneous cluster.

随着LLM应用程序的流行，云端和边缘设备上运行和扩展AI工作负载的需求不断增加。Rust和Wasm通过提供一个抽象硬件复杂性的可移植字节码来提供解决方案。 LlamaEdge是一个轻量级、高性能和跨平台的LLM推理运行时。使用Rust编写，并构建在WasmEdge上，LlamaEdge为开发人员提供了一个标准的WASI-NN API。开发人员只需针对API编写代码并编译为Wasm。Wasm文件可以在任何设备上运行，WasmEdge将Wasm调用转换并路由到底层的本地库，如llama.cpp。本次演讲将讨论LlamaEdge的设计和实现，并展示它如何实现跨平台的LLM应用程序开发和部署。我们还将从基本的句子补全应用程序、聊天机器人，到具有外部知识的矢量数据库中的RAG代理应用程序，再到跨异构集群的Kubernetes管理应用程序，演示几个代码示例。

Speakers

Michael Yuan

Product Manager, Second State

Dr. Michael Yuan is a maintainer of WasmEdge Runtime (a project under CNCF) and a co-founder of Second State. He is the author of 5 books on software engineering published by Addison-Wesley, Prentice-Hall, and O'Reilly. Michael is a long-time open-source developer and contributor... Read More →

Friday August 23, 2024 1:20pm - 1:55pm HKT
Level 1 | Hung Hom Room 3

AI_dev: Open Source GenAI & ML Summit Sessions, Foundations + Frameworks + Tools for Machine Learning

Experience Level | 内容经验水平 任意程度 (Any)
Language | 语言 英语 (English)

KubeCon + CloudNativeCon + Open Source Summit + AI_dev China 2024

Michael Yuan

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!