Loading…
Attending this event?
In-person
21-23 August, 2024
Learn More and Register to Attend

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for KubeCon + CloudNativeCon + Open Source Summit + AI_Dev China 2024 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

Please note: This schedule is automatically displayed in Hong Kong Standard Time (UTC +8). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date." The schedule is subject to change and session seating is available on a first-come, first-served basis. 

亲临现场
2024年8月21-23日
了解更多并注册参加

Sched应用程序允许您创建自己的日程安排,但不能替代您的活动注册。您必须注册参加KubeCon + CloudNativeCon + Open Source Summit + AI_Dev China 2024,才能参加会议。如果您尚未注册但希望加入我们,请访问活动注册页面购买注册。

请注意:本日程自动显示为香港标准时间(UTC +8)。要查看您偏好的时区的日程,请从右侧“按日期筛选”上方的下拉菜单中选择。日程可能会有变动,会议席位先到先得。
Friday August 23, 2024 3:15pm - 3:50pm HKT
Seeking out a way to ship LLMs more seamlessly? Way too complicated to manage, composite, and setup a runtime with Python, C++, CUDA, GPUs when deploying LLMs? Tired of fighting against dependencies, model sizes, syncing deliverable model images across nodes? It's true that people often find it hard to bundle, distribute, deploy, and scale their own LLM workloads, but no worries, here is Ollama Operator, a scheduler, and utilizer for LLM models powered by Modelfile introduced by Ollama. You can now enjoy then unified bundled, runtime powered by llama.cpp with simple lines of CRD definition or the natively included kollama CLI with single command line, bundling, distributing, deploying, scaling of LLMs can never be easily and seamlessly accomplished across OS and environments. Let's dive in and find out what Ollama Operator with Ollama can do to deploy our own large langaugae models, what can we do and combine these features with Modelfile then bring them into the Kubernetes world!

寻找一种更无缝地运输LLM的方式?在部署LLM时,使用Python、C++、CUDA、GPU设置运行时太复杂?厌倦了与依赖、模型大小、在节点间同步可交付模型图像等问题作斗争? 人们常常发现很难捆绑、分发、部署和扩展自己的LLM工作负载,但不用担心,这里有Ollama Operator,一个由Ollama引入的基于Modelfile的LLM模型调度器和利用者。现在,您可以通过简单的CRD定义行或内置的kollama CLI命令行,享受由llama.cpp提供支持的统一捆绑运行时,轻松实现LLM的捆绑、分发、部署和扩展,跨操作系统和环境都可以轻松实现。 让我们深入了解一下Ollama Operator与Ollama能够做些什么来部署我们自己的大型语言模型,我们可以如何结合这些功能与Modelfile,然后将它们带入Kubernetes世界!
Speakers
avatar for Neko Ayaka

Neko Ayaka

Software Engineer, DaoCloud
Cloud native developer, AI researcher, Gopher with 5 years of experience in loads of development fields across AI, data science, backend, frontend. Co-founder of https://github.com/nolebase
Friday August 23, 2024 3:15pm - 3:50pm HKT
Level 1 | Hung Hom Room 2
  KubeCon + CloudNativeCon Sessions, AI + ML

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link