자료실
세미나
AI-Accelerated HW/SW Systems in the Era of Large Language Models and On-Device AI
  • 2024-05-24

 

주제AI-Accelerated HW/SW Systems in the Era of Large Language Models and On-Device AI

연사KAIST 박종세 교수

일시: 5월 27(월오후 4

장소: K304

개요 

The advent of Generative AI (GenAI), represented by Large Language Models (LLMs), is rapidly changing the way that our society works. IT industry is in the process of actively deploying GenAI for real-world applications. However, GenAI inferencing requires a massive amount of computing and memory resources, necessitating the use of a cloud-based inferencing serving system to amortize the operation cost. Therefore, there is an urgent need to develop an efficient, cost-effective, and scalable LLM inference serving system for unlocking the limitless potential of such remarkable algorithmic advances. Rather disjointly, IT industry is transitioning its attention towards on-device systems, including autonomous systems (e.g., vehicles and robots) and AR/VR devices. Building efficient on-device AI systems poses a different set of requirements and constraints than cloud-based systems, calling for aggressive research efforts in the domain. To this end, I will introduce two research projects: (1) NeuPIMs: a Processing-in-Memory (PIM) based acceleration for LLM inferencing, and (2) DaCapo: an on-device continuous learning acceleration for autonomous systems. Both works aim to build a hardware-software co-designed acceleration solutions, which will be deeply discussed in this talk. 

 

주관강석주 교수 연구실