AI PCs and architectures

Jim Hsiao

DIGITIMES Research observes that on-device large-scale AI model inference is determined not only by the computing performance of xPU, but also by model compression and memory bandwidth all of which will affect the inference performance of AI PCs.

Abstract

As Microsoft, Meta and others actively launch lightweight AI models, and notebook processor vendors introduce system architectures and designs to enhance AI computing performance, AI PCs to be launched in 2024 will be able to execute multiple generative AI tasks offline.

The original versions of large language models (LLM) or large vision models (LVM) cannot be run on notebooks due to their huge volumes and enormous demands for computing power. Through compression technologies, such as model pruning and knowledge distillation, AI models with tens of billions of parameters can be compressed to one-tenth of the original. By quantizing parameters, the model can be further compressed by a factor of four or eight, effectively compressing the large models for use in notebooks with certain accuracy.

The general matrix multiplication (GEMM) and general matrix vector multiplication (GEMV) algorithms, which are the most important components of large-scale AI models, are compute bound and memory bound respectively.

Table of contents

Introduction
Challenges AI PCs facing
Table 1: Three major challenges AI PCs facings when running LLM
Challenges breakdown
Compression of LLMs
Table 2: Parameters used by LLMs before/after compression and key technologies
LLM parameters and precision
Chart 1: LLM parameters, precision, performance and memory demand
Bottlenecks
Table 3: LLM GEMM and GEMV comparison
CPU changes
Table 4: Changes in notebook hardware to improve AI efficiency
Hardware breakdown
New AI PCs for 2024
Chart 2: Parameter scales of generative AI applications (b units with INT4 precision)
Major platforms
Table 5: Computing performance comparison among major AI PC platforms, 1H24
Memory specifications
Table 6: Memory specification comparison among major AI PC platforms, 1H24
Roadmap
Chart 3: AI PC roadmap by generative AI application, CPU platform, and Microsoft and key brands, 1Q24-4Q24
Shipments
Table 7: AI notebook shipments and specifications by price sector, 2024 (m units)
Summary
Table 8: AI PC development and challenges

AI PCs and architectures

Single Report

Annual Notebooks subscription

Team or Enterprise subscription