解剖Tyr：Linux首个Rust GPU驱动的代码实战分析

2025年9月，Linux内核合并了首个Rust GPU驱动Tyr（commit cf4fd52e3236），标志着Rust在内核图形子系统的正式落地。本文通过剖析Tyr的实际代码，展示Rust GPU驱动的架构设计、DRM抽象层的具体实现，以及从Panthor（C）移植到Tyr（Rust）的关键挑战。这是Rust在Linux内核从抽象到实战的完整技术案例。

引言：从理论到代码

在前两篇文章中，我们分析了Rust在Linux内核的整体状态和ABI稳定性¹²。这些讨论主要停留在宏观层面：代码统计、政策争议、技术保证。但实际的Rust内核代码长什么样？如何与C内核交互？遇到了哪些具体挑战？

本文通过解剖Tyr项目——Linux内核首个合并的Rust GPU驱动——来回答这些问题。我们将：

分析实际代码：基于commit cf4fd52e3236的真实代码
对比C/Rust实现：Panthor（C）vs Tyr（Rust）
揭示技术挑战：为何上游代码如此精简？
理解DRM抽象层：rust/kernel/drm/如何工作？

这不是一篇科普文章，而是代码级的技术剖析。

背景知识：GPU驱动与DRM子系统

GPU驱动的双层架构

在Linux中，GPU驱动分为两个部分：

1. 内核模式驱动（Kernel-mode Driver）

位置：Linux内核的drivers/gpu/drm/目录
职责：
- 管理GPU硬件资源
- 提供内存分配和映射
- 处理多进程的GPU访问调度
- 电源管理和故障恢复
Tyr就是内核模式驱动

2. 用户模式驱动（Userspace Driver）

典型代表：Mesa（实现OpenGL/Vulkan）
职责：
- 实现图形API（OpenGL、Vulkan等）
- 将API调用翻译为GPU命令
- 着色器编译
通过ioctl与内核驱动通信

┌─────────────────────────────┐
│   游戏/应用程序              │
└──────────┬──────────────────┘
           │ OpenGL/Vulkan API
           ↓
┌─────────────────────────────┐
│   Mesa (用户模式驱动)        │
│   - panfrost_dri.so (Panthor)│
└──────────┬──────────────────┘
           │ ioctl系统调用
           ↓
┌─────────────────────────────┐
│   Tyr (内核模式驱动)         │ ← 本文重点
│   drivers/gpu/drm/tyr/      │
└──────────┬──────────────────┘
           │ 硬件寄存器操作
           ↓
┌─────────────────────────────┐
│   Mali GPU 硬件              │
└─────────────────────────────┘

什么是DRM子系统？

DRM（Direct Rendering Manager） 是Linux内核的图形子系统，管理所有GPU驱动。

核心组件：

DRM Core（drivers/gpu/drm/drm_*.c）
- 提供通用GPU管理框架
- 处理显示模式设置（KMS）
- 管理图形内存（GEM）
GEM（Graphics Execution Manager）
- GPU内存对象管理
- 处理CPU/GPU内存共享
- 管理用户空间映射（mmap）
GPUVM（GPU Virtual Address Management）
- GPU虚拟地址空间管理
- 类似CPU的虚拟内存
- 支持多进程GPU内存隔离
GPU调度器（drm_gpu_scheduler）
- 管理GPU任务队列
- 处理任务依赖关系
- 实现公平调度

学习资源：

DRM Internals Documentation - 官方内核文档
Linux Graphics Stack Overview - Bootlin培训材料
DRM/KMS Overview - Intel图形文档

ARM Mali GPU架构

Mali GPU家族：

架构	代表型号	特点	Tyr支持
Midgard	Mali-T760	早期架构	❌
Bifrost	Mali-G71, G52	引入四边形着色器	❌
Valhall	Mali-G77, G78	超标量引擎	✅
Valhall CSF	Mali-G610, G710	命令流前端	✅ Tyr目标

CSF（Command Stream Frontend）架构：

GPU固件（MCU）直接管理任务调度
驱动通过命令流与固件通信
减轻CPU负担，提高效率

Mali GPU硬件结构：

┌─────────────────────────────────────┐
│  MCU (Microcontroller Unit)        │
│  - Cortex-M7核心 @ GHz             │
│  - 运行固件，管理GPU调度            │
└──────────┬──────────────────────────┘
           │ 内部总线
┌──────────┴──────────────────────────┐
│  Shader Cores (着色器核心)          │
│  - 执行计算/图形任务                │
│  - 多核并行（8-32核心不等）          │
└──────────┬──────────────────────────┘
           │
┌──────────┴──────────────────────────┐
│  L2 Cache + Memory System           │
│  - 共享L2缓存                       │
│  - MMU（内存管理单元）               │
└─────────────────────────────────────┘

MCU固件的关键作用：

任务调度：决定哪个任务在哪个核心执行
电源管理：动态开关核心和调节频率
故障恢复：检测和处理GPU挂起

学习资源：

ARM Mali GPU Datasheet - 官方技术文档
Panfrost Driver Documentation - Mesa的Mali开源驱动文档
Mali GPU Architecture - ARM官方博客

为什么要用Rust重写GPU驱动？

GPU驱动的复杂性：

海量内存操作：
- CPU/GPU共享内存
- 用户空间映射（mmap）
- DMA传输
- 常见bug：use-after-free、double-free
并发密集：
- 多进程同时访问GPU
- 中断处理
- 任务队列管理
- 常见bug：数据竞争、死锁
用户空间交互频繁：
- ioctl暴露大量攻击面
- 需要严格验证用户输入
- 常见bug：权限提升漏洞

历史数据（来自前文¹）：

Linux内核CVE中，约70%是内存安全问题
GPU驱动是CVE高发区

Rust的解决方案：

问题类别	C的困境	Rust的保证
内存安全	手动管理，易出错	所有权系统，编译时检查
并发安全	锁靠约定	借用检查器，编译时防数据竞争
资源泄漏	手动cleanup	RAII自动管理
空指针	运行时崩溃	`Option<T>`编译时消除

Greg Kroah-Hartman（内核维护者）的评价¹：

“The majority of bugs we have are due to the stupid little corner cases in C that are totally gone in Rust.”

Panthor vs Tyr：移植关系

Panthor是Mali CSF GPU的C驱动（已上游）：

位置：drivers/gpu/drm/panthor/
作者：Collabora工程师（Boris Brezillon等）
状态：生产就绪，功能完整

Tyr是Panthor的Rust移植：

目标：功能对等（feature parity）
策略：暴露相同的uAPI（用户空间API），兼容Mesa
当前状态：基础功能，依赖GPUVM等抽象完善

为什么不直接用Panthor？

技术演进：验证Rust在GPU驱动的可行性
安全提升：消除Panthor的潜在内存安全bug
生态建设：为其他GPU驱动提供Rust参考

快速入门：如何学习GPU驱动开发

前置知识

必备基础：

✅ C语言（指针、结构体、位操作）
✅ Linux系统编程（系统调用、设备驱动基础）
✅ 计算机体系结构（虚拟内存、DMA、中断）

Rust特有：

✅ 所有权和借用
✅ 生命周期
✅ unsafe Rust（FFI互操作）

学习路径（推荐顺序）

第1步：DRM基础（2-3周）

📚 DRM Driver Development Guide
💻 实践：编译并加载简单DRM驱动（vkms）
🎯 目标：理解GEM对象、ioctl处理流程

第2步：Rust内核编程（3-4周）

📚 Rust for Linux官方文档
📚 Kernel Module in Rust
💻 实践：编写简单的Rust platform驱动
🎯 目标：理解Pin, Opaque, #[pin_data]等内核特有概念

第3步：阅读现有代码（持续）

📖 rvkms（最简单的Rust DRM驱动）
📖 Nova（完整的Rust GPU驱动，Nvidia GSP）
📖 Tyr（本文重点）
📖 Asahi（Apple Silicon GPU，最成熟）

第4步：理解GPU硬件（按需）

📚 Mali GPU Architecture
📚 Panfrost Wiki（Mali开源驱动项目）
🎯 目标：理解着色器核心、MMU、MCU固件

关键资源汇总

官方文档：

Linux DRM Documentation - 内核DRM子系统文档
Rust for Linux - 官方项目网站
freedesktop.org DRM - 社区Wiki

代码仓库：

Linux Kernel
DRM Rust Tree - Rust DRM开发树
Mesa - 用户空间驱动

社区资源：

Rust for Linux邮件列表
DRM开发者IRC - #dri-devel频道
Collabora博客 - Tyr团队的技术博客

书籍推荐：

《Linux Device Drivers》（3rd Edition）- 经典驱动开发书籍
《Programming Rust》（2nd Edition）- Rust语言深入
《The Rust Reference》- Rust语言规范

从哪里开始贡献？

难度递增的任务：

⭐ 初级：
- 为Rust抽象添加文档注释
- 修复编译警告
- 添加单元测试
⭐⭐ 中级：
- 实现缺失的寄存器定义
- 添加新的GPU型号支持
- 改进错误处理
⭐⭐⭐ 高级：
- 开发GPUVM Rust抽象
- 实现GPU调度器
- 移植其他GPU驱动到Rust

如何参与：

订阅Rust for Linux邮件列表
在GitLab上关注DRM Rust项目
参与代码审查（学习最快的方式！）
从小patch开始提交

Tyr项目概览：第一手资料

Git Commit信息

提交哈希：cf4fd52e3236 作者：Daniel Almeida daniel.almeida@collabora.com 日期：2025年9月10日 合作方：Collabora、Arm、Google

Commit message核心摘录（原文）³：

Add a Rust driver for ARM Mali CSF-based GPUs. It is a port of Panthor and therefore exposes Panthor’s uAPI and name to userspace, and the product of a joint effort between Collabora, Arm and Google engineers.

The downstream code is capable of booting the MCU, doing sync VM_BINDS through the work-in-progress GPUVM abstraction and also doing (trivial) submits through Asahi’s drm_scheduler and dma_fence abstractions.

This first patch, however, only implements a subset of the current features available downstream, as the rest is not implementable without pulling in even more abstractions. In particular, a lot of things depend on properly mapping memory on a given VA range, which itself depends on the GPUVM abstraction that is currently work-in-progress. For this reason, we still cannot boot the MCU and thus, cannot do much for the moment.

关键信息解读

下游分支功能完整：
- ✅ MCU启动（Mali GPU的微控制器）
- ✅ 同步VM_BINDS（虚拟内存绑定）
- ✅ 基础任务提交
上游代码受限：
- ❌ 无法启动MCU
- ❌ GPUVM抽象缺失
- ❌ 只能查询GPU信息
战略转变：
- 之前尝试C+Rust混合（失败）
- 现在改为纯Rust，分阶段上游

Tyr代码结构：实际文件布局

代码树（基于commit cf4fd52e3236）

drivers/gpu/drm/tyr/
├── tyr.rs        # 模块入口，platform_driver声明
├── driver.rs     # 驱动核心，TyrDriver和TyrData实现
├── file.rs       # DRM file操作，处理用户空间连接
├── gem.rs        # GEM对象管理
├── gpu.rs        # GPU信息查询（GpuInfo结构体）
├── regs.rs       # GPU寄存器定义和访问
├── Kconfig       # 内核配置选项
└── Makefile      # 构建配置

对比Panthor（C驱动）：

$ cd /Users/weli/works/linux
$ ls drivers/gpu/drm/panthor/
panthor_devfreq.c  panthor_fw.c   panthor_gem.c  panthor_gpu.c
panthor_device.c   panthor_fw.h   panthor_gem.h  panthor_gpu.h
panthor_device.h   panthor_heap.c panthor_mmu.c  panthor_regs.h
...（共24个文件）

Tyr更精简：8个文件 vs Panthor的24个文件。但这并非优势，而是功能缺失的体现。

代码分析1：Tyr驱动入口

文件：`drivers/gpu/drm/tyr/tyr.rs`

// SPDX-License-Identifier: GPL-2.0 or MIT

//! Arm Mali Tyr DRM driver.
//!
//! The name "Tyr" is inspired by Norse mythology, reflecting Arm's tradition of
//! naming their GPUs after Nordic mythological figures and places.

use crate::driver::TyrDriver;

mod driver;
mod file;
mod gem;
mod gpu;
mod regs;

kernel::module_platform_driver! {
    type: TyrDriver,
    name: "tyr",
    authors: ["The Tyr driver authors"],
    description: "Arm Mali Tyr DRM driver",
    license: "Dual MIT/GPL",
}

关键点：

module_platform_driver! 宏：
- 自动生成平台驱动注册代码
- 等价于C中的module_platform_driver(tyr_driver)
模块组织：
- 清晰的模块划分（driver、file、gem、gpu、regs）
- 私有模块，不暴露内部细节

对比C版本（panthor_drv.c）：

static struct platform_driver panthor_driver = {
    .probe = panthor_probe,
    .remove = panthor_remove,
    .driver = {
        .name = "panthor",
        .pm = &panthor_pm_ops,
        .of_match_table = dt_match,
    },
};
module_platform_driver(panthor_driver);

Rust的优势：

类型安全：type: TyrDriver编译时检查
生命周期自动管理：probe/remove的资源管理通过RAII

代码分析2：驱动核心实现

文件：`drivers/gpu/drm/tyr/driver.rs`（部分）

2.1 设备树匹配

kernel::of_device_table!(
    OF_TABLE,
    MODULE_OF_TABLE,
    <TyrDriver as platform::Driver>::IdInfo,
    [
        (of::DeviceId::new(c_str!("rockchip,rk3588-mali")), ()),
        (of::DeviceId::new(c_str!("arm,mali-valhall-csf")), ())
    ]
);

解释：

支持Rockchip RK3588 SoC的Mali GPU
兼容ARM Mali Valhall CSF架构
c_str!宏：编译时C字符串，零运行时开销

对比C版本：

static const struct of_device_id dt_match[] = {
    { .compatible = "arm,mali-valhall-csf" },
    { .compatible = "rockchip,rk3588-mali" },
    {}
};
MODULE_DEVICE_TABLE(of, dt_match);

Rust的类型安全：

编译时检查字符串有效性
of::DeviceId::new确保格式正确

2.2 驱动数据结构

#[pin_data(PinnedDrop)]
pub(crate) struct TyrData {
    pub(crate) pdev: ARef<platform::Device>,

    #[pin]
    clks: Mutex<Clocks>,

    #[pin]
    regulators: Mutex<Regulators>,

    /// Some information on the GPU.
    ///
    /// This is mainly queried by userspace, i.e.: Mesa.
    pub(crate) gpu_info: GpuInfo,
}

关键设计：

#[pin_data] 属性：
- 保证内存不移动（pin到堆上）
- 必需，因为C代码可能持有指针
ARef<platform::Device>：
- 引用计数的平台设备
- 等价于C中的struct platform_device *
Mutex<Clocks> 和 Mutex<Regulators>：
- 内核互斥锁，保护共享资源
- #[pin]：这些字段不能移动

2.3 初始化流程（probe函数）

impl platform::Driver for TyrDriver {
    type IdInfo = ();
    const OF_ID_TABLE: Option<of::IdTable<Self::IdInfo>> = Some(&OF_TABLE);

    fn probe(
        pdev: &platform::Device<Core>,
        _info: Option<&Self::IdInfo>,
    ) -> Result<Pin<KBox<Self>>> {
        // 1. 获取时钟
        let core_clk = Clk::get(pdev.as_ref(), Some(c_str!("core")))?;
        let stacks_clk = OptionalClk::get(pdev.as_ref(), Some(c_str!("stacks")))?;
        let coregroup_clk = OptionalClk::get(pdev.as_ref(), Some(c_str!("coregroup")))?;

        // 2. 启用时钟
        core_clk.prepare_enable()?;
        stacks_clk.prepare_enable()?;
        coregroup_clk.prepare_enable()?;

        // 3. 获取并启用电源调节器
        let mali_regulator = Regulator::<regulator::Enabled>::get(pdev.as_ref(), c_str!("mali"))?;
        let sram_regulator = Regulator::<regulator::Enabled>::get(pdev.as_ref(), c_str!("sram"))?;

        // 4. 映射MMIO寄存器
        let request = pdev.io_request_by_index(0).ok_or(ENODEV)?;
        let iomem = Arc::pin_init(request.iomap_sized::<SZ_2M>(), GFP_KERNEL)?;

        // 5. 软复位GPU
        issue_soft_reset(pdev.as_ref(), &iomem)?;

        // 6. L2缓存上电
        gpu::l2_power_on(pdev.as_ref(), &iomem)?;

        // 7. 读取GPU信息
        let gpu_info = GpuInfo::new(pdev.as_ref(), &iomem)?;
        gpu_info.log(pdev);

        // 8. 创建DRM设备
        let data = try_pin_init!(TyrData {
            pdev: platform.clone(),
            clks <- new_mutex!(Clocks { ... }),
            regulators <- new_mutex!(Regulators { ... }),
            gpu_info,
        });

        let tdev: ARef<TyrDevice> = drm::Device::new(pdev.as_ref(), data)?;
        drm::driver::Registration::new_foreign_owned(&tdev, pdev.as_ref(), 0)?;

        // 9. 返回驱动实例
        let driver = KBox::pin_init(try_pin_init!(TyrDriver { device: tdev }), GFP_KERNEL)?;

        dev_info!(pdev.as_ref(), "Tyr initialized correctly.\n");
        Ok(driver)
    }
}

详细分析：

步骤1-2：时钟管理

Rust的Clk::get + prepare_enable自动管理生命周期：

let core_clk = Clk::get(pdev.as_ref(), Some(c_str!("core")))?;
core_clk.prepare_enable()?;
// 当core_clk离开作用域时，自动disable + unprepare

对比C版本：

core_clk = devm_clk_get(dev, "core");
if (IS_ERR(core_clk))
    return PTR_ERR(core_clk);

ret = clk_prepare_enable(core_clk);
if (ret)
    return ret;

// ...
// 忘记disable？内存泄漏！
// clk_disable_unprepare(core_clk);  // 必须手动

步骤3：电源调节器的类型状态

let mali_regulator = Regulator::<regulator::Enabled>::get(pdev.as_ref(), c_str!("mali"))?;

类型系统保证：

Regulator<Enabled>：类型上已启用
Regulator<Disabled>：类型上已禁用
编译时防止操作未启用的调节器

C中无此保证，完全依赖运行时检查。

步骤4：MMIO映射的大小检查

let iomem = Arc::pin_init(request.iomap_sized::<SZ_2M>(), GFP_KERNEL)?;

iomap_sized::<SZ_2M>()：编译时指定映射大小为2MB
SZ_2M是常量（kernel::sizes::SZ_2M），编译时检查

C版本：

iomem = devm_ioremap_resource(dev, res);
// 没有大小检查，运行时越界访问可能！

步骤5：软复位实现

fn issue_soft_reset(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result {
    regs::GPU_CMD.write(dev, iomem, regs::GPU_CMD_SOFT_RESET)?;

    // TODO: We cannot poll, as there is no support in Rust currently, so we
    // sleep. Change this when read_poll_timeout() is implemented in Rust.
    kernel::time::delay::fsleep(time::Delta::from_millis(100));

    if regs::GPU_IRQ_RAWSTAT.read(dev, iomem)? & regs::GPU_IRQ_RAWSTAT_RESET_COMPLETED == 0 {
        dev_err!(dev, "GPU reset failed with errno\n");
        dev_err!(
            dev,
            "GPU_INT_RAWSTAT is {}\n",
            regs::GPU_IRQ_RAWSTAT.read(dev, iomem)?
        );

        return Err(EIO);
    }

    Ok(())
}

TODO注释揭示的问题：

Rust内核还没有read_poll_timeout()
被迫用固定延迟（100ms）替代轮询
这是基础设施缺失的直接体现

步骤7：GPU信息查询

这是当前Tyr唯一能做的事情。详见下一节。

代码分析3：GPU信息查询

文件：`drivers/gpu/drm/tyr/gpu.rs`

/// Struct containing information that can be queried by userspace. This is read from
/// the GPU's registers.
///
/// # Invariants
///
/// - The layout of this struct identical to the C `struct drm_panthor_gpu_info`.
#[repr(C)]
pub(crate) struct GpuInfo {
    pub(crate) gpu_id: u32,
    pub(crate) gpu_rev: u32,
    pub(crate) csf_id: u32,
    pub(crate) l2_features: u32,
    pub(crate) tiler_features: u32,
    pub(crate) mem_features: u32,
    pub(crate) mmu_features: u32,
    pub(crate) thread_features: u32,
    pub(crate) max_threads: u32,
    pub(crate) thread_max_workgroup_size: u32,
    pub(crate) thread_max_barrier_size: u32,
    pub(crate) coherency_features: u32,
    pub(crate) texture_features: [u32; 4],
    pub(crate) as_present: u32,
    pub(crate) pad0: u32,
    pub(crate) shader_present: u64,
    pub(crate) l2_present: u64,
    pub(crate) tiler_present: u64,
    pub(crate) core_features: u32,
    pub(crate) pad: u32,
}

关键设计：

#[repr(C)]：
- 保证与C结构体drm_panthor_gpu_info内存布局完全相同
- 用户空间通过ioctl读取这个结构体
Invariants注释：
- Rust文档化不变量
- 编译器无法检查（需要人工审查）

GpuInfo初始化

impl GpuInfo {
    pub(crate) fn new(dev: &Device<Bound>, iomem: &Devres<IoMem>) -> Result<Self> {
        let gpu_id = regs::GPU_ID.read(dev, iomem)?;
        let csf_id = regs::GPU_CSF_ID.read(dev, iomem)?;
        let gpu_rev = regs::GPU_REVID.read(dev, iomem)?;
        let core_features = regs::GPU_CORE_FEATURES.read(dev, iomem)?;
        let l2_features = regs::GPU_L2_FEATURES.read(dev, iomem)?;
        let tiler_features = regs::GPU_TILER_FEATURES.read(dev, iomem)?;
        let mem_features = regs::GPU_MEM_FEATURES.read(dev, iomem)?;
        let mmu_features = regs::GPU_MMU_FEATURES.read(dev, iomem)?;
        let thread_features = regs::GPU_THREAD_FEATURES.read(dev, iomem)?;
        let max_threads = regs::GPU_THREAD_MAX_THREADS.read(dev, iomem)?;
        let thread_max_workgroup_size = regs::GPU_THREAD_MAX_WORKGROUP_SIZE.read(dev, iomem)?;
        let thread_max_barrier_size = regs::GPU_THREAD_MAX_BARRIER_SIZE.read(dev, iomem)?;
        let coherency_features = regs::GPU_COHERENCY_FEATURES.read(dev, iomem)?;

        let texture_features = regs::GPU_TEXTURE_FEATURES0.read(dev, iomem)?;

        let as_present = regs::GPU_AS_PRESENT.read(dev, iomem)?;

        // 64位寄存器，分两次读取
        let shader_present = u64::from(regs::GPU_SHADER_PRESENT_LO.read(dev, iomem)?);
        let shader_present =
            shader_present | u64::from(regs::GPU_SHADER_PRESENT_HI.read(dev, iomem)?) << 32;

        let tiler_present = u64::from(regs::GPU_TILER_PRESENT_LO.read(dev, iomem)?);
        let tiler_present =
            tiler_present | u64::from(regs::GPU_TILER_PRESENT_HI.read(dev, iomem)?) << 32;

        let l2_present = u64::from(regs::GPU_L2_PRESENT_LO.read(dev, iomem)?);
        let l2_present = l2_present | u64::from(regs::GPU_L2_PRESENT_HI.read(dev, iomem)?) << 32;

        Ok(Self {
            gpu_id,
            gpu_rev,
            csf_id,
            l2_features,
            tiler_features,
            mem_features,
            mmu_features,
            thread_features,
            max_threads,
            thread_max_workgroup_size,
            thread_max_barrier_size,
            coherency_features,
            // TODO: Add texture_features_{1,2,3}.
            texture_features: [texture_features, 0, 0, 0],
            as_present,
            pad0: 0,
            shader_present,
            l2_present,
            tiler_present,
            core_features,
            pad: 0,
        })
    }
}

技术细节：

错误传播：
- 每次regs::XXX.read()?都可能失败
- ?运算符自动传播错误
- 无需手动if (ret < 0) return ret;
64位寄存器读取：
- Mali GPU的64位寄存器分成LO/HI两个32位寄存器
- Rust明确显示位运算：| u64::from(...) << 32
- C中容易出错（符号扩展问题）
TODO注释：
- texture_features只读取了第一个
- 其余3个硬编码为0
- 说明这是WIP（Work-in-Progress）

代码分析4：DRM抽象层

Tyr依赖rust/kernel/drm/提供的抽象层。让我们深入分析。

文件：`rust/kernel/drm/gem/mod.rs`

4.1 BaseDriverObject trait

/// GEM object functions, which must be implemented by drivers.
pub trait BaseDriverObject<T: BaseObject>: Sync + Send + Sized {
    /// Create a new driver data object for a GEM object of a given size.
    fn new(dev: &drm::Device<T::Driver>, size: usize) -> impl PinInit<Self, Error>;

    /// Open a new handle to an existing object, associated with a File.
    fn open(
        _obj: &<<T as IntoGEMObject>::Driver as drm::Driver>::Object,
        _file: &drm::File<<<T as IntoGEMObject>::Driver as drm::Driver>::File>,
    ) -> Result {
        Ok(())
    }

    /// Close a handle to an existing object, associated with a File.
    fn close(
        _obj: &<<T as IntoGEMObject>::Driver as drm::Driver>::Object,
        _file: &drm::File<<<T as IntoGEMObject>::Driver as drm::Driver>::File>,
    ) {
    }
}

设计解析：

PinInit<Self, Error>：
- 就地初始化（in-place init）
- 避免在栈上构造后移动到堆
- 关键：C指针可能指向这块内存
open/close回调：
- 默认实现为空
- 驱动可选择性覆盖
- 对比C：必须提供函数指针或NULL
类型约束：
- Sync + Send：可安全跨线程
- Sized：大小已知（非trait object）

4.2 引用计数机制

// SAFETY: All gem objects are refcounted.
unsafe impl<T: IntoGEMObject> AlwaysRefCounted for T {
    fn inc_ref(&self) {
        // SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
        unsafe { bindings::drm_gem_object_get(self.as_raw()) };
    }

    unsafe fn dec_ref(obj: NonNull<Self>) {
        // SAFETY: We either hold the only refcount on `obj`, or one of many - meaning that no one
        // else could possibly hold a mutable reference to `obj` and thus this immutable reference
        // is safe.
        let obj = unsafe { obj.as_ref() }.as_raw();

        // SAFETY:
        // - The safety requirements guarantee that the refcount is non-zero.
        // - We hold no references to `obj` now, making it safe for us to potentially deallocate it.
        unsafe { bindings::drm_gem_object_put(obj) };
    }
}

SAFETY注释的重要性：

inc_ref：
- 调用C函数drm_gem_object_get
- 假设：已有&self，所以refcount非零
- 这是不变量，违反=UB（未定义行为）
dec_ref：
- 详细的SAFETY论证：
  - 持有唯一或多个引用之一
  - 没有可变引用冲突
  - refcount非零（由调用者保证）
- 可能释放内存（refcount降到0）

对比C版本：

static inline void drm_gem_object_get(struct drm_gem_object *obj)
{
    kref_get(&obj->refcount);
}

static inline void drm_gem_object_put(struct drm_gem_object *obj)
{
    kref_put(&obj->refcount, drm_gem_object_free);
}

C中完全没有安全论证：

编译器不检查refcount一致性
开发者完全凭经验
常见bug：double-free、use-after-free

4.3 open/close回调的FFI桥接

extern "C" fn open_callback<T: BaseDriverObject<U>, U: BaseObject>(
    raw_obj: *mut bindings::drm_gem_object,
    raw_file: *mut bindings::drm_file,
) -> core::ffi::c_int {
    // SAFETY: `open_callback` is only ever called with a valid pointer to a `struct drm_file`.
    let file = unsafe {
        drm::File::<<<U as IntoGEMObject>::Driver as drm::Driver>::File>::as_ref(raw_file)
    };
    // SAFETY: `open_callback` is specified in the AllocOps structure for `Object<T>`, ensuring that
    // `raw_obj` is indeed contained within a `Object<T>`.
    let obj = unsafe {
        <<<U as IntoGEMObject>::Driver as drm::Driver>::Object as IntoGEMObject>::as_ref(raw_obj)
    };

    match T::open(obj, file) {
        Err(e) => e.to_errno(),
        Ok(()) => 0,
    }
}

FFI桥接技巧：

extern "C"：
- 使用C ABI（调用约定）
- C代码可以调用这个函数
unsafe转换：
- raw_obj和raw_file是C指针
- 转换为Rust引用需要unsafe
- SAFETY注释论证为何安全
错误处理：
- Rust的Result<T>转换为C的int
- Err(e) => e.to_errno()：错误码映射

这是Rust/C互操作的经典模式：

C kernel → extern "C" fn → unsafe转换 → 安全Rust trait方法 → Result → C错误码

代码分析5：Nova驱动对比

Nova是另一个Rust GPU驱动（Nvidia GSP），结构与Tyr类似。

文件：`drivers/gpu/drm/nova/driver.rs`（部分）

#[vtable]
impl drm::Driver for NovaDriver {
    type Data = NovaData;
    type File = File;
    type Object = gem::Object<NovaObject>;

    const INFO: drm::DriverInfo = INFO;

    kernel::declare_drm_ioctls! {
        (NOVA_GETPARAM, drm_nova_getparam, ioctl::RENDER_ALLOW, File::get_param),
        (NOVA_GEM_CREATE, drm_nova_gem_create, ioctl::AUTH | ioctl::RENDER_ALLOW, File::gem_create),
        (NOVA_GEM_INFO, drm_nova_gem_info, ioctl::AUTH | ioctl::RENDER_ALLOW, File::gem_info),
    }
}

declare_drm_ioctls!宏分析：

// 宏展开后（简化版）
const IOCTLS: &'static [drm::ioctl::DrmIoctlDescriptor] = &[
    drm::ioctl::DrmIoctlDescriptor {
        cmd: drm::ioctl::IOWR::<drm_nova_getparam>(DRM_COMMAND_BASE + 0),
        flags: ioctl::RENDER_ALLOW,
        func: nova_get_param_wrapper,  // 自动生成的C包装器
    },
    // ...
];

自动生成的工作：

计算ioctl号（_IOWR宏）
生成C→Rust的包装函数
类型安全检查（编译时）

对比C版本（手动）：

#define DRM_NOVA_GETPARAM 0x00
#define DRM_IOCTL_NOVA_GETPARAM \
    DRM_IOWR(DRM_COMMAND_BASE + DRM_NOVA_GETPARAM, struct drm_nova_getparam)

static const struct drm_ioctl_desc nova_ioctls[] = {
    DRM_IOCTL_DEF_DRV(NOVA_GETPARAM, nova_get_param, DRM_RENDER_ALLOW),
    // 魔数0x00容易重复或冲突
};

Rust的宏：

自动分配ioctl号（按顺序）
类型检查：drm_nova_getparam必须存在
编译时验证File::get_param签名

为何上游代码如此精简？GPUVM抽象缺失

回到最核心的问题：为何Tyr上游只能查询GPU信息，无法启动MCU？

Commit message的关键解释³：

In particular, a lot of things depend on properly mapping memory on a given VA range, which itself depends on the GPUVM abstraction that is currently work-in-progress. For this reason, we still cannot boot the MCU.

技术分解

启动MCU需要什么？

分配GPU内存：存放MCU固件（数百KB）
映射到GPU虚拟地址：MCU通过VA访问内存
配置MCU寄存器：设置入口地址
启动MCU：发送启动命令

当前Tyr能做什么？

✅ 步骤1：分配物理内存（通过GEM）
❌ 步骤2：映射到GPU VA（需要GPUVM抽象）
❌ 步骤3-4：后续全阻塞

GPUVM抽象是什么？

C实现（drivers/gpu/drm/drm_gpuvm.c）：

/**
 * DOC: Overview
 *
 * The GPU VA Manager, represented by struct drm_gpuvm, keeps track of a
 * GPU's virtual address (VA) space and manages the corresponding virtual
 * mappings represented by &drm_gpuva objects.
 *
 * The DRM GPUVM tracks GPU VA space with &drm_gpuva objects backed by a
 * &drm_gem_object representing the actual memory backing the VA range.
 */
struct drm_gpuvm {
    struct drm_gem_object *r_obj;
    struct drm_device *drm;
    const char *name;

    struct rb_root_cached rb;  // 红黑树，存储VA映射
    // ...
};

Rust需要什么？

// 理想的GPUVM Rust API（概念性）
pub struct GpuVm<T: drm::Driver> {
    inner: Opaque<bindings::drm_gpuvm>,
    _phantom: PhantomData<T>,
}

impl<T: drm::Driver> GpuVm<T> {
    /// 映射GEM对象到GPU虚拟地址
    pub fn map(
        &self,
        gem_obj: &gem::Object<...>,
        va: u64,
        size: usize,
    ) -> Result<GpuVa> {
        // 调用C的drm_gpuva_insert()
    }

    /// 取消映射
    pub fn unmap(&self, va: &GpuVa) -> Result {
        // 调用C的drm_gpuva_remove()
    }
}

问题：

drm_gpuvm结构体复杂
涉及红黑树、引用计数、锁
Rust封装需要保证内存安全和生命周期正确

Alice Ryhl的工作

根据新闻报道和commit message，Alice Ryhl正在开发GPUVM的Rust抽象，基于Asahi Lina的前期工作。

挑战：

生命周期管理：GEM对象和VA映射的关系
锁顺序：避免死锁（C代码有隐式锁顺序）
红黑树抽象：Rust需要安全的树操作

这是高难度的内核Rust工作，需要深入理解C实现和Rust所有权模型。

技术洞察：从Tyr学到的经验

1. 类型状态模式的威力

电源调节器示例：

pub struct Regulator<S: State> {
    inner: *mut bindings::regulator,
    _state: PhantomData<S>,
}

pub struct Enabled;
pub struct Disabled;

impl Regulator<Disabled> {
    pub fn enable(self) -> Result<Regulator<Enabled>> {
        // unsafe调用C API
        // 转换到Enabled状态
    }
}

impl Regulator<Enabled> {
    pub fn set_voltage(&self, min_uV: i32, max_uV: i32) -> Result {
        // 只有Enabled状态才能设置电压
    }

    pub fn disable(self) -> Result<Regulator<Disabled>> {
        // 转换回Disabled状态
    }
}

// 编译错误：Disabled状态没有set_voltage方法
let reg = Regulator::<Disabled>::get(...)?;
reg.set_voltage(1000000, 1000000)?;  // ❌ 编译失败！

// 正确用法
let reg = reg.enable()?;  // 转换到Enabled
reg.set_voltage(1000000, 1000000)?;  // ✅ 编译通过

优势：

编译时防止错误状态操作
零运行时开销：PhantomData<S>不占内存
自文档化：类型签名即文档

C中完全没有这种保证：

struct regulator *reg = regulator_get(...);
// 忘记enable
regulator_set_voltage(reg, 1000000, 1000000);  // 运行时错误或崩溃！

2. RAII消除资源泄漏

时钟管理示例：

{
    let clk = Clk::get(dev, Some(c_str!("core")))?;
    clk.prepare_enable()?;

    do_work()?;  // 即使这里失败提前返回

    // clk离开作用域，自动调用Drop
} // <- 这里自动disable+unprepare

Drop trait实现（简化）：

impl Drop for Clk {
    fn drop(&mut self) {
        unsafe {
            bindings::clk_disable_unprepare(self.inner);
        }
    }
}

C版本的问题：

ret = clk_prepare_enable(clk);
if (ret)
    return ret;

ret = do_work();
if (ret) {
    // 忘记cleanup！
    return ret;  // 时钟泄漏
}

clk_disable_unprepare(clk);  // 只有成功路径执行

统计数据（来自前文）：

内核CVE中，~70%是内存/资源管理错误
RAII在编译时消除这类错误

3. 错误传播的简洁性

Rust的?运算符：

fn initialize() -> Result {
    let clk = Clk::get(dev, Some(c_str!("core")))?;  // 失败则返回
    let reg = Regulator::get(dev, c_str!("mali"))?;  // 失败则返回
    let iomem = iomap()?;  // 失败则返回

    // 全部成功才继续
    Ok(())
}

C版本：

int initialize(void) {
    clk = clk_get(dev, "core");
    if (IS_ERR(clk)) {
        ret = PTR_ERR(clk);
        goto err_clk;
    }

    reg = regulator_get(dev, "mali");
    if (IS_ERR(reg)) {
        ret = PTR_ERR(reg);
        goto err_reg;
    }

    iomem = ioremap(...);
    if (!iomem) {
        ret = -ENOMEM;
        goto err_iomem;
    }

    return 0;

err_iomem:
    regulator_put(reg);
err_reg:
    clk_put(clk);
err_clk:
    return ret;
}

差异：

Rust：4行
C：25行（含错误处理）
Rust的RAII自动cleanup，无需goto

4. FFI安全边界的明确化

Tyr代码中，所有unsafe都在特定位置：

寄存器读写：regs::XXX.read()内部
C结构体转换：as_ref()方法
引用计数操作：drm_gem_object_get/put

驱动代码本身几乎全是安全Rust：

// drivers/gpu/drm/tyr/driver.rs - probe函数
// 没有任何unsafe！
fn probe(pdev: &platform::Device<Core>, ...) -> Result<Pin<KBox<Self>>> {
    let core_clk = Clk::get(pdev.as_ref(), Some(c_str!("core")))?;
    core_clk.prepare_enable()?;
    // ... 全部安全代码
}

unsafe集中在抽象层：

// rust/kernel/drm/gem/mod.rs
unsafe impl<T: IntoGEMObject> AlwaysRefCounted for T {
    fn inc_ref(&self) {
        unsafe { bindings::drm_gem_object_get(self.as_raw()) };
        // ^^^ unsafe在这里，驱动无需接触
    }
}

这是Rust在内核的核心价值：

驱动开发者：写安全代码
抽象层维护者：处理unsafe，详细论证安全性

与已有Blog的体系关联

Blog1：Rust in the Linux Kernel - Reality Check¹

该文关注：

宏观数据：338个Rust文件，135,662行代码
Android Binder案例：18文件，~8,000行
GPU驱动：Nova（47文件，~15,000行）

本文补充：

Tyr的具体代码实现
DRM抽象层的实际工作原理
Nova的IOCTL宏展开

Blog2：Rust and Linux Kernel ABI Stability²

该文关注：

用户空间ABI稳定性
#[repr(C)]的保证
System V ABI兼容性

本文补充：

GpuInfo的#[repr(C)]实战应用
ioctl处理的FFI桥接
C/Rust互操作的实际代码

形成的知识体系

Blog1 (宏观) → Blog2 (ABI) → Blog3 (代码实战)
     ↓              ↓                ↓
  数据统计      技术保证        具体实现
  政策争议      接口规范        挑战分析
  整体趋势      系统设计        代码细节

三篇文章从不同角度完整覆盖了Rust在Linux内核的状态。

未来展望：Tyr的Roadmap

短期（2026年上半年）

依赖的抽象层（根据commit message）：

✅ GEM shmem（Lyude Paul负责）
✅ GPUVM（Alice Ryhl负责）
✅ io-pgtable（Alice Ryhl负责）

期望效果（原文）³：

Once we can handle those items, we expect to quickly become able to boot the GPU firmware and then progress unhindered until it is time to discuss job submission.

中期（2026-2027）

整合Nova的贡献：

register!宏：类型安全的寄存器访问
Bounded integers：编译时范围检查

完善功能：

电源管理（DVFS）
GPU恢复机制
通过Vulkan CTS

长期（2027+）

JobQueue架构：

替代drm_gpu_scheduler
首个C驱动可调用的Rust组件
双向互操作的里程碑

结论：代码层面的洞察

通过解剖Tyr项目的实际代码，我们得到了超越宏观讨论的具体认识：

技术层面

Rust的类型系统价值：
- 类型状态模式（Regulator）
- 编译时状态机（设备初始化）
- RAII资源管理（时钟、锁）
FFI互操作的实践：
- extern "C"的C ABI桥接
- #[repr(C)]的ABI兼容
- SAFETY注释的严格论证
抽象层的分层设计：
- 驱动层：安全Rust
- 抽象层：处理unsafe
- C层：bindings自动生成

挑战层面

基础设施缺失的实际影响：
- GPUVM抽象→无法启动MCU
- read_poll_timeout()缺失→用固定延迟
- 工具链不成熟→Send/Sync workaround
上游策略的务实性：
- 不再C+Rust混合（失败过）
- 分阶段上游（避免下游分叉）
- 与Nova/rvkms协同演进

对开发者的启示

学习路径：
- 先掌握Rust基础（所有权、生命周期）
- 学习内核概念（DRM、GEM、GPUVM）
- 阅读实际代码（Tyr、Nova、Asahi）
贡献机会：
- GPUVM抽象开发
- 其他DRM抽象补全
- Tyr驱动功能实现
技术趋势：
- Rust在DRM子系统的采用不可逆
- 基础设施建设是当前瓶颈
- 2027年可能禁止新C驱动⁴

Rust在Linux内核已经从”实验”进入”生产”，Tyr项目是这一转变的代码级见证。

参考资料

Rust boosted by permanent adoption for Linux kernel code - DevClass, 2025-12-15
Rust is here to stay: the experimental phase in the Linux Kernel has ended - DesdeLinux Blog, 2025
The future for Tyr – OSnews - OSnews转载LWN文章

代码仓库：

Linux Kernel: /Users/weli/works/linux（本地分析用）
官方仓库：https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
DRM Rust Tree: https://gitlab.freedesktop.org/drm/rust/kernel

相关项目：

Collabora: Introducing Tyr - 官方介绍
Rust for Linux - 官方项目网站

Rust in the Linux Kernel: A Reality Check from Code to Controversy - 本系列第一篇 ↩ ↩² ↩³ ↩⁴
Rust and Linux Kernel ABI Stability: A Technical Deep Dive - 本系列第二篇 ↩ ↩²
Linux Kernel Git Commit cf4fd52e3236 - “rust: drm: Introduce the Tyr driver for Arm Mali GPUs”, Daniel Almeida, 2025-09-10. 可通过git show cf4fd52e3236查看完整commit message。 ↩ ↩² ↩³
Dave Airlie在2025 Maintainers Summit的声明，报道来源： ↩

阿男的小窝 / Published on Feb 18, 2026