1.. |msrv| replace:: 1.63.0 2 3Rust in QEMU 4============ 5 6Rust in QEMU is a project to enable using the Rust programming language 7to add new functionality to QEMU. 8 9Right now, the focus is on making it possible to write devices that inherit 10from ``SysBusDevice`` in `*safe*`__ Rust. Later, it may become possible 11to write other kinds of devices (e.g. PCI devices that can do DMA), 12complete boards, or backends (e.g. block device formats). 13 14__ https://doc.rust-lang.org/nomicon/meet-safe-and-unsafe.html 15 16Building the Rust in QEMU code 17------------------------------ 18 19The Rust in QEMU code is included in the emulators via Meson. Meson 20invokes rustc directly, building static libraries that are then linked 21together with the C code. This is completely automatic when you run 22``make`` or ``ninja``. 23 24However, QEMU's build system also tries to be easy to use for people who 25are accustomed to the more "normal" Cargo-based development workflow. 26In particular: 27 28* the set of warnings and lints that are used to build QEMU always 29 comes from the ``rust/Cargo.toml`` workspace file 30 31* it is also possible to use ``cargo`` for common Rust-specific coding 32 tasks, in particular to invoke ``clippy``, ``rustfmt`` and ``rustdoc``. 33 34To this end, QEMU includes a ``build.rs`` build script that picks up 35generated sources from QEMU's build directory and puts it in Cargo's 36output directory (typically ``rust/target/``). A vanilla invocation 37of Cargo will complain that it cannot find the generated sources, 38which can be fixed in different ways: 39 40* by using special shorthand targets in the QEMU build directory:: 41 42 make clippy 43 make rustfmt 44 make rustdoc 45 46* by invoking ``cargo`` through the Meson `development environment`__ 47 feature:: 48 49 pyvenv/bin/meson devenv -w ../rust cargo clippy --tests 50 pyvenv/bin/meson devenv -w ../rust cargo fmt 51 52 If you are going to use ``cargo`` repeatedly, ``pyvenv/bin/meson devenv`` 53 will enter a shell where commands like ``cargo clippy`` just work. 54 55__ https://mesonbuild.com/Commands.html#devenv 56 57* by pointing the ``MESON_BUILD_ROOT`` to the top of your QEMU build 58 tree. This third method is useful if you are using ``rust-analyzer``; 59 you can set the environment variable through the 60 ``rust-analyzer.cargo.extraEnv`` setting. 61 62As shown above, you can use the ``--tests`` option as usual to operate on test 63code. Note however that you cannot *build* or run tests via ``cargo``, because 64they need support C code from QEMU that Cargo does not know about. Tests can 65be run via ``meson test`` or ``make``:: 66 67 make check-rust 68 69Building Rust code with ``--enable-modules`` is not supported yet. 70 71Supported tools 72''''''''''''''' 73 74QEMU supports rustc version 1.63.0 and newer. Notably, the following features 75are missing: 76 77* ``cast_mut()``/``cast_const()`` (1.65.0). Use ``as`` instead. 78 79* "let ... else" (1.65.0). Use ``if let`` instead. This is currently patched 80 in QEMU's vendored copy of the bilge crate. 81 82* Generic Associated Types (1.65.0) 83 84* ``CStr::from_bytes_with_nul()`` as a ``const`` function (1.72.0). 85 86* "Return position ``impl Trait`` in Traits" (1.75.0, blocker for including 87 the pinned-init create). 88 89* ``MaybeUninit::zeroed()`` as a ``const`` function (1.75.0). QEMU's 90 ``Zeroable`` trait can be implemented without ``MaybeUninit::zeroed()``, 91 so this would be just a cleanup. 92 93* ``c"" literals`` (stable in 1.77.0). QEMU provides a ``c_str!()`` macro 94 to define ``CStr`` constants easily 95 96* ``offset_of!`` (stable in 1.77.0). QEMU uses ``offset_of!()`` heavily; it 97 provides a replacement in the ``qemu_api`` crate, but it does not support 98 lifetime parameters and therefore ``&'a Something`` fields in the struct 99 may have to be replaced by ``NonNull<Something>``. *Nested* ``offset_of!`` 100 was only stabilized in Rust 1.82.0, but it is not used. 101 102* inline const expression (stable in 1.79.0), currently worked around with 103 associated constants in the ``FnCall`` trait. 104 105* associated constants have to be explicitly marked ``'static`` (`changed in 106 1.81.0`__) 107 108* ``&raw`` (stable in 1.82.0). Use ``addr_of!`` and ``addr_of_mut!`` instead, 109 though hopefully the need for raw pointers will go down over time. 110 111* ``new_uninit`` (stable in 1.82.0). This is used internally by the ``pinned_init`` 112 crate, which is planned for inclusion in QEMU, but it can be easily patched 113 out. 114 115* referencing statics in constants (stable in 1.83.0). For now use a const 116 function; this is an important limitation for QEMU's migration stream 117 architecture (VMState). Right now, VMState lacks type safety because 118 it is hard to place the ``VMStateField`` definitions in traits. 119 120* associated const equality would be nice to have for some users of 121 ``callbacks::FnCall``, but is still experimental. ``ASSERT_IS_SOME`` 122 replaces it. 123 124__ https://github.com/rust-lang/rust/pull/125258 125 126It is expected that QEMU will advance its minimum supported version of 127rustc to 1.77.0 as soon as possible; as of January 2025, blockers 128for that right now are Debian bookworm and 32-bit MIPS processors. 129This unfortunately means that references to statics in constants will 130remain an issue. 131 132QEMU also supports version 0.60.x of bindgen, which is missing option 133``--generate-cstr``. This option requires version 0.66.x and will 134be adopted as soon as supporting these older versions is not necessary 135anymore. 136 137Writing Rust code in QEMU 138------------------------- 139 140QEMU includes four crates: 141 142* ``qemu_api`` for bindings to C code and useful functionality 143 144* ``qemu_api_macros`` defines several procedural macros that are useful when 145 writing C code 146 147* ``pl011`` (under ``rust/hw/char/pl011``) and ``hpet`` (under ``rust/hw/timer/hpet``) 148 are sample devices that demonstrate ``qemu_api`` and ``qemu_api_macros``, and are 149 used to further develop them. These two crates are functional\ [#issues]_ replacements 150 for the ``hw/char/pl011.c`` and ``hw/timer/hpet.c`` files. 151 152.. [#issues] The ``pl011`` crate is synchronized with ``hw/char/pl011.c`` 153 as of commit 02b1f7f61928. The ``hpet`` crate is synchronized as of 154 commit 1433e38cc8. Both are lacking tracing functionality. 155 156This section explains how to work with them. 157 158Status 159'''''' 160 161Modules of ``qemu_api`` can be defined as: 162 163- *complete*: ready for use in new devices; if applicable, the API supports the 164 full functionality available in C 165 166- *stable*: ready for production use, the API is safe and should not undergo 167 major changes 168 169- *proof of concept*: the API is subject to change but allows working with safe 170 Rust 171 172- *initial*: the API is in its initial stages; it requires large amount of 173 unsafe code; it might have soundness or type-safety issues 174 175The status of the modules is as follows: 176 177================ ====================== 178module status 179================ ====================== 180``assertions`` stable 181``bitops`` complete 182``callbacks`` complete 183``cell`` stable 184``c_str`` complete 185``errno`` complete 186``irq`` complete 187``memory`` stable 188``module`` complete 189``offset_of`` stable 190``qdev`` stable 191``qom`` stable 192``sysbus`` stable 193``timer`` stable 194``vmstate`` proof of concept 195``zeroable`` stable 196================ ====================== 197 198.. note:: 199 API stability is not a promise, if anything because the C APIs are not a stable 200 interface either. Also, ``unsafe`` interfaces may be replaced by safe interfaces 201 later. 202 203Naming convention 204''''''''''''''''' 205 206C function names usually are prefixed according to the data type that they 207apply to, for example ``timer_mod`` or ``sysbus_connect_irq``. Furthermore, 208both function and structs sometimes have a ``qemu_`` or ``QEMU`` prefix. 209Generally speaking, these are all removed in the corresponding Rust functions: 210``QEMUTimer`` becomes ``timer::Timer``, ``timer_mod`` becomes ``Timer::modify``, 211``sysbus_connect_irq`` becomes ``SysBusDeviceMethods::connect_irq``. 212 213Sometimes however a name appears multiple times in the QOM class hierarchy, 214and the only difference is in the prefix. An example is ``qdev_realize`` and 215``sysbus_realize``. In such cases, whenever a name is not unique in 216the hierarchy, always add the prefix to the classes that are lower in 217the hierarchy; for the top class, decide on a case by case basis. 218 219For example: 220 221========================== ========================================= 222``device_cold_reset()`` ``DeviceMethods::cold_reset()`` 223``pci_device_reset()`` ``PciDeviceMethods::pci_device_reset()`` 224``pci_bridge_reset()`` ``PciBridgeMethods::pci_bridge_reset()`` 225========================== ========================================= 226 227Here, the name is not exactly the same, but nevertheless ``PciDeviceMethods`` 228adds the prefix to avoid confusion, because the functionality of 229``device_cold_reset()`` and ``pci_device_reset()`` is subtly different. 230 231In this case, however, no prefix is needed: 232 233========================== ========================================= 234``device_realize()`` ``DeviceMethods::realize()`` 235``sysbus_realize()`` ``SysbusDeviceMethods::sysbus_realize()`` 236``pci_realize()`` ``PciDeviceMethods::pci_realize()`` 237========================== ========================================= 238 239Here, the lower classes do not add any functionality, and mostly 240provide extra compile-time checking; the basic *realize* functionality 241is the same for all devices. Therefore, ``DeviceMethods`` does not 242add the prefix. 243 244Whenever a name is unique in the hierarchy, instead, you should 245always remove the class name prefix. 246 247Common pitfalls 248''''''''''''''' 249 250Rust has very strict rules with respect to how you get an exclusive (``&mut``) 251reference; failure to respect those rules is a source of undefined behavior. 252In particular, even if a value is loaded from a raw mutable pointer (``*mut``), 253it *cannot* be casted to ``&mut`` unless the value was stored to the ``*mut`` 254from a mutable reference. Furthermore, it is undefined behavior if any 255shared reference was created between the store to the ``*mut`` and the load:: 256 257 let mut p: u32 = 42; 258 let p_mut = &mut p; // 1 259 let p_raw = p_mut as *mut u32; // 2 260 261 // p_raw keeps the mutable reference "alive" 262 263 let p_shared = &p; // 3 264 println!("access from &u32: {}", *p_shared); 265 266 // Bring back the mutable reference, its lifetime overlaps 267 // with that of a shared reference. 268 let p_mut = unsafe { &mut *p_raw }; // 4 269 println!("access from &mut 32: {}", *p_mut); 270 271 println!("access from &u32: {}", *p_shared); // 5 272 273These rules can be tested with `MIRI`__, for example. 274 275__ https://github.com/rust-lang/miri 276 277Almost all Rust code in QEMU will involve QOM objects, and pointers to these 278objects are *shared*, for example because they are part of the QOM composition 279tree. This creates exactly the above scenario: 280 2811. a QOM object is created 282 2832. a ``*mut`` is created, for example as the opaque value for a ``MemoryRegion`` 284 2853. the QOM object is placed in the composition tree 286 2874. a memory access dereferences the opaque value to a ``&mut`` 288 2895. but the shared reference is still present in the composition tree 290 291Because of this, QOM objects should almost always use ``&self`` instead 292of ``&mut self``; access to internal fields must use *interior mutability* 293to go from a shared reference to a ``&mut``. 294 295Whenever C code provides you with an opaque ``void *``, avoid converting it 296to a Rust mutable reference, and use a shared reference instead. The 297``qemu_api::cell`` module provides wrappers that can be used to tell the 298Rust compiler about interior mutability, and optionally to enforce locking 299rules for the "Big QEMU Lock". In the future, similar cell types might 300also be provided for ``AioContext``-based locking as well. 301 302In particular, device code will usually rely on the ``BqlRefCell`` and 303``BqlCell`` type to ensure that data is accessed correctly under the 304"Big QEMU Lock". These cell types are also known to the ``vmstate`` 305crate, which is able to "look inside" them when building an in-memory 306representation of a ``struct``'s layout. Note that the same is not true 307of a ``RefCell`` or ``Mutex``. 308 309Bindings code instead will usually use the ``Opaque`` type, which hides 310the contents of the underlying struct and can be easily converted to 311a raw pointer, for use in calls to C functions. It can be used for 312example as follows:: 313 314 #[repr(transparent)] 315 #[derive(Debug, qemu_api_macros::Wrapper)] 316 pub struct Object(Opaque<bindings::Object>); 317 318where the special ``derive`` macro provides useful methods such as 319``from_raw``, ``as_ptr`, ``as_mut_ptr`` and ``raw_get``. The bindings will 320then manually check for the big QEMU lock with assertions, which allows 321the wrapper to be declared thread-safe:: 322 323 unsafe impl Send for Object {} 324 unsafe impl Sync for Object {} 325 326Writing bindings to C code 327'''''''''''''''''''''''''' 328 329Here are some things to keep in mind when working on the ``qemu_api`` crate. 330 331**Look at existing code** 332 Very often, similar idioms in C code correspond to similar tricks in 333 Rust bindings. If the C code uses ``offsetof``, look at qdev properties 334 or ``vmstate``. If the C code has a complex const struct, look at 335 ``MemoryRegion``. Reuse existing patterns for handling lifetimes; 336 for example use ``&T`` for QOM objects that do not need a reference 337 count (including those that can be embedded in other objects) and 338 ``Owned<T>`` for those that need it. 339 340**Use the type system** 341 Bindings often will need access information that is specific to a type 342 (either a builtin one or a user-defined one) in order to pass it to C 343 functions. Put them in a trait and access it through generic parameters. 344 The ``vmstate`` module has examples of how to retrieve type information 345 for the fields of a Rust ``struct``. 346 347**Prefer unsafe traits to unsafe functions** 348 Unsafe traits are much easier to prove correct than unsafe functions. 349 They are an excellent place to store metadata that can later be accessed 350 by generic functions. C code usually places metadata in global variables; 351 in Rust, they can be stored in traits and then turned into ``static`` 352 variables. Often, unsafe traits can be generated by procedural macros. 353 354**Document limitations due to old Rust versions** 355 If you need to settle for an inferior solution because of the currently 356 supported set of Rust versions, document it in the source and in this 357 file. This ensures that it can be fixed when the minimum supported 358 version is bumped. 359 360**Keep locking in mind**. 361 When marking a type ``Sync``, be careful of whether it needs the big 362 QEMU lock. Use ``BqlCell`` and ``BqlRefCell`` for interior data, 363 or assert ``bql_locked()``. 364 365**Don't be afraid of complexity, but document and isolate it** 366 It's okay to be tricky; device code is written more often than bindings 367 code and it's important that it is idiomatic. However, you should strive 368 to isolate any tricks in a place (for example a ``struct``, a trait 369 or a macro) where it can be documented and tested. If needed, include 370 toy versions of the code in the documentation. 371 372Writing procedural macros 373''''''''''''''''''''''''' 374 375By conventions, procedural macros are split in two functions, one 376returning ``Result<proc_macro2::TokenStream, MacroError>`` with the body of 377the procedural macro, and the second returning ``proc_macro::TokenStream`` 378which is the actual procedural macro. The former's name is the same as 379the latter with the ``_or_error`` suffix. The code for the latter is more 380or less fixed; it follows the following template, which is fixed apart 381from the type after ``as`` in the invocation of ``parse_macro_input!``:: 382 383 #[proc_macro_derive(Object)] 384 pub fn derive_object(input: TokenStream) -> TokenStream { 385 let input = parse_macro_input!(input as DeriveInput); 386 let expanded = derive_object_or_error(input).unwrap_or_else(Into::into); 387 388 TokenStream::from(expanded) 389 } 390 391The ``qemu_api_macros`` crate has utility functions to examine a 392``DeriveInput`` and perform common checks (e.g. looking for a struct 393with named fields). These functions return ``Result<..., MacroError>`` 394and can be used easily in the procedural macro function:: 395 396 fn derive_object_or_error(input: DeriveInput) -> 397 Result<proc_macro2::TokenStream, MacroError> 398 { 399 is_c_repr(&input, "#[derive(Object)]")?; 400 401 let name = &input.ident; 402 let parent = &get_fields(&input, "#[derive(Object)]")?[0].ident; 403 ... 404 } 405 406Use procedural macros with care. They are mostly useful for two purposes: 407 408* Performing consistency checks; for example ``#[derive(Object)]`` checks 409 that the structure has ``#[repr[C])`` and that the type of the first field 410 is consistent with the ``ObjectType`` declaration. 411 412* Extracting information from Rust source code into traits, typically based 413 on types and attributes. For example, ``#[derive(TryInto)]`` builds an 414 implementation of ``TryFrom``, and it uses the ``#[repr(...)]`` attribute 415 as the ``TryFrom`` source and error types. 416 417Procedural macros can be hard to debug and test; if the code generation 418exceeds a few lines of code, it may be worthwhile to delegate work to 419"regular" declarative (``macro_rules!``) macros and write unit tests for 420those instead. 421 422 423Coding style 424'''''''''''' 425 426Code should pass clippy and be formatted with rustfmt. 427 428Right now, only the nightly version of ``rustfmt`` is supported. This 429might change in the future. While CI checks for correct formatting via 430``cargo fmt --check``, maintainers can fix this for you when applying patches. 431 432It is expected that ``qemu_api`` provides full ``rustdoc`` documentation for 433bindings that are in their final shape or close. 434 435Adding dependencies 436------------------- 437 438Generally, the set of dependent crates is kept small. Think twice before 439adding a new external crate, especially if it comes with a large set of 440dependencies itself. Sometimes QEMU only needs a small subset of the 441functionality; see for example QEMU's ``assertions`` or ``c_str`` modules. 442 443On top of this recommendation, adding external crates to QEMU is a 444slightly complicated process, mostly due to the need to teach Meson how 445to build them. While Meson has initial support for parsing ``Cargo.lock`` 446files, it is still highly experimental and is therefore not used. 447 448Therefore, external crates must be added as subprojects for Meson to 449learn how to build them, as well as to the relevant ``Cargo.toml`` files. 450The versions specified in ``rust/Cargo.lock`` must be the same as the 451subprojects; note that the ``rust/`` directory forms a Cargo `workspace`__, 452and therefore there is a single lock file for the whole build. 453 454__ https://doc.rust-lang.org/cargo/reference/workspaces.html#virtual-workspace 455 456Choose a version of the crate that works with QEMU's minimum supported 457Rust version (|msrv|). 458 459Second, a new ``wrap`` file must be added to teach Meson how to download the 460crate. The wrap file must be named ``NAME-SEMVER-rs.wrap``, where ``NAME`` 461is the name of the crate and ``SEMVER`` is the version up to and including the 462first non-zero number. For example, a crate with version ``0.2.3`` will use 463``0.2`` for its ``SEMVER``, while a crate with version ``1.0.84`` will use ``1``. 464 465Third, the Meson rules to build the crate must be added at 466``subprojects/NAME-SEMVER-rs/meson.build``. Generally this includes: 467 468* ``subproject`` and ``dependency`` lines for all dependent crates 469 470* a ``static_library`` or ``rust.proc_macro`` line to perform the actual build 471 472* ``declare_dependency`` and a ``meson.override_dependency`` lines to expose 473 the result to QEMU and to other subprojects 474 475Remember to add ``native: true`` to ``dependency``, ``static_library`` and 476``meson.override_dependency`` for dependencies of procedural macros. 477If a crate is needed in both procedural macros and QEMU binaries, everything 478apart from ``subproject`` must be duplicated to build both native and 479non-native versions of the crate. 480 481It's important to specify the right compiler options. These include: 482 483* the language edition (which can be found in the ``Cargo.toml`` file) 484 485* the ``--cfg`` (which have to be "reverse engineered" from the ``build.rs`` 486 file of the crate). 487 488* usually, a ``--cap-lints allow`` argument to hide warnings from rustc 489 or clippy. 490 491After every change to the ``meson.build`` file you have to update the patched 492version with ``meson subprojects update --reset ``NAME-SEMVER-rs``. This might 493be automated in the future. 494 495Also, after every change to the ``meson.build`` file it is strongly suggested to 496do a dummy change to the ``.wrap`` file (for example adding a comment like 497``# version 2``), which will help Meson notice that the subproject is out of date. 498 499As a last step, add the new subproject to ``scripts/archive-source.sh``, 500``scripts/make-release`` and ``subprojects/.gitignore``. 501