Structs blog post initial version
parent
3d79740786
commit
c694f1ab3d
Binary file not shown.
|
Before Width: | Height: | Size: 32 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 54 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 44 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 104 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 140 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 91 KiB |
@ -0,0 +1,309 @@
|
||||
title = 'How the struct gets made'
|
||||
subtitle = 'In which peek behind the curtain to see how compilers represent our data types.'
|
||||
author = 'Tom Panton'
|
||||
tags = []
|
||||
---
|
||||
A while back I came across a question online asking why Rust uses a different layout for structs
|
||||
than C. "Layout" here refers to the way a struct gets represented as a sequence of bytes in memory.
|
||||
I think it's an excellent question, and it gives us an excuse to mess around with a debugger to see
|
||||
what's going on in memory, so let's have a go at answering it!
|
||||
|
||||
To know _why_ the two languages lay out structs differently, we first need to know _how_ they lay
|
||||
them out. Let's define a little test struct in C for us to poke and prod at:
|
||||
|
||||
```C
|
||||
#include <stdint.h>
|
||||
|
||||
struct TestStruct {
|
||||
uint32_t x; // 32 bits = 4 bytes
|
||||
uint64_t y; // 64 bits = 8 bytes
|
||||
uint32_t z; // 32 bits = 4 bytes
|
||||
};
|
||||
```
|
||||
|
||||
At a glance, representing `TestStruct` in memory doesn't seem like a particularly difficult thing
|
||||
to do. My first guess is that it can be 16 contiguous bytes where the first 4 bytes represent `x`,
|
||||
the next 8 represent `y` and the last 4 represent `z`. Something like this:
|
||||
|
||||

|
||||
|
||||
Let's do a quick experiment to see if I'm right! We'll use C's `sizeof` operator find the size of
|
||||
`TestStruct` in bytes. If my prediction is correct, it should be 16 bytes.
|
||||
|
||||
```C
|
||||
@@struct_size.c@@
|
||||
#include <stdint.h>
|
||||
#include <stdio.h>
|
||||
|
||||
struct TestStruct {
|
||||
uint32_t x;
|
||||
uint64_t y;
|
||||
uint32_t z;
|
||||
};
|
||||
|
||||
int main() {
|
||||
printf("%zu bytes\n", sizeof (struct TestStruct));
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
|
||||
Let's run it and see what we get:
|
||||
|
||||
```
|
||||
$ clang struct_size.c
|
||||
$ ./a.out
|
||||
24 bytes
|
||||
```
|
||||
|
||||
24 bytes?! I was wrong! So what the heck is C doing here?
|
||||
|
||||
Well, let's take a look using a debugger! We'll create a `TestStruct` variable and fill its fields
|
||||
with some values that will be easy to spot later:
|
||||
|
||||
```C
|
||||
@@struct_layout.c@@
|
||||
#include <stdint.h>
|
||||
|
||||
struct TestStruct {
|
||||
uint32_t x;
|
||||
uint64_t y;
|
||||
uint32_t z;
|
||||
};
|
||||
|
||||
int main() {
|
||||
struct TestStruct test;
|
||||
test.x = 0xcafebabe;
|
||||
test.y = 0x0123456789abcdef;
|
||||
test.z = 0xfeedface;
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
|
||||
Now let's compile it and load it into LLDB:
|
||||
|
||||
```
|
||||
$ clang -g struct_layout.c
|
||||
$ lldb a.out
|
||||
Current executable set to '/home/tom/structs/a.out' (x86_64).
|
||||
```
|
||||
|
||||
Let's put a breakpoint to pause the program right before `main` returns on line 14, and then we can
|
||||
read the 24 bytes of memory representing our `test` variable. To make it a little easier to read,
|
||||
we'll ask LLDB to organise the bytes into groups of four.
|
||||
|
||||
```
|
||||
(lldb) breakpoint set --file struct_layout.c --line 14
|
||||
(lldb) run
|
||||
(lldb) memory read --format x --size 4 --count `24 / 4` `&test`
|
||||
0x7fffffffea10: 0xcafebabe 0x00000000 0x89abcdef 0x01234567
|
||||
0x7fffffffea20: 0xfeedface 0x00000000
|
||||
```
|
||||
|
||||
We can see `0xcafebabe` which is the value we stored in `test.x`, `0x0123456789abcdef` which we
|
||||
stored in `test.y` (the two groups of four bytes are displayed in reverse because my machine is
|
||||
[little-endian](https://en.wikipedia.org/wiki/Endianness)) and `0xfeedface` which we stored in
|
||||
`test.z`. However, there are also some bytes that we didn't tell C to store: there's four bytes
|
||||
of zeroes sitting between `test.x` and `test.y`, and another four bytes of zeroes after `test.z`!
|
||||
Although we don't know what these extra 8 bytes are doing there yet, at least we now have a more
|
||||
accurate idea of how `TestStruct` looks in memory:
|
||||
|
||||

|
||||
|
||||
To understand what those extra bytes are there for, we first need to know about _alignment_.
|
||||
|
||||
## So, what's this "alignment" stuff?
|
||||
|
||||
Every type has both a size and an alignment. Whereas the size determines how much memory is
|
||||
required to represent the type, the alignment determines where that memory is allowed to be.
|
||||
The rule for alignment is simple:
|
||||
|
||||
> A value should be stored at a memory address that is a multiple of its alignment.
|
||||
|
||||
Most modern CPUs expect this rule to be followed; if it's broken, a variety of platform-dependent
|
||||
Bad Things can happen such as performance penalties, [crashes](https://www.oracle.com/technetwork/server-storage/sun-sparc-enterprise/documentation/140521-ua2011-d096-p-ext-2306580.pdf#page=93), and [changes to the atomicity guarantees of instructions](https://www.amd.com/system/files/TechDocs/24593.pdf#page=252).
|
||||
|
||||
Let's look at some examples. Similar to the `sizeof` operator, we can use the `alignof` operator
|
||||
to find the alignment of a particular type:
|
||||
|
||||
```C
|
||||
@@alignment.c@@
|
||||
#include <stdalign.h>
|
||||
#include <stdint.h>
|
||||
#include <stdio.h>
|
||||
|
||||
int main() {
|
||||
printf("uint8_t: %zu\n", alignof (uint8_t));
|
||||
printf("uint32_t: %zu\n", alignof (uint32_t));
|
||||
printf("uint64_t: %zu\n", alignof (uint64_t));
|
||||
return 0;
|
||||
}
|
||||
```
|
||||
|
||||
```
|
||||
$ clang alignment.c
|
||||
$ ./a.out
|
||||
uint8_t: 1
|
||||
uint32_t: 4
|
||||
uint64_t: 8
|
||||
```
|
||||
|
||||
C tells us that `uint8_t` has an alignment of 1. Every memory address is a multiple of 1, so that
|
||||
means it's ok for a `uint8_t` to live at any memory address. `uint32_t`, on the other hand, has
|
||||
an alignment of 4, so it can only live at memory addresses 0, 4, 8, 12, 16 and so on.
|
||||
|
||||
Now we can explain what the mysterious extra bytes in `TestStruct` are there for! The 4 bytes
|
||||
between `x` and `y` are _padding_ to ensure that `y` follows the rule of alignment. `y` is a
|
||||
`uint64_t` which has an alignment of 8 (on 64-bit platforms); without the padding it would be at
|
||||
offset 4, which is not a multiple of 8, but when we add the 4 bytes of padding it ends up at offset
|
||||
8 instead, which is of course a multiple of 8.
|
||||
|
||||

|
||||
|
||||
The 4 bytes of padding after `z` are there to make sure the rule of alignment is followed when we
|
||||
have an _array_ of `TestStruct`. Suppose we have an array `struct TestStruct a[2]`; arrays are
|
||||
represented by just storing the elements contiguously in memory, so without the
|
||||
padding after `z`, `a[1].y` would be at offset 20 + 8 = 28 from the start of the array, which is
|
||||
not a multiple of 8 so it would break the rule of alignment. With the padding after `z` included,
|
||||
`a[1].y` is at offset 24 + 8 = 32 from the start of the array, which is a multiple of 8.
|
||||
|
||||
|
||||

|
||||
|
||||
Ok, so now we have a sense of how C lays out structs; the fields are put in memory in the same
|
||||
order as we wrote them in the struct definition, and extra padding is inserted after some of the
|
||||
fields when it is needed to follow the rule of alignment.
|
||||
|
||||
## Turning our attention to Rust
|
||||
|
||||
Time to find out what Rust does differently to C! Let's start off by defining a Rust equivalent
|
||||
of `TestStruct` and checking its size:
|
||||
|
||||
```Rust
|
||||
@@struct_size.rs@@
|
||||
#![allow(dead_code)]
|
||||
|
||||
use std::mem::size_of;
|
||||
|
||||
struct TestStruct {
|
||||
x: u32,
|
||||
y: u64,
|
||||
z: u32,
|
||||
}
|
||||
|
||||
fn main() {
|
||||
println!("{} bytes", size_of::<TestStruct>());
|
||||
}
|
||||
```
|
||||
|
||||
```
|
||||
$ rustc struct_size.rs
|
||||
$ ./struct_size
|
||||
16 bytes
|
||||
```
|
||||
|
||||
16 bytes is smaller than the 24 bytes used by C, so Rust can't be laying out `TestStruct` the same
|
||||
way. To find out what it's doing, let's use the same trick from before of filling the fields of
|
||||
the struct with some dummy values then reading the memory using LLDB:
|
||||
|
||||
```Rust
|
||||
@@struct_layout.rs@@
|
||||
#![feature(bench_black_box)]
|
||||
#![allow(dead_code)]
|
||||
|
||||
use std::hint::black_box;
|
||||
|
||||
struct TestStruct {
|
||||
x: u32,
|
||||
y: u64,
|
||||
z: u32,
|
||||
}
|
||||
|
||||
fn main() {
|
||||
let test = TestStruct {
|
||||
x: 0xcafebabe,
|
||||
y: 0x0123456789abcdef,
|
||||
z: 0xfeedface,
|
||||
};
|
||||
|
||||
// Our test value is not actually used for anything in the program, so the
|
||||
// Rust compiler wants to optimise it out. We encourage it not to do this
|
||||
// by using the black box function.
|
||||
black_box(test);
|
||||
}
|
||||
```
|
||||
|
||||
```
|
||||
$ rustc -g struct_layout.rs
|
||||
$ lldb struct_layout
|
||||
(lldb) breakpoint set --file struct_layout.rs --line 22
|
||||
(lldb) run
|
||||
(lldb) memory read --format x --size 4 --count `16 / 4` `&test`
|
||||
0x7fffffffe3d8: 0x89abcdef 0x01234567 0xcafebabe 0xfeedface
|
||||
```
|
||||
|
||||
Two things jump out: there's no padding bytes, and the value we stored in `y` appears before the
|
||||
value we stored in `x`. It looks like Rust has **changed the order of the fields**! This is a cool
|
||||
little optimisation; by switching the order of `x` and `y` in memory, all of the fields obey the
|
||||
rule of alignment without the need for any padding. `y` is now at offset 0 which is a multiple of
|
||||
8, `x` is at offset 8 which is a multiple of 4, and `z` is at offset 12 which is a multiple of 4.
|
||||
|
||||

|
||||
|
||||
Getting rid of the padding can have some practical performance benefits; since the overall size
|
||||
of the struct is smaller, we can fit more in the CPU's limited cache memory, which is _much_
|
||||
faster to access than RAM.
|
||||
|
||||
## "Let me choose the order, dammit!"
|
||||
|
||||
Rust's way of doing things might improve performance, but, (angrily shaking fist), what right does
|
||||
the compiler have to mess with the order of our fields without our permission?! We specifically
|
||||
said that `x` comes before `y` when we defined `TestStruct`; wouldn't it be better for Rust to
|
||||
just tell us that it's a suboptimal ordering rather than silently moving the fields around? Then,
|
||||
we could decide whether or not we want to listen to the compiler's recommendation and manually
|
||||
change the order of the fields, which would give us more control.
|
||||
|
||||
Unfortunately, this manual approach has problems; in particular, it doesn't play nice with
|
||||
generics. Suppose we have a generic struct like this:
|
||||
|
||||
```Rust
|
||||
struct GenericStruct<T, U> {
|
||||
x: T,
|
||||
y: U,
|
||||
z: u32,
|
||||
}
|
||||
```
|
||||
|
||||
There's no single ordering of the struct's fields that's optimal (in terms of the amount of
|
||||
padding required) for all possible choices of `T` and `U`. For example, the only two orderings
|
||||
that are optimal for both `GenericStruct<u32, u64>` and `GenericStruct<u64, u32>` are `x, z, y`
|
||||
and `y, z, x`, but neither of these two orderings are optimal for `GenericStruct<u16, u16>`.
|
||||
Whatever ordering we pick, there's going to be some choice of `T` and `U` that uses more padding
|
||||
than the minimum possible amount.
|
||||
|
||||
That's why it's useful for Rust to pick the order of the fields for us; Rust can use different
|
||||
orderings depending on the values of the generic parameters so that padding is always minimised.
|
||||
For `GenericStruct<u32, u64>` it can use the ordering `x, z, y`, and for
|
||||
`GenericStruct<u16, u16>` it can use a different ordering `x, y, z`.
|
||||
|
||||
Despite this, there's still going to be situations where we _need_ to manually specify the order
|
||||
of the fields in memory, so Rust provides us with the `#[repr(C)]` attribute which lets us use
|
||||
C's memory layout for a particular struct.
|
||||
|
||||
## Back to the original question
|
||||
|
||||
Time to answer the question we started with: _why_ do the two languages use different layouts?
|
||||
Since C is often used for very low-level tasks like FFI and interfacing with hardware, it's
|
||||
important that data has a consistent and predictable layout in memory; therefore the programmer
|
||||
is given complete control over the ordering of fields. If you were writing an IP implementation
|
||||
in C by
|
||||
[casting the received bytes](https://github.com/torvalds/linux/blob/c1084b6c5620a743f86947caca66d90f24060f56/include/linux/ip.h#L21)
|
||||
to a
|
||||
[struct representing the header format](https://github.com/torvalds/linux/blob/c1084b6c5620a743f86947caca66d90f24060f56/include/uapi/linux/ip.h#L86),
|
||||
and the compiler decided to rearrange the order of that struct's fields, then your program would
|
||||
misinterpret the IP headers!
|
||||
|
||||
Since casting between bytes and structs
|
||||
[can't be done in safe Rust](https://doc.rust-lang.org/nomicon/transmutes.html), it's more
|
||||
acceptable for Rust to take a bit of control away from the programmer and reorder the fields.
|
||||
This means that structs will always have the optimal size without the need for the programmer to
|
||||
think about alignment, even for generic structs that are impossible to optimise by hand.
|
||||
@ -1,42 +0,0 @@
|
||||
title = 'Testing'
|
||||
subtitle = 'In which we test my post renderer works correctly.'
|
||||
author = 'Tom Panton'
|
||||
tags = []
|
||||
published = '2022-06-05T11:00:13.318217Z'
|
||||
---
|
||||
# Heading
|
||||
## Subheading
|
||||
Here is some **text**!!
|
||||
|
||||
testing that _italics_ work and ~~strikethrough~~!
|
||||
|
||||
| Column A | Column 2 |
|
||||
|:--------:|----------|
|
||||
| aiwdio w | apwidho |
|
||||
|
||||
```ruby
|
||||
@@main.rb@@
|
||||
def fib(n)
|
||||
if n == 0 then
|
||||
0
|
||||
elsif n == 1 then
|
||||
1
|
||||
else
|
||||
fib(n - 2) + fib(n - 1)
|
||||
end
|
||||
end
|
||||
```
|
||||
|
||||
Here is some code without any specific language:
|
||||
|
||||
```
|
||||
func foo(f) {
|
||||
if halts(f) {
|
||||
print("Turings hate them for discovering this one simple trick");
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Here is a [shameless plug](https://smolbotbot.com)!
|
||||
|
||||

|
||||
Loading…
Reference in New Issue