(Recently not updated) csapp learning notes

(This article's assembly code uniformly adopts AT&T syntax)

Friendly reminder:

When reading this book, notes, exercises, and experiments are all indispensable.

If you feel the quality of the Chinese translation is subpar and prefer to read the original English version, do not choose the Global Edition! The Global Edition is riddled with errors in the exercise section, with many answers not matching the questions at all, severely impacting the reading experience and understanding of the material! (I ended up using free resources from GitHub—though not photocopied versions, and the main text quality is high, this issue with the exercises overshadows all other advantages). I now have to read the English original while doing the exercises from the Chinese version (lll￢ω￢).

A Tour of Computer Systems

An outline covering all chapters, ~~which can serve as review material for an introduction to computer systems.~~

Representing and Manipulating Information

I skimmed through this before without doing the exercises, and now I'm unclear on how the smallest/largest denormalized number and smallest/largest normalized number of floating-point numbers are calculated.

I'll come back to fill in the details when I have time.

Machine-Level Representation of Programs

Historical Perspectives

Discussed the history of Intel processors and briefly mentioned the SSE and AVX instruction sets (I feel like I'll have to learn both in the future—time to start losing hair again (lll￢ω￢)).

Program Coding

How to view GCC-compiled assembly code in AT&T format in a Linux environment. (Note: GDB actually provides an option to switch to Intel format.)

Data Types

char
short
int
long (equivalent to long long; same applies below)
float
double
pointer (this book assumes a 64-bit environment; all pointers are 8 bytes)

Accessing Information

The various registers (rax~rdx, rsi, rdi, rbp, rsp, r8~r15, and their corresponding 32-bit, 16-bit, and 8-bit versions) along with their conventional usages are shown in the figure below.

The mov instructions (movq, movl, movw, movb, movzbw, movsbq, etc., with the latter two performing type casting).

Arithmetic and Logical Operations

lea — Load effective address, plus simple arithmetic instructions, similar to the form a+b*c (the letters q, l, w, b indicating byte size are omitted here and below).

sf, zf, pf, cf, af, of — Various flags.

add, sub, (i)mul, (i)div, shl(sal), shr, sar — Modify flag states.

inc, dec — Do not modify flag states.

cqto — Convert 64-bit to 128-bit, useful for 128-bit division.

Control Flow

The assembly implementations of if, switch, while, do-while, and for statements are particularly error-prone in this assignment.

Due to the CPU's branch prediction capability (which can achieve up to 90% accuracy for certain statements), simple if statements may precompute the register changes for both branches in advance to improve execution speed.

For switch statements with multiple branches and closely clustered values, the compiler often constructs a jump table to replace various conditional jump instructions like jz and jbe, thereby enhancing code execution efficiency. This is because the performance of a jump table is not affected by the number of branches.

Process

Once again reviewing the assembly implementation of stack calls, each part of a stack frame (return address, saved registers, local variables, argument build area) was thoroughly explained in detail.

The order of parameter passing in 64-bit registers is %rdi, %rsi, %rdx, %rcx, %r8, %r9.

Array Allocation and Access

The lea instruction is specifically designed for accessing arrays.

Although compilers prefer using pointers to traverse arrays by adding a fixed value, even for multidimensional array traversal, they try to avoid using the formula i * Wid + j that we learned in data structure courses—after all, multiplication is too time-consuming.

The Data Structure of Exceptions — 1/15/2022

Discussed struct, union, and memory alignment.

The size of a union is the size of its largest data type.

Using a union allows for forced type conversion (e.g., converting a double to an unsigned long).

Each member in a structure must be aligned to an address that is a multiple of its type size.

Combining Control and Data at the Machine Level—1/16/2022

Precedence of Pointers and Address Arithmetic—Determine the output of the following program:

#include<stdio.h>
int arr[20]={10,9,8,7,6,5,4,3,2,1,5};
int main()
{
	int *p=&arr;
	*(p+8)+=9;
	int m=*((char*)p+8);
	int n=*((char*)(p+8));
	printf("%d %d",m,n);
	return 0;
}

The answer is 8 11.

int (*p)(int, *int)—Function pointer, called using p = fun, returns an int with res = p(int, &int).

int *p(int, *int)—Pointer function, returns an int* pointer.

Various GDB debugging commands (for those using peda, commands like b, si, ni, x/s, x/wx may be useful since peda displays registers, assembly, stack, etc.):

Errors in the main text of Exercise 3.46:

Global edition: get_line is called with the return address equal to ~~0x400776~~ 0x400076.

Chinese edition: The ASCII codes for characters 0–9 are ~~0x3~0x39~~ 0x30~0x39.

In stack diagrams, addresses increase from bottom to top and from right to left. Therefore, for a dword (qword), data is stored from left to right (little-endian); while for a single byte, it is stored from right to left.

Question:

In part D, when the get_line function returns, the corrupted registers should also include %rip, not just %rbx as stated in the answer.

Three main methods to prevent stack overflow: Address Space Layout Randomization (ASLR), stack protection (canary), and non-executable memory (NX).

Assembly support for variable-length stacks—base pointer rbp:

pushq %rbp
movq %rsp, %rbp
;procedure in the function
leave 
;movq %rbp, %rsp
;popq %rbp
ret

~~The book mentions that recent compilers have abandoned the convention of using a base pointer.~~ In practice, even with the latest compilers, writing a simple "hello world" program still uses the base pointer—this convention is still maintained.

Questions:

Why is s2 rounded to the "nearest multiple of 8" in Exercise B, and why is "the offset of s1 preserved to the nearest multiple of 16" in Exercise D?

Floating-Point Code — 1/19/2022

Registers:

%ymm0–%ymm15 are 256-bit floating-point registers.

%xmm0–%xmm15 are 128-bit floating-point registers, occupying the lower 128 bits of the corresponding ymm registers.

The first 8 registers can be used as function parameters (in particular, %xmm0 is the conventional return value, similar to %rax), while the last 8 are caller-saved.

Moving floating-point values between memory and registers:

movss: move single-precision float (between %xmm and a 32-bit memory location)

movsd: move double-precision double (between %xmm and a 64-bit memory location)

Moving between registers:

movaps: move packed single-precision floats

movapd: move packed double-precision doubles

Similar to integer registers, you cannot move a value directly from one memory location to another.

Conversion between floating-point and integer types (src can also be memory; omitted here for brevity):

(src type)\(dst type)	int	long	float	double
int	\	cltq	cvtsi2ssl %eax %xmm0	cvtsi2sdl %eax %xmm0
long	%eax	\	cvtsi2ssq %rax %xmm0	cvtsi2sdq %rax %xmm0
float	cvttss2si %xmm0 %eax	cvttss2siq %xmm0 %rax	\	cvtss2sd mem %xmm0
double	cvttsd2si %xmm0 %eax	cvttsd2siq %xmm0 %rax	cvtsd2ss mem %xmm0	\

There are significant discrepancies between the instructions in CS:APP and the assembly syntax used by current compilers. This table follows the current standard. The following differences are noted:

All floating-point instructions in CS:APP are prefixed with the letter "v", which is now deprecated in current gcc (e.g., MinGW 10.3). The "l" suffix in integer-to-float and integer-to-double conversion instructions did not exist previously.

Unlike in CS:APP, three-operand floating-point instructions no longer exist.

Instead of the obscure self-conversion instructions for float↔double (vunpcklps, vcvtps2pd, vmovddup, vcvtpd2psx) used in CS:APP, gcc first copies the register value to memory and then uses a cvt instruction.

Floating-point operators: Omitted, as the assembly statements are self-explanatory.

Bitwise operations on floating-point values: Only applicable when all operands are inside registers.

Immediate values in floating-point operations: Pre-stored in memory according to the IEEE standard and copied into %xmm registers when used in operations.

Floating-point comparison instructions:

comiss: compare single-precision floats.

comisd: compare double-precision doubles.

The comparison sets three flags: CF, ZF, and PF. PF is set to 1 if and only if either operand is NaN, in which case the result is Unordered.

(The practice problems are good, but unfortunately, there's no answer key. I'll skip them.)

Summary - 1/22/2022

Finally finished learning x86 assembly! ✿✿ヽ(°▽°)ノ✿ Celebration time!