x86-64 Memory Architecture and mov Instructions: Deep Dive into Addressing Mechanisms, Stack Operati

x86-64 Memory Architecture and mov Instructions: Deep Dive into Addressing Mechanisms, Stack Operati

本文为纯手打原创硬核干货,适合学习计算机组成汇编CSAPP 的同学,欢迎真实阅读、交流。


Based on the x86-64 architecture, this article starts with the matrix-based physical implementation of main memory, systematically breaks down the memory addressing mechanism, the family of data transfer instructions, and the logic of stack operations. It will help you fully grasp the underlying principles of CPU-memory interaction and thoroughly understand the essence of the mov instruction.


I. Physical Implementation of Main Memory and Addressing

1.1 Matrix-Based Storage Structure

Modern main memory is implemented using a matrix structure, rather than a simple linear arrangement. The core motivation behind this design is to avoid the physical wiring challenges caused by “extremely long memory address lines.”

Core Idea: Map a one-dimensional address into a two-dimensional physical space through row and column decoding, significantly reducing the complexity of address decoding circuits.

1.2 Memory Model in x86-64

In the x86-64 architecture, the memory system has the following key characteristics:

FeatureDescription
Address SpaceTheoretically supports 2642^{64}264 distinct memory addresses
Address LengthEach memory address requires 64 bits in binary representation
Addressing GranularityEach address points to a single byte (1 byte = 8 bits)
Byte OrderLittle-Endian
Although the architecture supports a 2642^{64}264 address space, actual implementations typically use only 48 bits (256 TB) to avoid excessively large page tables.

II. Data Formats and mov Instruction Rules

x86-64 Data Types

x86-64 assembly instructions use a single-character suffix to explicitly indicate the operand size. This is fundamental to understanding instruction behavior:

SuffixFull NameSizeExample Instruction
bbyte1 byte (8 bits)movb
wword2 bytes (16 bits)movw
llong/double word4 bytes (32 bits)movl
qquad word8 bytes (64 bits)movq

2.1 Data Movement Instructions (mov)

InstructionEffect
movbmove byte
movwmove word
movlmove double word
movqmove quad word
InstructionEffect
movbmove byte
movwmove word

2.2 Special Transfer Instruction: movabsq

When handling 64-bit immediate values, the standard movq may cause truncation (sign extension). In such cases, movabsq must be used:

; Incorrect: immediate is truncated to 32 bits and then sign-extended movq $0x0011223344556677, %rax ; %rax = 0x0000000044556677 ❌ ; Correct: full 64-bit immediate transfer movabsq $0x0011223344556677, %rax ; %rax = 0x0011223344556677 ✓ 

III. x86-64 Register System

3.1 Hierarchical Structure of General-Purpose Registers

x86-64 registers support partial access, which is a key design for backward compatibility with 32/16/8-bit code:

Common x86-64 Registers

Data Consistency of Registers:
As shown above, these registers of different sizes are physically nested accesses to the same underlying storage unit.

This means registers of the same type completely share their contents:

  • Write synchronization: Writing to a lower-sized register (such as %eax) will also affect the corresponding 64-bit register (%rax).
  • Partial read: No matter which size alias you operate on, you are accessing the same data. If you write a 64-bit integer into %rax, you can later read its lower 32 bits through %eax or its lowest byte through %al.
In the x86-64 specification, writing to a 32-bit register (such as %eax) will automatically zero the upper 32 bits.

3.2 Common Registers Overview

64-bit32-bit16-bit8-bitUsage
%rax%eax%ax%alAccumulator, return value
%rbx%ebx%bx%blBase register
%rcx%ecx%cx%clCounter, loop variable
%rdx%edx%dx%dlData register
%rsi%esi%si%silSource index
%rdi%edi%di%dilDestination index
%r8-%r15%r8d-%r15d%r8w-%r15w%r8b-%r15bExtended registers

In functions, %rdi generally represents the first argument, %rsi the second argument, and %rcx often corresponds to the C-language index “i”.
Note that these are 64-bit registers corresponding to the long type. If the parameter type is different, the corresponding register variant must be used. For example, if the parameter is int, use %edi and %esi.


IV. Operand Addressing Modes

x86-64 provides flexible addressing methods, which can be divided into three categories: Immediate, Register, and Memory.
Memory refers to the value stored at the address contained in a register, which may be a value or another address.

4.1 Addressing Mode Quick Reference

TypeFormOperand ValueName
Immediate$ImmImmImmImmImmediate
RegisterraR[ra]R[ra]R[ra]Register
AbsoluteImmM[Imm]M[Imm]M[Imm]Absolute
Indirect(ra)M[R[ra]]M[R[ra]]M[R[ra]]Indirect
Base + displacementImm(rb)M[Imm+R[rb]]M[Imm + R[rb]]M[Imm+R[rb]]Base + displacement
Indexed(rb, ri)M[R[rb]+R[ri]]M[R[rb] + R[ri]]M[R[rb]+R[ri]]Indexed
Scaled indexed(rb, ri, s)M[R[rb]+R[ri]⋅s]M[R[rb] + R[ri] \cdot s]M[R[rb]+R[ri]s]Scaled indexed
📝 Scale factor s: can only be 1, 2, 4, 8 (convenient for array element access)

4.2 Address Calculation Example

Assume the following register and memory state:

AddressValueRegisterValue
0x1000xFF%rax0x100
0x1040xAB%rcx0x1
0x1080x13%rdx0x3
0x10C0x11--

Then the operand values are:

OperandCalculationResult
%raxR[%rax]0x100
0x104M[0x104]0xAB
$0x1080x1080x108
(%rax)M[0x100]0xFF
4(%rax)M[0x100 + 4]0xAB
9(%rax, %rdx)M[0x100 + 0x3 + 9]0x11
260(%rcx, %rdx)M[0x1 + 0x3 + 0x104]0x13
(%rax, %rdx, 4)M[0x100 + 0x3 * 4]0x11
  • Common scenario: %rdi is the array A in parameters, %rcx represents offset i
    • Registers and addressing logic
      %rdi = A (array base address/pointer)
      %rcx = i (index/offset)
    • Addressing mapping for different data types
      int *A: (%rdi, %rcx, 4) -> A[i]
      char *A: (%rdi, %rcx, 1) -> A[i]
      long *A: (%rdi, %rcx, 8) -> A[i]

V. Data Extension Transfer Instructions

5.1 Zero Extension

Extend smaller data to larger data, filling high bits with zero:

InstructionSourceDestinationEffect
movzbw8-bit16-bitzero-extend to word
movzbl8-bit32-bitzero-extend to double word
movzbq8-bit64-bitzero-extend to quad word
movzwl16-bit32-bitzero-extend to double word
movzwq16-bit64-bitzero-extend to quad word
  • Special equivalence: movzlq %ecx, %rax is equivalent to movl %ecx, %eax, because a 32-bit move automatically zero-extends to 64 bits.

5.2 Sign Extension

Extend smaller data to larger data, copying the sign bit into higher bits:

InstructionSourceDestination
movsbw8-bit16-bit
movsbl8-bit32-bit
movsbq8-bit64-bit
movswl16-bit32-bit
movswq16-bit64-bit
movslq32-bit64-bit
cltq%eax%rax(abbreviation of movslq %eax, %rax)
  • cltq = movslq %eax, %rax => %rax = sign-extend(%eax)

5.3 Extension Instruction Comparison Experiment

movq $0x0011223344556677, %rax ; %rax = $0x0000000044556677 (only low 8 bytes modified) movabsq $0x0011223344556677, %rax ; %rax = 0x0011223344556677 movb $0xAA, %dl ; %dl = 0xAA (10101010) movb %dl, %al ; %rax = 0x00112233445566AA (only lowest byte modified) movsbq %dl, %rax ; %rax = 0xFFFFFFFFFFFFFFAA (sign-extended, 1-filled) movzbq %dl, %rax ; %rax = 0x00000000000000AA (zero-extended, 0-filled) 

VI. Stack Operations: Push and Pop

In the x86-64 architecture, the stack is not merely a data structure; it is a hardware-encoded memory region maintained by the %rsp register. Its core logic lies in the reverse growth of addresses synchronized with pointer dereferencing.

6.1 Core Concept: Counterintuitive “Downward Growth”

The stack is a special region of memory used for temporarily storing data (such as local variables and saved register states).

  • Stack pointer: The %rsp register always stores the starting address of the current top element of the stack.
  • Downward growth: A push operation causes %rsp to move toward lower addresses. In other words, as more data is pushed, the numerical value in %rsp becomes smaller.
  • Operation unit: In x86-64 systems, the standard stack operation unit is a Quadword (8 bytes / 64 bits).

6.2 Instruction Decomposition: Micro-Operation Equivalence

push and pop are highly encapsulated instructions. Their execution flow can be decomposed into pointer arithmetic and memory access:

  • Push
    Executing pushq %rax causes the CPU to perform:
  1. Pointer decrement: subtract 8 from %rsp (move downward by 8 bytes to allocate space).
  2. Memory write: store the data in %rax into the new address pointed to by %rsp.
pushq %rax is fully equivalent to: subq $8, %rsp ; 1. stack pointer moves downward (address decreases) movq %rax, (%rsp) ; 2. write data to new top address 
  • Pop
    Executing popq %rax performs the reverse sequence:
  1. Memory read: copy the value pointed to by %rsp into %rax.
  2. Pointer increment: add 8 to %rsp (reclaim the 8-byte space).
popq %rax is fully equivalent to: movq (%rsp), %rax addq $8, %rsp 

6.3 State Trace: Register and Memory Instantaneous States

Below is a trace analysis assuming %rsp initially equals 0x8000:

Instruction%rax Value%rsp (Stack Top Address)Memory StateDescription
movq $0x123, %rax0x1230x8000M[0x8000] = ?Initialize register
pushq %rax0x1230x7FF8M[0x7FF8] = 0x123Pointer moves down 8 bytes and writes
movq $0x22, %rax0x220x7FF8M[0x7FF8] = 0x123Register overwritten, memory unchanged
pushq %rax0x220x7FF0M[0x7FF0] = 0x22Pointer moves down again and writes
popq %rax0x220x7FF8M[0x7FF8] = 0x123、M[0x7FF0] = 0x22Data loaded into register, pointer restored
Observe the last row popq %rax:
The pop operation only changes the %rsp pointer; it does not clear or erase old data in memory!
The 0x22 at address 0x7FF0 is not physically erased. Until overwritten by a future push, it is considered garbage data.

VII. Common Errors and Notes

7.1 Illegal Operand Combinations

Incorrect InstructionReasonCorrect Form
movb $0xF, (%ebx)Cannot use %ebx (32-bit) as address registermovb $0xF, (%rbx)
movl %rax, (%rsp)Suffix l does not match 64-bit register raxmovq %rax, (%rsp)
movw (%rax), 4(%rsp)Source and destination cannot both be memoryUse register as intermediary
movb %al, %sl%sl register does not existmovb %al, %sil
movq %rax, $0x123Immediate cannot be destination operandmovq $0x123, %rax
movl %eax, %dxDestination operand size incorrectmovl %eax, %edx
movb %si, 8(%rbp)Suffix b does not match 16-bit register simovw %si, 8(%rbp)

7.2 Key Principles

  1. mov is just a copy: No persistent link is established between source and destination; it is only a value copy.
  2. Memory-to-memory prohibited: x86-64 does not allow a single instruction to directly transfer data between two memory locations.
  3. Immediate constraints: Immediate values cannot be destination operands, and their size is limited by the instruction suffix.

VIII. Practice: C Code and Assembly Comparison

8.1 Swap Function

voidswap(long*xp,long*yp){long t0 =*xp;long t1 =*yp;*xp = t1;*yp = t0;}

Corresponding x86-64 assembly:

swap: movq (%rdi), %rax ; t0 = *xp (xp in %rdi) movq (%rsi), %rdx ; t1 = *yp (yp in %rsi) movq %rdx, (%rdi) ; *xp = t1 movq %rax, (%rsi) ; *yp = t0 ret 

Execution trace:

  • Initial: %rdi=0x120 (stores 123), %rsi=0x100 (stores 456)
  • After execution: 0x120 becomes 456, 0x100 becomes 123

8.2 Array Addressing

Assume long long array[8] base address is in %rdx and index 4 is in %rcx.
Goal: store array[4] into %rax

movq (%rdx, %rcx, 8), %rax # 1. Compute address %rdx + 4*8 # 2. Move 8 bytes from that memory address into %rax 

IX. Summary and Outlook

Starting from the matrix implementation of main memory, this article systematically reviewed the following aspects of the x86-64 architecture:

  1. Addressing system: 2642^{64}264 byte address space, byte-level addressing
  2. Data formats: b/w/l/q sizes with explicit instruction suffixes
  3. Register design: hierarchical partial access mechanism
  4. Flexible addressing: 9 memory addressing modes with scaled indexing
  5. Data extension: precise control of zero and sign extension
  6. Stack mechanism: push/pop operations growing toward lower addresses

Understanding these low-level mechanisms forms a solid foundation for subsequent study of arithmetic logic instructions, control flow, procedure calls, and memory hierarchy structures. Main memory is not only a container for data but also the core interface between the CPU and software. Mastering its working principles is essential to truly understanding the complete picture of computer systems.


References

  • “Computer Organization” course notes
  • Computer Systems: A Programmer’s Perspective (CS:APP), Chapter 3
  • Intel 64 and IA-32 Architectures Software Developer’s Manual

Last updated: 2026-02-26

Read more

前端安全:别让你的网站成为黑客的游乐场

前端安全:别让你的网站成为黑客的游乐场 毒舌时刻 前端安全?这不是后端的事吗? "我只是个前端,安全关我什么事?"——结果网站被XSS攻击,用户信息泄露, "我用了框架,应该很安全吧?"——结果框架有漏洞,被人轻松突破, "我的网站小,没人会攻击的"——结果被黑客当作练手的靶子。 醒醒吧,前端安全不是可有可无的,而是必须重视的! 为什么你需要这个? * 保护用户数据:防止用户信息被窃取 * 维护网站声誉:避免安全事件影响品牌形象 * 遵守法律法规:如GDPR、CCPA等数据保护法规 * 防止业务损失:避免因安全问题导致的经济损失 反面教材 // 反面教材:直接拼接HTML字符串 function renderUserInput() { const userInput = document.getElementById('user-input').value; // 危险!直接将用户输入插入到DOM中

【2026春招】三年前端血泪面经:拿下字节/阿里/美团Offer,这些高频题你必须掌握!(附手写源码)

【2026春招】三年前端血泪面经:拿下字节/阿里/美团Offer,这些高频题你必须掌握!(附手写源码)

前言: 2026 年的春招可以用一个词形容: “卷中卷” 。单纯会写 Vue/React 业务代码已经很难过简历关了,面试官现在更看重你的底层原理、工程化基建(如 Rspack/Vite/微前端)、性能优化以及复杂场景的解决能力。 笔者双非本,三年中小厂前端经验,经过一个多月的地狱级复习,最终拿下了字节跳动、淘天集团(阿里)、美团的三家 Offer。今天把这一个月的面经和高频手写题全部复盘出来,希望给正在求职的兄弟们一点参考! (文末附高频手撕代码题,建议收藏反复手敲!) 一、 字节跳动(抖音电商团队) 面试特点: 极其看重计算机基础、算法能力和源码理解。基本每一轮都会有一到两道 Hard/Medium 级别的算法题或手写题。 一面(基础与深度,约 60 分钟) 一面面试官主要考察基础的扎实程度,问得很细。 1. CSS/HTML: BFC 的触发条件和应用场景?如何实现一个高度自适应的瀑布流布局?

总结前端三年 理想滚烫与现实的冰冷碰撞

总结前端三年 理想滚烫与现实的冰冷碰撞

大家好,我是500佰,技术宅男 目前正在前往独立开发路线,我会在这里分享关于编程技术、独立开发、技术资讯以及编程感悟等内容 6月3日的一篇《一个普通人的30岁 他经历了什么》介绍一篇自己的碎碎念、即回顾自己以前的成长经历,那么再接着说下这3年来的工作经历,2022年1月,我以一名前端新人的身份开始了职业生涯。每当看到浏览器中运行的网站、手机里流畅的APP,或是点击按钮后转动的loading图标,都会想到这些产品背后凝聚着无数开发者的心血。我既期待能成为这个创造数字世界的一员,又难免担心:自己的技术储备是否足够?会不会被身边优秀的同事远远甩在身后? 怀揣着对未来的憧憬与一丝忐忑,我正式踏入了职业生涯的第一站。 不断尝试和调整的前两年(2022 ~ 2024) 我的职业生涯始于一家颇具特色的企业。原本以为会从事移动应用或网站开发,没想到公司专注于打造一款独特产品——我们开发了一系列可复用组件,配合自主研发的拖拽式平台,能够快速搭建Web站点。这种模式与后来流行的低代码平台颇有相似之处。 作为一名Java工程师加入公司后,却发现实际工作内容与预期有较大差异。当时还不了解’前端开发’这个

[开源] 纯前端实现楼盘采光模拟工具:从2D规划图到3D日照分析

[开源] 纯前端实现楼盘采光模拟工具:从2D规划图到3D日照分析

前言 买房是人生大事,不仅要看户型,更要看采光。尤其是现在高层住宅密集,低楼层的日照时长往往是购房者的心病。虽然市面上有专业的日照分析软件,但对于普通开发者或购房者来说门槛太高。 最近利用周末时间,我开发了一套纯前端、零依赖的楼盘规划与采光模拟工具。它包含两个部分: 1. 配置器 (Editor):基于 Canvas,在普通的楼盘规划图(JPG/PNG)上绘制楼栋轮廓、标定比例尺。 2. 可视化 (Viewer):基于 Three.js,将配置好的数据生成 3D 模型,模拟冬至/夏至不同时间段的日照阴影。 本文将分享这个项目的核心技术实现思路。 开源地址:[https://github.com/SeanWong17/building-sunlight-simulator] 欢迎 Star ⭐ 和 Fork! 🚀 功能演示 1. 2D 规划图配置器 这是数据生产的入口。用户上传一张总平图,