Untitled
Storage Tech
- SRAM: faster and expensive
- DRAM: sensitive to disturbance, consists of supercells
How to read from a DRAM
![[{33C9AD35-F29C-4A5B-8185-DE954359B2F7}.png]]
- link the DRAM chip to a memory controller.
- select a ROW, store it at internal row buffer
- select a COL, get the value from buffer
- output to memory controller
a supercell only contains ONE byte information
Several DRAM consists a memory module.
get a 8-byte information from 8 DRAM.
RAM都在内存里,非易失性存储负责ROM和disk
SSD中用的是flash memory的技术
Read & Write from main memory
what will happen after
1 | movq A %rax |
![[{593C2E54-1E72-43AC-850F-4376524BEDDD}.png]]
- passing the address A from bus interface through System bus and memory bus to Main memory
- read from the DRAM in the main memory. Passing back to Bus interface.
- load data from Bus interface to Register file
what about Write
1 | movq %rax A |
- Read address and tell the main memory, memory wait data
- Load data on system bus
- memory read data from memory bus
Disk and SSD
Disk: rotating disk vs SSD: use flash memory
Access time consists:
- seek time : rotate the arm.
- rotational latency: rotate the disk
- transfer time: read time. (scan)
![[{25626AC0-E79C-447B-9EA0-55A47D8D7A4F}.png]]
I/O bus link the I/O devices to I/O bridge
memory-mapped I/O
a block of addresses in the address space is reserved for communicating with I/O devices.
each device has its port
Example: how to let the CPU to access data in disk
1. bus interface tell the disk to start reading (with some params)
2. tell where to get data in the disk
3. tell where to store data in memory
4. after disk send data to memory will interrupt the CPU to notify.
Locality
temporal locality: use the same variable
spatial locality: use stride-1 element
Example: iterate a matrix: use row first and column
Memory Hierarchy
![[{50BB4D7A-0CF5-4261-B2C9-8CB906523928}.png]]
level k is the cache of level k+1
use block-size transfer unit
Hit and Miss
miss type : cold miss, conflict miss, capacity miss
the management of cache
Type | Manged by |
---|---|
registers | compiler |
L1 to L3 caches | Hardware |
virtual memory | Hardware + OS |
Cache memory
memory: S ,E ,B ,m
S sets, E lines, B bytes
remember set index is in the middle
![[{193715AB-31C0-4A8A-AEF2-4309DF454FAE}.png]]
direct-mapped cache
one line per set
how to read:
- select set
- line matching. (valid && tag match)
- select word. by block offset
- if miss, place or replace the new line by its set index
commom array index:
![[{A93926E3-225D-4E63-82D6-D9F552F817F9}.png]]
Note: only two conditions can share a set value:
- in the same line.(tag value same and different block offset)
- different tag. (will replace by the other line)
- be careful of conflict miss when calculating two 8 num array
1
2
3
4
5
6
7
8float dotprod(float x[8],float y[8]) {
float sum = 0.0;
int i;
for (i = 0; i<8 ;i++){
sum += x[i] * y[i];
}
return sum;
}
set associative cache
new thing: how to choose the line to be replaced by a new line.(meeting a conflict)
- random
- least frequent
- least recently used
fully associative cache
only one set
small-size cache (TLB)
write
write hit
the write memory is already cached
- write-through: immediately write w’s bloch to a lower level memory(like main memoy) ==>causing bus traffic
- write-back: write back to lower level when evicted by new block. ==>need to store whether this line is changed or not.
write miss
- write-allocate: first load from lower level, then write in the cache
- no-write-allocate: write directly to lower level.
Tips: use write-back and write-allocate