April 2020 - Page 2 - victoryang00’ blog

April 11, 2020February 6, 2022

[Signal and Systems] 零状态响应的线性系统

关注连续系统

零输入响应

零状态响应

cond

A brief summary of why we have to learn S&S

Note:

The "齐次方程" and "齐次方程的通解“ is different.
An example why \(f(0_-)\) is not 0 since \(f\) is a causal system: before the impulse, is the state of $0_-$, the effect is the zero-state response. So regarding the physical meaning, they are different.

Matlab intro

Matlab can't defer the zero-state or zero-input

April 11, 2020February 9, 2022

[Computer Architecture] superscalar

Greater Instruction-Level Parallelism (ILP)

Multiple issue “superscalar”
- Replicate pipeline stages ⇒ multiple pipelines
- Start multiple instructions per clock cycle – CPI < 1, so use Instructions Per Cycle (IPC)
- E.g., 4GHz 4-way multiple-issue
  - 16 BIPS, peak CPI = 0.25, peak IPC = 4
- But dependencies reduce this in practic
“Out-of-Order” execution
- Reorder instructions dynamically in hardware to reduce impact of hazards
Hyper-threading

Pipelining recap

pipelines complexities exlained

GPRs FPRs

More than one Functional Unit
Floating point execution!
- Fadd & Fmul: fixed number of cycles; > 1
- Fdiv: unknown number of cycles!
Memory access: on Cache miss unknown number of cycles
Issue: Assign instruction to functional unit

summary

Some static multiple issues

VLIW: very long instruction word

The solution can be easily found

Quiz

[ ] A. In-order processors have a CPI >=1

[x] B. more stages allow a higher clock frequency

[x] D. OoO pipleines need speculation

[ ] E. superscalar processor can execute

April 1, 2020February 9, 2022

[Computer Architecture] data path

intro - where are we now?

the cpu

Processor - datapath - control

cenario in RISC-V machine

The datapath and control

Overview

problem: single monolithic bloc
solution: break up the process of excuting an instruction into stages
- smaller stages are easier to design –
- easy to optimize &modularity

five stages

Instruction Fetch
- pc+=4
- take full use of mem hierarchy
Instruction Decode
read the opcode to determine instruction type and field lengths
second, (at the same time!) read in data from all necessary registers
- for add, read two registers
- for addi, read one register
third, generate the immediates
ALU
- the real work of most instructions is done here: arithmetic (+, -, *, /), shifting, logic (&, |)
- For instance, it's load and store
  - lw t0, 40(t1)
  - the address we are accessing in memory = the value in t1 PLUS the value 40
  - so we do this addition in this stage
Mem access
- Actually only the load and store instruction do anything during this stage.
- it's fast but unavoidable
Register write
- write the result of some computation into a register.
- for stores and brances idle
  
  misc
memory alignment

data path elements state and sequencing

register

instruction level

add

time diagram for add

addi

lw

sw - critical path

combined I+S Immediate generation

branches

pipelining

summary

quiz

for D

for E