EEE · Revision Notes

Microprocessor and Interfacing

8086 Architecture · Assembly Programming · Peripheral Interfacing

By Dr. Mithun Mondal · Department of EEE, BITS Pilani – Hyderabad Campus · Academic Year 2025–26

Section 1

Number Systems and Data Representation

The 8086 sees the world as bits. Four representations every assembly programmer must master:

BaseDigitsNotation
Binary0, 11011B or 0b1011
Octal0–717O or 017
Decimal0–911D (default)
Hexadecimal0–9, A–F0BH or 0x0B
⭐ Why Hex Dominates

One hex digit = exactly 4 bits (24=16). A byte = 2 hex digits; a word = 4. Easy to read and convert to/from binary mentally.

Signed Numbers: Two’s Complement

Definition

Invert all bits (one’s complement) and add 1. MSB carries the sign: 0 = positive, 1 = negative.
8-bit range: −128 to +127  |  16-bit range: −32768 to +32767

Worked Example: Encode −5 in 8 bits

+5 = 0000 01012
Invert: 1111 10102
Add 1: 1111 10112 = FBH
Check: 5+(−5)=00H, carry discarded.

⭐ Why Two’s Complement?

The same adder hardware works for signed and unsigned addition. Subtraction becomes A−B = A+B̅+1.

BCD and ASCII

BCD – Binary Coded Decimal

Each decimal digit (0–9) stored in 4 bits.
Packed BCD: two digits/byte (59 → 59H).
Unpacked BCD: one digit/byte (5 → 05H).
8086 adjusts via DAA/DAS, AAA/AAS/AAM/AAD.

ASCII – American Standard Code

7-bit code; digits ‘0’–‘9’ at 30H39H.

ASCII digit → BCD: AND AL,0FH
BCD digit → ASCII: OR  AL,30H

Quick Conversion

Character ‘7’ in AL = 37H. Subtract 30H (or AND 0FH) → numeric value 7.

Section 2

Introduction to Microprocessors

Definition: Microprocessor

A microprocessor (μP) is a programmable, clock-driven, register-based electronic device that reads binary instructions from memory, accepts binary data as input, processes the data, and provides results as output.

Four Pillars of Any Microprocessor Architecture

  • Word length – size of the natural operand (4, 8, 16, 32, 64-bit)
  • Instruction set – the vocabulary of operations it understands
  • Addressing modes – ways an operand can be specified
  • Register set – the on-chip scratchpad

Everything connects to the outside world via three buses: Address, Data, and Control.

Three-bus microprocessor architecture: CPU connected to memory and I/O via address bus, data bus, and control bus
Three-bus microprocessor architecture. The unidirectional address bus carries location information; the bidirectional data bus transfers operands; the control bus carries timing signals (RD̅, WR̅, ALE, INTR, RESET…).

Microprocessor vs. Microcontroller vs. Microcomputer

AspectMicroprocessorMicrocontrollerMicrocomputer
Chip contentsCPU onlyCPU+RAM+ROM+I/O+Timers+ADCCPU+memory+peripherals on board
Typical useGeneral-purpose computingDedicated/embedded controlPersonal computing
MemoryExternalOn-chipExternal (modular)
PowerHigherVery lowHighest
CostModerateVery lowHigh
Examples8085, 8086, Pentium8051, AVR, PIC, ARM Cortex-MIBM-PC, Raspberry Pi
⭐ Rule of Thumb

If the silicon alone can blink an LED with no external memory or I/O, it’s a microcontroller. If it needs a motherboard full of chips, it’s a microprocessor.

Von Neumann vs. Harvard Architecture

Von Neumann (Princeton)

Single memory for code and data; one set of buses; simpler, cheaper. Cannot fetch instruction and data simultaneously. Used in 8085, 8086, x86 PCs.

Harvard

Separate memories and buses for code and data; enables parallel access; faster but more complex. Used in 8051, AVR, ARM Cortex-M (modified Harvard).

⭐ Modern Reality

Most modern CPUs are modified Harvard at the cache level (separate I-cache and D-cache) while remaining Von Neumann at main memory.

Evolution of Intel Microprocessors

ProcessorYearData BusAddr. BusClockTransistorsHighlights
400419714-bit12740 kHz2,300First μP, calculators
800819728-bit14800 kHz3,5008-bit successor
808019748-bit162 MHz6,000First general-purpose 8-bit
808519768-bit163 MHz6,500Single +5 V supply
8086197816-bit205–10 MHz29,000First 16-bit, 1 MB memory
808819798-bit ext.205 MHz29,000Original IBM-PC
80286198216-bit246–25 MHz134,000Protected mode, 16 MB
80386198532-bit3216–33 MHz275,000First 32-bit, paging
80486198932-bit3225–100 MHz1.2 MIntegrated FPU + cache
Pentium199364-bit3260–300 MHz3.1 MSuperscalar
Core i-series2008+64-bit36/40GHz>109Multi-core, SIMD, AVX
Memory hierarchy pyramid from fast registers at top to slow HDD at bottom
The memory hierarchy. Each level down increases capacity and latency while reducing cost per bit. The locality principle means most programs spend most of their time in a small region of code and data, enabling caches to bridge the speed gap.
Section 3

The 8085 Microprocessor (Foundations)

The Intel 8085 (1976) is the conceptual ancestor of every later x86 chip and the starting point for understanding the 8086.

Specifications
  • 8-bit data bus, 16-bit address bus ⇒ 64 KB memory space
  • Multiplexed AD0–AD7 (lower address + data), demultiplexed by ALE
  • Clock: 3–5 MHz; single +5 V supply
  • 246 instructions (74 unique), 8-bit accumulator-based
  • 5 hardware interrupts: TRAP, RST 7.5, 6.5, 5.5, INTR
⭐ Why Learn 8085 First?

It is small enough to fit on one whiteboard but contains every fundamental idea — registers, ALU, instruction fetch–decode–execute, interrupts, multiplexed buses — that will be seen in the 8086.

8085 Register Set

RegisterFunction
A (Accumulator)8-bit result register for the ALU
B, C, D, E, H, LGeneral-purpose 8-bit; paired as BC, DE, HL (16-bit)
SP (Stack Pointer)16-bit, points to top of stack
PC (Program Counter)16-bit, address of the next instruction
Flag RegisterS, Z, AC, P, CY (5 flags)
⭐ HL Pair as Memory Pointer

The M operand in instructions like MOV A,M refers to memory pointed to by HL — the 8085’s primary indirect-addressing mechanism.

Section 4

8086 Microprocessor: Architecture

Salient Features

  • 16-bit data bus, 20-bit address bus; addressable memory: 220 = 1 MB
  • Clock: 5, 8, 10 MHz versions; 40-pin DIP, HMOS, +5 V, ~29,000 transistors
  • Pipelined architecture (BIU + EU), 6-byte instruction queue
  • Two operating modes: Minimum and Maximum
  • Memory segmentation (4 segments × 64 KB)
  • Supports multiprocessor configurations
Why Two Internal Units?

The 8086 is internally pipelined: while the Execution Unit (EU) executes one instruction, the Bus Interface Unit (BIU) fetches the next. This overlap nearly doubles throughput vs. the 8085.

⭐ Backward Compatibility

The 8086’s instruction set is a strict superset of the 8080’s (and effectively the 8085’s) at the source level — old code could be ported with minor effort.

Internal Architecture: BIU and EU

8086 internal block diagram: Bus Interface Unit (left) and Execution Unit (right) connected by internal 16-bit data bus
The 8086 internal architecture. The BIU holds segment registers, the instruction pointer (IP), address-generation logic, bus control logic, and the 6-byte instruction queue. The EU holds general-purpose registers (AX, BX, CX, DX), pointer/index registers, the 16-bit ALU, the flag register, and EU control logic. Both operate simultaneously over the internal 16-bit data bus.

General-Purpose Registers

All four are 16-bit and split into two 8-bit halves:

RegisterHalvesPrimary Purpose
AX (Accumulator)AH, ALArithmetic, I/O, MUL/DIV implicit operand
BX (Base)BH, BLBase address for memory addressing (with DS)
CX (Counter)CH, CLLoop counter (LOOP), shift count, REP
DX (Data)DH, DLI/O port address; high word in MUL/DIV
⭐ Implicit Operands

Many instructions require a specific register: MUL src multiplies AL or AX with src; LOOP decrements CX; IN/OUT uses DX for ports > 255.

Pointer, Index and Segment Registers

Pointers (16-bit)
  • SP – Stack Pointer (offset in SS)
  • BP – Base Pointer (default SS; stack frames)
  • IP – Instruction Pointer (offset in CS)
Index Registers (16-bit)
  • SI – Source Index (DS by default)
  • DI – Destination Index (DS or ES in strings)
Segment Registers (16-bit)
  • CS – Code Segment
  • DS – Data Segment
  • SS – Stack Segment
  • ES – Extra Segment
Default Pairings — Memorise!

CS:IP  |  SS:SP  |  SS:BP  |  DS:BX/SI/DI  |  ES:DI (strings)

Flag Register (16-bit, 9 Flags Used)

15141312 111098 7654 3210
OFDFIFTF SFZFAF PFCF

Status Flags (set by ALU result):

  • CF – Carry from MSB (unsigned overflow)
  • PF – Parity (even = 1)
  • AF – Auxiliary carry (bit 3→4, for BCD)
  • ZF – Zero result
  • SF – Sign (MSB of result)
  • OF – Signed overflow

Control Flags (set by programmer):

  • TF – Trap (single-step debug)
  • IF – Interrupt enable (STI/CLI)
  • DF – Direction (string ops, STD/CLD)

Memory Segmentation

The 8086 has 16-bit registers but a 20-bit address bus. Memory is divided into segments of 64 KB each, addressed by a 16-bit segment register + 16-bit offset.

⭐ Physical Address Computation
\[\text{Physical Address} = (\text{Segment Register}) \times 16 + \text{Offset}\]

Equivalently: shift the segment left by 4 bits (one hex digit), then add the offset.

Example: CS=2000H, IP=0050H
\[2000\text{H}\times 10\text{H} + 0050\text{H} = 20000\text{H}+0050\text{H} = \mathbf{20050\text{H}}\]
Advantages of Segmentation
  • Programs are relocatable (just change the segment register)
  • Logical separation of code, data, stack, extra segments
  • Allows multiple programs to coexist without conflict
  • Memory > 64 KB usable without widening internal registers
Exam Tip

Given any SEG:OFFSET, compute the physical address in under 10 seconds. Practise with at least 20 random pairs before any examination.

Stack in the 8086

Stack Basics
  • Reserved area in the stack segment (SS)
  • SP holds the offset of the top; grows downward (toward lower addresses)
  • Operates word-by-word: PUSH writes 2 bytes; POP reads 2 bytes

PUSH src: SP ← SP−2; [SP] ← src
POP  dst: dst ← [SP]; SP ← SP+2

Pin Diagram and Bus De-multiplexing

8086 40-pin DIP pin diagram showing multiplexed AD0-AD15, ALE, BHE, RD, WR, INTR, NMI, RESET
The Intel 8086 40-pin DIP package. Pins AD0–AD15 are multiplexed address/data lines. ALE (Address Latch Enable) strobes the address during T1. BH̅E̅ selects the high-byte bank. M/I̅O̅, RD̅, WR̅ form the primary control signals in minimum mode.

During T1 of every bus cycle the 8086 outputs the address with ALE high; an octal latch (74LS373) captures it on the falling edge of ALE, freeing AD0–AD15 for data in T2–T4.

Minimum vs. Maximum Mode

Minimum Mode

MN/M̅X̅ pin tied to +5 V. The 8086 generates all control signals directly: M/I̅O̅, RD̅, WR̅, ALE, DT/R̅, DE̅N̅, HOLD/HLDA. Used for single-CPU boards and small embedded systems.

Maximum Mode

MN/M̅X̅ pin tied to GND. Three encoded status lines S̅2, S̅1, S̅0 decoded by the external 8288 Bus Controller. Used for multiprocessor designs and when an 8087 NDP or 8089 I/O processor is present.

8288 Bus Controller status decoding (Maximum Mode)
210Bus Cycle Generated
000Interrupt acknowledge
001Read I/O port
010Write I/O port
011Halt
100Code (instruction) fetch
101Read memory
110Write memory
111Passive (no bus cycle)

Bus Cycle T-States

  • T1: address output + ALE high
  • T2: address tri-stated, RD̅ or WR̅ asserted
  • T3: data on bus
  • T4: data latched, control signals deasserted

If READY = 0 during T3, wait states Tw are inserted until the memory or peripheral is ready.

Section 5

Addressing Modes of 8086

The 8086 supports seven basic addressing modes for memory/register references, plus I/O and control-transfer modes.

#ModeOperand FormExample
1ImmediateConstant in instructionMOV AL, 25H
2RegisterOperand in registerMOV AX, BX
3Direct16-bit address in instructionMOV AX, [1234H]
4Register IndirectAddress in BX/SI/DIMOV AX, [BX]
5BasedBX or BP + displacementMOV AX, [BX+4]
6IndexedSI or DI + displacementMOV AX, [SI+2]
7Based-IndexedBase + Index (+ displacement)MOV AX, [BX+SI+6]

I/O modes: Direct (IN AL, 60H) and Indirect (IN AL, DX).
Control transfer: Intra-segment (NEAR) and Inter-segment (FAR), each direct or indirect.

Effective Address (EA) Computation

\[\text{EA} = \underbrace{\{\text{BX or BP}\}}_{\text{base}} + \underbrace{\{\text{SI or DI}\}}_{\text{index}} + \underbrace{\text{displacement}}_{\text{8 or 16-bit}}\] \[\text{Physical Address} = \text{Segment Register}\times 16 + \text{EA}\]
⭐ Default Segment Selection

If BP is used ⇒ SS is the default segment; otherwise DS is the default. Override with a segment prefix, e.g., MOV AX, ES:[BX].

Worked Example

DS=1000H, BX=0200H, SI=0050H; instruction MOV AX, [BX+SI+6]:
EA = 0200+0050+6 = 0256H;  PA = 1000H×10 + 0256H = 10256H.

Section 6

8086 Instruction Set

Data Transfer Instructions

None of these affect flags (except SAHF/POPF, which restore them).

InstructionOperationExample
MOV dst, srcdst ← srcMOV AX, BX
XCHG dst, srcswap dst ↔ srcXCHG AX, BX
PUSH srcSP←SP−2; [SP]←srcPUSH AX
POP dstdst←[SP]; SP←SP+2POP BX
LEA dst, srcdst ← offset of srcLEA SI, MSG
LDS reg, srcreg ← [src], DS ← [src+2]LDS SI, PTR
LES reg, srcsimilar but loads ESLES DI, PTR
IN AL, portAL ← portIN AL, 60H
OUT port, ALport ← ALOUT 80H, AL
XLATAL ← [BX+AL]XLAT
LAHF / SAHFAH ↔ low byte of FLAGSLAHF
PUSHF / POPFflags ↔ stackPUSHF

Arithmetic Instructions

InstructionOperationFlags
ADD dst, srcdst ← dst+srcall status
ADC dst, srcdst ← dst+src+CFall status
SUB dst, srcdst ← dst−srcall status
SBB dst, srcdst ← dst−src−CFall status
INC / DEC dst±1all except CF
NEG dstdst ← −dst (two’s comp.)all status
CMP dst, srcdst−src (discarded)all status
MUL srcAX ← AL×src (8-bit); DX:AX ← AX×src (16-bit) — unsignedCF, OF
IMUL srcsigned multiplyCF, OF
DIV srcAL=AX/src, AH=AX mod src (or DX:AX/src)undefined
IDIV srcsigned divideundefined
DAA / DASBCD adjust after add/subtractstatus
CBW / CWDsign-extend AL→AX, AX→DX:AXnone
⭐ MUL/DIV Implicit Operands

There is no MUL AX,BX. Always MUL src: AL/AX is the implicit other operand; result goes to AX or DX:AX.

Logical, Shift and Rotate

InstructionOperationNotes
AND, OR, XORbit-wiseCF=OF=0; SF, ZF, PF set
NOTone’s complementno flags
TEST dst, srcdst AND src (discarded)sets flags only
SHL / SAL dst, nshift left (n=CL if count>1)CF = last bit shifted out
SHR dst, nlogical right shiftMSB filled with 0
SAR dst, narithmetic right shiftMSB retained (sign)
ROL / RORrotate without carrywraps around
RCL / RCRrotate through carryCF is part of the chain
⭐ Multiply/Divide by Powers of 2

SHL AX,1 ≡ AX×2; SAR AX,1 ≡ signed÷2. Much faster than MUL/DIV.

Control Transfer Instructions

InstructionCondition / Action
Unconditional
JMP targetunconditional jump (short/near/far)
CALL targetpush return address, jump
RET / RETFreturn (near / far)
Conditional (after CMP or arithmetic)
JE/JZ, JNE/JNZZF=1, ZF=0
JC/JB/JNAE, JNC/JAE/JNBCF=1, CF=0
JA/JNBE, JBE/JNAunsigned greater / ≤
JG/JNLE, JL/JNGEsigned greater / less
JGE/JNL, JLE/JNGsigned ≥ / ≤
JS, JNSSF=1, SF=0
JO, JNOOF=1, OF=0
JCXZCX = 0
Loop
LOOP targetCX−−; if CX≠0, jump
LOOPE / LOOPZCX−−; if CX≠0 and ZF=1, jump
LOOPNE / LOOPNZCX−−; if CX≠0 and ZF=0, jump

Signed vs. Unsigned Conditional Jumps

Easy to Confuse!

After CMP A,B, choose the jump that matches the intended interpretation.

RelationUnsigned (Above/Below)Signed (Greater/Less)
A > BJA / JNBEJG / JNLE
A ≥ BJAE / JNBJGE / JNL
A < BJB / JNAEJL / JNGE
A ≤ BJBE / JNAJLE / JNG
A = BJE / JZJE / JZ
⭐ Mnemonic

Use “A/B” (above/below) for unsigned, “G/L” (greater/less) for signed.

String Instructions

SI (source, DS) and DI (destination, ES); DF=0 auto-increments, DF=1 auto-decrements.

InstructionOperationPrefix
MOVSB / MOVSWES:[DI] ← DS:[SI], update SI, DIREP
CMPSB / CMPSWcompare DS:[SI] with ES:[DI]REPE / REPNE
SCASB / SCASWcompare AL/AX with ES:[DI]REPE / REPNE
LODSB / LODSWAL/AX ← DS:[SI]
STOSB / STOSWES:[DI] ← AL/AXREP
Block Copy Idiom (100 bytes)
CLD           ; direction = forward
MOV CX, 100
REP MOVSB     ; copy 100 bytes from DS:SI to ES:DI

Processor Control Instructions

InstructionFunction
CLC / STC / CMCclear / set / complement CF
CLD / STDclear / set DF
CLI / STIclear / set IF
HLThalt until interrupt or reset
NOPno operation (XCHG AX,AX)
WAITwait for TEST pin active
ESC opcode, srccoprocessor escape (8087)
LOCKassert LOC̅K̅ for next instruction (bus arbitration)

Instruction Execution Time — T-States

T-State

One T-state = one clock period. T-states = decoding + ALU work + bus cycles (each bus cycle = 4 T-states minimum).

InstructionTypical T-StatesNotes
MOV reg, reg2no memory access
MOV reg, mem8 + EAEA varies with addressing mode
ADD reg, reg3ALU work only
MUL r/m870–77multi-cycle iterative
DIV r/m16144–162most expensive single instruction
LOOP target17 / 5taken / not taken
INT n51stacks FLAGS, CS, IP; fetches vector
⭐ Execution Time
\[T_{\text{inst}} = N_T \times T_{\text{clk}}\]

At 5 MHz, Tclk=200 ns, so DIV r/m16 can take ~32 μs.

Section 7

Assembly Language Programming

Assembler Directives (MASM/TASM)

DirectivePurpose
DB / DW / DD / DQ / DTDefine Byte / Word / DWord / QWord / 10-byte
EQUSymbolic constant: N EQU 10
ORGSet location counter: ORG 100H
SEGMENT … ENDSBegin / end a segment
ASSUMEInform assembler which segment register is which
PROC … ENDPDefine a procedure (NEAR/FAR)
ENDEnd of source file (with entry-point label)
PUBLIC / EXTRNMulti-file linkage
OFFSET / SEGReturn offset / segment of a symbol
PTROverride default size (BYTE PTR [SI])
DUPAllocate repeated data: ARR DB 100 DUP(0)

A Complete 8086 Program: Sum of First N Natural Numbers

;----- Sum of first N natural numbers -----
.MODEL TINY
.CODE
ORG     100H
START:  MOV     CX, 10        ; N = 10
        XOR     AX, AX        ; AX = sum = 0
        MOV     BX, 1         ; BX = i = 1
NEXT:   ADD     AX, BX
        INC     BX
        LOOP    NEXT          ; CX-- ; jump if CX != 0
        ; AX now holds the sum (37H = 55 decimal)
        MOV     AH, 4CH       ; DOS terminate
        INT     21H
        END     START

Procedures

;----- Procedure: square AL, return in AX -----
SQUARE  PROC    NEAR
        PUSH    BX
        MOV     BL, AL
        MUL     BL            ; AX = AL * BL
        POP     BX
        RET
SQUARE  ENDP

        MOV     AL, 7
        CALL    SQUARE        ; AX = 49

Macros

;----- Macro to display a $-terminated string -----
PRINT   MACRO   MSG
        MOV     AH, 09H
        LEA     DX, MSG
        INT     21H
        ENDM

        PRINT   HELLO_MSG
        PRINT   GOODBYE_MSG

Procedure vs. Macro

AspectProcedureMacro
Code sizeOnce in memoryOnce per invocation
Execution speedSlower (CALL/RET overhead)Faster (inline)
Parameter passingRegisters / stackTextual substitution
RecursionSupportedNot supported
Use whenLogic is long, called oftenLogic is short, called rarely
⭐ Rule of Thumb

Short 2–3-line patterns → macro. Anything longer or called many times → procedure.

Section 8

Sample Programs and Worked Examples

Example 1: Largest Number in an Array

;----- Find the largest of N bytes at DS:SI -----
;  Inputs:  SI -> array, CX = N (assume N >= 1)
;  Output:  AL = largest byte
        MOV     AL, [SI]      ; first element = current max
        DEC     CX
        JZ      DONE
NEXT:   INC     SI
        CMP     AL, [SI]
        JAE     SKIP          ; AL >= mem -> no change
        MOV     AL, [SI]      ; new maximum
SKIP:   LOOP    NEXT
DONE:   ; AL holds the result

Why JAE? Unsigned bytes. For signed data use JGE.

Example 2: Factorial (Iterative)

;----- Factorial of N (in CL), result in DX:AX -----
        MOV     AX, 1
        XOR     DX, DX
        XOR     CH, CH
        OR      CL, CL
        JZ      DONE          ; 0! = 1
LOOP1:  MOV     BX, CX
        MUL     BX            ; DX:AX = AX * BX
        LOOP    LOOP1
DONE:   ; result in DX:AX
⭐ Why DX:AX?

MUL BX produces a 32-bit unsigned product in DX:AX even when both operands are 16-bit.

Example 3: Block Move (REP MOVSB)

;----- Copy 256 bytes from DS:SRC to ES:DST -----
        MOV     AX, SEG SRC
        MOV     DS, AX
        MOV     SI, OFFSET SRC
        MOV     AX, SEG DST
        MOV     ES, AX
        MOV     DI, OFFSET DST
        MOV     CX, 256
        CLD
        REP     MOVSB

Example 4: Counting Even and Odd Numbers

;----- Count even/odd bytes; BL=#even, BH=#odd -----
        XOR     BX, BX
SCAN:   MOV     AL, [SI]
        TEST    AL, 01H
        JZ      EVEN1
        INC     BH
        JMP     CONT
EVEN1:  INC     BL
CONT:   INC     SI
        LOOP    SCAN

Idiom: TEST AL,01H inspects the LSB without modifying AL.

Example 5: BCD-to-ASCII Conversion

;----- Convert packed BCD in AL (e.g. 59H) to AH:AL ('5','9') -----
        MOV     AH, AL
        AND     AL, 0FH       ; lower nibble -> BCD digit
        OR      AL, 30H       ; -> ASCII
        MOV     CL, 4
        SHR     AH, CL
        OR      AH, 30H       ; upper nibble -> ASCII

Example 6: 32-bit Addition with ADC

;----- (DX:AX) = (DX:AX) + (CX:BX) -----
        ADD     AX, BX        ; low halves; CF set if overflow
        ADC     DX, CX        ; high halves + carry
⭐ ADC = ADD + CF

CF is the bridge between successive words of a multi-precision operation. Same pattern with SBB for subtraction.

Example 7: String Length ($-terminated)

;----- Length of $-terminated string at ES:DI; output AX -----
        MOV     AL, '$'
        MOV     CX, 0FFFFH
        CLD
        REPNE   SCASB
        MOV     AX, 0FFFFH
        SUB     AX, CX
        DEC     AX

REPNE = repeat while not equal. Stops when AL = ES:[DI] (sentinel found).

Section 9

Interrupts of 8086

An interrupt suspends normal program execution and transfers control to an Interrupt Service Routine (ISR).

Sequence on an Interrupt
  1. CPU completes the current instruction.
  2. Pushes FLAGS, CS, IP onto the stack.
  3. Clears IF and TF.
  4. Reads the 4-byte vector from the IVT.
  5. Loads CS:IP from the vector ⇒ ISR begins.
  6. IRET pops IP, CS, FLAGS and resumes.
⭐ Interrupt vs. Subroutine

A subroutine CALL pushes only CS:IP. An interrupt also pushes FLAGS and clears IF/TF. Return is IRET, not RET.

Interrupt Vector Table (IVT)

  • Located at physical addresses 00000H003FFH (first 1 KB).
  • 256 vectors, each 4 bytes: IP_low, IP_high, CS_low, CS_high.
  • For type n: IVT address = n × 4.
TypeSourceDescription
0InternalDivide-by-zero error
1InternalSingle-step (TF=1)
2ExternalNMI (non-maskable pin)
3InternalBreakpoint (INT 3, 1-byte)
4InternalOverflow (INTO, if OF=1)
5–31ReservedReserved by Intel
32–255UserSoftware / external (INT n)
Vector Address Quick Math

INT 21H ⇒ vector at 21H×4 = 84H. CPU reads 4 bytes starting at 00000:0084H.

Hardware vs. Software Interrupts

Hardware (External)
  • Asynchronous, triggered by external pins
  • NMI: non-maskable, type 2
  • INTR: maskable (IF=1 to enable); type number supplied during INTA cycles
  • Usually routed through 8259A PIC
Software (Internal)
  • Synchronous, executed by program
  • INT n — call ISR at vector n
  • INTO — type 4 if OF=1
  • Used for DOS/BIOS services (INT 21H, INT 10H…)
⭐ Priority Order

Internal exceptions > NMI > INTR > Single-step. Maskable interrupts can be temporarily disabled with CLI.

ISR Skeleton

;----- ISR template -----
ISR_X   PROC    FAR
        PUSH    AX
        PUSH    BX
        PUSH    DS
        ; ... actual work ...
        MOV     AL, 20H       ; EOI to 8259A
        OUT     20H, AL
        POP     DS
        POP     BX
        POP     AX
        IRET
ISR_X   ENDP

Three sacred rules: (1) preserve every register you change, (2) issue EOI to 8259A if hardware-driven, (3) end with IRET, not RET.

8259A Programmable Interrupt Controller

Expands the single INTR pin to 8 prioritized lines (IR0–IR7); cascadable to 64 inputs.

Key Features
  • 8 prioritized inputs; cascadable up to 64
  • Programmable priority modes: fixed, rotating, specific
  • Programmable interrupt vectors (no fixed mapping)
  • Edge- or level-triggered inputs; masking via IMR

Internal registers: IRR latches requests; IMR blocks selected ones; Priority Resolver picks the highest non-masked; ISR tracks which is being serviced.

WordPurpose
ICW1Edge/level, single/cascade, need-ICW4 flag
ICW2Interrupt vector base address (T7–T3)
ICW3Master: which IR has slave; Slave: slave ID
ICW48086/8085 mode, AEOI, buffered, SFNM
OCW1Set/clear IMR bits (masking)
OCW2EOI commands, rotate priority
OCW3Poll, read IRR/ISR, special mask
⭐ End-Of-Interrupt (EOI)

Until the ISR writes an EOI, lower-priority interrupts remain blocked. Simplest EOI: MOV AL,20H; OUT 20H,AL.

⭐ Master–Slave Cascade

Up to 9 chips cascaded (1 master + 8 slaves) give 64 interrupt inputs. The original IBM-PC used one 8259A; the PC-AT added a second for 15 usable IRQs.

Section 10

Memory Interfacing

TypeCharacteristicExample
ROMMask-programmed, non-volatilemonitor / firmware
PROMOne-time user programmable27xx
EPROMUV erasable2716, 2732, 2764
EEPROMElectrically erasable28Cxx
FlashBlock-erasable EEPROM29Fxx
SRAMBistable latch, fast, volatile6116 (2K×8)
DRAMCapacitor, needs refresh, volatile4116, 4164

Memory Banking in the 8086

  • Even bank (low byte D0–D7): selected by A0=0
  • Odd bank (high byte D8–D15): selected by BH̅E̅=0
BH̅E̅A0Operation
00Whole word (aligned)
01Upper byte only
10Lower byte only
11None
Misaligned Access Penalty

Reading a word at an odd address takes two bus cycles instead of one — always align 16-bit data on even addresses!

Address Decoding

Absolute (Full) Decoding

All unused address lines participate in CS̅ generation. No overlap; uses a 3-to-8 decoder (74LS138).

Linear (Partial) Decoding

A single high-order address line used as CS̅. Cheaper but causes address foldback.

Memory-Mapped vs. I/O-Mapped I/O

AspectMemory-Mapped I/OI/O-Mapped I/O
Address spacePart of 1 MBSeparate 64 K I/O ports
InstructionsAny memory instructionOnly IN / OUT
Control signalM/I̅O̅ = 1M/I̅O̅ = 0
Memory lossYes (ports steal addresses)No
FlexibilityRich (any instruction on ports)Limited (IN/OUT only)
Section 11

8255A Programmable Peripheral Interface

Pins and Ports
  • 40-pin DIP, +5 V
  • Three 8-bit ports: Port A, Port B, Port C
  • Port C split into upper (PC7–PC4) and lower (PC3–PC0)
  • A0, A1 select port; CS̅ from address decoding
  • One 8-bit Control Word Register (CWR)
A1A0Selected Register
00Port A
01Port B
10Port C
11CWR

Operating Modes

Mode 0 – Simple I/O

Each port independently input or output. No handshaking.

Mode 1 – Strobed I/O

Port A and B with handshaking (STB, IBF, INTR for input; OB̅F̅, AC̅K̅, INTR for output).

Mode 2 – Bidirectional

Port A only; full-duplex with 5 handshake lines from Port C.

⭐ BSR Mode

Bit Set/Reset mode lets you set/clear individual Port-C bits without affecting A or B. Selected by D7 = 0 in the control word.

Control Word (I/O Mode, D7=1)

BitMeaning
D71 = I/O mode select
D6, D5Group A mode: 00=Mode 0, 01=Mode 1, 1x=Mode 2
D4Port A: 1=input, 0=output
D3PC upper: 1=input, 0=output
D2Group B mode: 0=Mode 0, 1=Mode 1
D1Port B: 1=input, 0=output
D0PC lower: 1=input, 0=output
Control Word Example

PA=output, PB=input, PC upper=output, PC lower=input, all Mode 0:
CW = 1 00 0 0 0 1 12 = 83H
MOV AL,83H    OUT CWR,AL

8255A Applications

App 1: LED Bar Display

; Port addresses: PA=80H, PB=82H, PC=84H, CWR=86H
START:  MOV  AL, 80H   ; all ports output, Mode 0
        OUT  86H, AL
        MOV  AL, 01H   ; first LED on
NEXT:   OUT  80H, AL
        CALL DELAY
        ROL  AL, 1
        JMP  NEXT

App 2: Stepper Motor (Full-Step Sequence)

StepABCDHex
1100001H
2010002H
3001004H
4000108H
        MOV  AL, 80H
        OUT  86H, AL
        MOV  BL, 01H        ; initial phase
        MOV  CX, N
ROT:    MOV  AL, BL
        OUT  80H, AL
        CALL DELAY
        ROL  BL, 1          ; CW; use ROR for CCW
        LOOP ROT

App 3: Reading 8 Switches

        MOV   AL, 82H    ; PA=out, PB=in, Mode 0
        OUT   86H, AL
WAIT:   IN    AL, 82H
        TEST  AL, 01H
        JNZ   WAIT       ; still open (pull-up=1)
        MOV   AL, 0FFH
        OUT   80H, AL    ; all LEDs on
Section 12

8254 Programmable Interval Timer

Three independent 16-bit counters (Counter 0, 1, 2). Each counts in binary or BCD. Inputs: CLK, GATE; Output: OUT.

Six Modes per Counter
  1. Mode 0 – Interrupt on terminal count
  2. Mode 1 – Hardware retriggerable one-shot
  3. Mode 2 – Rate generator (divide-by-N)
  4. Mode 3 – Square-wave generator
  5. Mode 4 – Software-triggered strobe
  6. Mode 5 – Hardware-triggered strobe

Control Word

BitMeaning
SC1, SC0Counter select: 00=C0, 01=C1, 10=C2, 11=read-back
RW1, RW000=latch, 01=LSB, 10=MSB, 11=LSB then MSB
M2, M1, M0Mode (000 to 101)
BCD0=binary 16-bit, 1=4-decade BCD
Generate 1 kHz Square Wave from 1 MHz CLK on Counter 0

Divisor = 1 MHz / 1 kHz = 1000. Mode 3, binary, LSB+MSB, Counter 0:
CW = 0011 01102 = 36H

MOV AL, 36H   ; control word
OUT CWR, AL
MOV AX, 1000  ; divisor
OUT C0, AL    ; LSB
MOV AL, AH
OUT C0, AL    ; MSB
Section 13

8251A USART (Serial Communication)

Asynchronous serial frame: start bit, 8 data bits, parity, stop bits on TxD line
Asynchronous serial frame. One start bit (logic 0) signals the beginning; 5–8 data bits (LSB first) follow; an optional parity bit provides error detection; one or two stop bits (logic 1) mark the end. The 8251A handles this framing in hardware.
8251A USART

A programmable serial communication chip designed to interface with the 8085/8086 family. Supports asynchronous (5–8 data bits, 1/1.5/2 stop bits, baud factor ×1/×16/×64) and synchronous modes; full-duplex, double-buffered; built-in parity, framing, and overrun error detection.

Important pins:

  • TxD, RxD – serial data lines
  • TxC, RxC – transmit/receive clocks
  • TxRDY, RxRDY – status signals to CPU
  • DTR, DSR, RTS, CTS – modem control
  • C/D̅ – control word / data select
⭐ Initialisation Order

After reset: Mode word → (optional sync chars) → Command word, then data transfer begins.

8 data, no parity, 1 stop, ×16 async

Mode word = 0100 11102 = 4EH
Command = 0011 01112 = 37H (TxEN, RxEN, ER, DTR, RTS)

RS-232C Standard

RS-232C

EIA serial interface between DTE and DCE. Negative logic: Logic 1 (mark) = −3 to −15 V; Logic 0 (space) = +3 to +15 V. TTL↔RS-232 conversion via MAX232.

⭐ Why MAX232?

Microcontroller pins (5 V / 3.3 V TTL) cannot drive RS-232 levels directly. MAX232 generates ±10 V from a single 5 V supply using charge pumps.

Section 14

8237A DMA Controller

Direct Memory Access

A dedicated hardware engine transfers data directly between memory and an I/O device, bypassing the CPU after initial setup.

⭐ Bus Handshake

Peripheral → DMAC: DREQ.   DMAC → CPU: HOLD.
CPU → DMAC: HLDA (bus granted).   DMAC → peripheral: DACK.

Transfer Modes

  • Single transfer – one byte per DREQ; bus returned after each byte.
  • Block transfer – DMAC keeps bus until entire block done. Fastest; CPU stalled.
  • Demand transfer – continues while DREQ active; pauses otherwise.
  • Cascade – one 8237A acts as slave to another (PC/AT: two 8237As, 7 usable channels).
8237A Features
  • Four independent DMA channels
  • Each channel has 16-bit base address and 16-bit word count
  • Supports read, write, verify, and memory-to-memory transfers
  • Programmable priority (fixed or rotating); auto-initialise on terminal count
16-bit Address Limit

The 8237A natively addresses 64 KB. For 1 MB systems, an external page register supplies the upper bits, with the DMAC walking inside a 64 KB page.

Section 15

8279 Keyboard/Display Controller

8279

A programmable controller handling a matrix keyboard (up to 8×8) and a multiplexed 7-segment display (up to 16 digits).

⭐ Why a Dedicated Chip?

Keyboard scanning and display refreshing are repetitive housekeeping tasks. Off-loading them to the 8279 frees the CPU and removes flicker/missed-keystroke issues.

Three Sub-blocks

  • Keyboard section – scan, debounce, FIFO of detected keys
  • Display section – 16×8 display RAM + scan logic
  • Scan section – shared row/column scan lines

Keyboard modes:

  • Scanned keyboard: 2-key lockout or N-key rollover
  • Scanned sensor matrix
  • Strobed input (external strobe latches data)

Display modes:

  • 8 or 16 character display
  • Left-entry (calculator style)
  • Right-entry (typewriter style)
⭐ Interrupt-Driven Read

When a key is detected and debounced, 8279 asserts IRQ; the CPU reads the FIFO to get the (row, column) code.

Section 16

ADC and DAC Interfacing

The Bridge to the Analog World

Real-world signals are analog. ADC converts analog to digital for the CPU; DAC converts digital back to analog for actuators/outputs.

Key ADC parameters:

  • Resolution (bits)
  • Conversion time
  • Reference voltage VREF
  • Input range (unipolar/bipolar)
  • Linearity, INL/DNL errors

ADC architectures:

  • Flash – fastest, costly
  • Successive Approximation (SAR) – balanced
  • Dual-slope integrating – accurate, slow
  • Sigma-Delta – audio/precision

ADC 0808/0809 – Classic 8-bit SAR

⭐ Resolution
\\[\\text{Step size} = \\frac{V_{REF}}{2^n} = \\frac{5\\,\\text{V}}{256} \\approx 19.5\\,\\text{mV per LSB}\\]

Read Channel 3: Worked Example

; control port=80H, data port=81H, status port=82H
        MOV   AL, 03H        ; channel 3
        OUT   80H, AL
        OR    AL, 18H        ; ALE=1, SOC=1
        OUT   80H, AL
        AND   AL, 0E7H
        OUT   80H, AL
WAIT:   IN    AL, 82H
        TEST  AL, 01H
        JZ    WAIT           ; wait for EOC
        IN    AL, 81H        ; 8-bit digital sample

DAC 0808 – 8-bit Current-Output DAC

Output equation: \(V_{out} = V_{REF} \cdot D / 2^n\) where D = input code (0–255).

⭐ R-2R Ladder

Most DACs use an R-2R resistor network: only two resistor values, monotonic by construction, easy to fabricate — the workhorse of integrated DAC design.

Generate a Triangular Wave

Increment a counter 0→255, OUT to DAC port; then decrement 255→0; repeat. The op-amp converts the staircase into a voltage ramp.

Section 17

Advanced Intel Processors

⭐ One Thread of Progress

Each generation widened registers, added new modes, deepened pipelines, and integrated more on-chip (cache, FPU, MMU) — all while preserving backward compatibility with 8086 code.

80286 (1982) – Protected Mode is Born

  • 16-bit data bus, 24-bit address bus ⇒ 16 MB physical memory
  • Real mode: 8086-compatible (1 MB). Protected mode: descriptor tables, 4 privilege rings.
  • Hardware support for multitasking and memory protection; no paging yet.

80386 (1985) – The 32-bit Leap

  • 32-bit registers (EAX, EBX, …, EIP, EFLAGS); 32-bit address bus ⇒ 4 GB
  • Adds Virtual-8086 mode; built-in paging with 4 KB pages and two-level page tables
⭐ Paging in One Line

A 32-bit linear address is split [Dir | Table | Offset] = [10 | 10 | 12] bits, walking a two-level page-table tree — the foundation of every modern OS.

80486 (1989) – Integration and Pipelining

  • On-chip 8 KB L1 cache (write-through); integrated FPU; five-stage pipeline
  • DX2/DX4 variants: internal clock doubled/tripled vs. external bus

Pentium (1993) and Beyond

  • Superscalar: two pipelines (U and V) — up to 2 IPC
  • Split L1: 8 KB code + 8 KB data cache; 64-bit data bus; branch prediction (BTB)
  • AMD’s x86-64 (2003) extended to 64-bit; Intel adopted as Intel 64
⭐ Trend in One Phrase

More parallelism per clock + more cores per chip + more memory hierarchy on-die.

CISC vs. RISC — The Quiet Convergence

CISC

Complex Instruction Set Computer: many instructions, variable length, complex addressing modes. Intel x86 family.

RISC

Reduced Instruction Set Computer: few simple instructions, fixed length, load/store architecture. ARM, RISC-V, MIPS.

⭐ Inside Modern x86

The decoder splits each CISC instruction into smaller micro-ops, executed by a RISC-style out-of-order engine. Today’s x86 is effectively RISC-on-the-inside, CISC-on-the-outside.

Section 18

Bonus Module: 8051 Microcontroller

Microcontroller

A self-contained chip with CPU + RAM + ROM + I/O + Timers on one die. Designed for embedded control.

8051 Architecture at a Glance

  • 8-bit CPU; Harvard architecture
  • 4 KB internal ROM; 128 bytes internal RAM + 128 bytes SFR area
  • Four 8-bit I/O ports: P0, P1, P2, P3
  • Two 16-bit Timers/Counters: T0, T1
  • One full-duplex UART; 5 interrupt sources (2 external, 2 timer, 1 serial)
  • 64 KB external code memory + 64 KB external data memory

8051 Memory Organisation

Internal RAM (00H–7FH)
  • 00H–1FH: 4 banks of R0–R7
  • 20H–2FH: bit-addressable area (128 bits)
  • 30H–7FH: general-purpose RAM and stack
SFR Area (80H–FFH)
  • Accumulator (A), B register
  • Port latches P0–P3; Timer regs; SCON, SBUF; IE, IP
  • PSW, SP, DPTR (DPH:DPL), PCON

Selected Special Function Registers

SFRAddressPurpose
ACC (A)E0HAccumulator; primary working register
BF0HAuxiliary; used in MUL/DIV
PSWD0HProgram Status Word (CY, AC, F0, RS1, RS0, OV, –, P)
SP81HStack pointer (defaults to 07H after reset)
DPTR82H/83HData pointer for external memory
P0–P380H/90H/A0H/B0HPort latches
TMOD89HTimer mode control
TCON88HTimer/interrupt control (bit-addressable)
SCON98HSerial port control (bit-addressable)
SBUF99HSerial data buffer
IEA8HInterrupt enable
IPB8HInterrupt priority

8051 “Hello, Blink!” in Assembly

ORG  0000H            ; reset vector
     SJMP MAIN

ORG  0030H
MAIN:    CLR  P1.0
LOOP:    CPL  P1.0             ; toggle LED
         ACALL DELAY
         SJMP  LOOP

DELAY:   MOV  R7, #200
D2:      MOV  R6, #250
D1:      DJNZ R6, D1           ; ~500 us inner
         DJNZ R7, D2           ; ~100 ms total
         RET

         END
⭐ 8051 Still Going Strong

Forty years on, the 8051 core lives inside SIM cards, USB controllers, RF transceivers, and countless industrial micros. Its tiny size, deterministic timing, and well-known toolchain keep it relevant.

Section 19

Practice Problems

Problem 1 – Physical Address

Given CS=1234H, IP=5678H, find the physical address of the instruction being fetched.

Solution: \(\text{PA}=\texttt{12340H}+\texttt{5678H}=\mathbf{\texttt{179B8H}}\) (≈96.7 KB into 1 MB)

⭐ Always 5 Hex Digits

A 20-bit physical address fits in exactly 5 hex digits (max FFFFFH = 1 MB−1).

Problem 2 – Addressing Mode

DS=2000H, BX=0100H, SI=0050H. Find EA and PA for MOV AX,[BX+SI+20H].

Mode: based-indexed with displacement.
EA = 0100H+0050H+0020H = 0170H
PA = 20000H+0170H = 20170H. Word: bytes 20170H (low) and 20171H (high).

Problem 3 – Flag Settings

After MOV AL,7FH then ADD AL,01H, find CF, ZF, SF, OF, AF, PF.

Result: 7FH+01H = 80H (1000 0000).

  • CF=0 — no carry out of bit 7
  • ZF=0 — result non-zero
  • SF=1 — MSB of result is 1
  • OF=1 — signed overflow (positive + positive → negative)
  • AF=1 — carry from bit 3 to bit 4
  • PF=0 — odd number of 1-bits (just bit 7)
⭐ OF vs CF

CF signals unsigned overflow; OF signals signed overflow. They are set independently by the same arithmetic.

Problem 4 – Memory Banking

The 8086 reads a word from physical address 2050H. Which bank(s) are accessed?

2050H is even ⇒ aligned word access. BH̅E̅=0, A0=0 ⇒ both banks selected in one bus cycle. Even bank supplies D0–D7; odd bank D8–D15.
If address were 2051H (odd): two bus cycles required (misalignment penalty).

Problem 5 – Stack Trace

SS=3000H, SP=0100H, AX=1234H, BX=ABCDH. After PUSH AX; PUSH BX; POP AX — state?

StepSPAXBX300FEH/FF300FCH/FD
Start01001234ABCD
PUSH AX00FE1234ABCD34/12
PUSH BX00FC1234ABCD34/12CD/AB
POP AX00FEABCDABCD34/12CD/AB
Problem 6 – T-States and Execution Time

An 8086 runs at 5 MHz. An instruction takes 17 T-states. Execution time?

\[T_{CLK}=\frac{1}{5\,\text{MHz}}=200\,\text{ns}\] \[T_{exec}=17\times 200\,\text{ns}=3.4\,\mu\text{s}\]
Loop Timing

A loop body of 50 T-states executed 1000 times takes 50×1000×200 ns = 10 ms.

Summary

Five Ideas That Will Stay With You

⭐ Big Picture
  1. A microprocessor is just a very fast sequencer that fetches, decodes, and executes instructions on data living in memory.
  2. Segmentation, paging, privilege levels — successive layers of abstraction over the same physical memory.
  3. Interrupts and DMA are the two great mechanisms by which I/O escapes the CPU’s serial bottleneck.
  4. Almost every peripheral chip — 8255, 8254, 8259, 8251, 8237 — shares the same recipe: control register + data registers + status bits.
  5. The boundary between hardware and software is a useful fiction. Reading a datasheet is half the engineering.

Recommended Textbooks

  • Douglas V. Hall, Microprocessors and Interfacing — Programming and Hardware, 2nd ed., TMH. The classic 8086 reference.
  • A.K. Ray & K.M. Bhurchandi, Advanced Microprocessors and Peripherals, 3rd ed., McGraw Hill.
  • Barry B. Brey, The Intel Microprocessors: 8086 to Pentium 4, 8th ed., Pearson.
  • Y. Liu & G.A. Gibson, Microcomputer Systems: The 8086/8088 Family, 2nd ed., Prentice Hall.
  • M.A. Mazidi et al., The 8051 Microcontroller and Embedded Systems, 2nd ed., Pearson.
  • Ramesh S. Gaonkar, Microprocessor Architecture, Programming, and Applications with the 8085, 6th ed., Penram.

Where Next?

Embedded systems: ARM Cortex-M (STM32, NXP), AVR (Arduino), RISC-V (SiFive, ESP32-C).

Operating systems: processes, threads, scheduling, virtual memory and TLB.

Computer architecture: pipelining, hazards, branch prediction, cache coherence, out-of-order execution.

Hardware design: Verilog/VHDL, FPGAs, soft-core CPUs.

⭐ One Project per Topic

The fastest way to internalise this material is to build something: a stepper-motor driver, a UART terminal, a tiny RTOS scheduler. Theory sticks when soldered to practice.