|
- Overview:
DSP overview, DSP
lecture (www.cs.mtu.edu/~soner)
BDTi processor/DSP
info
DSP families
CPU Info Center
(great
microprocessors)
CPU
history list
SPEC table
(DiMarco)
Choosing a
DSP
CPU und Chipset
Guide, SIMD,
Post-RISC
Era
Dynamic
Scheduling/Pipelining/ILP (German)
Signal Processing: Compression
Concepts of parallelism:
Real CPU/DSP implementations very often employ combinations of
the following concepts to process/move information/data in parallel
and keep units busy for maximum hardware utilization.
Static:
determined at compile-time by compiler (often advantageous for
systems with predictable behavior, e.g. fixed repetitive numerical
algorithm on DSPs or vector computers)
Dynamic:
determined at run-time by CPU (often advantageous for systems
with non-predictable behavior, e.g. general purpose processors
for multi-threaded/process applications)
Pipeline:
overlapping execution of subsequent instructions, each instruction
is in a different execution stage/phase within the pipeline, fixed
cylce-count instruction set advantageous to avoid pipeline stalls,
typically called "superpipeline" if stage number >7
Vector
instruction:
single instruction for identical operation on multiple data elements
in parallel (SIMD), multiple parallel units (per pipeline) in
CPU
Multi-function instruction:
single instruction for non-identical operations on multiple data
elements in parallel (not to be confused with VLIW, however sometimes
also referred to as VLIW), multiple independent units (per pipeline)
in CPU
Very long/variable length instruction word ("VLIW"):
grouping of multiple instructions to be executed by CPU in parallel
(MIMD), "statically" determined at compile-time (compare
with superscalar), multiple pipelines in CPU
Fixed number ("length"):
unused slots to be filled with NOPs by compiler
Variable number ("length"):
group of instructions encoded (e.g. via link-Bit) by compiler
Superscalar:
CPU executes multiple instructions in parallel, "dynamically"
determined by CPU at run-time (compare with VLIW), multiple pipelines
in CPU
Instruction-level parallelism ("ILP"):
CPU issues multiple instructions in one cycle ("multiple-issue"),
multiple pipelines in CPU
Static:
scheduling determined by compiler at compile-time ("VLIW")
Dynamic:
scheduling determined by CPU at run-time ("superscalar")
Thread-level parallelism ("TLP"):
execution of multiple threads by CPU in parallel ("simultaneous
multi-threading"="SMT")
Dynamic scheduling:
CPU reorders instruction sequence at run-time
Register renaming:
by CPU at run-time to avoid write-after-read and read-after-write
hazards with dynamic scheduling
Branch prediction:
speculative execution to avoid cycle overhead with branches/function
calls
Static:
at compile-time by compiler ("trace scheduling")
Dynamic:
at run-time by CPU
Out-of-order execution:
dynamic scheduling + register renaming + dynamic branch prediction
DSP vs. CPU:
By definition, a DSP (Digital Signal Processor) is a specialized
CPU, typically used for real-time signal processing purposes in
embedded systems. Hereby, some features of general purpose CPUs
like memory management (virtual memory, paging) or protection
mechanisms (kernel/user modes or memory protection for operating
system support) are left out in a DSP design. The main goal is
to achive high numerical performance, low interrupt latencies
(for real-time), deterministic behavior (for real-time) and low
power consumption (for embedded). High data transfer rates by
using multiple data and address busses (generalized Harvard architecture),
specialized instruction sets for certain numerical algorithms
(like the bit-reversed and circular addressing modes for FFT,
MAC instructions and zero-overhead loop support for vector operations)
and the use of on-chip directly addressable multiport SRAMs (sometimes
instead of caches for deterministic behavior) are typical. DSPs
are RISC designs (especially load-store architectures), employing
pipelines with only a few stages (e.g. 3, in conjunction with
on-chip interrupt+loop stacks) to achieve low latencies or pipelines
with highly specialized stage sequences. The use of VLIW and other
static (and therefore deterministic) parallelizing concepts (like
SIMD and multifunction instructions) provides high data throughput
und chip utilization. Very often DSPs support massive multiprocessing
"farms" (with interprocessor communication ports) and
have large multiported register files, 1-cycle MAC-ALUs, Bit manipulation
units, powerful address generators (with own register sets), DMA
units and multiple I/O units/busses. The number of supported data
types is often very limited (e.g. only 32-Bit integer and float).
Software (i.e., instruction set binary) compatibility is not important.
Typical application fields range from audio/video (e.g. in synthesizers,
graphics accelerators, MPEG/JPEG codecs) and telecommunication
(e.g. in cell phone tranceiver base stations) to specialized coprocessor
designs (e.g. "number crunching") and (where the DSP
has its name from) processing of digital signals (e.g. in medical
instruments, radar, scientific experimenal setups).
- DEC
PDP-8 chips (12-Bit):
IM6100 ("CMOS-8", 6100 (CPU), 6101 (PIE), 6102/6103
(support chip, address extension ala PDP-8/E), by Intersil,
CMOS, 1976, used in DECstation/VT78)
HM6120 (6100 + 6102, 6121 I/O controller, 10 MHz, by Harris, 1981,
used in DECmate-I, II, III and III+)
- DEC
PDP-11 chips (16-Bit, chips,
performance):
for instruction sets (base+, EIS, FIS, FPP, CIS) see PDP-11
LSI-11 (by Western Digital, MCP-1600 chip set, 1611 data path
+ 1621 control + 2x1631 microcode ROM, EIS ("Extended Instruction
Set") + FIS ("Floating point Instruction Set")
ROM available (KEV11), base+, 1975, used on LSI-11 board and PDP-11/03),
DCF-11 ("Fonz" or "F11", consisted of a 2-chip
hybrid (21-15541 data path and 23-001C7 control chip), KTF-11
22-Bit MMU chip (21-15542), KEF-11 FPP floating point 2-chip hybrid
option (23-002C7 and 23-203C7), alternative: FPF-11 floating point
option (picture),
CIS option as 2 chips, dual register set, base+, EIS, 1979, used
in PDP-11/23, PDP-11/23+ and MicroPDP-11/23, DEC
picture),
DCT-11 ("T11", 1983, single-chip CPU (21-17311), for
embedded applications, used in Falcon, RQDX and DEUNA, picture),
DCJ-11 ("Jaws" or "J11" or "PDP-11/70
on a chip", by Harris, 2-chip hybrid (21-17679 and 21-17677),
FPJ-11 FPU chip (21-21858), CMOS, 15 MHz and 18 MHz, 1984, DEC
picture, picture,
lacked the WCS (writeable control store) and CIS ("Commercial
Instruction Set") options, 22-Bit MMU, separate I/D, dual
register set, base+, EIS, FPP, used on LSI-11/73 board, MircoPDP-11/73,
MicroPDP-11/83, PDP-11/84, new version with improved system layout
("cache-only") used in MicroPDP-11/93,
PDP-11/94)
- DEC
VAX chips (32-Bit, clocks,
interview,
VAX
processors):
VAX systems see here
Note:
MOSAIC ("Motorola Oxide Self Aligned Implanted Circuits")
by Motorola is a high-density
bipolar chip technology
VAX 700:
780 (TTL modules, SID=0x01, 1977)
750 (TTL gate-array modules, SID=0x02, 1980)
730 (TTL modules, SID=0x03, 1982)
VAX 8000:
8600 (MOSAIC-I ECL gate-array modules, SID=0x04, 1984)
8800 (MOSAIC-I ECL gate-array modules, SID=0x06, 1986)
V-11:
V-11 (single-chip prototype, SID=0x05, 1983?)
MicroVAX I:
MicroVAX I data path chip (NMOS 4 micron, rest of CPU was spread
over lots of TTL chips, part of the KD32 (=KA610) MicroVAX I CPU
("DAP" data path module + "MCT" memory controller
modul), SID=0x07, 1984)
MicroVAX II (chip generation 1):
MicroVAX-32
(78032=DC333 CPU + 78132=DC337 FPU, SID=0x08, ZMOS (4-type NMOS)
3.0 micron, 101k trans., 40 MHz master clock for 8-phase 200 nsec
microcycle, used in MicroVAX II and VAXstation 2000 and VAX 8200,
1985)
CVAX (chip generation 2):
CVAX
(78034 CPU + 78134 FPU, SID=0x0a, CMOS 2.0 micron, 44.44 MHz master
clock for 4-phase 90 nsec microcycle, 134k trans., used in VAXstation
3500/3200 and VAXstation 3100 M30 and VAX
6000/200, 1987),
CVAX+
(enhanced CVAX, SID=0x0a, CMOS 1.5 micron, 60 nsec, used in VAX
6000/300, 1988)
SOC
("System On a Chip", DC222, SID=0x14, CMOS-3 1 micron,
35/40 nsec, for low cost VAX designs, used in VAX 4000/VLC, 1990)
Rigel/Mariah (chip generation 3):
Rigel
(REX520 CPU + DC523 FPU, SID=0x0b, CMOS-2 1.5 micron, 28 nsec,
first VAX vector instruction set, used in VAXstation 3100 M76
and VAX 4000/300 and VAX 6000/400, 1989),
Mariah
(Rigel variant, SID=0x12, CMOS-3 1.0 micron, 16 nsec, used in
MicroVAX 3100 and VAXstation 4000/60 and VAX
6000/500, 1990)
VAX 9000:
9000 (MOSAIC-II ECL gate-arrays, SID=0x0e, 1990)
NVAX (chip generation 4):
NVAX
(SID=0x13, CMOS-4 0.75 micron, 1.3M trans., 12/14/16 nsec, macropipelined,
based on VAX
9000 (ECL gate arrays) microarchitecture, used in VAXstation
4000/90 and VAX 4000/400 and VAX
6000/600, paper,
1991),
NVAX+ (NVAX for Alpha AXP Bus, SID=0x17, CMOS-4 0.75 micron, 11
nsec, used in VAX 7000/600 and VAX 10000/600, paper,
1992),
NVAX5 ("NVAX++", SID=0x17, CMOS-5 0.5 micron, 7.5 nsec,
used in VAX 7000/700 and VAX 10000/700, 1994)
- AMD
AM2900 family (Bit-slice processor chip set,
introduction
to the AM2900 family + Bit-slice design):
AM2901, AM2902, AM2903, AM2904 (4-Bit slice processor, ALU + register
file, first member: 2901 (1975, same as MMI 67901, Xilinx
C2901))
AM2909, AM2911 (microprogram controller (CCU + micro instruction
register))
AM2910 (microprogram sequencer, AM2910
info, Xilinx
C2910A)
AM2914, AM2913 (interrupt controller, AM2914
info, AM2913
info)
AM2919, AM2918 (instruction register)
AM2925 (programmable microcycle timing support)
AM2930, AM2932 (main memory program control)
AM2940, AM2942 (DMA support)
AM2950 (bus I/O)
IA32
compatible CPUs:
Am386 (1989)
Am486 (1993, info)
Am5x86 ("80486DX5", 4-times overdrive 486, 0.35
micron, 1995, info,
info2)
AMD5K86 ("AMD-SSA/5", first K5 core, 1995, info)
AMD-K5 (improved AMD5K86, 0.35 micron, 4.3M trans., 6-issue
RISC microcore, out-of-order execution, 1996, info)
NexGen Nx586 (0.44 micron, 3.5M trans., 4-issue RISC86
micro architecture, superscalar, out-of-order execution, 1995,
announcement,
info,
info2)
NexGen Nx686 (never released, announcement)
AMD acquires NexGen
AMD-K6 (based on Nx686 work, 0.25 micron, 8.8M trans.,
6-issue RISC86, MMX (integer SIMD extensions), 1997, info,
datasheet)
AMD-K6-2 (0.25 micron, 9.3M trans., 6-issue RISC86, 3D-Now!
(floating-point SIMD extension), 1998, info,
datasheet)
Mobile AMD-K6-2-P (info)
AMD-K6-III (0.25 micron, 21.3M trans., TriLevel cache,
6-issue RISC86, info,
datasheet)
Mobile AMD-K6-III-P (info)
Mobile AMD-K6-2+ (0.18 micron, PowerNow!, info)
Mobile AMD-K6-III+ (0.18 micron, PowerNow!, info)
K7 core:
AMD Athlon ("AMD-K7", 0.25, 22M trans., 3-way x86
superscalar, 9-issue RISC86, DEC Alpha EV6 bus, Slot-A, 1999,
info)
K75 core:
AMD Athlon (0.18 micron)
Thunderbird core:
AMD Athlon (copper, 0.18 micron, 37M trans., on-die L2
cache, Socket-A)
AMD Duron ("Spitfire", socket A)
Mobile AMD Duron
Palomino core:
Mobile AMD Athlon 4 (0.18 micron, 2001, info)
AMD Athlon MP ("Mustang", for multi-processor
systems)
AMD Athlon XP (MMX/ISSE/3DNow!, QuantiSpeed 9-issue micro
architecture, info)
AMD
Duron ("Morgan", info)
Mobile AMD Duron
X86-64:
(IA32 compatible 64-Bit architecture, whitepaper,
x86-64.org, info)
AMD-K8 ("Hammer" architecture, scheduled for
2002)
- MIPS
("Microprocessor without Interlocked Pipeline Stages",
but newer versions (MIPS III, ...) with interlocked pipeline,
concept roots in Stanford MIPS project, one of the first RISC
microprocessors, history,
Linux
MIPS-HOWTO, ISA=instruction set architecture, ASE=application
specific extension):
MIPS I (1985):
R2000 + R2010 FPU (as COP1) + 4xR2020 WB (32-Bit RISC, 5-stage
(6-stage FPU) pipeline, CMOS, 110k trans., 8.3/12.5/15 MHz, 1985/1987
FPU, imp=0),
R3000 + R3010 FPU (enhanced R2000, CMOS, 115k trans., 20/25/33.3
MHz, 1988, imp=1),
R2000A + R2010A FPU (R3000 core, 12.5/16.67 MHz, 1988, imp=1,
pin compatible replacement for R2000),
R3000A (enhanced R3000, 20/25/33.3/40 MHz, 1989, imp=2, paper),
R3400 (integrated R3000 + R3010, by PACEMIPS),
RISCore3000 family (microcontroller family by IDT, based on R3000A,
1994, only "E"-versions contain MMU, IDT
list, paper):
R3051, R3052, R3071, R3081 (incl. FPU), 3041
RISCore 32300 family (microcontroller family, by IDT, based on
R3000A, 2000, IDT
list)
R39xx (MIPS I + parts of MIPS II and III for embedded applications,
Toshiba TX39 series, 1995, Philips PR31700 contains R3900 core)
MIPS II (1990):
R6000 + R6010 FPU + B3110 MULT/DIV (connected to FPU) + R6020
system bus controller/WB (by BIT Technology, enhanced R3000, ECL,
90k trans., 66.7/80 MHz, 1990, imp=3, has nothing to do with IBM's
RS/6000, different MMU than MIPS I),
R6000A (imp=6)
MIPS III (1991):
R4000 (first 64-Bit MIPS, 8-stage superpipelined, CMOS 0.7 micron,
1.35M trans., 50(100) MHz, PC=low cost/SC=secondary cache interface/MC=multiprocessing,
1991, imp=4, rev=0x00), R4400 (enhanced R4000, CMOS 0.6 micron,
2.3M trans., 250 MHz, 1992, imp=4, rev=0x40),
low cost (non-superpipelined, 5-stage pipeline) versions:
R4200
("VRX", by NEC, CMOS
0.6 micron, 80 MHz, 1993, imp=0xa, FPU combined with IU, MIPS
III architecture for embedded applications),
R4300i
(low cost version of R4200, 100 MHz, CMOS 0.35 micron, 1997, imp=0xb),
newer NEC
families: VR4100, VR4300
R4600 ("Orion", by IDT,
100/133/150 MHz, 1993, imp=0x20, designed as low cost alternative
to R4400, comparison
with R4200),
R4700
(successor of R4600, 175 MHz, 1995)
MIPS IV (1994):
R8000 + R8010
FPU (5-stage pipeline, superscalar, CMOS, 2.6M + 830k trans.,
75/90 MHz, 1994, imp=0x10),
R10000 (successor
of R8000, "ANDES" out-of-order
execution, CMOS 0.35 micron, 6.7M trans., 200/275 MHz, 1996,
imp=0x9, die
photo),
R5000
(low cost version based on R10000, CMOS 0.32 micron, 3.6M trans.,
250 MHz, 1996, optimized for single-precision FP),
R5230, R5260,
R12000
(improved R10000, CMOS 0.25 micron, 300/400 MHz, 1997),
R14000 (improved R12000, by NEC, 500 MHz, 1999),
NEC
families: VR5000, VR10000
MIPS V and MDMX
("MIPS digital media extension") (1996):
both are SIMD extensions, actually no special CPU ever built,
superseeded by MIPS64
new ISA standards (for embedded applications):
MIPS32:
32-Bit, MIPS II + parts of MIPS IV + ..., also for 64-Bit CPUs
in 32-Bit mode
MIPS64:
64-Bit extension of MIPS32, MIPS IV + MIPS V + ...
MIPS16:
compressed 16-Bit instructions for code reduction
MIPS-3D ASE:
SIMD 3D-processing extensions to MIPS64
custom designs:
Sony "Playstation"
("PS-X", custom R3000A CPU at 33.8688 MHz, 2 MB RAM,
1MB graphics RAM, 512 KB sound RAM, 3D graphics coprocessor, MPEG
decoder, 1994 (Japan)/1995 (USA))
"PS one" (Playstation successor, 2000)
"Reality Immersion Engine" ("Reality Coprocessor"
or "RCP", by NEC, coprocessor in Nintendo64,
CMOS 0.35 micron, approx. 4M trans. incl. CPU, 1996, announcement,
MIPS Nintendo64
history,
Nintendo64 info, processor units in Nintendo64:
CPU: custom version of R4300i, 93.75 MHz
RCP: 64-Bit vector coprocessor, 62.5 MHz)
ICE
("Image Compression Engine", R4xxx-derived control logic
unit + 128-Bit SIMD MDMX-style CPU, by SGI,
66 MHz, 1997, used in SGI O2
workstations as additional CPU to R5000/R10000 main CPU, accelerates
JPEG and OpenGL imaging extensions, O2 uses UMA
("Unified Memory Architecture") with multi-ported main
memory to CPU/ICE/IOE/MRE, O2
info)
"emotion
engine" (by Sony/Toshiba
for PlayStation2,
CMOS 0.25 micron, 13.5M trans., 296 MHz, 2000, units on chip:
CPU: MIPS III CPU
(2x64-Bit IU) + FPU (FMAC + FDIV, as COP1) + VU0
(4xFMAC + FDIV, 128-Bit SIMD, as COP2 or as separate VLIW processor)
geometry processing: VU1
(4xFMAC + FDIV + EFU (FMAC + FDIV), 128-Bit SIMD, separate VLIW
processor) + GIF (graphics interface)
IPU: MPEG2 decoding accelerator
DMA controller
DRAM and I/O interface)
- DEC/Compaq
Alpha
(AXP) 21x64: (64-Bit RISC successor of VAX, high clock
rate ("short-tick design"), newer versions exploit static
and dynamic instruction level parallelism (dynamic
scheduling info) with out-of-order execution and simultaneous
multi-threading ("SMT") (in contrast to static "EPIC"
in IA64), superpipelined + superscalar, roots in DEC's PRISM design
(has nothing to do with Apollo's PRISM), PALcode instructions
for emulation and OS support, performance,
Compaq
documentation, Linux
Alpha-HOWTO, not to be confused with 21xxx ADI SHARC):
EV4 (7-stage INT/10-stage FP pipeline, dual-issue, 1992):
21064 (0.75 micron, 1.7M trans., 150/166 MHz, info),
EV4S (1993):
21064(200 MHz)
LCA4S:
21066 (low cost EV4, documentation)
EV45 (1994):
21064A
(improved EV4, 0.5 micron, 233/266/275/300 MHz, documentation)
LCA45:
21066A (low cost EV4 core + EV45 FPU, 100/233 MHz)
EV5 (7-stage INT/9-stage FP superpipeline, quad-issue,
1995):
21164 (0.5 micron, 266/333MHz, documentation,
info)
EV56 (1997):
21164A (0.35 micron, 9.6M trans., 366/433/500/612 MHz)
PCA56 (1997):
21164PC (0.35 micron, 400/466/533 MHz, documentation)
PCA57 (1997):
21164PC (improved PCA56, 0.25 micron, MVI ("motion video
instructions"), 400/466/533 MHz, documentation)
EV6 (MVI ("motion video instructions"), out-of-order
execution, register
renaming, 1998):
21264 (0.35 micron, 15.2M trans., 466/500/525/575 MHz, info,
info2,
documentation,
manual,
Mircoprocessor
Report 1996)
EV67 (2000):
21264A (0.28 micron, 15.2M trans., 600/667/700/730/750/833 MHz,
info,
manual)
EV68AL (2001):
21264B (0.18 micron (aluminium), 15.2M trans., 833-1250 MHz, info,
announcement,
manual)
EV68CB (2001):
21264C (copper)
EV68CX (2001):
21264D
EV68DC (2001):
21264E (140M trans., 1250 MHz)
Intel acquires Alpha technology from Compaq (2001)
EV69 (planned for 2002)
(0.125 micron, SOI)
EV7 (planned for 2002):
21364
EV78:
21364A (0.125 micron)
EV79
EV8 (planned for 2004, 8-way superscalar, 4-way simultaneous
multi-threading ("SMT"))
21464 (0.125 micron, SOI, copper, 250M trans.)
- Apollo
PRISM ("Parallel Reduced Instruction Set Machine"
also called "88k", has nothing to do with DEC's PRISM
or Motorola's 88000, used in Apollo DN10000)
- HP
PA-RISC ("Precision Architecture" RISC, influenced
by Apollo PRISM, info,
hardware database):
HP-PA 1.0 (32-Bit):
PA7000 PCX (used in HP9000-8xx/9xx)
HP-PA 1.1a:
PA7000 PCX-S
HP-PA 1.1b:
PA7100/PA7150 PCX-T
HP-PA 1.1c:
PA7100LC PCX-L
HP-PA 1.1d:
PA7200 PCX-T'
HP-PA 1.1e:
PA7300LC PCX-L2
HP-PA 2.0 (64-Bit):
PA8000 PCX-U (first 64-Bit PA-RISC)
PA8200 PCX-V/U+
PA8500 PCX-W
PA8600 PCX-W+
- Sun
SPARC ("Scalable
Processor Architecture" processor specification, register
windows, concept roots in UC Berkeley RISC project 1984-1988,
one of the first RISC microprocessors, floating
point options, models+CPUs,
hardware+CPUs,
Linux
SPARC-HOWTO):
Specifications/Implementations:
SPARC-V7 (32-Bit, 4-stage integer pipeline)
SPARC
SPARC-V8 (32-Bit, superscalar)
SuperSPARC
MicroSPARC
TurboSPARC
HyperSPARC
SPARC-V9 (64-Bit)
UltraSPARC
UltraSPARC II
UltraSPARC III
UltraSPARC IV
Chips:
Fujitsu SF9010IU = Fujitsu MB86900 (IU) + Fujitsu SF910FPC = Fujitsu
MB86910 (FPC) + Weitek 1164 (MULTIPLY FPU) + Weitek 1165 (ALU
FPU) + MMU (IU&FPC are gate-arrays, used in Sun-4/260)
Fujitsu MB86901 = LSI L64801 + Weitek 3170 (FPU) (used in SPARCstation
I)
Fujitsu MB86902 = LSI L64911 (used in SPARCstation/IPC)
Fujitsu MB86903 = Weitek W8701 (FPU on chip, used in SPARCstation/ELC
and IPX)
Ross RT601 = Cypress CY7C601 + TI 8847 or TI TMS390C601A or Ross
RT602 = Cypress CY7C602 (FPU) + Ross RT605 = Cypress CY7C605 (MMU,
SPARC reference MMU implementation) (used in SPARCstation 2)
MicroSPARC = TI TMS390S10 (FPU + MMU on chip, 3-level MMU, used
in SPARCstation 4/xx)
MicroSPARC II = Fujitsu MB86904 (FPU + MMU on chip, used in SPARCstation
5/xx)
TurboSPARC = Fujitsu MB86907
SuperSPARC = TI TMX390Z50 (FPU + MMU on chip, 3-level MMU, used
in SPARCstation 10/xx)
HyperSPARC = Ross RT620 (IU + FPU) + RT625 (MMU) (used in SPARCstation
20/xx)
- IBM
ROMP ("032", RISC CPU, 1986)
ancestor: IBM 801
used in PC/RT
POWER ("Second Generation RISC" = "SRG",
superscalar 32-Bit RISC architecture, 1990)
ancestors: "AMERICA" (three-processor design) and later
"RIOS"
used in "RISC System/6000" = RS/6000 computers (has
nothing to do with MIPS R6000)
manuals
literature: Dipto Chakravarty, "POWER RISC System/6000",
McGraw-Hill, 1994 (ISBN 0-07-011047-6)
logical components: (central electronic complex = "CEC")
BPU (branch execution unit, 2-stage pipeline)
FXU (fixed-point execution unit, 4-stage pipeline)
FPU (floating-point execution unit, 6-stage pipeline)
ICU (instruction cache functional unit, combined with BPU)
DCU (data cache functional unit)
SCU (storage control functional
unit, memory management)
POWER implementations:
Multi-chip:
* RS 1.0 (chipset: 1x SC, 1x FX, 1x FP, 1x IC/BP, 4x DC)
* RS .9 (chipset: 1x SC, 1x FX, 1x FP, 1x IC/BP, 2x DC)
Single-chip:
* RSC
boards:
CPU planar
I/O planar
standard I/O planar
CPU designation:
SRG-xxyy (xx=clock in MHz, yy=data cache in KB)
PowerPC (CPU architecture based on POWER, 32-Bit
and later on 64-Bit, used in newer RS/6000 systems, 1992)
for details see Motorola section
POWER2 (32-Bit, successor of POWER, used in high-end
RS/6000 systems, 1993)
info
two floating point units
Multi-chip and single-chip ("P2SC") designs
POWER3 (superscalar 64-Bit RISC CPU, based on PowerPC/POWER/POWER2
architecture, used in high-end RS/6000 systems, 1997)
info
POWER4 (2001)
info,
info
POWER5 (scheduled for 2004)
info,
info
POWER6 (scheduled for 2006)
info
- Motorola
Microprocessors:
68xx (8-Bit):
6800 (8-Bit data/16-Bit addresses, 4000 trans., 1974, datasheet)
6809, 6809E (improved 6800, 8/16-Bit data, 1 MHz, 1979, datasheet,
disassembler)
68A09E (1.5 MHz)
68B09E (2 MHz)
6800 based microcontrollers (microcontroller
documentation):
6802 (6800 compatible, datasheet),
6808 (6802 w/o onchip RAM)
6801 family (superset of 6800):
6801, 6803
6805 family (6800 subset, low cost):
6805, 68705 (EPROM version), 68HC05, 68HC705, 68HC805
6811 family (superset of 6801):
68HC11
68xx 16-Bit microcontrollers (microcontroller
documentation):
6812 family (superset of 6811): 68HC12
6816 family (superset of 6811): 68HC16
68k (16/32-Bit):
68000 (HMOS 4.0 micron, 68k trans., 1979),
68008 (68000 with external 8-Bit data bus, 70k trans.)
68010 (enhanced 68000 for virtual memory support, 68451/68851
MMU, 84k trans., 1983)
68020 (CPU) + 68881/2 (FPU) + 68851 (MMU) (HCMOS 2.0 micron, 190k
trans., full 32-Bit virtual memory, 1984)
68030 (CPU+MMU) + 68881/2 (FPU) (CMOS 1.3 micron, 273k trans.,
1987)
68040 (CPU+MMU+FPU, CMOS 0.8 micron, 1.17M trans., 1990, info)
68060 (superscalar, 2.5M trans.)
683xx series (microcontroller, for embedded applications, 68000
core + 68010 and 68020 extensions)
PowerPC
(32-Bit RISC CPU family, based on the IBM "POWER" CPU,
Apple/IBM/Motorola alliance, overview,
old
overview, AIX
assembler):
MPC601 (first PowerPC, 32-Bit, CMOS 0.6 micron, 2.8M trans.,
1992, 50-80 MHz)
MPC602
(low cost, 66 MHz)
MPC603
(low power, 1993, 66-80 MHz), MPC603e
(1995, 100-200MHz), MPC603ev (1995, 160-240 MHz)
MPC604
(1994, 100-180 MHz), MPC604e
(0.35 micron, 5M trans., 1995, 166-223 MHz)
MPC620 (64-Bit),
MPC740,
MPC750
("G3", 32-Bit)
MPC7400
("G4", 32-Bit)
ColdFire (MCF5xxx series, VL RISC, for embedded applications,
Motorola
info)
Version 1: (68000 and ColdFire VL RISC ISA)
MCF5102
Version 2: (ColdFire VL RISC ISA)
MCF5202
MCF5204
MCF5206, MCF5206E
MCF5272
Version 3:
MCF5307
Version 4: (Harvard memory architecture, dual-issue superscalar)
MCF5407
DSPs:
DSP56000 family (24-Bit fixed-point DSP, 56-Bit accumulators,
16-Bit address, info)
56000, 56001 (1987)
56002 (1992)
56001A (1994, successor of 56001 and 56002)
56005
56011
56004
56007
DSP96000 family (32-Bit floating-point DSP, 96-Bit accumulators,
32-Bit address, architecture derived from DSP56000 family)
96001, 96002 (1988/1990, 750000 trans., info,
disassembler)
DSP56100 family (16-Bit fixed-point DSP, info)
DSP56300 family (24-Bit fixed-point DSP, 24-Bit address,
successor of DSP56000 family, info)
DSP56301 (1995)
DSP56303
DSP56600 family (16-Bit fixed-point DSP, 16-Bit
version of 24-Bit DP56300 family, for low-power applications,
info)
DSP56800 family (16-Bit fixed-point DSP, info)
DSP56850 family (enhanced DSP56800 family, info)
StarCore DSPs (see StarCore, Motorola and Agere (formerly
Lucent Technologies Microelectronics Group))
- StarCore
DSPs (Motorola and Agere Systems (formerly Lucent
Technologies Microelectronics Group) alliance, 1998)
SC140 core (high performance, info)
MSC8101
SC110 core (power efficient)
- Hitachi
63xx
HD6309 (superset of Motorola 6809, 1 MHz)
HD63B09 (2 MHz)
HD63C09 (3 MHz)
HD6301, HD6303 (based on Motorola 68xx microcontrollers, datasheet)
HD64180Z (compatible with Zilog Z180)
- MOS
Technology (MOSTEK)/Rockwell
6500 series
6502
(8-Bit data/16-Bit addresses, 8.0 micron, 1 MHz, 1976, strongly
influenced by Motorola 6800, archive,
cross-development
tools, info)
- Intel
(microprocessor
history, evolution)
Chip development:
Technologies:
1968: silicon gate PMOS, Schottky TTL (bipolar)
1971: NMOS
1972: CMOS
Intel
article
Microprocessors:
4-Bit, MCS-4, MCS-40 (1971):
4004
(CPU) + 4001 (ROM+I/O) + 4002 (RAM+O) + 4003 (shift register)
+ 4008/4009 (mem+I/O interface) (the first microprocessor, 4-Bit
data /8&16-Bit instr./12-Bit PC, 4004: 2300 trans., 10.0 micron,
108 kHz, 1971, datasheets),
4040 ("MCS-40 family", enhanced 4004, Japanese
info)
8-Bit, MSC-8 (1972):
8008 (8-Bit data/14-Bit PC, 10.0 micron, 3500 trans., 200
kHz, 1972, datasheet),
8-Bit, MCS-80 (1974):
8080 (8-Bit data/16-Bit PC, 6.0 micron, 6000 trans., 1974,
datasheet),
8085 (enhanced 8080, 3.0 micron, 6500 trans., 1976),
see Z80 (by Zilog, 1976)
16-Bit, MCS-86 (1978):
8086 (16-Bit data/20-Bit segmented (64KB) addresses, 3.0
micron, 29k trans., 8 MHz, 1978, registers),
8088 (8086 with 8-Bit external data bus, 3.0 micron, 29k trans.,
1979, used 1981 in the first IBM PC (4.77 MHz, with 160 KB floppy)
and later in PC/XT (with harddisk)),
80186/80188 (enhanced 8086/8088 for embedded applications),
8087 (FPU for 8086, "NPX" ("Numeric Processor Extension")
1980)
i286=80286 + i287 FPU (16-Bit data/24-Bit addresses in
PVAM ("Protected Virtual Address Mode"), HMOS 1.5 micron,
134k trans., 6 MHz, 8086 successor, 1982, history,
used in IBM PC/AT)
32-Bit, IA32 (1985):
i386 + i387 FPU (32-Bit data/32-Bit segmented + paged addresses
in "Protected Mode", CMOS 1.5 micron, 275k trans., 80286
successor, 1985, prog.
reference, icomp
diagram, used in some IBM PS/2):
386DX,
386SX (only 16-Bit I/O, 16 MHz, 1988),
386SLC (2x overdrive, IBM)
i486 (based on i386 + FPU, 5-stage pipeline, cache, 1989):
486DX (CMOS 1.0 micron, 1.18M trans., 1989),
486SX (w/o FPU),
486SLC2 (386 pinout, 2x overdrive, IBM),
486DX2 (2x overdrive, CMOS 0.8 micron, 1992),
IntelDX4 ("486DX4", P24C, 4x overdrive, BiCMOS 0.6 micron,
1.6M trans., 1994)
Pentium: (2-way superscalar, 1993, list)
P5 (BiCMOS 0.8 micron, 3.1M trans., 60/66 MHz, 1993),
P54C (BiCMOS 0.6 micron, 3.2M trans., 75/90/100 MHz, 1994),
P54M (upgrade for P54C),
P54CQS (BiCMOS 0.35 micron, 3.2M trans., 120 MHz),
P54CS (BiCMOS 0.35 micron, 3.3M trans., 133/150/166/200 MHz, 1995),
P5T (CMOS 0.8 micron, overdrive for P5, 120/133 MHz),
P54T (for notebooks),
P54CT (overdrive for P54C, CMOS 0.35 micron, 125/150/166 MHz),
P55C (MMX (integer SIMD extension), CMOS 0.35 micron, 4.5M trans.,
166/200/233/266 MHz),
P54CTB (MMX overdrive for P54C, 0.35 micron, 4.5M trans., 125/150/166/180/200
MHz)
Pentium Pro (P6, 0.35 micron, 5.5M trans., RISC core with
CISC interpreter, out-of-order execution, 1995, info)
Pentium II (0.35 micron, 7.5M trans., 300 MHz, improved Pentium
Pro, ISSE (floating-point SIMD extension), 1997), (0.25 micron,
7.5M trans., 333 MHz, 1998)
Pentium II Xeon (0.25 micron, 7.5M trans., 400 MHz, Pentium
II with larger L1 cache, 1998)
Celeron (Pentium II with smaller L1 cache)
Pentium III (0.25 micron, 9.5M trans., 1999)
Pentium III Xeon (0.25 micron, 9.9M trans., 550 MHz, 1999),
(0.18 micron, 28.0M trans., 733 MHz, 1999)
Mobile Pentium II (0.18 micron, 27.4M trans., 400 MHz,
1999)
Pentium 4 (microarchitecture)
Mobile Pentium III
64-Bit,
IA64 (2001): (VLIW derived static instruction level paralellism
EPIC
("Explicit Parallel Instruction Computing") (e.g. in
contrast to dynamic instruction level parallelism with out-of-order
execution and simultaneous multi-threading in Alpha AXP))
Itanium (info)
McKinley
Microcontrollers:
8-Bit, MCS-48 family (1976):
8048 (8-Bit, 6 MHz, 1976, datasheet,
japanese
info)
8035 (ROMless 8048)
8021 (low cost 8048 subset, 1977)
8022 (8021 + A/D-converter, 1978)
8049 (enhanced 8048, 11 MHz, 1978)
8039 (ROMless 8049)
8041 (Universal Peripheral Interface "UPI-41", based
on 8048, datasheet)
8042 (improved 8041)
87xx (EPROM versions)
8-Bit, MCS-51 family (1980):
8051 (architecture
info)
8031 (ROMless 8051)
8052 (enhanced 8051)
8032 (ROMless 8052)
87xx (EPROM versions)
DSPs:
2920 (24-Bit instruction word, NMOS 5 micron, 400nsec, 1979, info)
- Zilog
CPUs:
(search)
Z80 core
(enhanced 8080, 8/16-Bit data/address, dual register banks)
Z80 (2.5MHz, 1976)
Z80A (4MHz)
Z80B (6MHz)
Z80H (8MHz)
Z08400
Z84C00
Z8000 (16-Bit, 1979)
Z80000 (32-Bit, 6-stage pipeline, 1983)
Z80320 ("Z320", CMOS Z80000)
Z280 (enhanced Z80, 16-Bit, MMU, 16MB physical address space,
1987)
Z180
core
(enhanced Z80, 8/16-Bit data/logical address, on-chip MMU, 1MB
physical address space)
Z80180 ("Z180", compatible with Hitachi HD64180Z, MMU
info)
Z80181
Z80S183
Z80L183 (low voltage)
S180 core
(improved Z180)
Z8S180
Z80182
Z8L182 (low voltage)
Z80185
Z80189
Z80195 (ROMless Z80185)
eZ80 core
(enhanced Z80, 16MB linear address space)
eZ80190 ("eZ80 Webserver", 50MHz, 2000)
Z380 core
Z80380 ("Z380", 16/32-Bit)
Z80382
- Analog
Devices
DSPs:
ADSP-2100
DSP family:
ADSP-2100 (16-bit fixed-point DSP, 1986, datasheet)
ADSP-2100A (1988)
ADSP-21csp01 (1995, improved ADSP-2100 architecture)
ADSP-218x family
ADSP-219x family
2192 (2000)
SHARC
DSP family (ADSP-21xxx, "Super Harvard Architecture",
32/40-Bit floating point DSP, 32-Bit addresses, 48-Bit multifunction
instructions, not to be confused with 21x64 DEC Alpha CPUs):
21020 (first "SHARC", 1991, info,
history,
ADDS-21020-EZ-LAB)
21010
21060, 21062,
21061,
21065
(SHARC core + SRAM + DMA + ..., SRAM: 60: 4MBit/62: 2MBit/61:
1MBit/65: 544KBit, 1994, info,
history,
ADSP-2106X-EZLITE,
ADDS-21065L-EZLITE)
21160,
21161 ("Hammerhead", 2106x compatible, SIMD extensions,
64-Bit data busses, 1999, info,
history,
ADDS-21160M-EZLITE)
TigerSHARC family (ADSP-TSxxx, "static superscalar",
8/16/32-Bit fixed-point and 32-Bit floating point, 128-Bit busses,
64-Bit external port, 2000, info):
ADSP-TS001
(product
brief)
- Texas
Instruments (TI, TI
DSP home, user
manuals)
TMS320 DSP products:
TMS320XX family (NMOS)
TMS32010 (16-Bit fixed-point DSP, NMOS 2.7micron, 200nsec (390nsec
MAC), 1982, info)
TMS32011
TMS32020 (NMOS 2.4micron, 200nsec, 1985)
TMS320C1X family (16-Bit fixed-point DSP, CMOS successor
of TMS3201X)
TMS320C15
TMS320C17
TMS320C2X family (16-Bit fixed-point DSP, CMOS successor
of TMS3202X)
TMS320C25 (1987)
TMS320C3X family (32/40-Bit floating point & 24/32-Bit
fixed-point DSP, info)
TMS320C30 (1988)
TMS320C33
TMS320C4X family (32/40-Bit floating point and 32-Bit fixed-point
DSP, successor of TMS320C3X, info)
TMS320C40 (1990)
TMS320C5X family (16-Bit fixed point DSP, successor of
TMS320C2X)
TMS320C50 (1990)
TMS320C20X family (16-Bit fixed-point DSP, successor of
TMS320C2X/5X)
TMS320C2000
platform (for control applications)
TMS320C24X
DSP generation (16-Bit fixed-point DSP)
TMS320C28X DSP generation (32-Bit fixed-point DSP)
TMS320C5000 platform (power efficient)
TMS320C54X
DSP generation (16-Bit fixed-point DSP)
TMS320C55X DSP generation (16-Bit fixed-point DSP, variable
instruction length)
TMS320C6000 platform (high performance, VLIW)
TMS320C62X DSP generation (16-Bit fixed-point DSP, VLIW,
info)
TMS320C6201
TMS320C6202
TMS320C6203
TMS320C6204
TMS320C6205
TMS320C6211
TMS320C67X DSP generation (32/64-Bit floating point
DSP, VLIW, info)
TMS320C6701
TMS320C6711
TMS320C6712
TMS320C64X DSP generation (16-Bit fixed-point DSP, VLIW,
TMS320C62X successor)
TMS320C6414
TMS320C6415
TMS320C6416
TMS320C8X family (multi-processor CPU/DSP)
TMS320C80
other DSPs:
TMS57002 ("DASP"="Digital Audio Signal Processor",
24-Bit fixed-point, two engines, I2S serial audio I/Os)
- NEC
DSPs:
NEC DSP history
µPD7720 (16-Bit fixed-point DSP, NMOS 4.5micron, 250nsec
cycle, 1981)
µPD77230 (1986)
- Fujitsu
CPUs:
see SPARC chips
DSPs:
MB8764 (CMOS 2.3micron, 100nsec, 1983)
- AT&T
/ Bell Labs (now Lucent)
DSPs:
DSP-1 (NMOS 4.5micron, 800nsec, 1980)
DSP16 (16-Bit integer)
DSP16A (1986)
DSP32C (32-Bit floating-point, CMOS 1.5micron, 250nsec, 1985,
info)
DSP1600 core (16-Bit fixed-point, based on DSP16, info)
DSP1610 (1990)
DSP1611
DSP1616
DSP1617
DSP1618
DSP1615 (1995, info)
DSP1620 (1996)
DSP1628
DSP3200 core (32-Bit floating-point, based on DSP32C, info)
DSP3210 (1991)
DSP3207
DSP16000 core (info,
info)
DSP16210 (1997)
see StarCore DSP (Motorola and Agere (formerly Lucent Technologies
Microelectronics Group))
-
American Microsystems Inc.
(AMI)
DSPs:
S2811 (16-Bit integer "Signal Processing Peripheral"
in conjunction with 6800 CPU, "V-groove"-MOS 4.5micron,
300nsec, 1978)
|
|
|