Computer Architecture

updated 2002-12-21.

Some notes on Computer Architecture. Very incomplete.

Contents:

[FIXME: should I put links to Beowulf here ?]

David also maintains related files:

``I wisdom dwell with prudence, and find out knowledge of witty inventions.'' -- Proverbs 8:12

news

computer architecture news

The latest computer architecture information is at

computer architecture comic strips

The Freedom CPU Project

DLX

is one of the architectures suggested for the F-CPU. Its major advantage is: gcc has already been ported to it, so we don't need to write a new compiler on top of everything else. So we can immediately compile the Linux kernel and other code for this chip *now*, before we even start doing anything else, which should make it easy to cache analysis and other trade-off analysis. Also, it means that when the hardware is turned on for the first time, it could boot directly into Linux, which would be cool.

(Can we really go to 64 bit architecture and still use this compiler ?)

(Can we really use a TTA architecture for the attached FPUs ?)

MMIX

is one of the architectures suggested for the F-CPU.

TTA

is one of the architectures suggested for the F-CPU.

Using FPGAs to simulate CPUs

(only with FPGAs can you have reconfigurable computing)

see also robot_links.html#PLD Programmable Logic (FPGA, PLD, CPLD, etc.) and the devices needed to program them.

see also simple_cpu for simple CPU architectures that one would think would be simple to implement in a FPGA.

see also vlsi.html#PCI_on_FPGA for FPGA devices compatible with the PCI buss.

Extremely Simple CPU Architectures

see also "Tiny robots" robot_links.html#tiny

I am fascinated by CPU designs that are extremely simple, approaching "minimal", in several (sometimes incompatible) senses of the word "simple".

Here "CPU", "MCU", and "PE" are almost interchangeable.

Often these characteristics exhibit synergy -- when you eliminate some opcodes, that eliminates some gates (making it lower-cost) and makes it easier to understand (less documentation required). Occasionally, though, driving a design to extreme simplicity according to one measure causes extra complexity in another area to compensate. This is called the the Turing tarpit #tarpit .

For purposes of robotics, "low-cost" and "Simple interface" are usually the dominant considerations. Some of the concepts of massively parallel processing are also present when one builds swarms of simple robots, but the kind of random communication between constantly-changing arrangements of simple robots is very different than the communication between the rigidly connected (most commonly in a 2 D mesh or a hypercube) elements of common cellular automata and multi-processors.

See also Using FPGAs to simulate CPUs #FPGA for some very simple CPUs designed to fit onto FPGAs. 2 very different reasons for simplicity there: (a) a simpler CPU can fit onto a smaller FPGA, making it much less expensive. (b) making the CPU smaller allows you to fit more copies of the CPU on a given FPGA, increasing MIPs at no cost. [FIXME: should combine these into 1 section ? Since the same op-code set may be implemented many ways, in TTL, in FPGA, in custom VLSI, it doesn't really make sense to split them into seperate sections ... On the other hand, ``less hardware'' means something a little different in these 3 technologies. ]

[here I *list* simple CPUs I know about; Opcode considerations #considerations and #FPGA also talk about tips for designing new CPU architectures. ]

... Here I also list all the CPUs I know about that were built up out of TTL.

cellular automata

cellular automata is related to other interests I have:

[FIXME: CA stuff scattered elsewhere ...]

The cellular automata questions I'm most interested in are:

There's a little loop here -- first we use (simulated) cellular automata to learn about replication, then we use that information to design replicating tools. We also use (simulated) cellular automata to explore good rules and good patterns for computronium. Then we use those replicating tools to (build enough copies of themselves to) build computronium. Then we use *that* computronium (hardware cellular automata) as a better computer.

self-replication

It has been shown that self-replicators can be designed in cellular automata #cellular_automata . some of this deals with ``real'', physical replication. While other parts -- cellular automata, quines, etc -- deal only with patterns in a computer. Should I separate them ? But some of the theory applies to both.

related to reconfigurable robots robot_links.html#reconfigurable and robot construction (humans building robots) robot_links.html#construction and tool closure 3d_design.html#closure . and the bootstrap problem [FIXME:] [FIXME: cross-link all the self-replication stuff on my web pages. nano, robot, cellular automata, etc. Point back and forth from ``replication'' section to computer architecture # cellular automata robots nanotech idea_space and unknowns ]

Minimum Instruction Set Computing (MISC)

see #simple_cpu and minimal_instruction_set.html

other attempts at a minimal instruction set

Steamer16: a high-performance homebrewer's microprocessor http://www3.sympatico.ca/myron.plichota/steamer1.htm by Myron Plichota <myron.plichota at sympatico.ca>. written in VHDL and prototyped on a single wire-wrapped protoboard. Has only 7 instructions. Packets of 5 instructions (5 instructions * 3 bits = 15 bits/packet) execute in 6 cycles/packet at a cycle rate of 20 MHz. Both the address and data busses are 16 bits. Unusual ``ArF'' call/return protocol.

qUark ../mirror/quark.txt a viable stack-computer with 4-bit opcodes (c) vic plichota, original concept by Myron Plichota Dec '98.

------------------------------
Date: Sat, 20 Feb 1999 12:54:34 +0300
From: "Stas Pereverzev" 
To: MISC
Subject: Re: nFORTH v2.3
Content-Type: text/plain;
	charset="koi8-r"
Content-Transfer-Encoding: 7bit

>Comments, folks?


You need only five instructions, not 16. They are:

ALU:
1. nand
2. shr

RAM:
3. store ( addr n -- )
4. lit   ( -- n )

CONTROL:
5. ncret  \ JUMP to addr in N, if carry flag isn't set in T,
          \ also drop both T and N


Also, if PC is memory variable (or can be addressed as memory variable)
we can awoid "ncret" instruction:
In that case we sholud use NCSTORE instead STORE:
ncstore (addr n flag -- )

ncstore (addr n 0 ) - same as store,
ncstore (PC n -1 )  - same as ncret

We need only 2 bits per instruction in that case.

That all folks ;-)

Stas.

------------------------------

(does this really work ???)

Turing tarpit

Turing Tar Pit.

Opcode considerations

Some things you might think about if you are designing a new instruction set (for a Von Neumann machine).

subroutines

[FIXME: I have a lot of information on subroutines and the importance of well-factored code scattered about ... should I gather it all together here, or since it's so very important, leave a brief reference at each place ?]

Taxonomy of Multiprocessors

also see Koopman's taxonomy of microprocessors [FIXME: ]

_Computation Structures_ by Stephen A. Ward and Robert H. Halstead, Jr. (c)1990 by The Massachusetts Institute of Technology The MIT Press

p.606 Models of Computation The von Nuemann model of computation ... alternative models ... include:

The above list is neither exhaustive nor even representative...

... p.612 Taxonomy of Multiprocessors ... Michael Flynn's taxonomy ...

Computation Structures

Ward and Halstead

quotes from _Computation Structures_ by Stephen A. Ward and Robert H. Halstead, Jr. (c)1990 by The Massachusetts Institute of Technology The MIT Press

... p. 347 Programming languages offer several /classes/ (allocation disciplines) for storage. These differ ... in the degree to which responsibility for allocation and deallocation of storage is assumed implicitly by the system (as opposed as being left to the programmer).

... common storage classes: