In this thesis, a novel computer architecture called Computational RAM (C·RAM) is proposed and implemented. C·RAM is semiconductor random access memory with processors incorporated into the design, while retaining a memory interface. C·RAM can be used to build an inexpensive massively-parallel computer. Applications that contain the appropriate parallelism will typically run thousands of times faster on C·RAM than on the CPU. This work includes the design and implementation of the architecture as a working chip with 64 processor elements (PEs), a PE design for a 2048-PE 4 Mbit DRAM, and applications.
C·RAM is the first processor-in-memory architecture that is scalable across many generations of DRAM. This scalability is obtained by pitch-matching narrow 1-bit PEs to the memory and restricting communications to using 1-dimensional interconnects. The PEs are pitch-matched to memory columns so that they can be connected to the sense amplifiers. The 1-bit wide datapath is suitable for a narrow, arrayable VLSI implementation, is compatible with memory redundancy, and has the highest performance/cost ratio among hardware arithmetic algorithms. For scalability, the memory arrays and memory-style packaging limit the internal interprocessor communications to 1-dimensional networks. Of these networks, both a broadcast bus network and a left-right nearest-neighbour network are implemented.
C·RAM requires little overhead over the existing memory to exploit much of the internal memory bandwidth. When C·RAM PEs are added to DRAM more than 25% of the internal memory bandwidth is exploited at a cost of less than 25% in terms of silicon area and power. The memory bandwidth internal to memory chips at the sense amplifiers can be 3000 times the memory bandwidth at the CPU. By placing SIMD PEs adjacent to those sense amplifiers, this internal memory bandwidth can be better utilized.
The performance of C·RAM has been demonstrated in a wide range of application areas, and speed-ups of several orders of magnitude compared to a typical workstation have been shown in the fields of signal and image processing, database, and CAD.