Zarcon Dee Grissom's Idea Page
EV9
Updated Sunday 10 DEC 2000
Home > Ideas > Backside RAM
Backside RAM
The Faster Computer Part 2.0

Allot of thought went into how the basic computer architecture, Could be improved. this part covers the motherboard connections, and memory architecture.
Main Menu
Home

Ideas

Meet the computers

About Me

Links

Idea#003
Backside RAM
The Faster Computer Part 2.0

After allot of consideration, I came up with a draft of a real performance PC. The computer I have drafted out isn't much different from the current designs, just applied a little different from the norm. Imagine a computer that didn't half to waste clock cycles swapping out memory blocks in it's L2-cache. Or for that matter, didn't need to cache at ALL. What if the Front side buss could be dedicated to interfacing with all the cool gadgets in your PC, and not bogged down by the memory. Imagine a 200MHz computer that could blow away the fastest and greatest 1G available.

Current computers use the CPU's Front-Side Buss(FSB) for EVERYTHING! Next, current computers have a huge bottleneck in the width and speed of the FSB. Current computers limit the threw put of the CPU to how fast it can get data threw it's FSB... To RAM, Heard-disk, Graphics card, etc. What about the peripherals interface, there no where close to what the CPU's can do. Not much can be done about the standard interfaces like ISA, PCI, etc. It is up to big corps agreeing on standards and applying them together. And the other stuff, like the heard-drives, well there is only so fast you can spin a steal disk until you run into problems with heating up barrings, platter tinsel strength, and shorter product lifespan. In my opinion, HDD manufactures have done amazing things with heard drives, and there transfer rates. Whats left to improve, the motherboard and the FSB. Computer manufactures have had no problem with inventing all kinds of CPU sockets, slots, cards, etc. Well lets pick at the problems with the FSB and try to fix them.

The manufactures have made a big fuss about this backside L2-Cache buss. Well what if that was RAM, and not just another more distant from memory cache. Most of the traffic that goes across the FSB is grabbing instructions and data from memory to process. If the memory was on it's own backside buss, the FSB could be more responsive to all the computer peripherals. The CPU would have a more direct link to the system memory. The system would definitely improve in performance. Ketch 22, peripherals and ports use memory addresses to function, the chip set would half to trap these signals and route them directly to the L2-RAM. Well the current PC's do this already, the Graphics port is not a standard VGA card, So the chip set traps signals for a non existent CGA/EGA//VGA card, and routs them to the graphics port. we could expand this option, to work with all base memory access. It could definitely work, and not to difficult to implement, well just rename the backside L2-Cache buss, the backside L2-RAM buss.

Wait a minute, Why stop there. We could change the CPU's cache, to actual RAM. like L1-RAM, on the chip, extremely fast, and there goes all the memory load from cache refreshing. Almost all windows computers have a basic 640kB base memory, some without any memory dims installed. lets go further, most "E" machines are sold with more then 32Megs of memory, lets make the L1-RAM 64Megs, or 128Megs. Then you expand by adding memory to the L2-RAM slots. That works quite well.

What would this look like?
New Buss This CPU and mother board has all the buses of current PC's. There just configured a little bit differently. Granted the ALU, Execution handler, and registry are not functional, it's more for demonstration. The current CPU's have very good cores in them, were just picking at the deficiencies of the rest of the system.

This CPU would use the FSB Controller (CTRLR) to access RAM. I mentioned that this would free up the FSB, it dose. The CPU is the computers primary router of data. when a program transfers data from ram to something else... it goes threw the CPU. That means one clock cycle to read one byte of data from the memory, and one clock cycle (at least) to send the data to something else. The reverse order to get data from something else and put it in RAM. The backside RAM system reduces this operation to at least one clock cycle. Thus, The FSB merely interfaces something else, and not the RAM. The FSB controller is the CPU's interface engine to the rest of the computer, thus it dose not mater where the RAM is accessed from, chip set, or CPU. the RAM access happens just as fast either way. the chip set road dose take up precious bandwidth on the FSB from everything else. So using the on chip FSB controller is a very good option, besides that allows some compatibility trapping to be implemented.

So when this computer boots up, it merely uses the FSB to read data from the mass-storage controller(heard drive), and the CPU's FSB controller, puts the data into memory. there is no need for cache here, the execution system can get data from the L1-RAM just as fast as it needs it, without blocking every thing else from talking to the CPU. The L2-RAM has an extremely wide interface, so running at a mere 50MHz, the L2-RAM can easily keep pace with a 200MHz CPU.

The interface is extremely wide... Noticed that all the lines in the core system are doubled. One path for send and another for receive. That means I can send and receive data simultaneously, at double the threw put of the clock cycle. well that means a 200MHz computer can transfer 400MB/s, and it all works perfectly. and it dose. There is a paradox here, how can one CPU handle twice the data threw put of it's core system, because it is sending and receiving simultaneously. this removes the need to temporally store data between read and wright cycles. You see, as a operation is completed it sends the data to where it needs to go, and at the same time the CPU is getting the next instruction and data. Granted not all executions will require a send of data, the sending of data dos not stop the CPU from getting data to process.

This design presents RAM manufactures with two possible standards to produce memory modules. One being a simple singular read/wright buss operating at the CPU's clock speed. The other being a more complex multi-wright/multi-read buses operating at a fraction of the CPU's clock speed. the former being what I believe manufactures will be more willing to produce. the latter requires built in protection from writing to the same address simultaneously, and a form of parallel to serial buss caching. 

Multy read, Multy Write single wright, single read
multi wide ram module. Single wide ram module

Part of the reason for the single wide being easier to implement, is the lack of a reason to protect against multiple wrights on the same address. reading from the same address that is being written to, if this should happen. can be sped up by forwarding the data back out of the control logic, rather then waiting for the write cycle to finish before reading the data from the DRAM. any other cell can be read from at the same time writes are happening.

Valid XHTML 1.0
My Email link. copy and past. zarcondeegrissom@yahoo.com  
or is the status "coffee is good" on yahoo messenger