EMMA HS3 Advanced Hardware OutlineWeek #6

Hardware - CPU – Central Processing Unit

CPU Models

Brands – Intel, AMD

Processor Types – Desktop, Server, Mobile

Series (Desktop)

AMD- Sempron, Athlon 64, Athlon 64 FX, Athlon 64 X2-X3-X4, Phenom I-II

Intel - Celeron, Pentium 4, Core 2 Duo, Core 2 Quad, Core i3, Core i5, Core i7

CPU Socket Type

AMD-754, 939, 940, AM2, AM2+, AM3

Intel-478, 604, 771, 775, 1156, 1366

Technical Specifications

Single-Core, Dual-Core, Quad-Core, Hexa-Core

Operating Frequency – Almost Meaningless Now – Must review CPU Charts for Performance

Cache-L1 primary, smallest, fastest memory on PC, built directly into theCPU itself

-L2secondary, larger, slower, external to CPU unless included inside CPU

-L3 exists when L2 is included in CPU, slowest Cache, external

Manufacturing Tech – Size and spacing of the processor's transistors measured in nanometers

CPU Voltage

HyperTransport & QuickPath Interconnect – Replaces FSB, QPI uses GigaTransfers/Second

HyperThreading

Virtualization Technology - Videos

List of Intel microprocessors from Wikipedia

Tomshardware.com CPU Charts

Newegg.com pricing and specification information

Compare price to performance

Online documents –CPU Cache & Cache L1, L2, L3

Processor Glossary

CPU Virtualization

HyperTransport & HyperThreading

QuickPath Interconnect

i3 vs i5 vs i7 Microprocessors

Homework Online-Newegg Wish List – CPU Selection

CPU Online Quiz

CPU Cache

Level 1 (Primary) Cache

Level 1 or primary cache is the fastest memory on the PC. It is in fact, built directly into the processor itself. This cache is very small, generally from 8 KB to 64 KB, but it is extremely fast; it runs at the same speed as the processor. If the processor requests information and can find it in the level 1 cache, that is the best case, because the information is there immediately and the system does not have to wait.

Note: Level 1 cache is also sometimes called "internal" cache since it resides within the processor.

Level 2 (Secondary) Cache

The level 2 cache is a secondary cache to the level 1 cache, and is larger and slightly slower. It is used to catch recent accesses that are not caught by the level 1 cache, and is usually 64 KB to 2 MB in size. Level 2 cache is usually found either on the motherboard or a daughterboard that inserts into the motherboard. Pentium Pro processors actually have the level 2 cache in the same package as the processor itself (though it isn't in the same circuit where the processor and level 1 cache are) which means it runs much faster than level 2 cache that is separate and resides on the motherboard. Pentium II processors are in the middle; their cache runs at half the speed of the CPU.

Note: Level 2 cache is also sometimes called "external" cache since it resides outside the processor. (Even on Pentium Pros... it is on a separate chip in the same package as the processor.)

Level 3 cache (L3 cache)

Some microprocessor manufacturers now offer central processing units (CPUs) with both level 1 (L1) and level 2 (L2) cache memory, located on the surface of the chip or within its single-edge cartridge. When this is the case, the cache memory that resides outside the processor and on the motherboard (which is referred to as L2 cache in some cases) is called level 3 (L3) cache.

Disk Cache

A disk cache is a portion of system memory used to cache reads and writes to the hard disk. In some ways this is the most important type of cache on the PC, because the greatest differential in speed between the layers mentioned here is between the system RAM and the hard disk. While the system RAM is slightly slower than the level 1 or level 2 cache, the hard disk is much slower than the system RAM.

Unlike the level 1 and level 2 cache memory, which are entirely devoted to caching, system RAM is used partially for caching but of course for other purposes as well. Disk caches are usually implemented using software (like DOS's SmartDrive).

Cache Levels 1,2, and 3 by Patrick Schmid and Achim RoosOctober 6, 2009

Every modern processor comes with a dedicated cache that holds processor instructions and data meant for almost immediate use. This is referred to as the first level cache, or L1, and it first appeared on the 486DX processor. Recently, AMD Processors standardized on 64KB of L1 per core while Intel processors use 32KB of dedicated data and instruction L1 cache.The first level caches from Intel were introduced on the 486DX and are still an integral part of microprocessors today.

The second level cache (L2) has been available on all processors since the Pentium III, although the first on-chip implementation arrived with the Pentium Pro (not on die, though). Today’s processors offer up to 6MB of L2 cache on-die. This is the amount you’ll find being shared between the two cores on Intel’s Core 2 Duo, for example. Typical L2 cache configurations usually offer 512KB or 1MB cache per core. Processors with less L2 cache are often found in lower-end products. Here is an overview on early L2 cache configurations:

Pentium Pro had L2 cache on the processor. The following Pentium III and Athlon generation implemented L2 cache through surface-mounted SRAM chips common at that time (1998, 1999).

The introduction of 180nm manufacturing processes allowed manufacturers to finally integrate L2 caches within the processor die.

The first quad-core processors simply utilized existing designs and duplicated them. AMD did this on one die and added the memory controller and a crossbar switch, while Intel simply placed two single-core dies into a processor package to create the first dual-core.

The first cache that was shared between two cores was the Core 2 Duo's L2. AMD labored away and created its Phenom quad-core from scratch, while Intel decided once again to pair two dies—this time two Core 2 dual-cores—in an effort to create economical quad-cores.

Third level cache has existed since the early days of Alpha’s 21165 (96KB, released in 1995) or IBM’s Power 4 (256KB, 2001). However, it wasn’t until the advent of Intel’s Itanium 2, the Pentium 4 Extreme (Gallatin, both in 2003), and the Xeon MP (2006) that L3 caches were used on x86 and related architectures.

First implementations represented just an additional level, while recent architectures provide the L3 cache as a large and shared data buffer on multi-core processors. The high associativity underlines this. It’s preferable to seach a little longer inside the cache memory than have several cores trigger slow memory accesses. AMD was first to introduce L3 cache on a desktop product, namely the Phenom family. The 65nm Phenom X4 offered 2MB of shared L3 cache, while the current 45nm Phenom II X4 comes with 6MB of shared L3. Intel’s Core i7 and i5 both feature 8MB of L3 cache.

The latest quad-core processors come with dedicated L1 and L2 caches for each core and a larger, shared L3 cache available for all cores. This shared L3 is also able to exchange data the cores might be working on in parallel.

It makes sense to equip multi-core processors with a dedicated memory utilized jointly by all available cores. In this role, fast third-level cache (L3) can accelerate access to frequently needed data. Cores should not revert to accessing the slower main memory (RAM) whenever possible.

That’s the theory, at least. AMD’s recent launch of the Athlon II X4, which is fundamentally a Phenom II X4 without the L3, implies that the tertiary cache may not always be necessary. We decided to do an apples to apples comparison using both options and find out.

How Cache Works - The principle of caches is rather simple. They buffer data as close as possible to the processing core(s) in order to avoid the CPU having to access the data from more distant, slower memory sources. Today’s desktop platform cache hierarchies consist of three cache levels before reaching system memory access. The second and especially the third levels aren’t just for data buffering. Their purpose is also to prevent choking the CPU bus with unnecessary data exchange traffic between cores.

Processor Glossary Definitions
/ Architecture /
/ The size and spacing of the processor's transistors (silicon etchings), which partially determine the switching speed. The diameter of transistors is measured in microns. One micron is one-millionth of a meter. The 90 nm (a nanometer is one-billionth of a meter) process combines higher-performance, lower-power transistors, strained silicon, high-speed copper interconnects and a new low-k dielectric material. For more information see: /

/ Chipset /
/ The motherboard chipset consists of a north bridge, or Memory Controller Hub (MCH), which is responsible for controlling communication between system memory, the processor, AGP, and the south bridge, or I/O Controller Hub (ICH). The ICH controls communication between PCI devices, system management bus, ATA devices, AC'97, USB, IEEE1397 (firewire), and LPC controller. [These controllers are soldered onto the motherboard and cannot be changed or upgraded.] /

/ Clock Speed /
/ The speed at which the processor executes instructions. Every processor contains an internal clock that regulates the rate at which instructions are executed. It is expressed in Megahertz (MHz), which is 1 million cycles per second or Gigahertz (GHz), which is 1 billion cycles per second. /

/ Front Side Bus Speed /
/ The speed of the bus that connects the processor to main memory (RAM). As processors have become faster and faster, the system bus has become one of the chief bottlenecks in modern PCs. Typical bus speeds are 400 MHz, 533 MHz, 667 MHz, and 800 MHz. /

/ L2 Cache /
/ The size of 2nd level cache. L2 Cache is ultra-fast memory that buffers information being transferred between the processor and the slower RAM in an attempt to speed these types of transfers. /

/ L3 Cache /
/ The size of 3rd level cache, typically larger then L2. L3 Cache is ultra-fast memory that buffers information being transferred between the processor and the slower RAM in an attempt to speed these types of transfers. Integrated Level 3 cache provides a faster path to large data sets stored in cache on the processor. This results in reduced average memory latency and increased throughput for larger High-end Desktop workloads. /

/ Memory Type /
/ Random Access Memory (RAM) is fast but temporary data storage space. Each chipset supports one type of memory: SDR SDRAM, DDR SDRAM, or RDRAM. SDR (Single Data Rate) SDRAM and RDRAM (Rambus) are older memory technologies that are no longer supported by current Intel chipsets. DDR (Double Data Rate) SDRAM has two transfers for every one transfer with SDR SDRAM. Dual Channel DDR SDRAM transfers data four times for every one transfer with SDR SDRAM. For more information click here. /

/ Package /
/ The physical packaging or form factor (size, shape, number and layout of the pins or contacts) in which the processor is manufactured. There are many different package types for Intel® processors. See the Processor Package Type Guide for photos and details. /

/ Pin Count /
/ When processors are manufactured using pin grid array (PGA) packaging, the back-side of the processor has protruding pins. The amount of pins on the processor, along with the layout of the pins, is a gating factor for which processors a particular motherboard can support. The socket that is soldered onto a motherboard cannot be changed, so only pin-compatible processors will be supported. /

/ Slot/Socket Type /
/ A motherboard is designed for a certain range of processors. One of the determining factors of processor compatibility is the slot or socket connector soldered onto the board. 242-contact and 330-contact slot connectors were used for a short time to allow for L2 cache to be packaged close to the processor die. Processor manufacturing advancements now allow L2 cache to be manufactured on the same die as the processor, requiring a smaller form-factor processor packaging. PGA (pin grid array) sockets are more common, flexible, and compact, but have many variations in the amount of pin connects and pin layouts. /

/ sSpec Number /
/ Also known as specification number. A five character string (SL36W, XL2XL, etc.) that is printed on the processor, and used to identify the processor. By knowing the processor's sSpec Number, you can find out the processor's core speed, cache size and speed, core voltage, maximum operating temperature and so on. /

What Is CPU Virtualization?

wiseGEEK.com

CPUvirtualization involves a single CPU acting as if it were two separate CPUs. In effect, this is like running two separate computers on a single physical machine. Perhaps the most common reason for doing this is to run two different operating systems on one machine.

The CPU, or central processing unit, is arguably the most important component of the computer. It is the part of the computer which physically carries out the instructions of the applications which run on the computer. The CPU is often known simply as a chip or microchip.

The way in which the CPU interacts with applications is determined by the computer's operating system. The best known operating systems are Microsoft Windows®, Mac OS® and various open-source systems under the Linux banner. In principle a CPU can only operate one operating system at a time. It is possible to install more then one system on a computer's hard drive, but normally only one can be running at a time.

The aim of CPU virtualization is to make a CPU run in the same way that two separate CPUs would run. A very simplified explanation of how this is done is that virtualization software is set up in a way that it, and it alone, communicates directly with the CPU. Everything else which happens on the computer passes through the software. The software then splits its communications with the rest of the computer as if it were connected to two different CPUs.

One use of CPU virtualization is to allow two different operating systems to run at once. As an example, an Apple computer could use virtualization to run a version of Windows® as well, allowing the user to run Windows®-only applications. Similarly a Linux-based computer could run Windows® through virtualization. It's also possible to use CPU virtualization to run Windows® on a Mac® or Linux PC, or to run Mac OS® and Linux at the same time.

Another benefit of virtualization is to allow a single computer to be used by multiple people at once. This would work by one machine with a CPU running virtualization software, and the machine then connecting to multiple "desks," each with a keyboard, mouse and monitor. Each user would then be running their own copy of the operating system through the same CPU. This set-up is particularly popular in locations such as schools in developing markets where budgets are tight. It works best where the users are mainly running applications with relatively low processing demands such as web browsing and word processing.

CPU virtualization should not be confused with multitasking or hyperthreading. Multitasking is simply the act of running more than one application at a time. Every modern operating system allows this to be done on a single CPU, though technically only one application is dealt with at any particular moment. Hyperthreading is where compatible CPUs can run specially written applications in a way that carries out two actions at the same time.

AMD HyperTransport technology

By AMD definition "the HyperTransport technology I/O link is a narrow, high speed, lower power I/O bus that has been designed to meet the requirements of the embedded markets, the desktop, workstation, and server markets, and networking and communication markets."

To de-PR babble this statement, HyperTransport technology simply means a faster connection that is able to transfer more data between two chips. This does not mean that the chip itself is faster. It means that the capability exists via the HyperTransport pathway for one chip to "talk" to another chip or device at a faster speed and with greater data throughput.

Think of a HyperTransport I/O link as a highway between two cities with the cars being data; If there are a lot of cars on a two lane highway, then there are going to be traffic jams and possibly a few fender benders and scrapes. The HyperTransport bus makes the highway wider and faster allowing for better traffic flow. This does not mean the cars are any faster; that is up to the car builder but the road is able to accommodate more cars that may have bigger engines and the ability to carry more.

This highway or BUS is an internal connection. On the motherboard level, the HyperTransport bus connects all parts of the motherboard, such as the PCI slots, AGP slots and USB ports to the CPU and memory and also provides the connection between the CPU and memory itself (although it is a bit more complicated than this.

So what...will it be faster?

The simple answer is yes, but how much faster depends on how HyperTransport technology is implemented. Keep in mind the old saying, "you are only as good as your weakest link." HyperTransport is technology that can be incorporated into any particular component or device in a PC. It's like a tune-up for the car engine. If all vehicles had the same tune-up, then they would all run faster or have more horsepower or at least have greater fuel efficiency; but each in their own way. HyperTransport technology raises the performance bar in two ways.