the future of 3d

Issue 226 May 2009 £5.99 Outside UK & ROI £6.49

How tobuild an EVECorp

GODFATHER 2THOU SHALT NOT KILL

WORLD IN CONFLICTSOVIET ASSAULT

PERFORMANCE GEAR & GAMING

BRAIN MELTING

MOTHERBOARD SUPERTEST

BUDGET PROJECTION: BUILD A SCREEN AND MOUNT ON THE CHEAP

Big screen action on a shoe-string budget

HARDCOREPC ADVICEPro tips Hacking, Tweaking, Overclocking and Modding

I’VE BEEN

ION

IZED, BU

T I’M O

K NO

W

WW

W.PCFO

RM

AT.CO.U

KISSU

E 226M

AY 2009

THERE’S MORE…£550 Intel vs AMD PCsHow to fi x any PC problemThe tech behind new MMOs

ISSUE 226 I’VE BEEN IONIZED, BUT I’M OK NOW

PCF226.cover 1 20/3/09 5:30:40 pm

74 May 2009

PCF226.feature2 74 23/3/09 5:34:28 pm

Are we addicted to graphics power? Are NVIDIA and AMD, in reality, the pushers of the most addictive shader skag? In any other walk of

life you’d be dragged off to rehab for this intensely self-destructive cycle of substance abuse. But just like that heroin addict, who’s currently trying to break into your home and steal your pride and joy gaming system, the PC industry is addicted to graphics. It’s the driving force behind the games we love and the reason we love the PC so much in the fi rst place and why consoles will always be toys in comparison.

Of course, we could turn around, throw our hands heavenwards and shout ‘Enough is enough, this madness must end! Stop the development! My graphics card is good enough.’ But where would that get us? We’d still be playing Tomb Raider on an original 3dfx Voodoo card. The desperate truth is we need; we long for that hit of hardcore 3D acceleration. We need help, we need treatment, what we need is the latest graphics card. Sliding that long card into a tight PCI Express slot always feels so good.

For many years now we’ve been happy in this abusive relationship. Clinging to our ageing card, trying to scrap the last remnants of a decent frame rate together by installing hacked drivers and dropping the resolution, until

we end up crawling back to our favourite green or red dealer for a fresh hit of delicious 3D. But today that magic hit isn’t just about graphics: from HD decoding and physics acceleration to GP-GPU features, that graphics card is offering a lot of technology. The next generation of cards is set to up this technology to a new level and with the advent of a new pusher on the scene – in the form of chip-giant Intel – the entire graphics market is set for an enormous shake up. It’ll be a combination of new competition, changing demands and the evolving technology that is bringing general processing and graphics processing closer and closer together. But what will happen when these two worlds collide? Lets fi nd out…

A new generation of incredible graphics power is less than a year away, Neil Mohr lines up the new technology.

May 2009 75

The future of 3D

PCF226.feature2 75 23/3/09 5:34:31 pm

Not that we want to dwell on the past but how did this addictive relationship start? The answer lies with what PCs were back in the early nineties and how 3D images are generated in the fi rst place. So lets take you back, back, back in time to when the original Doom, Duke Nuke ‘em 3D and Wing Commander adorned our screens. 3D gaming was a simplistic affair – sometimes referred to as ‘vector games’. 3D line objects were made of vectors; a mathematical construct and nothing more than a line in space defi ned by two points. Put three vectors together and you get a triangle, put enough triangles together and you can form anything.

Luckily for your average 7MHz, 16-bit processor, vectors can be manipulated using simple matrices functions, so they can be scaled and rotated in our imaginary space before being drawn to the screen. But lines aren’t very exciting, unless they’re white and Bolivian in origin. As a stepping stone to true 3D, Doom and its clones were based on 2D maps that had simple height information and the actual 3D effect were a textured wall projection. Similarly, the monsters were fl at bitmaps positioned on that same 2D map, scaled according to their distance from the player. This combined with pseudo lighting effects enabled id Software to generate a basic, fully textured 3D world on a lowly 386 PC. Faster processors have enabled devs to combine the texture handling used in Doom with a true full 3D Vector Engine to create the likes of Descent in 1995, and in 1996, the seminal Quake. But despite all the cleverness of these

engines, incredibly basic abilities, such as texture fi ltering were and remain simply too processor intensive for a standard CPU to even consider attempting in real time.

ACCELERATION HEAVEN The fi rst time our gelatinous eyeballs gazed upon the smooth textures and lighting effects in Quake or the explosive effects of Incoming, we were hooked. It was these types of effects and abilities that enabled a mid-nineties PC to pull off arcade-level graphics. While not wanting to delve into degree level subjects, to really understand why

graphics cards exist as they do today, it’s helpful to know what’s required to create that eye-pleasing 3D display we so enjoy. As you’ll see graphics cards started with handling only a fraction of the total process, up until today where they embrace almost the entire task.

We’ve already mentioned vectors and how they can be used to build up models from triangular meshes. You start here with your models, these need to be transformed and scaled to fi t into a virtual ‘world view’, the application then applies a ‘view space’, which is how the player will view this world. It’s a pyramid volume cut out of the world space and bounds the only area of interest to the renderer. From this pyramid we get the clipping space, which is the visible square of our virtual viewport and fi nally these are translated into the screen space where the 2D x/y coordinates are calculated ready for the pixel rendering. These steps are important as originally this was done on the CPU, but stages were slowly shifted to the GPU.

So are you still with us? As that’s the simple part, each of those ‘views’ is required for different stages in rendering. For instance, to help optimise the rendering it makes sense to discard all the undrawn triangles. Occlusion culling will remove obscured objects, trivial clipping removes objects outside the ‘view space’ and fi nally culling determines which triangles are facing away from the viewer and so can be ignored. The clipping space view is created from this remaining world space and any models that bisect the viewing boundary box need to be clipped off and retessellated, leaving only the visible triangles in the fi nal scene.

Below From a little voodoo, do mighty pipelines grow

Hidden line removable, totally awesome!

The future of 3D

76 May 2009

RASTERISATION

INPUT ASSEMBLER

INPUT ASSEMBLER

VERTEX SHADER

INDEX BUFFER

VERTEX BUFFER

TEXTURE

TEXTURE

TEXTURE

DEPTH/STENCIL

RENDER

SAMPLER

CONSTANT

SAMPLER

SAMPLER

CONSTANT

CONSTANT

GEOMETRY SHADER

STREAMOUTPUT

STREAMBUFFER

PIXELSHADER

CLIP + PROJECT + SETUP+EARLY Z

RASTERIZE

TRANSFORMLIGHTING

CLIPPINGRASTERISATION

MULTITEXTURE UNIT

OUTPUT MERGEROUTPUT MERGER

TEXTURE UNIT

SCREEN READER

VOODOO PIPELINE 1995

DIRECT X 7 PIPELINE 1999

DIRECT X 10 PIPELINE 2006

PCF226.feature2 76 23/3/09 5:34:34 pm

LIGHT & BRIGHT With an optimised view space created, lighting can be applied. It’s important to understand this isn’t the visual representation of light, it’s calculating how ‘bright’ every surface is going to be. A scene can have a global light-source, along with point-sources and spotlight-sources. Every triangle surface will have material properties, such as ambient, diffuse, specular and emissive material colours. For every source and every triangle a calculation will be made to determine its total luminosity. As you can imagine the more sources there are, the larger the calculation expense. It’s

important to remember that, at this stage, all we know is the luminance for each triangular surface, the actual rendering comes later.

If you want to know more about the pixel rendering stage see the pipeline diagram on page 76. For now lets just say each pixel can now be blended with its corresponding lighting values, textures and other effects, such as bump maps and light maps. On top of this each pixel will have fi ltering applied, fogging, shadow values and even antialiasing to produce the fi nal image. If you’re feeling a bit dazed and wondering what that’s useful for, it’s so you have an overview of what goes into creating a single 3D frame, which is on screen for mere milliseconds.

As graphics cards have developed more of that pipeline has been moved or added to the graphics card. With the original 3D cards, only the end rasterisation and rendering stages were performed on-card and that was by

dumb, fi xed-units that could only perform a single render pass. Multi-texturing and multi-pass rendering improved visual quality and when DirectX 7.0 was released in 1999, graphics cards got a little smarter because of Transform and Lighting (T&L). T&L moved the lighting and vertex transformation stages on to the graphics card and was the fi rst move away from CPU-based vertex handling.

It wasn’t until the introduction of DirectX 8 that things really got interesting, as the fi rst shaders appeared. Vertex shaders enable programmers to manipulate vertices

directly on the card, while pixel shaders replaced the fi xed multi-texture engines with programmable ones. These gave graphics cards their fi rst smarts, even though these were limited; there couldn’t be any branches in the code, there were limits on the number of commands and variables, plus the total program length was very short. So while technically these cards were running programs of a sort, the two types of shader units were different in design and very limited.

THE SMART STUFF It took until DirectX 9.0c was released in 2004 with Shader Model 3.0 that cards started to look more like a collection of smart processors than dumb fi xed logic. Dynamic branching, program lengths over 512 commands and access to hundreds of registers made graphics cards sound more like mini-super computers. The fi nal evolution came with unifi ed shaders introduced in

DIRECTX 11’S GOT LEGS?

The SDK preview is already available for developers, we’ve even got our hands on it as part of the Windows 7 Beta. The hardware is well on the way and will be out in the latter half of 2009, but what can we expect from DirectX 11?

Built atop of DirectX 10’s Windows Graphics Foundation the new version is pushing the idea of the graphics card as a GP-GPU system and adds evermore complex pipeline manipulation into the hardware. The most radical feature is the implementation of the new ‘compute’ shader, its soul aim is to lay open the power of the GPU for general processing tasks, including physics and media encoding, to name but two.

A new ability in DirectX 11 is a combination of three new features: the hull shader, hardware tessellator and domain shader. Before Shader 4.0 and the geometry shader, graphics hardware could only manipulate existing vertexes rather than create them. The geometry shader changes that and enables DirectX 11 hardware to be fed a model and generate extra detail. The initial thinking is that low-polygon models will get enhanced, but equally it’d enable high-end hardware to generate highly detailed models, while lower-end hardware makes do with the basic models.

We’re glad to see that Microsoft has recognised that more people have multi-core processors, perhaps it has been looking at the Steam Hardware Survey recently? DirectX 11 fi nally offers ways for developers to multi-thread areas of the 3D pipeline. Much of it is sequential, but new Immediate Context and Deferred Context resources will enable better use of multi-core chips as it’ll enable resources to be loaded in separately for different threads and it will even work with DirectX 10 hardware as long as it’s running on Vista or Windows 7. It may not add any outstanding eye candy, but what DirectX 11 does add is a heap of new tools that developers can go to task on, more so than anything DirectX 10 brought to the game.

“It wasn’t until DirectX 8 that things really got interesting, as the fi rst shaders appeared”

Will DX11 lead to vastly better looking games? Probably not

Our favourite 3D game and no GPU in sight

The future of 3D

PCF226.feature2 77 23/3/09 5:34:37 pm

DirectX 10 and Shader Model 4.0. At this point there’s no distinction between vertex or pixel shaders. Cards have ‘unifi ed’ shaders, akin to having hundreds of tiny dedicated processing units and are found on both the GeForce 8 and Radeon HD 2000, and later generations of cards. This has enabled both AMD and NVIDIA to start offering GP-GPU features and programming languages for current graphics cards and which allow them to process physics and other mathematically complex data alongside 3D rendering.

LARRY WHO? As testament to the idea that shaders are becoming processors in their own right, Intel is wading into the graphics arena and the ripples could permanently erode the market that once seemed so rock solid. As we already know the new GPU is codenamed Larrabee and its

heart is based, in part, on the original x86 Pentium core. Intel is on record as saying it can, in theory, run OS kernel-level code. The idea is to take a bunch of optimised, in-order x86 Pentium cores, add in a Vector Processing Unit and tie the whole thing together via each core’s L2 cache using a high-speed ring bus. Alongside the multi-core design there’s a dedicated texture fi ltering unit, plus the usual extra gubbins for the memory controller, display and system interfaces.

Intel is approaching the problem in the opposite direction to AMD and NVIDIA. It’s almost dumbing-down an

x86 core to help fi t as many as possible onto a GPU die. All parties are selling these as more than just a graphics solution. Intel is partnering with Dreamworks, who will be using Larrabee as an accelerated computing platform for ray tracing frames within its animated features. With Intel measuring a 1GHz, 24-core Larrabee GPU running almost fi ve times faster than an eight-core Xeon processor at 2.6GHz at

ray tracing. This shows the huge acceleration potential GP-GPU solutions have in the real world.

Currently no one has any idea how well Larrabee will perform, if it performs at all. However, we managed to dig out some fi gures from a paper Intel published. It estimates the performance of a Larrabee processor running F.E.A.R., Gears of War and Half-Life 2: Episode 2. The most interesting section took the DirectX commands generated from a sequence of random frames from each of these games. These commands were fed through a ‘functional model’ of Larrabee rendering at 1,600x1,200 with

4x AA. The test was to see how many 1GHz cores were required to keep a constant 60fps output for each game. The answers is between 10 and 24 cores depending on the game. Clearly this is nowhere near the performance of top-end cards, the frames would have to be nearer 180fps at that resolution, but even so at 3GHz with 24 cores that would be achievable and still in the realms of reality.

3D PIPELINES

We went deep down the rabbit hole covering the various ‘view’ transformation and lighting technologies, but there’s still a way to go yet. Before rendering can begin there’s the triangle set-up phase. Also known as ‘rasterisation’ or ‘scan-line conversion’. For each triangle its vertex data is used to calculate which screen pixels it intersects with and a colour value for that pixel is generated.

Now, we can fi nally start to render the screen image. The most visible and well-known area is texturing. For each triangle a corresponding texture is translated, rotated and scaled to its correct coordinates, also known as a ‘texel’. The lighting and colour data is applied and blended to the correct brightness, along with the texture fi ltering. Multitexturing is also applied at this stage, if necessary over

multiple passes. This is used to apply many effects: bump mapping, light maps, specular lighting and other visual tricks.

This would create a perfectly acceptable fi nished frame, however there are still a number of effects that can be applied. Fogging is the fi rst and is rendered on a per-pixel basis, most likely using a look-up table for volumetric fogging. Opacity is also alpha values on a vertex basis, enabling

glass and water effects. Shadows are another effect and are often applied via a stencil buffer that generates shadow volumes blended to the fi nal image. Finally, antialiasing is applied, this is either via brute force of rendering the frame at twice or four times the resolution and sampling it down, while more interesting antialiasing techniques, such as multisample and adaptive sampling can also be applied.

Shadows are annoyingly diffi cult to generate, so it’s best just turn all the lights off

“Intel is wading into the graphics arena and the ripples could permanently erode the market”

Above GP-GPU can help accelerate everything from weather prediction to testing WMDs

The future of 3D

78 May 2009

PCF226.feature2 78 23/3/09 5:34:40 pm

WHEN LARRY COMES By the time Larrabee launches, it could be almost 2010 and both NVIDIA and AMD will have had next-gen DirectX 11 devices well out of the stable. Intel’s own fi gures show that its core scaling works well up to and over 48 cores with apparently only a two to ten per cent drop in performance.

It’s impossible at this stage to know how much a Larrabee card will cost, but we can make several massive assumptions based on existing technology. For example, a 24-core GPU would require 6MB of L2 cache, that’s roughly 300 million transistors. Lets guesstimate that the x86 modifi ed Pentium cores are twice their original sizes at 6 million transistors, that’s around 450 million transistors in total for a 24-core Larrabee GPU. Now, if you accept those transistor counts and

accept fab costs are closer to that of a full processor than a GPU, at roughly half the transistor count of a 3GHz Core i7, the consumer price could be up to £230. That’s not including the 1GB of GDDR5, of course. The issue is whether Intel can put out a GPU that’s affordable and a good performer, when the Larrabee’s launched.

At least AMD and NVIDIA will put us out of our misery soon enough, as they’re both expected to fi eld hardware supporting DirectX 11 in the second half of 2009. It will be interesting to see, which of the two has the most powerful GP-GPU solution, but regardless Intel won’t get an easy ride. The quality of Intel’s drivers is going to be a key issue and support for dual-GPU or SLI-style, dual-card support may be a necessity, if it wants to compete for the performance crown. ¤

D$ D$

Mem

ory

Cont

rolle

r

Text

ure

Logi

cFi

xed

Func

tion

Mem

ory Controller

Display Interface

System Interface

L2 Cache

I$ I$

I$ I$

D$ D$

Multi-ThreadedWide SIMD




We still remember when computers had separate fl oating point units – yes we’re that old – which was insane, but at the time it was the only option that worked. Now, of course, the FPU is integrated into the processor core. The memory controller has been absorbed and now the GPU is next on the list.

AMD has staked its claim with its Fusion processor; an AMD multi-core chip that combines unifi ed shader and video decoding features. Intel will have a competing product at the end of 2009, based on dual-core Nehalem-C architecture. The graphics core is based on its existing 65nm chipset graphics core and will have serious speed improvements via increased bandwidth and clock speeds from being on the CPU die.

Both of these are low-power, laptop and entry-level desktop technologies, which is bad news for NVIDIA as it’s being pushed out of a large chunk of the lucrative laptop and netbook market, as well as losing its chipset market for the desktop. Without an x86 processor its hard to see how NVIDIA can compete, leaving it with just the mid- and high-end discrete graphics market, which it will now have to compete for against both Intel and AMD. NVIDIA has publicly stated it expects to have a low-power x86 chip within a few years, but without the right licenses to produce one, it’ll be interesting to see how legally it’ll be able to pull that off.

Below How the Larrabee GPU looks like to a cubist

Above AMD is well advanced with its all-in-one CPU-GPU Fusion tech

The future of 3D

May 2009 79

accept fab costs are closer to that of a

With 900 million transistors we’re surprised these things are so cheap

A FUSION FUTURE

PCF226.feature2 79 23/3/09 5:34:47 pm

the future of 3d

Documents

latest graphics card

new technology

graphics processing

d acceleration

d gaming

d pcf226

long card

entire graphics market