Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
techwiki:gpu [2017/02/15 02:19]
ying [GPU Rendering Final Summary]
techwiki:gpu [2017/02/15 03:24] (current)
ying [GPU Rendering Final Summary]
Line 4: Line 4:
  
 ^ Key Point | Description | ^ Key Point | Description |
-^ GPU rendering doesn'​t use SLI | GPU rendering is more like computing, each device doing its own thing and don't need to be sync-ed | +^ GPU rendering doesn'​t use SLI | GPU rendering is more like computing, each device doing its own thing and don't need to be sync-ed ​(ref: [[http://​forums.cgarchitect.com/​78108-help-first-build-gpu-rendering-mind.html#​post401722|difference of sli and GPU computing]]) ​
-^ GPU rendering is OK with any PCIE speed | unlike SLI requiring at least x8 PCIE slot, GPU computing can be done in any speed slot, as speed only affect data asset uploading into VRAM, and it is minimum comparing to the time doing the GPU complex computing | +^ GPU rendering is OK with any PCIE speed | unlike SLI requiring at least x8 PCIE slot, GPU computing can be done in any speed slot, as speed only affect data asset uploading into VRAM, and it is minimum comparing to the time doing the GPU complex computing ​(ref: [[https://​render.otoy.com/​forum/​viewtopic.php?​f=21&​t=27454|4x slot for render]], [[https://​blenderartists.org/​forum/​showthread.php?​346927-GPU-rendering-performance-and-PCI-Express-bus-speed|test with motherboard x4 slot]], [[https://​setiathome.berkeley.edu/​forum_thread.php?​id=74365&​postid=1492701|GPU computing doesn'​t swap vram data as often as game content]], [[https://​world.taobao.com/​item/​528792774300.htm?​spm=a312a.7700714.0.0.bkAa3J#​detail|old motherboard with slow but many slots for mining]], [[http://​blog.erratasec.com/​2011/​06/​password-cracking-mining-and-gpus.html#​.WKQO-2gRXMg|article on GPU mining at slow but many slots]]) ​
-^ CPU max PCIE lanes limits motherboard x8 speed slot count | that may affect SLI setup'​s max GPU device count, but for GPU computing, x4 is also fine, those GPU mining even using x1 speed slot |+^ CPU max PCIE lanes limits motherboard x8 speed slot count | that may affect SLI setup'​s max GPU device count, but for GPU computing, x4 is also fine, those GPU mining even using x1 speed slot (ref: [[https://​www.asus.com/​sg/​Motherboards/​Z170-A/​overview/​|Z170A w. x8/x8 but with lots of low PCIE slots]], [[https://​www.pugetsystems.com/​labs/​articles/​Octane-Render-GPU-Performance-Comparison-790/#​TestSetup|CPU means lanes for max SLI]]) ​|
 ^ Dual CPUs setup will double max lanes count | dual CPUs setup will give more PCIEs lanes, even for 5+5 GPU devices at 8x | ^ Dual CPUs setup will double max lanes count | dual CPUs setup will give more PCIEs lanes, even for 5+5 GPU devices at 8x |
 +^ motherboard PCIE slot count determine max GPU device | that counts both PCIE Gen 3 and PCIE Gen 2 slots, and slot size not matter much if you use PCIE size convertor |
 ^ motherboard PCIE slot configuration option determined slot speed setup | higher price motherboard tends to offer higher slot speed configuration for stack of GPU devices | ^ motherboard PCIE slot configuration option determined slot speed setup | higher price motherboard tends to offer higher slot speed configuration for stack of GPU devices |
-^ extra PLX chipset enable motherboard to can create extra PCIE lanes by multiplexing underused lanes | Since once data asset is uploaded to VRAM, GPU doesn'​t need lanes whiling computing, so PLX chipset can allocate the x16 lanes slot to another GPU, so it lifts up the CPU max lane count limitation with possible latency | +^ extra PLX chipset enable motherboard to can create extra PCIE lanes by multiplexing underused lanes | Since once data asset is uploaded to VRAM, GPU doesn'​t need lanes whiling computing, so PLX chipset can allocate the x16 lanes slot to another GPU, so it lifts up the CPU max lane count limitation with possible latency ​(ref: [[https://​en.wikipedia.org/​wiki/​PLX_Technology|PLX tech]], [[https://​www.asus.com/​sg/​Motherboards/​Z170-WS/​specifications/​|Z170-WS at x8/​x8/​x8/​x8]],​ [[http://​www.anandtech.com/​show/​9819/​the-gigabyte-z170x-gaming-g1-review-quad-sli-on-skylake-and-now-with-thunderbolt-3|Z170X-Gaming G1 at x8/x8/x8/x8 w. PLX8747]], [[https://​forums.geforce.com/​default/​topic/​903076/​sli/​is-there-no-longer-any-3-way-sli-support-in-z170-motherboards-/​|z170 vs x99 sli]], [[https://​www.youtube.com/​watch?​v=A8hi3gm_xzs|video on x99 vs z170 4-way sli]], [[http://​www.tomshardware.com/​answers/​id-3190820/​cpu-pcie-lanes-motherboards.html|PLX used for not just max GPU count but also max x16 full speed slot]], [[https://​linustechtips.com/​main/​topic/​287691-build-help-x99-pcie-lanes/​|x99 with PLX]]) ​
-^ motherboard GPU slots' in-between distance can limit GPU choices | the gap between 2 GPU slot can limit the max "​thickness"​ of GPU device (so called single-slot GPU like low-profile quadra card or normal double-slot gaming GPU), also GPU cooler need to be "​blower type" like those reference card if gap too small for "Open Fan type" cooling, and even tighter space like single-slot distance may require custom water cooling | +^ motherboard GPU slots' in-between distance can limit GPU choices | the gap between 2 GPU slot can limit the max "​thickness"​ of GPU device (so called single-slot GPU like low-profile quadra card or normal double-slot gaming GPU), also GPU cooler need to be "​blower type" like those reference card if gap too small for "Open Fan type" cooling, and even tighter space like single-slot distance may require custom water cooling. (reference: [[https://​smallformfactor.net/​forum/​threads/​gpu-height-clearance.127/​|GPU thickness]],​ [[http://​www.pcgamer.com/​what-you-need-to-know-about-gpu-coolers/​|GPU cooler type]], [[http://​techbuyersguru.com/​video-card-comparison-blower-style-vs-open-air-coolers|GPU cooler type and gap distance]], [[https://​pcpartpicker.com/​forums/​topic/​73698-best-cooler-design-for-4way-sli|talk of cooler design]], [[https://​www.servethehome.com/​supermicro-x10drg-q-review-gpu-compute-server-motherboard/​|big GPU slot gap case]], [[https://​www.asus.com/​Motherboards/​Z10PED8_WS/​|Z10PED8_WS 7 slot but gap for 4 GPU case]], [[https://​www.youtube.com/​watch?​v=LXOaCkbt4lI|video on 7 GPU single slot mod]], [[http://​rawandrendered.com/​Octane-Render-Hepta-GPU-Build|article on 7 GPU mod on X99-E WS]]) 
-^ PCIE riser can help extend tight PCIE slot to outside case for better cooling | it is like a extension PCIE cable but it require case to be able to hang and hold those extends GPUs | +^ PCIE riser can help extend tight PCIE slot to outside case for better cooling | it is like a extension PCIE cable but it require case to be able to hang and hold those extends GPUs (ref: [[https://​bitcointalk.org/​index.php?​topic=365181.0|holding lots of gpu]], [[https://​www.moddiy.com/​products/​PCI%252dExpress-PCI%252dE-8X-to-16X-Riser-Card-Flexible-Ribbon-Extender-Cable-w%7B47%7DMolex-%252b-Solid-Capacitor.html|slot size convertor cable]]) ​
-^ Power supply must be able to feed GPU devices | do a calculation on the power usage with a power calculator |+^ Power supply must be able to feed GPU devices | do a calculation on the power usage with a power calculator, like 3 GPU=850w, 2GPU=650w | 
 +^ External GPU device will go through motherboard chipset lanes instead of CPU lanes | External GPU device using thunderbolt or usb3.1 will not affect CPU lanes usage (?) | 
 +^ GPU RAM size determine the Max scene size or data size | but nowadays GPU quite good at memory management and now card with 8GB vram is quite normal (ref: [[https://​docs.blender.org/​manual/​en/​dev/​render/​cycles/​gpu_rendering.html|each GPU can only access its own memory for cycle case]]) ​|
  
 +  * case study for all points above
 +    * [[http://​www.tomshardware.com/​answers/​id-3245704/​workstation-build-gpu-based-render-animation.html|10-GPU setup case reading with X10DRX]]: GPU device size and power and slot speed and lane allocation debate
 +    * [[http://​forums.cgarchitect.com/​78108-help-first-build-gpu-rendering-mind.html#​post401722|4-GPU step case reading with ASUS Z97-WS]]: PCIE Gen3 is fast and hard for data to VRAM speed to saturate it
 +    * [[http://​www.overclockers.com/​asrock-z170-oc-formula-motherboard-review/​|asrock Z170 oc formula'​s 4 slots' speed source]] and [[http://​vr-zone.com/​articles/​asrock-z170-oc-formula-review/​99356.html|this reference of CPU lanes structure for its 8x/​4x/​4x/​4x]],​ and this [[http://​www.legitreviews.com/​asrock-z170-oc-formula-motherboard-review_171946|this review as well]], [[http://​forum.asrock.com/​forum_posts.asp?​TID=3434&​PN=2&​title=z170-oc-forumla-6-gpu-detection-on-windows-10|6-GPU setup case with it]]
 +      * ASRock has wired the second PCIe slot to the Platform Controller Hub (PCH) to tap on the PCIe 3.0 lanes from the chipset to enable better multi-GPU performance
 +      * while same trick with asrock Z170 Extreme7+(x8/​x4/​x4) and Fatal1ty Z170 Professional Gaming i7; but these 2 has gaps that physically only allow 3 double-slot card
 +      * [[https://​pcpartpicker.com/​products/​motherboard/#​h=4,​8&​c=110&​sort=a8&​page=1|list of all cards with 4 PCI-E x16 size slots (for both Gen2 and Gen3)]]
 +    * [[https://​www.youtube.com/​watch?​v=5xSsAY15VpE|ASUS X99-E with 7 GPU render speed video]]
  
 ====== GPU Practice in Real World ====== ====== GPU Practice in Real World ======