devwiki:nvidia

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
devwiki:nvidia [2022/09/20 09:09] – [Nvidia Cuda programming] yingdevwiki:nvidia [2022/09/21 09:00] (current) – [Nvidia for AI] ying
Line 39: Line 39:
     * Cuda programming is:      * Cuda programming is: 
       * each thread has its thread ID, that its id determine which block of data it works on, and all threads finish the data together at the same data.       * each thread has its thread ID, that its id determine which block of data it works on, and all threads finish the data together at the same data.
-      * optimize how code use memory can be important to fit more thing in the fixed size memory by better arrangement and swapping thing in the slots of memory blocks.+      * optimize how code use memory can be important to fit more thing in the fixed size memory by better arrangement and swapping thing in memory blocks.
  
 +====== CV-Cuda ======
 +
 +  * CV cuda: computer vision with cuda.
 +
 +====== Nivdia RTX stack ======
 +
 +  * 1st Gen: VkRay, DXR, DLSS1
 +  * 2nd Gen: 
 +    * real-time denoise: spatial denoise
 +    * caustics
 +    * RTXDI: raytrace direct illumination, casting shadows from all lights, emissive surface
 +    * RTXGI: real-time multiple bounce indirect lighting
 +    * Reflex
 +    * DLSS2: deep learning super resolution, AI generat pixel
 +  * 3rd Gen: 
 +    * Displaced micro-meshes
 +    * 2D SGM optical flow, shader execution reordering, real-time path tracing, opacity micro-maps, 
 +    * DLSS3: deep learning super resolution, AI frame generator
 +
 +====== Nvidia GPU architecture ======
 +
 +| core   ^ turing ^ ampere ^ ada |
 +| shader | 16     | 40     | 90  |
 +| RT     | 49     | 78     | 200 |
 +| tensor | 130    | 320    | 1400 |
 +| OFA    |        | 126    | 300  |
 +
 +====== Nvidia for AI ======
 +
 +  * large language model: enable single model to do various different task with one single model, context aware output. like text related, image related.
 +  * NeMo LLM service, Prompt learning framework, to promp learn with pre-trained LLM for specific task.
 +  * recommed system: like in shopping, social network
  • devwiki/nvidia.1663664990.txt.gz
  • Last modified: 2022/09/20 09:09
  • by ying