Differences

This shows you the differences between two versions of the page.

--- devwiki:nvidia [2022/09/21 02:55] – [Nvidia Cuda programming] ying
+++ devwiki:nvidia [2022/09/21 09:00] (current) – [Nvidia for AI] ying
@@ Line 40: / Line 40: @@
       * each thread has its thread ID, that its id determine which block of data it works on, and all threads finish the data together at the same data.
       * optimize how code use memory can be important to fit more thing in the fixed size memory by better arrangement and swapping thing in memory blocks.
+====== CV-Cuda ======
+  * CV cuda: computer vision with cuda.
 ====== Nivdia RTX stack ======
@@ Line 45: / Line 49: @@
   * 1st Gen: VkRay, DXR, DLSS1
   * 2nd Gen:
-    * real-time denoise
+    * real-time denoise: spatial denoise
     * caustics
-    * RTXDI
+    * RTXDI: raytrace direct illumination, casting shadows from all lights, emissive surface
-    * RTXGI
+    * RTXGI: real-time multiple bounce indirect lighting
     * Reflex
-    * DLSS2
+    * DLSS2: deep learning super resolution, AI generat pixel
   * 3rd Gen:
     * Displaced micro-meshes
-    * 2D SGM optical flow, shader execution reordering, real-time path tracing, opacity micro-maps, DLSS3
+    * 2D SGM optical flow, shader execution reordering, real-time path tracing, opacity micro-maps,
+    * DLSS3: deep learning super resolution, AI frame generator
+====== Nvidia GPU architecture ======
+| core   ^ turing ^ ampere ^ ada |
+| shader | 16     | 40     | 90  |
+| RT     | 49     | 78     | 200 |
+| tensor | 130    | 320    | 1400 |
+| OFA    |        | 126    | 300  |
+====== Nvidia for AI ======
+  * large language model: enable single model to do various different task with one single model, context aware output. like text related, image related.
+  * NeMo LLM service, Prompt learning framework, to promp learn with pre-trained LLM for specific task.
+  * recommed system: like in shopping, social network