Differences

This shows you the differences between two versions of the page.

--- devwiki:nvidia [2022/09/20 09:09] – [Nvidia Cuda programming] ying
+++ devwiki:nvidia [2022/09/21 09:00] (current) – [Nvidia for AI] ying
@@ Line 39: / Line 39: @@
     * Cuda programming is:
       * each thread has its thread ID, that its id determine which block of data it works on, and all threads finish the data together at the same data.
-      * optimize how code use memory can be important to fit more thing in the fixed size memory by better arrangement and swapping thing in the slots of memory blocks.
+      * optimize how code use memory can be important to fit more thing in the fixed size memory by better arrangement and swapping thing in memory blocks.
+====== CV-Cuda ======
+  * CV cuda: computer vision with cuda.
+====== Nivdia RTX stack ======
+  * 1st Gen: VkRay, DXR, DLSS1
+  * 2nd Gen:
+    * real-time denoise: spatial denoise
+    * caustics
+    * RTXDI: raytrace direct illumination, casting shadows from all lights, emissive surface
+    * RTXGI: real-time multiple bounce indirect lighting
+    * Reflex
+    * DLSS2: deep learning super resolution, AI generat pixel
+  * 3rd Gen:
+    * Displaced micro-meshes
+    * 2D SGM optical flow, shader execution reordering, real-time path tracing, opacity micro-maps,
+    * DLSS3: deep learning super resolution, AI frame generator
+====== Nvidia GPU architecture ======
+| core   ^ turing ^ ampere ^ ada |
+| shader | 16     | 40     | 90  |
+| RT     | 49     | 78     | 200 |
+| tensor | 130    | 320    | 1400 |
+| OFA    |        | 126    | 300  |
+====== Nvidia for AI ======
+  * large language model: enable single model to do various different task with one single model, context aware output. like text related, image related.
+  * NeMo LLM service, Prompt learning framework, to promp learn with pre-trained LLM for specific task.
+  * recommed system: like in shopping, social network