Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
devwiki:nvidia [2022/09/21 03:20] – [Nivdia RTX stack] ying | devwiki:nvidia [2022/09/21 09:00] (current) – [Nvidia for AI] ying | ||
---|---|---|---|
Line 40: | Line 40: | ||
* each thread has its thread ID, that its id determine which block of data it works on, and all threads finish the data together at the same data. | * each thread has its thread ID, that its id determine which block of data it works on, and all threads finish the data together at the same data. | ||
* optimize how code use memory can be important to fit more thing in the fixed size memory by better arrangement and swapping thing in memory blocks. | * optimize how code use memory can be important to fit more thing in the fixed size memory by better arrangement and swapping thing in memory blocks. | ||
+ | |||
+ | ====== CV-Cuda ====== | ||
+ | |||
+ | * CV cuda: computer vision with cuda. | ||
====== Nivdia RTX stack ====== | ====== Nivdia RTX stack ====== | ||
Line 56: | Line 60: | ||
* DLSS3: deep learning super resolution, AI frame generator | * DLSS3: deep learning super resolution, AI frame generator | ||
+ | ====== Nvidia GPU architecture ====== | ||
+ | |||
+ | | core ^ turing ^ ampere ^ ada | | ||
+ | | shader | 16 | 40 | 90 | | ||
+ | | RT | 49 | 78 | 200 | | ||
+ | | tensor | 130 | 320 | 1400 | | ||
+ | | OFA | | 126 | 300 | | ||
+ | |||
+ | ====== Nvidia for AI ====== | ||
+ | |||
+ | * large language model: enable single model to do various different task with one single model, context aware output. like text related, image related. | ||
+ | * NeMo LLM service, Prompt learning framework, to promp learn with pre-trained LLM for specific task. | ||
+ | * recommed system: like in shopping, social network |