Differences
This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
devwiki:nvidia [2022/09/20 05:28] – ying | devwiki:nvidia [2022/09/21 09:00] (current) – [Nvidia for AI] ying | ||
---|---|---|---|
Line 17: | Line 17: | ||
* nvidia OVX server provide hardware support to build larget scale digital twins | * nvidia OVX server provide hardware support to build larget scale digital twins | ||
* omniverse system: digital twins+ robotics; design +content creation; integration; | * omniverse system: digital twins+ robotics; design +content creation; integration; | ||
+ | * AI: drive, ISAAC (for move+manipulate stuff), metropolis (auto infrastructure), | ||
* replicator: generate + train synthetic data for train+test AI model | * replicator: generate + train synthetic data for train+test AI model | ||
* omnigraph, behavior, animation: run data center scale 3d application | * omnigraph, behavior, animation: run data center scale 3d application | ||
Line 24: | Line 25: | ||
* tut: https:// | * tut: https:// | ||
* tut: https:// | * tut: https:// | ||
+ | |||
+ | ====== Nvidia MDL ====== | ||
+ | * to define physically based material | ||
+ | * store specification for material exchange | ||
+ | * render-algorithm agnostic | ||
+ | * designed for high performance on GPU | ||
+ | |||
+ | ====== Nvidia Cuda programming ====== | ||
+ | |||
+ | * video info: How CUDA Programming Works | ||
+ | * https:// | ||
+ | * the cube programming is designed based on how the GPU hardware works, and GPU also designed how GPU normally programms for best performance | ||
+ | * Cuda programming is: | ||
+ | * each thread has its thread ID, that its id determine which block of data it works on, and all threads finish the data together at the same data. | ||
+ | * optimize how code use memory can be important to fit more thing in the fixed size memory by better arrangement and swapping thing in memory blocks. | ||
+ | |||
+ | ====== CV-Cuda ====== | ||
+ | |||
+ | * CV cuda: computer vision with cuda. | ||
+ | |||
+ | ====== Nivdia RTX stack ====== | ||
+ | |||
+ | * 1st Gen: VkRay, DXR, DLSS1 | ||
+ | * 2nd Gen: | ||
+ | * real-time denoise: spatial denoise | ||
+ | * caustics | ||
+ | * RTXDI: raytrace direct illumination, | ||
+ | * RTXGI: real-time multiple bounce indirect lighting | ||
+ | * Reflex | ||
+ | * DLSS2: deep learning super resolution, AI generat pixel | ||
+ | * 3rd Gen: | ||
+ | * Displaced micro-meshes | ||
+ | * 2D SGM optical flow, shader execution reordering, real-time path tracing, opacity micro-maps, | ||
+ | * DLSS3: deep learning super resolution, AI frame generator | ||
+ | |||
+ | ====== Nvidia GPU architecture ====== | ||
+ | |||
+ | | core ^ turing ^ ampere ^ ada | | ||
+ | | shader | 16 | 40 | 90 | | ||
+ | | RT | 49 | 78 | 200 | | ||
+ | | tensor | 130 | 320 | 1400 | | ||
+ | | OFA | | 126 | 300 | | ||
+ | |||
+ | ====== Nvidia for AI ====== | ||
+ | |||
+ | * large language model: enable single model to do various different task with one single model, context aware output. like text related, image related. | ||
+ | * NeMo LLM service, Prompt learning framework, to promp learn with pre-trained LLM for specific task. | ||
+ | * recommed system: like in shopping, social network |