Nvidia GPU based toolkit
- Nvidia developer program: https://developer.nvidia.com/
- its AI models: https://catalog.ngc.nvidia.com/models
- RIVA is their language process framework
- Maxine is video-audio translator and recomposer to video chat
- Merlin is recommed system, like shop, video recommed
- it also has computer vision related: Image segmentation models
- ai related: nvidia AI accelerated program
- Digital Twins, like real world twins, it is a exact duplicate model for testing when the main physical copy is not accessible.
- in addition, digital twins also updates with the main physical copy, anythings happen to main one, the digital twin one update to the same.
- nivida omniverse is the tool to simulate the real world, to build digital twins,
- it bridge realtime collaboration between different users and different graphic softwares.
- omniverse audio2face: to generate face animation from audio
- nvidia OVX server provide hardware support to build larget scale digital twins
- omniverse system: digital twins+ robotics; design +content creation; integration; rendering; sensors; asset lib;
- AI: drive, ISAAC (for move+manipulate stuff), metropolis (auto infrastructure), holoscan (robotic medical)
- replicator: generate + train synthetic data for train+test AI model
- omnigraph, behavior, animation: run data center scale 3d application
- avatar (wip): build digital humans
- nvidia open source Material Definition Language (MDL): https://developer.nvidia.com/rendering-technologies/mdl-sdk
Nvidia MDL
- to define physically based material
- store specification for material exchange
- render-algorithm agnostic
- designed for high performance on GPU
Nvidia Cuda programming
- video info: How CUDA Programming Works
- the cube programming is designed based on how the GPU hardware works, and GPU also designed how GPU normally programms for best performance
- Cuda programming is:
- each thread has its thread ID, that its id determine which block of data it works on, and all threads finish the data together at the same data.
- optimize how code use memory can be important to fit more thing in the fixed size memory by better arrangement and swapping thing in memory blocks.
CV-Cuda
- CV cuda: computer vision with cuda.
Nivdia RTX stack
- 1st Gen: VkRay, DXR, DLSS1
- 2nd Gen:
- real-time denoise: spatial denoise
- caustics
- RTXDI: raytrace direct illumination, casting shadows from all lights, emissive surface
- RTXGI: real-time multiple bounce indirect lighting
- Reflex
- DLSS2: deep learning super resolution, AI generat pixel
- 3rd Gen:
- Displaced micro-meshes
- 2D SGM optical flow, shader execution reordering, real-time path tracing, opacity micro-maps,
- DLSS3: deep learning super resolution, AI frame generator
Nvidia GPU architecture
core | turing | ampere | ada |
---|---|---|---|
shader | 16 | 40 | 90 |
RT | 49 | 78 | 200 |
tensor | 130 | 320 | 1400 |
OFA | 126 | 300 |
Nvidia for AI
- large language model: enable single model to do various different task with one single model, context aware output. like text related, image related.
- NeMo LLM service, Prompt learning framework, to promp learn with pre-trained LLM for specific task.
- recommed system: like in shopping, social network