GPU编码器
Generate CUDA code for NVIDIA GPUs
GPU编码器™ generates optimized CUDA®code from MATLAB®code and Simulink®models. The generated code includes CUDA kernels for parallelizable parts of your deep learning, embedded vision, and signal processing algorithms. For high performance, the generated code calls optimized NVIDIA®CUDA libraries, including TensorRT™, cuDNN, cuFFT, cuSolver, and cuBLAS. The code can be integrated into your project as source code, static libraries, or dynamic libraries, and it can be compiled for desktops, servers, and GPUs embedded on NVIDIA Jetson™, NVIDIA DRIVE™, and other platforms. You can use the generated CUDA within MATLAB to accelerate deep learning networks and other computationally intensive portions of your algorithm. GPU Coder lets you incorporate handwritten CUDA code into your algorithms and into the generated code.
When used with Embedded Coder®, GPU Coder lets you verify the numerical behavior of the generated code via software-in-the-loop (SIL) and processor-in-the-loop (PIL) testing.
开始:
Free White Paper
Generating CUDA Code from MATLAB
部署算法免版税
Compile and run your generated code on popular NVIDIA GPUs, from desktop systems to data centers to embedded hardware. The generated code is royalty-free—deploy it in commercial applications to your customers at no charge.
GPU编码器Success Stories
Learn how engineers and scientists in a variety of industries use GPU Coder to generate CUDA code for their applications.
Generate Code from Supported Toolboxes and Functions
GPU编码器从广泛的MATLAB语言功能生成代码,该功能设计设计工程师用于开发算法作为较大系统的组件。这包括来自Matlab和Companion工具箱的数百个运营商和函数。
Incorporate Legacy Code
Use legacy code integration capabilities to incorporate trusted or highly optimized CUDA code into your MATLAB algorithms for testing in MATLAB. Then call the same CUDA code from the generated code as well.
Run Simulations and Generate Optimized Code for NVIDIA GPUs
与Simulink Coder金宝app™一起使用时,GPU编码器在NVIDIA GPU上的Simulink模型中加速了MATLAB功能块的计算密集部分。然后,您可以从Simulink模型生成优化的CUDA代码,并将其部署到您的NVIDIA GPU目标。金宝app
Deploy End-to-End Deep Learning Algorithms
Use a variety of trained deep learning networks (including ResNet-50, SegNet, and LSTM) from Deep Learning Toolbox™ in your Simulink model and deploy to NVIDIA GPUs. Generate code for preprocessing and postprocessing along with your trained deep learning networks to deploy complete algorithms.
日志信号,调谐参数和数字验证代码行为
When used with Simulink Coder, GPU Coder enables you to log signals and tune parameters in real time using external mode simulations. Use Embedded Coder with GPU Coder to run software-in-the-loop and processor-in-the-loop tests that numerically verify the generated code matches the behavior of the simulation.
Deploy End-to-End Deep Learning Algorithms
部署一个各种各样的训练有素的深度学习网络(including ResNet-50, SegNet, and LSTM) from Deep Learning Toolbox to NVIDIA GPUs. Use predefined deep learning layers or define custom layers for your specific application. Generate code for preprocessing and postprocessing along with your trained deep learning networks to deploy complete algorithms.
Generate Optimized Code for Inference
GPU编码器与其他深度学习解决方案相比,GPU编码器产生具有较小占用的代码,因为它只生成使用特定算法运行推断所需的代码。金宝搏官方网站生成的代码调用优化的库,包括TensorRT和CUDNN。
使用张力进一步优化
Generate code that integrates with NVIDIA TensorRT, a high-performance deep learning inference optimizer and runtime. Use INT8 or FP16 data types for an additional performance boost over the standard FP32 data type.
深度学习量化
量化您的深度学习网络以降低内存使用率并提高推理性能。使用Deep Network Standizer应用程序分析和可视化性能和推理准确性之间的折衷。
最小化CPU-GPU存储器传输并优化内存使用情况
GPU编码器automatically analyzes, identifies, and partitions segments of MATLAB code to run on either the CPU or GPU. It also minimizes the number of data copies between CPU and GPU. Use profiling tools to identify other potential bottlenecks.
调用优化的库
Code generated with GPU Coder calls optimized NVIDIA CUDA libraries, including TensorRT, cuDNN, cuSolver, cuFFT, cuBLAS, and Thrust. Code generated from MATLAB toolbox functions are mapped to optimized libraries whenever possible.
Prototype on NVIDIA Jetson and DRIVE Platforms
Automate cross-compilation and deployment of generated code onto NVIDIA Jetson and DRIVE platforms using GPU Coder Support Package for NVIDIA GPUs.
Access Peripherals and Sensors from MATLAB and Generated Code
远程与Matlab的NVIDIA目标通信,从网络摄像头和其他支持的外围设备获取早期原型的数据。金宝app将算法与外设接口代码一起部署到主板以进行独立执行。
从原型化到生产
Use GPU Coder with Embedded Coder to interactively trace your MATLAB code side-by-side with the generated CUDA code. Verify the numerical behavior of the generated code running on the hardware using software-in-the-loop (SIL) and processor-in-the-loop (PIL) testing.
加速算法Using GPUs in MATLAB
Call generated CUDA code as a MEX function from your MATLAB code to speed execution, though performance will vary depending on the nature of your MATLAB code. Profile generated MEX functions to identify bottlenecks and focus your optimization efforts.
使用NVIDIA GP金宝appU加速Simulink模拟
与Simulink编码器一起使金宝app用时,GPU编码器在NVIDIA GPU上的Simulink模型中加速了MATLAB功能块的计算密集部分。
金宝appSimu金宝applink支持
Generate, build, and deploy Simulink models to NVIDIA GPUs
深度学习Simulink支持金宝app金宝app
在Simulink模型中生成,构建和部署深度学习网络,以NVIDIA GPU金宝app
持久变量
在GPU上创建持久内存
Wavelet Toolbox Code Generation
使用DWT,DWT2,MODWT和MODWTMRA生成基于FFT的FIR滤波和短时傅里叶变换的代码
Deep Learning
Generate code for custom layers
多输入网络
为具有多个输入的网络生成代码
Long Short-Term Memory (LSTM) Networks
为卷积的LSTM和网络激活生成代码
IO阻止NVIDIA硬件库
使用GPU编码器支持包为NVIDIA GPUS访问NVIDIA硬件外设金宝app
See therelease notesfor details on any of these features and corresponding functions.