Main Content

Profile MEX Functions by UsingMATLABProfiler

You can profile execution times for MEX functions generated by MATLAB®Coder™ by using the MATLAB Profiler. The profile for the generated code shows the number of calls and the time spent for each line of the corresponding MATLAB function. Use the Profiler to identify the lines of MATLAB code that produce generated code that take the most time. This information can help you identify and correct performance issues early in the development cycle. For more information on the MATLAB Profiler, seeprofileandProfile Your Code to Improve Performance.

The graphical interface to the Profiler is not supported inMATLAB Online™.

MEX Profile Generation

You can use the MATLAB Profiler with a generated MEX function. Alternatively, if you have a test file that calls your MATLAB function, you can generate the MEX function and profile it in one step. You can perform these operations at the command line or in the MATLAB Coder app.

To use the Profiler with a generated MEX function:

  1. Enable MEX profiling by setting the configuration object propertyEnableMexProfilingtotrue.

    Alternatively, you can usecodegenwith the-profileoption.

    The equivalent setting in the MATLAB Coder app isEnable execution profilingin theGeneratestep.

  2. Generate the MEX fileMyFunction_mex.

  3. MATLAB运行分析器并查看概要总结ry Report, which opens in a separate window.

    profileon; MyFunction_mex; profileviewer;

    Make sure that you have not changed or moved the original MATLAB fileMyFunction.m. Otherwise, the Profiler does not considerMyFunction_mexfor profiling.

If you have a test fileMyFunctionTest.mthat calls your MATLAB function, you can:

  • Generate the MEX function and profile it in one step by usingcodegenwith the-testand the-profileoptions. If you turned on the MATLAB Profiler before, turn it off before you use these two options together.

    codegenMyFunction-testMyFunctionTest-profile
  • Profile the MEX function by selectingEnable execution profilingin theVerifystep of the app. If you turned on the MATLAB Profiler before, turn it off before you perform this action.

Example

You use the Profiler to identify the functions or the lines of the MATLAB code that produce generated code that take the most time. Following is an example of a MATLAB function that converts the representation of its input matricesAandBfrom row-major to column-major layout in one of its lines. Such a conversion has a long execution time for large matrices. Avoiding the conversion by modifying that particular line makes the function more efficient.

Consider the MATLAB function:

function[y] = MyFunction(A,B)%#codegen% Generated code uses row-major representation of matrices A and Bcoder.rowMajor; length = size(A,1);% Summing absolute values of all elements of A and B by traversing over the% matrices row by rowsum_abs = 0;forrow = 1:lengthforcol = 1:length sum_abs = sum_abs + abs(A(row,col)) + abs(B(row,col));endend% Calling external C function 'foo.c' that returns the sum of all elements% of A and Bsum = 0; sum = coder.ceval('foo',coder.ref(A),coder.ref(B),length);% Returning the difference of sum_abs and sumy = sum_abs - sum;end

The generated code for this function uses a row-major representation of the square matricesAandB. The code first computessum_abs(the sum of absolute values of all elements ofAandB) by traversing over the matrices row by row. This algorithm is optimized for matrices that are represented in a row-major layout. The code then usescoder.cevalto call the external C functionfoo.c:

#include  #include  #include "foo.h" double foo(double *A, double *B, double length) { int i,j,s; double sum = 0; s = (int)length; /*Summing all the elements of A and B*/ for(i=0;i
            

The corresponding C header filefoo.his:

#include "rtwtypes.h" double foo(double *A, double *B, double length);

foo.creturns the variablesum, which is the sum of all elements ofAandB. The performance of the functionfoo.cis independent of whether the matricesAandBare represented in row-major or column-major layouts.MyFunctionreturns the difference ofsum_absandsum.

You can measure the performance ofMyFunctionfor large input matricesAandB, and then optimize it further:

  1. Enable MEX profiling and generate MEX code forMyFunction. RunMyFunction_mexfor two large random matricesAandB. View the Profile Summary Report.

    A = rand(20000); B = rand(20000); codegenMyFunction-args{A,B}foo.cfoo.h-profileprofileon; MyFunction_mex(A,B); profileviewer;

    A separate window opens showing the Profile Summary Report.

    Profile summary exhibiting a table with field Function Name Calls, Total Time in seconds, Self Time in seconds and total time plot. A flame graph is present, representing the table in a bar graph.

    The Profile Summary Report shows the total time and the self time for the MEX file and its child, which is the generated code for the original MATLAB function.

  2. Under Function Name, click the first link to view the Profile Detail Report for the generated code forMyFunction. You can see the lines where the most time was spent:

    Table with fields Line Number, Code, Cells, Total time in seconds, Percentage of time and time plot with relevant data entries from example code. Important to point out that the total time for coder.ceval is relatively high.

  3. The line callingcoder.cevaltakes a lot of time (16.914 s). This line has considerable execution time becausecoder.cevalconverts the representation of the matricesAandBfrom row-major layout to column-major layout before passing them to the external C function. You can avoid this conversion by using an additional argument-layout:rowMajorincoder.ceval:

    sum = coder.ceval('-layout:rowMajor','foo',coder.ref(A),coder.ref(B),length);
  4. Generate the MEX function and profile again using the modifiedMyFunction.

    A = rand(20000); B = rand(20000); codegenMyFunction-args{A,B}foo.cfoo.h-profileprofileon; MyFunction_mex(A,B); profileviewer;
    The Profile Detail Report forMyFunctionshows that the line callingcoder.cevalnow takes only 0.653 s:

    Same image as mentioned above, here coder.ceval has a reduced total time of 0.653s.

Effect of Folding Expressions on MEX Code Coverage

When you usecoder.constto fold expressions into constants, it causes a difference in the code coverage between the MATLAB function and the MEX function. For example, consider the function:

functiony = MyFoldFunction%#codegena = 1; b = 2; c = a + b; y = 5 + coder.const(c);end

Profiling the MATLAB functionMyFoldFunctionshows this code coverage in the Profile Detail Report:

However, profiling the MEX functionMyFoldFunction_mexshows a different code coverage:

Lines 2, 3, and 4 are not executed in the generated code because you have folded the expressionc = a + binto a constant for code generation.

This example uses user-defined expression folding. The code generator sometimes automatically folds certain expressions to optimize the performance of the generated code. Such optimizations also cause the coverage of the MEX function to be different from the MATLAB function.

See Also

|||||

Related Topics