Compute the value of X in the equation A'AX = B for complex-valued matrices with infinite number of rows using Q-less QR decomposition - Simulink

Examples

Implement Hardware-Efficient Complex Partial-Systolic Matrix Solve Using Q-less QR Decomposition with Forgetting Factor

How to use the Complex Partial-Systolic Matrix Solve Using Q-less QR Decomposition with Forgetting Factor block.

Open Script

Algorithms to Determine Fixed-Point Types for Complex Q-less QR Matrix Solve A'AX=B

Derivation of algorithms for determining fixed-point types for complex Q-less QR matrix solve.

Open Live Script

Determine Fixed-Point Types for Complex Q-less QR Matrix Solve A'AX=B

Usefixed.complexQlessQRFixedpointTypesto determine fixed-point types for computation of the complex least-squares matrix equation.

Open Live Script

Fixed-Point HDL-Optimized Minimum-Variance Distortionless-Response (MVDR) Beamformer

Implement a hardware-efficient MVDR beamformer.

Open Script

Compute Forgetting Factor Required for Streaming Input Data

Usefixed.forgettingFactorandfixed.forgettingFactorInverseto compute forgetting factor.

Open Live Script

Ports

Input

expand all

A(i,:)—Rows of matrixA
vector

Rows of matrixA, specified as a vector.Ais an infinitely tall matrix of streaming data. IfBis single or double,Amust be the same data type asB. IfAis a fixed-point data type,Amust be signed, use binary-point scaling, and have the same word length asB. Slope-bias representation is not supported for fixed-point data types.

Data Types:single|double|fixed point
Complex Number Support:Yes

B—MatrixB
matrix | vector

MatrixB, specified as a vector or a matrix.Bis ann-by-pmatrix wheren≥ 2. IfAis single or double,Bmust be the same data type asA. IfBis a fixed-point data type,Bmust be signed, use binary-point scaling, and have the same word length asA. Slope-bias representation is not supported for fixed-point data types.

Data Types:single|double|fixed point

validInA—Whether A input is valid
`Boolean`scalar

WhetherA(i, ;)input is valid, specified as a Boolean scalar. This control signal indicates when the data from theA(i,:)input port is valid. When this value is1(true) and thereadyAvalue is1(true), the block captures the values at theA(i,:)input port. When this value is0(false), the block ignores the input samples.

After sending atruevalidInAsignal, there may be some delay beforereadyAis set tofalse. To ensure all data is processed, you must wait untilreadyAis set tofalsebefore sending anothertruevalidInAsignal.

Data Types:Boolean

validInB—Whether input B is valid
`Boolean`scalar

Whether inputBis valid, specified as a Boolean scalar. This control signal indicates when the data from theBinput port is valid. When this value is1(true) and thereadyBvalue is1(true), the block captures the values at theBinput port. When this value is0(false), the block ignores the input samples.

After sending atruevalidInBsignal, there may be some delay beforereadyBis set tofalse. To ensure all data is processed, you must wait untilreadyBis set tofalsebefore sending anothertruevalidInBsignal.

Data Types:Boolean

restart—Whether to clear internal states
`Boolean`scalar

Whether to clear internal states, specified as a Boolean scalar. When this value is 1 (true), the block stops the current calculation and clears all internal states. When this value is 0 (false) and thevalidInAandvalidInBvalues are 1 (true), the block begins a new subframe.

Data Types:Boolean

Output

expand all

X—MatrixX
matrix | vector

MatrixX,作为一个矩阵或向量返回。

Data Types:single|double|fixed point

validOut—Whether output data is valid
`Boolean`scalar

Whether the output data is valid, returned as a Boolean scalar. This control signal indicates when the data at the output portXis valid. When this value is1(true), the block has successfully computed a row ofX. When this value is0(false), the output data is not valid.

Data Types:Boolean

readyA—Whether block is ready for input A
`Boolean`scalar

Whether the block is ready for input A, returned as a Boolean scalar. This control signal indicates when the block is ready for new input data. When this value is 1 (true) andvalidInAvalue is 1 (true), the block accepts input data in the next time step. When this value is 0 (false), the block ignores input data in the next time step.

After sending atruevalidInAsignal, there may be some delay beforereadyAis set tofalse. To ensure all data is processed, you must wait untilreadyAis set tofalsebefore sending anothertruevalidInAsignal.

Data Types:Boolean

readyB—Whether block is ready for input B
`Boolean`scalar

Whether the block is ready for input B, returned as a Boolean scalar. This control signal indicates when the block is ready for new input data. When this value is 1 (true) andvalidInBvalue is 1 (true), the block accepts input data in the next time step. When this value is 0 (false), the block ignores input data in the next time step.

After sending atruevalidInBsignal, there may be some delay beforereadyBis set tofalse. To ensure all data is processed, you must wait untilreadyBis set tofalsebefore sending anothertruevalidInBsignal.

Data Types:Boolean

Parameters

expand all

Number of columns in matrix A and rows in matrix B—Number of columns in matrixAand rows in matrixB
`4`(default) | positive integer-valued scalar

Number of columns in matrixAand rows in matrixB, specified as a positive integer-valued scalar.

公关ogrammatic Use

Block Parameter:n

Type:character vector

Values:positive integer-valued scalar

Default:4

Number of columns in matrix B—Number of columns in matrixB
`1`(default) | positive integer-valued scalar

Number of columns in matrixB, specified as a positive integer-valued scalar.

公关ogrammatic Use

Block Parameter:p

Type:character vector

Values:positive integer-valued scalar

Default:1

Forgetting factor—Forgetting factor applied after each row of the matrix is factored
0.99 (default) | real positive scalar

Forgetting factor applied after each row of the matrix is factored, specified as a real positive scalar. The output is updated as each row ofAis input indefinitely.

公关ogrammatic Use

Block Parameter:forgettingFactor

Type:character vector

Values:positive integer-valued scalar

Default:0.99

Regularization parameter—Regularization parameter
0 (default) | real nonnegative scalar

Regularization parameter, specified as a nonnegative scalar. Small, positive values of the regularization parameter can improve the conditioning of the problem and reduce the variance of the estimates. While biased, the reduced variance of the estimate often results in a smaller mean squared error when compared to least-squares estimates.

公关ogrammatic Use

Block Parameter:regularizationParameter

Type:character vector

Values:real nonnegative scalar

Default:0

Output datatype—Data type of output matrixX
`fixdt(1,18,14)`(default) |`double`|`single`|`fixdt(1,16,0)`|

Data type of the output matrixX, specified asfixdt(1,18,14),double,single,fixdt(1,16,0), or as a user-specified data type expression. The type can be specified directly, or expressed as a data type object such asSimulink.NumericType.

公关ogrammatic Use

Block Parameter:OutputType

Type:character vector

Values:'fixdt(1,18,14)'|'double'|'single'|'fixdt(1,16,0)'|''

Default:'fixdt(1,18,14)'

Tips

Usefixed.forgettingFactorto compute the forgetting factor,α, for an infinite number of rows with the equivalent gain of a matrix withmrows.
Usefixed.forgettingFactorInverseto compute the number of rows,m, of a matrix with equivalent gain corresponding to forgetting factorα
.

Algorithms

expand all

Q-less QR Decomposition with Forgetting Factor

TheComplex Partial-Systolic Matrix Solve Using Q-less QR Decomposition with Forgetting Factorblock implements the following recursion to compute the upper-triangular factorRof continuously streamingn-by-1 row vectorsA(k,:)using forgetting factorα. It's as if matrixAis infinitely tall. The forgetting factor in the range0 <α< 1prevents it from integrating without bound.

$\begin{matrix} R_{0} = zeros (n, n) \\ [\sim, R_{1}] = qr ([\begin{matrix} R_{0} \\ A (1, :) \end{matrix}], 0) \\ R_{1} = α R_{1} \\ [\sim, R_{2}] = qr ([\begin{matrix} R_{1} \\ A (2, :) \end{matrix}], 0) \\ R_{2} = α R_{2} \\ ⋮ \\ [\sim, R_{k}] = qr ([[\begin{matrix} R_{k - 1} \\ A (k, :) \end{matrix}]], 0) \\ R_{k} = α R_{k} \\ ⋮ \end{matrix}$

Q-less QR Decomposition with Forgetting Factor and Tikhonov Regularization

The outputX_kafter processing thek^thinputA(k,:) is computed using the following iteration.

$\begin{matrix} R_{0} = λ I_{n} \\ [~, R_{1}] = qr ([\begin{matrix} R_{0} \\ A (1, :) \end{matrix}], 0) \\ R_{1} = α R_{1} \\ X_{1} = R_{1} \ (R'_{1} \ B) \\ [~, R_{2}] = qr ([\begin{matrix} R_{1} \\ A (2, :) \end{matrix}], 0) \\ R_{2} = α R_{2} \\ X_{2} = R_{2} \ (R'_{2} \ B) \\ ⋮ \\ [~, R_{k}] = qr ([\begin{matrix} R_{k - 1} \\ A (k, :) \end{matrix}], 0) \\ R_{k} = α R_{k} \\ X_{k} = R_{k} \ (R'_{k} \ B) \\ ⋮ \end{matrix}$

This is mathematically equivalent to computingA'_kA_kX=B, whereA_kis defined as follows, though the block never actually createsA_k.

$A_{k} = [\begin{matrix} α^{k} λ I_{n} \\ [\begin{matrix} α^{k} \\ α^{k - 1} \\ ⋱ \\ α \end{matrix}] A (1 : k, :) \end{matrix}]$

Forward and Backward Substitution

When an upper triangular factor is ready, then forward and backward substitution are computed with the current inputBto produce outputX.

$X = R_{k} \ (R_{k}^{'} \ B)$

Choosing the Implementation Method

Partial-systolic implementations prioritize speed of computations over space constraints, while burst implementations prioritize space constraints at the expense of speed of the operations. The following table illustrates the tradeoffs between the implementations available for matrix decompositions and solving systems of linear equations.

Implementation	Ready	Latency	Area
Systolic	C	O(n)	O(mn²)
Partial-Systolic	C	O(m)	O(n²)
Partial-Systolic with Forgetting Factor	C	O(n)	O(n²)
Burst	O(n)	O(mn²)	O(n)

WhereCis a constant proportional to the word length of the data,mis the number of rows in matrixA, andnis the number of columns in matrixA.

For additional considerations in selecting a block for your application, seeChoose a Block for HDL-Optimized Fixed-Point Matrix Operations.

AMBA AXI Handshake Process

This block uses the AMBA AXI handshake protocol [1]. Thevalid/readyhandshake process is used to transfer data and control information. This two-way control mechanism allows both the manager and subordinate to control the rate at which information moves between manager and subordinate. Avalidsignal indicates when data is available. Thereadysignal indicates that the block can accept the data. Transfer of data occurs only when both thevalidandreadysignals are high.

Block Timing

TheBurst Matrix Solve Using Q-less QR Decomposition with Forgetting Factorblocks accept matrixArow-by-row and matrixBas a single vector. After accepting the first valid pair ofAandBmatrices, the block outputs theXmatrices row by row continuously. The matrix is output from the first row to the last row.

例如,假设输入Amatrix is 3-by-3. Additionally assume thatvalidInasserts beforeready, meaning that the upstream data source is faster than the QR decomposition.

在图中,

A1r1is the first row of the firstAmatrix,A1r2第二行第一个吗Amatrix, and so on.
validIntoready— From a successfulArow input to the block being ready to accept the next row.
validOuttovalidOut— Because the Forward Backward Substitution block runs continuously, it generates output at a constant rate. This is the delay between two adjacent valid outputs.
n^throwvalidIntovalidOut— From then^throw input to the block starting to output the first solution.
This block is always ready to acceptBmatrices, soreadyBis always asserted.

ThePartial-Systolic Matrix Solve Using Q-less QR Decomposition with Forgetting Factorblocks accept matrixArow-by-row and matrixBas a single vector. After accepting the first valid pair ofAandBmatrices, the block outputs theXmatrices row by row continuously.

例如,假设输入Amatrix is 3-by-3. Additionally assume thatvalidInasserts beforeready, meaning that the upstream data source is faster than the QR decomposition.

在图中,

A1r1is the first row of the firstAmatrix,A1r2第二行第一个吗Amatrix, and so on.
validIntoready— From a successfulArow input to the block being ready to accept the next row.
validOuttovalidOut— Because the Forward Backward Substitution block runs continuously, it generates output at a constant rate. This is the delay between two adjacent valid outputs.
Last rowvalidIntovalidOut— From the lastm^throw input to the block starting to output the solution.
This block is always ready to acceptBmatrices, soreadyBis always asserted.

The following table provides details of the timing for theBurst Matrix Solve Using Q-less QR Decomposition with Forgetting FactorandPartial-Systolic Matrix Solve Using Q-less QR Decomposition with Forgetting Factorblocks.

Block	操作	`validIn`to`ready`(cycles)	`validOut`to`validOut`(cycles)	n^thRow`validIn`to`validOut`(cycles)
Real Burst Matrix Solve Using Q-less QR Decomposition with Forgetting Factor	Asynchronous	(wl+ 5)*n+ 2 +n	4n²+ 25n+ 5 + 2nwl+ 2nnextpow2(wl)	4n²+ 25n+ 5 + 2nwl+ 2nnextpow2(wl) + (wl+ 5)*n+n
Complex Burst Matrix Solve Using Q-less QR Decomposition with Forgetting Factor	Asynchronous	(wl2 + 11)n+ 2 +n	4n²+ 25n+ 5 + 2nwl+ 2nnextpow2(wl)	4n²+ 25n+ 5 + 2nwl+ 2nnextpow2(wl) + (wl2 + 11)n+n
Real Partial-Systolic Matrix Solve Using Q-less QR Decomposition with Forgetting Factor	Asynchronous	wl+ 7	4n²+ 25n+ 5 + 2nwl+ 2nnextpow2(wl)	4n²+ 25n+ 5 + 2nwl+ 2nnextpow2(wl) + (wl+ 6)*n+ 2
Complex Partial-Systolic Matrix Solve Using Q-less QR Decomposition with Forgetting Factor	Asynchronous	wl+ 9	4n²+ 25n+ 5 + 2nwl+ 2nnextpow2(wl)	4n²+ 25n+ 5 + 2nwl+ 2nnextpow2(wl) + (wl+ 7.5)2n+ 2

In the table,mrepresents the number of rows in matrixA, andnis the number of columns in matrixA.wlrepresents the word length ofA.

If the data type ofAis fixed point, thenwlis the word length.
If the data type ofAis double, thenwlis 53.
If the data type ofAis single, thenwlis 24.

Hardware Resource Utilization

This block supports HDL code generation using the Simulink^®高密度脂蛋白工作流顾问。例如,看到的HDL Code Generation and FPGA Synthesis from Simulink Model(HDL Coder)andImplement Digital Downconverter for FPGA(DSP HDL Toolbox).

In R2022b: The following tables show the post place-and-route resource utilization results and timing summary, respectively.

This example data was generated by synthesizing the block on a Xilinx^®Zynq^®UltraScale™ + RFSoC ZCU111 evaluation board. The synthesis tool was Vivado^®v.2020.2 (win64).

The following parameters were used for synthesis.

Block parameters:
- n = 16
- p = 1
- MatrixAdimension: inf-by-16
- MatrixBdimension: 16-by-1
Input data type:sfix16_En14
Target frequency: 250 MHz

Resource	Usage	Available	Utilization (%)
CLB LUTs	334280	425280	78.60
CLB Registers	261319	850560	30.72
DSPs	12	4272	0.28
Block RAM Tile	0	1080	0.00
URAM	0	80	0.00

	Value
Requirement	4 ns
Data Path Delay	3.892 ns
Slack	0.088 ns
Clock Frequency	255.62 MHz

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using Simulink® Coder™.

Slope-bias representation is not supported for fixed-point data types.

HDL Code Generation
Generate Verilog and VHDL code for FPGA and ASIC designs using HDL Coder™.

HDL Coder™ provides additional configuration options that affect HDL implementation and synthesized logic.

HDL Architecture

This block has one default HDL architecture.

HDL Block Properties

General
ConstrainedOutputPipeline	Number of registers to place at the outputs by moving existing delays within your design. Distributed pipelining does not redistribute these registers. The default is`0`. For more details, seeConstrainedOutputPipeline(HDL Coder).
InputPipeline	Number of input pipeline stages to insert in the generated code. Distributed pipelining and constrained output pipelining can move these registers. The default is`0`. For more details, seeInputPipeline(HDL Coder).
OutputPipeline	Number of output pipeline stages to insert in the generated code. Distributed pipelining and constrained output pipelining can move these registers. The default is`0`. For more details, seeOutputPipeline(HDL Coder).

Restrictions

Supports fixed-point data types only.

版本sion History

Introduced in R2020b

expand all

R2023a:Smart unrolling for improved resource utilization

This block depends on a partial-systolic QR decomposition block. Since 23a, when you update the diagram, the loop which composes the partial-systolic pipeline in the QR decomposition block is unrolled. This updated internal architecture removes dead operations in simulation and generated code, thus requiring fewer hardware resources. This block simulates with clock and bit-true fidelity with respect to library versions of these blocks in previous releases.

R2022a:Support for Tikhonov regularization parameter

TheComplex Partial-Systolic Matrix Solve Using Q-less QR Decomposition with Forgetting Factorblock now supports the TikhonovRegularization parameter.

R2021a:Reduced HDL resource utilization

This block now has an improved algorithm to reduce resource utilization on hardware-constrained target platforms.

Complex Partial-Systolic Matrix Solve Using Q-less QR Decomposition with Forgetting Factor

Description

Examples

Implement Hardware-Efficient Complex Partial-Systolic Matrix Solve Using Q-less QR Decomposition with Forgetting Factor

Algorithms to Determine Fixed-Point Types for Complex Q-less QR Matrix Solve A'AX=B

Determine Fixed-Point Types for Complex Q-less QR Matrix Solve A'AX=B

Fixed-Point HDL-Optimized Minimum-Variance Distortionless-Response (MVDR) Beamformer

Compute Forgetting Factor Required for Streaming Input Data

Ports

Input

A(i,:)—Rows of matrixAvector

B—MatrixBmatrix | vector

validInA—Whether A input is validBooleanscalar

validInB—Whether input B is validBooleanscalar

restart—Whether to clear internal statesBooleanscalar

Output

X—MatrixXmatrix | vector

validOut—Whether output data is validBooleanscalar

readyA—Whether block is ready for input ABooleanscalar

readyB—Whether block is ready for input BBooleanscalar

Parameters

Number of columns in matrix A and rows in matrix B—Number of columns in matrixAand rows in matrixB4(default) | positive integer-valued scalar

公关ogrammatic Use

Number of columns in matrix B—Number of columns in matrixB1(default) | positive integer-valued scalar

公关ogrammatic Use

Forgetting factor—Forgetting factor applied after each row of the matrix is factored0.99 (default) | real positive scalar

公关ogrammatic Use

Regularization parameter—Regularization parameter0 (default) | real nonnegative scalar

公关ogrammatic Use

Output datatype—Data type of output matrixXfixdt(1,18,14)(default) |double|single|fixdt(1,16,0)|

公关ogrammatic Use

Tips

Algorithms

Q-less QR Decomposition with Forgetting Factor

Q-less QR Decomposition with Forgetting Factor and Tikhonov Regularization

Forward and Backward Substitution

Choosing the Implementation Method

AMBA AXI Handshake Process

Block Timing

Hardware Resource Utilization

References

Extended Capabilities

C/C++ Code GenerationGenerate C and C++ code using Simulink® Coder™.

HDL Code GenerationGenerate Verilog and VHDL code for FPGA and ASIC designs using HDL Coder™.

版本sion History

R2023a:Smart unrolling for improved resource utilization

R2022a:Support for Tikhonov regularization parameter

R2021a:Reduced HDL resource utilization

See Also

Blocks

Functions

Topics

A(i,:)—Rows of matrixA
vector

B—MatrixB
matrix | vector

validInA—Whether A input is valid
`Boolean`scalar

validInB—Whether input B is valid
`Boolean`scalar

restart—Whether to clear internal states
`Boolean`scalar

X—MatrixX
matrix | vector

validOut—Whether output data is valid
`Boolean`scalar

readyA—Whether block is ready for input A
`Boolean`scalar

readyB—Whether block is ready for input B
`Boolean`scalar

Number of columns in matrix A and rows in matrix B—Number of columns in matrixAand rows in matrixB
`4`(default) | positive integer-valued scalar

Number of columns in matrix B—Number of columns in matrixB
`1`(default) | positive integer-valued scalar

Forgetting factor—Forgetting factor applied after each row of the matrix is factored
0.99 (default) | real positive scalar

Regularization parameter—Regularization parameter
0 (default) | real nonnegative scalar

Output datatype—Data type of output matrixX
`fixdt(1,18,14)`(default) |`double`|`single`|`fixdt(1,16,0)`|

C/C++ Code Generation
Generate C and C++ code using Simulink® Coder™.

HDL Code Generation
Generate Verilog and VHDL code for FPGA and ASIC designs using HDL Coder™.