主要内容

HDL过滤器体系结构

HDL Coder™软件提供了体系结构选项,可在实现滤波器设计中扩展您对速度与区域折衷的控制。为了实现生成的HDL代码所需的权衡,您可以指定完全并行体系结构,也可以选择几个串行架构之一。使用连续区(HDL Coder)reuseaccum(HDL Coder)parameters. You can also choose a frame-based filter for increased throughput.

使用管道参数来提高滤镜设计的速度性能。使用使用管道将管道添加到过滤器的加法逻辑中AddPipielineGisters(HDL Coder)for scalar input filters, andAdderTreePipeline(HDL Coder)用于基于框架的过滤器。指定每个乘数之前和之后的管道阶段乘数流台管(HDL Coder)乘数输入Pipeline(HDL Coder)。使用过滤器使用前后的管道阶段数量InputPipeline(HDL Coder)OutputPipeline(HDL Coder)。The architecture diagrams show the locations of the various configurable pipeline stages.

完全平行的架构

此选项是默认架构。一个完全并行架构为每个过滤器使用专用乘法器和加法器。抽头并联执行。完全平行的体系结构最适合速度。但是,与串行架构相比,它需要更多的乘数和添加程序,因此消耗了更多的芯片区域。该图显示了具有完全并行实现的直接形式和转置滤波器结构以及可配置管道阶段的位置的架构。

Direct Form

By default, the block implements linear adder logic. When you enableAddPipielineGisters,加法逻辑是作为管道加法树实现的。加法树使用完整的数据类型。如果生成验证模型,则必须在原始模型中使用完全的精度来避免验证不匹配。

转置

TheAddPipielineGisters参数对转置过滤器实现没有影响。

Serial Architectures

序列体系结构及时重用硬件资源,节省芯片区域。使用连续区(HDL Coder)reuseaccum(HDL Coder)parameters. The available serial architecture options arefully serial,partly serial, 和cascade serial

完全连续

A fully serial architecture conserves area by reusing multiplier and adder resources sequentially. For example, a four-tap filter design uses a single multiplier and adder, executing a multiply-accumulate operation once for each tap. The multiply-accumulate section of the design runs at four times the filter's input/output sample rate. This design saves area at the cost of some speed loss and higher power consumption.

In a fully serial architecture, the system clock runs at a much higher rate than the sample rate of the filter. Thus, for a given filter design, the maximum speed achievable by a fully serial architecture is less than that of a parallel architecture.

部分序列

Partly serial architectures cover the full range of speed vs. area tradeoffs that lie between fully parallel and fully serial architectures.

In a partly serial architecture, the filter taps are grouped into a number of serial partitions. The taps within each partition execute serially, but the partitions execute in parallel with respect to one another. The outputs of the partitions are summed at the final output.

When you select a partly serial architecture, you specify the number of partitions and the length (number of taps) of each partition. Suppose you specify a four-tap filter with two partitions, each having two taps. The system clock runs at twice the filter's sample rate.

级联系列

级联式体系结构非常类似于部分串行架构。与部分串行体系结构一样,将滤波器抽头分组为许多串行分区,这些分区相对于彼此并行执行。但是,每个分区的累积输出都级联到上一个分区的累加器。因此,所有分区的输出均在第一个分区的累加器上计算。该技术称为累加器重复使用。不需要最终的加法器,哪个节省区域。

级联式体系结构需要额外的系统时钟周期,以完成对输出的最终求和。因此,必须相对于非cascade部分串行架构中的时钟稍微增加系统时钟的频率。

要生成级联式体系结构,请指定启用累加器重复使用的部分串行体系结构。如果未指定串行分区,则HDL编码器会自动选择最佳分区。

Latency in Serial Architectures

序列化的filter increases the total latency of the design by one clock cycle. The serial architectures use an accumulator (an adder with a register) to add the products sequentially. An additional final register is used to store the summed result of all the serial partitions, requiring an extra clock cycle for the operation. To model this latency, HDL Coder inserts a Delay block into the generated model after the filter block.

完整的连续体系结构

当您选择串行体系结构时,代码生成器在HDL代码中使用完整的精度。因此,HDL编码器在生成的模型中迫使完整的精度。如果生成验证模型,则必须在原始模型中使用完全的精度来避免验证不匹配。

Frame-Based Architecture

When you select a frame-based architecture and provide anM- 样本输入框架,编码器实现了完全并行过滤器体系结构。过滤器包括Mparallel subfilters for each input sample.

每个子滤波器都包括每个Mth coefficient. The subfilter results are added so that each output sample is the sum of each of the coefficients multiplied with one input sample.

The diagram shows the filter architecture for a frame size of two samples (M= 2), and a filter length of six coefficients. The input is a vector with two values representing samples in time. The input samples,x [2n]x[2n+1],代表nth input pair. Every second sample from each stream is fed to two parallel subfilters. The four subfilter results are added together to create two output samples. In this way, each output sample is the sum of each of the coefficients multiplied with one of the input samples.

The sums are implemented as a pipelined adder tree. SetAdderTreePipeline(HDL Coder)指定加法树级别之间的管道阶段的数量。为了提高时钟速度,建议您将此参数设置为2。To fit the multipliers into DSP blocks on your FPGA, add pipeline stages before and after the multipliers using乘数流台管(HDL Coder)乘数输入Pipeline(HDL Coder)

For symmetric or antisymmetric coefficients, the filter architecture reuses the coefficient multipliers and adds design delay between the multiplier and summation stages as required.

相关话题