我投入了大量的时间来创建一个DFB程序,用于执行输入流的预处理。因为默认的DFB模拟器(V1.4)是无用的,所以这个代码的开发需要将PSoC造器版本4.1降低到2.2,因为它是能够运行Chris Keeser的扩展模拟器的最后版本,它至少显示了一些东西。CyPress支持也没有使我的任务更容易:所有四个我最近的bug报告都被标记为“取消”,尽管附加的片段显示直接违反DFB规范。他们回答了他们默认的“走开”的信息,那是他们告诉我的。
去社区
论坛,好像社区成员可以访问模拟器源,或者可以调整误导/惊人错误的文档。我不知道这种敌意的原因,但如果它想驱逐顾客,那肯定是一种魅力。
但是要点是:附加的项目有“DSP”页面,它包含一个DFB实例和TS程序。它由两个DMA通道馈送,结果由另两个DMA收集。确切的输入值是无关的,问题是在控制流级别,而不是用错误的计算结果。所有四个DMA通道工作正常,用几个阶段-Gt;保持转发片段检查。C是这样做的:配置测试环境,让我看到作用域上的调试信号。该代码在两个模拟器(即DFB汇编程序1.4和Keeser的)上进行了良好的
仿真,得到的结果与C++参考实现完全一致。在一个真正的芯片上,它是一场史诗般的灾难。DFB程序由两个独立的计算引擎组成,但最终它们都归结为相同的任务:计算3个第四阶CIC滤波器的序列,每一个都抽取4。因此组合抽取系数为64。为了将硬度降低至少一半,我绕过了“监视”部分,但是即使是更简单的SDR部分也被打破了。正确的正确标志是输出DMA传输的频率:对于每秒310000个输入采样,输出应该是64倍更少,即每秒4843.75个采样。范围为40 .50kHz,没有明显的图案。尽管它的名称,DFB ALU缺乏任何逻辑指令,所以组合的“64”计数器被实现为以复杂的增量/补偿器方式更新的3个2位计数器的压缩阵列。我重新使用了一个信号量来检查CSBJCICSKBBY集成状态访问的频率。太频繁了。所需的场景如下:每秒输入CSAYSDRYPROCEDATA数据310E3次,然后在每个I/Q路径上每第四个周期集成CSBJCICSCOMBY,在每个8个输入样本之后转换为两个后续访问,因为每个路径都有专用的CIC滤波器。然后进一步前进,在每一个已过滤的第四个周期中的每第四个周期中移动到更高的CIC电平,即每16个周期一次,然后每64个周期一次,然后将结果存储在HOLDB中。它是这样设计的,这就是模拟器上发生的事情。
这次我想找出柏柏尔工具到底出了什么毛病,但没劲了。我正把整个PSoC冒险扔到垃圾箱,切换到更好的指定Xilinx ZYNQ家族,但我后悔所有花了的时间和金钱,所以请你看看附件中的项目并尝试猜测DFB的物理实现在哪里发散。它的规格如此惊人,以至于我的代码变得毫无用处?
CyWrk.CaseV01.Zip
1.2兆字节
以上来自于百度翻译
以下为原文
I have invested considerable amount of
time in creating a DFB program intended to perform input streams' preprocessing. Because the default DFB simulator (v1.4) is next to useless, the development of this code required downgrade of PSocCreator version 4.1 to 2.2, because it is the last version that is able to run Chris Keeser's extended simulator, which at least shows you something. Cypress support didn't make my task easier either: all four of my recent bug reports have been marked as "Cancelled", despite the attached snippets which show direct violation of the DFB specification. They replied with their default "go away" message, that is they told me to
go to the community forum, as if the community members had access to the simulator sources or could adjust the misleading/strikingly wrong documentation. I don't know the reason for this hostility, but if its intended to repel customers, it surely works like a ch
ARM.
But to the point: the attached project has the "DSP" page, which contains a DFB instance together with ts program. It is fed by two DMA channels and the results are collected by another two.
The exact input values are irrelevant, the problem is at the control flow level, not with incorrectly computed results. All four DMA channels work correctly, checked that with a few stage->hold forwarding snippets. This is all what main.c does: configure the testing environment and lets me see the debug signals on the scope. The code
simulates well on both simulators (i.e. the DFB assembler 1.4's and Keeser's) and the obtained results are in full agreement with the C++ reference implementation.
On a real chip it is an epic disaster. The DFB program is composed of two independent calculation engines, but finally they all boil down to the same task: compute a sequence of 3 4th order CIC filters, each decimating by 4. So the combined decimation factor is exactly 64. To cut the hardness by at least a half, I bypassed the 'monitoring' part, but even the much simpler SDR part is broken. The obvious sign of correctness would be the the frequency of the output DMA transfers: for 310,000 input samples per second the output should be 64 times less, i.e. 4843.75 samples per second. The scope shows 40..50kHz with no obvious pattern. Despite its name, the DFB ALU lacks any logical instructions, so the combined "to 64" counter is implemented as a packed array of 3 2-bit counters updated in a complex delta/compensator way. I've reused one of the semaphores to check how often the c***_cic_comb_integrate state is visited. Far too often. The desired scenario is as follows: enter csa_sdr_process_data 310e3 times per second, then go to c***_cic_comb_integrate every fourth cycle on each of the I/Q paths, which translates into two subsequent visits after each 8 input samples, because each path has a dedicated CIC filter. And then go further and move to the higher CIC level after every fourth of the already filtered fourth cycles, i.e. once per 16 cycles, then once per 64, then store the result in holdb. It was designed this way and this is what happens on the simulator.
I was trying to figure out what is going wrong
this time with the Cypress tools, but ran out of steam. I'm at the verge of throwing the entire PSOC adventure to the dustbin and switching to the much better specified XIlinx Zynq family, but I regret all the spent time and money, so could you please have a look at the attached project and try to guess where the physical implementation of DFB diverges with its specification so strikingly that my code becomes useless?