完善资料让更多小伙伴认识你,还能领取20积分哦, 立即完善>
在我的研究工作中,Xilinx FPGA中的大规模并行处理器阵列(例如,100s的32b RISC和6VLX240T中的路由器),我的设计使用分层RPM来平铺(并填充)设备。
这些又是由原始元素和(通常)手工技术映射的LUT构建的。 为了获得最佳的结果质量,我寻找了最小化数据路径原语的方法,例如通过使用lut_map'd LUT将多路复用器和ALU折叠成LUT,并携带合成工具仍未找到的prims。 除了手动技术映射之外,我还使用分层RLOC来管理这些模块的放置,从而获得快速且确定性的PAR运行,并从我的关键路径中削减数十个百分点。 在我的设计中,通常> 50%的基元是手工技术映射和/或手工放置。 自1995年以来,我一直使用这种方法,尽管多年来经历了一些起伏,但它一直很棒。 在准备转向Vivado for 7系列设备时,我一直在审查Vivado实施文档。 注意:我还没有使用过这些新的Vivadotools。 在UG901中,我发现在HDL中不支持lut_map和rloc属性。 在UG903中,我也没有看到对先前ISE约束指南中的RLOC约束或类似概念的支持。 是全新文档的这些mereshortcomings,还是lut_map和rloc消失了? Xilinx是否终止了对7系列及更高版本设备的RPM的支持? (如果RPM是历史记录:也许可以使用XCF文件中的LOC约束(其中100,000个)来克服RLOC的丢失 - 但对于不拥有顶级设计的IP供应商而言,这将无法解决。 我不知道如何解决丢失lut_map以接管关键设计元素的技术映射。) 非常感谢您对此问题的任何指导。 以上来自于谷歌翻译 以下为原文 In my research work in massively parallel processor arrays in Xilinx FPGA (e.g. 100s of 32b RISCs and routers in one 6VLX240T), my designs tile (and fill) the device with hierarchical RPMs. These in turn are built up from primitive elements and (often) hand-technology-mapped LUTs. For very best quality of results I seek out ways to minimize datapath primitives, for example by folding muxes and ALUs into LUTs using lut_map'd LUTs and carry prims that the synthesis tools still don't find. Besides manual technology mapping, I also manage placement of these modules using hierarchical RLOCs to get fast and deterministic PAR runs and to shave many tens of per cent from my critical paths. In my designs often >50% of the primitives are hand technology mapped and/or hand placed. I have used this methodology since 1995 and despite some ups and downs over the years, it's been great. In preparing to move to Vivado for 7 series devices and beyond, I've been reviewing the Vivado implementation docs. Note: I haven't used these new Vivado tools yet. In UG901, I can find no support for lut_map and rloc attributes in HDL. In UG903, I also don't see support for RLOC constraints or similar concepts from previous ISE Constraints guides. Are these mere shortcomings of the brand new documentation, or are lut_map and rloc gone? Is Xilinx ending support for RPMs in 7 Series and later devices? (If RPMs are history: Perhaps one can overcome the loss of RLOCs using LOC constraints in the XCF file (100,000s of them) -- but that won't work fwell for IP vendors who don't own the top level design. And I don't see how one can work around the loss of lut_map to take over technology mapping of critical design elements.) Thank you very much for any guidance on this concern. |
|
相关推荐
27个回答
|
|
有没有人真的设法让Vivado在层次结构中使用-flatten_hierarchy none提供可靠的RPM构建?
如果是这样,你还有其他任何技巧吗? 我得到了一些小测试来使用该选项,这比我之前的建议要好得多。 但是,现在,随着更大的设计,我没有得到任何好处。 Vivado再次将RPM分解为层次结构边界的微小位。 如果我使用-flatten_hierarchy full,我会得到我对甚至相当强制RPM的测试的期望,但我不知道在HDL中选择低于给定级别的完全展平的任何方法。 我只知道如何压扁整个设计。 看起来我可以通过将DONT_TOUCH放在我希望保留层次结构的所有位置来获得可靠的东西,但这有点落后于我们在XST中使用的东西(选择在哪里展平,而不是在哪里不 我怀疑得到的一切都会进行相当多的测试,但到目前为止,我得到了可靠的结果。 无论如何,在放入大量DONT_TOUCH属性之前,我很好奇是否有其他人设法在Vivado -flatten_hierarchy none中可靠地构建复杂的跨层次RPM。 伊恩刘易斯 www.mstarlabs.com 以上来自于谷歌翻译 以下为原文 Have any of you actually managed to get Vivado to give reliable RPM builds across hierarchy with -flatten_hierarchy none? If so, do you have any other tricks to go with it? I got a few small tests to work with that option, which was a lot better than what I had before the suggestion. But, now, with bigger designs, I am not getting anything good. Vivado is again breaking up the RPMs into tiny bits at hierarchy boundaries. If I use -flatten_hierarchy full I get what I expect in tests of even quite compelx RPMs, but I do not know of any way to select full flattening below a given level in HDL. I only know how to flatten the whole design. It looks like I can probably get something reliable by going through and placing DONT_TOUCH on all the places where I want to keep the hierarchy, but that is kind of backwards of what we used in XST (select where to flatten, rather than where to not flatten), and I suspect getting everything worked out will take quite a bit of testing, though, so far I am getting reliable results. Anyhow, before putting in lots of DONT_TOUCH attributes I am curious whether anyone else has managed to reliably build complex cross-hierarchy RPMs in Vivado -flatten_hierarchy none. Ian Lewis www.mstarlabs.com |
|
|
|
雷写道:
“可能的解决方法可能是将逻辑方程式转换为LUT初始化字符串的被调用函数。有人在互联网上的某个地方有一个VHDLcode片段” 我在去年的论坛帖子上发布了一个片段并链接到该Rockylogic方程解析器: http://forums.xilinx.com/t5/Synthesis/Does-Vivado-synthesis-continue-support-for-XST-s-lut-map/m-p/407969#M10037 但是上次我在Vivado [2013.something]中尝试过字符串函数时,我无法让它们在综合中工作。 我也同意LUT_MAP应该在Vivado中实现,以启用这种结构代码。 布赖恩 以上来自于谷歌翻译 以下为原文 Ray wrote: "A possible work-around might be a called function that converts a logic equation to a LUT init string. Someone had a VHDL code snippet that did that somewhere on the internet" I posted a snippet and link to that Rockylogic equation parser on this forum thread last year: http://forums.xilinx.com/t5/Synthesis/Does-Vivado-synthesis-continue-support-for-XST-s-lut-map/m-p/407969#M10037 But the last time I tried string functions in Vivado [2013.something], I couldn't get them to work in synthesis. I also agree that LUT_MAP should be implemented in Vivado to enable this sort of structural code. -Brian |
|
|
|
无论它有什么价值,Vivado似乎都有一种忽略“-flatten_hierarchy none”的行为,当它在较低级别的模块中找到一个三态时,即使该模块直接连接到顶级端口而中间没有逻辑
: http://www.xilinx.com/support/answers/60092.html 这似乎有可能在尝试维护RPM时引起神秘问题。 您更改了与顶级端口相关的内容并破坏了完全不相关的RPM的结构,因为整个合成过程发生了变化。 我刚刚在另一个与RPM无关的线程中意外地注意到了这一点。 我在这里发布这个只是为了指出这种潜在的副作用,可能会根据不相关代码的变化破坏RPM。 伊恩刘易斯 www.mstarlabs.com 以上来自于谷歌翻译 以下为原文 For whatever it is worth, Vivado seems to have a behavior of ignoring "-flatten_hierarchy none" when it finds a tri-state in a lower-level module, even if that module connects directly to a top-level port with no logic in between: http://www.xilinx.com/support/answers/60092.html This looks to have potential to cause mysterious problems when trying to maintain RPMs. You change something related to a top-level port and break the structure of a completely unrelated RPM because your entire synthesis process changes. I just noticed this by accident in another thread unrelated to RPMs. I am posting this here only to point out this potential side effect that could clobber RPMs based on changes in unrelated code. Ian Lewis www.mstarlabs.com |
|
|
|
嗨格雷格,
我发现我失踪的另一个RPM功能是一种从XDC文件中指定RLOC_ORIGIN的方法。 在XST中,您可以在RTL中实例化RPM或一百RPM,然后使用UCF文件将它们放置到特定站点。 我经常编写一个Python程序,它在各种设备拓扑结构(如逻辑结构中的空白位置)中参数化以发出UCF文件。 关于这个流程的好处是你不必进入RTL设计层次结构的中间,以混淆和混合应用程序逻辑和结构与放置注意事项。 似乎Vivado套件不提供这样的机制。 您可以实现分层RPM,但是必须在RTL本身中使用RLOC_ORIGIN约束将它们固定下来。 现在,您的RTL与各种特定的器件/电路板拓扑混合在一起。 相反,如果您在RTL中实例化RPM但不挂起RLOC或RLOC_ORIGIN约束,它将浮动到占位符所希望的任何位置,并且您无法在以后执行。 我错过了一招吗? 我可以使用XDC / TCL place_cell函数来确定(例如)RPM左下角的LUT,以使其余的RPM捕捉到该位置吗? 或者我可以在某种程度上强加或混合某种顶级RTL,这种RTL可以深入到设计层次结构中以确定其内部RLOC? 我想我可以将每个顶级分层RPM放在tightPBLOCK中,但这在某种程度上似乎是错误的。 谢谢你的任何建议。 一月 以上来自于谷歌翻译 以下为原文 Hi Greg, Another RPM feature I have found I am missing is a way to specify an RLOC_ORIGIN from the XDC file. In XST you can instantiate an RPM, or a hundred RPMs, in RTL, then use the UCF file to place them to specific sites. I usually write a Python program that is parametric in various device topology things (like where the voids are in the logic fabric) to emit my UCF file. The nice thing about this flow is you don't have to reach into the middle of an RTL design hierarchy to confuse and commingle application logic and structure with placement considerations. It seems the Vivado suite provides no such mechanism. You can implement hierarchical RPMs but then you have to pin them down with RLOC_ORIGIN constraints in the RTL itself. Now your RTL is mixed up with all manner of specific device/board topology cruft. Conversely, if you instantiate an RPM in RTL but don't hang an RLOC or RLOC_ORIGIN constraint on it, it will float to wherever the placer wishes, and you can't take hold later. Am I missing a trick? Can I use an XDC/TCL place_cell function to nail down (say) a LUT in the bottom left corner of an RPM to make the rest of the RPM snap to that location? Or can I somehow I impose or mix in some kind of top level RTL that reaches way down into the design hierarchy to pin down its internal RLOCs? I suppose I can put each top level hierarchical RPM in a tight PBLOCK but that seems wrong somehow. Thank you for any advice. Jan. |
|
|
|
我自己没试过,但是怎么样:set_property RLOC_ORIGIN XxYy [get_cells instance_which_instantiates_cels]其中instance_which_instantiates_cels的单元格上有RLOC属性。我有一个RTL块,我在父实例上用RLOCs和RLOC_ORIGIN实例化FDRE单元格。
在我进行合成后,我可以获得所有这些属性。 如果缺少RLOC_ORIGIN,我想你可以使用set_property调用添加它。 - 如果提供的信息有用,请将答案标记为“接受为解决方案”。给予您认为有用且回复的帖子。 以上来自于谷歌翻译 以下为原文 I haven't tried this myself but how about: set_property RLOC_ORIGIN XxYy [get_cells instance_which_instantiates_cels] where instance_which_instantiates_cels has cells with RLOC properties on them. I have an RTL block where I instantiate FDRE cells with RLOCs and RLOC_ORIGIN on the parent instances. After I do the synthesis I can get all these properties. If RLOC_ORIGIN were to be missing, I would imagine you can add it with the set_property call. - Please mark the Answer as "Accept as solution" if information provided is helpful. Give Kudos to a post which you think is helpful and reply oriented. |
|
|
|
你好Jan,
根据UG912第284页(2015.1),您可以在XDC中使用LOC约束来获得类似RLOC_ORIGIN的行为。 来自手册: =========== XDC语法 RLOC_ORIGIN属性转换为合成设计中的LOC属性。 您可以通过将RPM的一个元素放在目标设备上来指定RPM的LOC属性。 RPM的其他元素将相对于该位置放置,并分配给LOC属性。 =========== 我还没有对此进行测试,但手册听起来相当清楚。 您是否成功获得分层RPM以在Vivado下工作? 到目前为止,我只能进行小型实验。 更大的设计它一直在破坏。 伊恩刘易斯 www.mstarlabs.com 以上来自于谷歌翻译 以下为原文 Hello Jan, According to UG912 p.284 (2015.1) you can use a LOC constraint in XDC to get RLOC_ORIGIN-like behavior. From manual: =========== XDC Syntax The RLOC_ORIGIN property translates to the LOC property in the synthesized design. You can specify the LOC property of RPMs by placing one of the elements of the RPM onto the target device. The other elements of the RPM will be placed relative to that location, and assigned to LOC property. =========== I have not yet tested this, but the manual sounds fairly clear on the point. Are you having success getting hierarchical RPMs to work under Vivado? So far, I can only get small experiments to work. Larger designs it keeps clobbering. Ian Lewis www.mstarlabs.com |
|
|
|
在XDC中使用LOC进行测试以放置RPM,并且至少在实验级别,它按预期工作。
伊恩 以上来自于谷歌翻译 以下为原文 Tested using LOC in XDC to place an RPM, and, at least at the level of an experiment, it works as expected. Ian |
|
|
|
我也一直在Vivado与RLOC混合经验。
最新的设计是客户让我重写设计的DSP核心,但希望他的所有基础设施都独立存在。 如果未将flatten_hierarchy设置为重建,则他的基础结构会中断,并且当设置为重建时,我的放置会中断。 显式HU_SET似乎主要解决了问题,直到他的代码中的最新更改。 当前迭代抱怨B5LUT在切片上有冲突的网络,其中我有一个寄存器但没有逻辑,并且它只是在客户对其无关的接口设计进行重大更新后才开始出现问题。 DSP设计在ISE 14.7中完美编译,并且在相当完整(87%的DSP,40%LUT,41%的BRAM)7vx690T-3中满足601 MHz(BRAM的最大切换)的时序。 Vivado 2016.4目前正在打破RPM,这当然严重破坏了DSP设计,使得它甚至无法接近400MHz的时序。 我学到的是: 1)隐含的H_SET不适用于分层RPM,你需要显式的HU_SET来实现可靠的构造,特别是要防止布局器尝试将RPM放在x0y0的顶部,即使来自不同的H_SET 2)在某些情况下,重建层次结构会破坏分层RPM。 我还没有确定那些情况。 我知道它早些起作用但不再有效。 3)由于BEL分配在LUT和FF之间不匹配,主要在使用x5LUT和x5FF时,RLOC寄存器与推断的lut逻辑相结合往往被放置在别处。 在这些情况下需要BEL。 在使用x5和x6 LUT时,在某些情况下使用引脚碰撞时我也遇到了一些困难。 这些工具似乎并不总是旋转引脚分配,以允许两者共享正确的引脚。 我仍然试图让它工作而不必进行锁定(我还没有正确地工作)。 4)XC_MAP未被识别(因为它没有一段时间没有意外)。 现在的区别是工具不能可靠地将单层LUT逻辑放入带有寄存器的片中。 我最终使用LUT实例化并编写VHDL函数来生成init表....更多的错误机会和混淆代码。 5)我的遗留代码具有不同参数的并行逻辑(例如FDSE或FDRE),其中只有一个已连接,但两者都是在未使用的参数被优化的情况下实例化的(这是多年前生成问题的解决办法) 。 如果使用了dont_touch属性,则优化后的剩余部分仍然存在,因此需要进行大量更新。 目前,迫切的问题是克服任何阻止RLOC构造的情况,当扁平化被设定为重建或使客户的设计部分正确地构建而不展平。 以上来自于谷歌翻译 以下为原文 I too have been having mixed experiences with RLOCs in Vivado. The latest is a design where the customer had me rewrite the DSP core of the design, but wants all his infrastructure left alone. His infrastructure breaks if the flatten_hierarchy is not set to rebuild, and my placement breaks when it is set to rebuild. explicit HU_SETs seemed to mostly fix the issue until the latest changes in his code. The current iteration is complaining about B5LUT conflicting nets on slices where I have a register but no logic, and it only started being a problem after the customer made a major update to his unrelated interface design. The DSP design compiles perfectly in ISE 14.7 and meets timing for 601 MHz (max toggle of BRAM) in a rather full (87% of DSPs, 40% LUTs, 41% of BRAMs) 7vx690T-3. Vivado 2016.4 currently is breaking up the RPMs, which of course badly breaks the DSP design such that it doesn't even come close to making timing at 400MHz.What I have learned is:1) implied H_SETs are not working for hierarchical RPMs, you need explicit HU_SETs for reliable construction and particularly to prevent the placer trying to put RPMs on top of each other at x0y0 even though from different H_SETs 2) Rebuild hierarchy is breaking hierarchical RPMs under some circumstances. I have not determined what those circumstances are yet. I do know that it worked earlier but is no longer working.3) RLOC'd registers coupled with inferred lut logic is more often than not getting placed elsewhere because of BEL assignments not matching between the LUT and FF, mainly when using the x5LUTs and x5FF's. BELs are needed in these instances. I'm also having some difficulty in some cases with pin collisions when using both the x5 and x6 LUTs. The tools don't seem to always rotate the pin assignments to allow the two to share the right pins. I'm still trying to get that to work without also having to do pin-locking (which I have also not gotten to work correctly yet) .4) XC_MAP is not recognized (no surprise there as it has not for a while). The difference now is the tools are not reliably putting single layer LUT logic in the slice with the register. I ended up using LUT instantiations and writing VHDL functions to generate the init tables....more opportunities for errors and obfuscating the code.5) My legacy code had parallel logic for different parameters (e.g. FDSE or FDRE), only one of which got connected but both were instantiated with the knowledge the unused one got optimized out (was a work-around for problems with generate years ago). The optimized out one remains if dont_touch attributes are used, so lots of updating there.Currently, the pressing issue is to overcome whatever is preventing the RLOC construction when flattening is set to rebuild or getting the customer's portion of the design to build correctly without flattening. |
|
|
|
只有小组成员才能发言,加入小组>>
2385 浏览 7 评论
2800 浏览 4 评论
Spartan 3-AN时钟和VHDL让ISE合成时出现错误该怎么办?
2264 浏览 9 评论
3336 浏览 0 评论
如何在RTL或xilinx spartan fpga的约束文件中插入1.56ns延迟缓冲区?
2433 浏览 15 评论
有输入,但是LVDS_25的FPGA内部接收不到数据,为什么?
763浏览 1评论
请问vc707的电源线是如何连接的,我这边可能出现了缺失元件的情况导致无法供电
548浏览 1评论
求一块XILINX开发板KC705,VC707,KC105和KCU1500
377浏览 1评论
1970浏览 0评论
688浏览 0评论
小黑屋| 手机版| Archiver| 电子发烧友 ( 湘ICP备2023018690号 )
GMT+8, 2024-11-26 16:06 , Processed in 1.539442 second(s), Total 91, Slave 74 queries .
Powered by 电子发烧友网
© 2015 bbs.elecfans.com
关注我们的微信
下载发烧友APP
电子发烧友观察
版权所有 © 湖南华秋数字科技有限公司
电子发烧友 (电路图) 湘公网安备 43011202000918 号 电信与信息服务业务经营许可证:合字B2-20210191 工商网监 湘ICP备2023018690号