完善资料让更多小伙伴认识你,还能领取20积分哦, 立即完善>
你好,
我的Kintex实现收到以下MAP警告: PhysDesignRules:2400 - MMCME2_ADV块引脚不会驱动与其他CLKOUT引脚相同的BUFFER负载。 来自不同缓冲器类型的路由将不是相位对齐的,因此可能无法满足IO触发器的零保持时间。 确实如此,因为我需要将时钟驱动到多个时钟区/ bank,因此我有一个BUFMR / BUFR链(与BUFMR相连的时钟输出然后驱动两个BUFR,为两个时钟区提供时钟) 。 当我收到上述警告时,我试图对CLKFBOUT做同样的事情(在FBOUT和FBIN之间有一个BUFMR / BUFR链),但我得到了一个MAP错误,说明CLKFBIN是不可能的。 我的输入时间非常紧张,所以我想知道这个问题是否有助于它。 任何人都可以提出一种方法来消除这种警告,并可能改善时机吗? 先谢谢你。 饶 以上来自于谷歌翻译 以下为原文 Hello, I am getting the following MAP warning for my Kintex implementation : PhysDesignRules:2400 - The MMCME2_ADV block That was indeed the case because I needed to drive my clock to more than one clock region/bank, so I had a BUFMR/BUFR chain (clockout tied to BUFMR which then drives the two BUFR's that supply the clocks to the two clock regions). When I got the above warning, I tried to do that same for CLKFBOUT (by having a BUFMR/BUFR chain between FBOUT and FBIN), but I got a MAP error saying that is not possible for CLKFBIN. My input timing is very tight, so I am wondering if this issue is contributing to it. Could anyone suggest a way to make this warning go away, and potentially improve timing? Thank you in advance. Rao |
|
相关推荐
9个回答
|
|
你想用这个时钟做什么?
如果此时钟的计划是捕获传入的高速输入接口,那么您使用的时钟将不起作用。 要捕获I / O接口,您有两种选择 - 使用三个库中间的时钟I / O直接驱动BUFMR,并让BUFMR驱动该库中的BUFIO和BUFR,上面的库和下面的库。 这被称为“芯片同步”时钟 - 使用支持时钟的I / O驱动MMCM输入,然后使用BUFG为MMCM的CLKFB输入和I / O触发器(或ISERDES)提供时钟 如果这是时钟转发(源同步)输出接口,那么你可以 - 将时钟从CCIO direclty驱动到BUFMR到BUFIO / BUFR(很少有用) - 将时钟从CCIO驱动到MMCM到BUFG到IOB - 通常在CLKFBOUT - > CLKFBIN路径上放置一个BUFG - 注意:使用缓冲的CLKFBOUT时钟为结构或I / O逻辑提供时钟是完全合法的(只要CLKFBOUT是正确的频率) - 从MMCM生成时钟,并使用CLK0,CLK1,CLK2或CLK3驱动一个BUFIO& BUFR; 这使用了从MMCM到BUFIO / BUFR的“高性能时钟路径”。 在这种情况下,时钟的相位是无关紧要的,因此时钟反馈路径(用于补偿时钟延迟)并不重要 - 最简单的方法是将CLKFBOUT连接到CLKFBIN,两者之间没有缓冲。 - 注意:看起来BUFMR没有“高性能时钟路径” - 所以这个时钟只能用于一个存储体中的接口 Avrum 以上来自于谷歌翻译 以下为原文 What are you trying to do with this clock? If the plan for this clock is to capture an incoming high speed input interface, then the clocking you are using won't work. To capture an I/O interface you have two choices - use the clock capable I/O in the middle of the three banks to drive the BUFMR directly, and have the BUFMR drive the BUFIOs and BUFRs in this bank, the bank above and the bank below. This is known as "chip-sync" clocking - use the clock capable I/O to drive an MMCM input, and then use BUFGs to clock both the CLKFB input of the MMCM and the I/O flip-flops (or ISERDES) If this is clock forwarded (source synchronous) output interface, then you can - drive the clock from the CCIO direclty to the BUFMR to BUFIO/BUFR (rarely useful) - drive the clock from the CCIO to the MMCM to a BUFG to the IOBs - one normally puts a BUFG on the CLKFBOUT -> CLKFBIN path - NOTE: It is perfectly legal to use the buffered CLKFBOUT clock to clock fabric or I/O logic (as long as the CLKFBOUT is the right frequency) - generate the clocks from an MMCM, and use CLK0, CLK1, CLK2, or CLK3 to drive one BUFIO & BUFR; this uses the "high performance clock path" from the MMCM to the BUFIO/BUFR. In this case, the phase of the clock is irrelevent, so the clock feedback path (which is used to compensate for clock delay) is unimportant - the easiest thing to do is to tie the CLKFBOUT to CLKFBIN with no buffer between. - Note: It appears that there is no "high performance clock path" to the BUFMR - so this clocking can only be used for interfaces that are in one bank Avrum |
|
|
|
Avrum,
感谢您的及时详细的回复。 你解决了我所处的确切情况,即高速输入界面。 我做了很多实验(沿着你建议的方向),但是在几乎关闭时序方面起作用的是输入时钟驱动MMCM,然后驱动BUFMR / 2-BUFR链到时钟输入数据 两家银行。 这个解决方案对我来说很有意义,因为MMCM纠正了时钟插入延迟。 出于某种原因,我无法让CCIO直接驱动BUFMR / BUFR解决方案来满足时机要求。 我正在处理DDR数据,并且“眼睛”对于通过BUFMR / BUFR链的时钟插入延迟而言不够宽,以便可预测地进行计时。 在短时间内,我尝试在IBUFDS和IDDR(在数据线上)之间添加IDELAY块,以使它们的延迟与时钟匹配,但没有成功。 最后,我所遇到的最大违规约为0.15ns(设置)几行,约为5%的赤字。 设计在实验室中工作。 那可能是好的,现在。 我希望我有更多的时间。 再次感谢。 饶 以上来自于谷歌翻译 以下为原文 Avrum, Thank you for your prompt and detailed reply. You addressed the exact situation I was in, the hgh speed input interface. I did quite a few experiments (along the lines you suggested), but the one that worked in terms of nearly closing timing is with the input clock driving an MMCM which then drove the BUFMR/2-BUFRs chain to clock-in the data on two banks. This solution made sense to me since the MMCM deskewed the clock-insertion delay. For some reason, I couldn't make the CCIO directly driving BUFMR/BUFR solution to meet timing. I was dealing with DDR data, and the 'eye' was not wide-enough for the clock-insertion delay through the BUFMR/BUFR chain to predictably make timing. For a short time, I tried adding IDELAY blocks between IBUFDS and IDDR's (on data lines) to match their delay with clock's without success. in the end, the max-violation I have is about 0.15ns (setup) on a couple of lines, about a 5% deficit. Design works in the lab. That might be ok, for now. I wish I had more time. Thanks again. Rao |
|
|
|
饶
我担心这没有意义...... CCIO-> BUFIO / BUFR-> IDDR / ISERDES是与最小数据眼一起工作的时钟机制; 该路径的时钟插入和PVT依赖性被严格控制,以便能够以最高速率捕获数据。 在时钟和数据路径上使用IDELAY(PVT补偿)应该允许您以几乎任何时钟/数据时间关系捕获数据。 这种捕获机制的时序要求在数据表中记录为Tpscs / Tphcs(Kintex-7的ds182 - 我不知道你使用的是哪种设备......)对于-1速度等级的Kintex-7 ,这是-0.36ns的设置(是的,这是NEGATIVE 0.36ns,所以所需的数据眼图在时钟的上升沿后开始0.36ns),保持时间为1.70ns,所需的数据眼为1.34ns 。 往返于BUFMR会稍微改变时间(数据表中没有关于多少的信息,但我希望它能够将数据扩大到非常小的数量 - 可能是100-200ps)。 CCIO-> MMCM-> BUFG,MMCM时钟反馈也在CLKFBOUT和CLKFBIN之间有一个BUFG,数据表中也规定了 - 这是Tpsmmcmcc / Tphmmcmcc。 对于-1速度等级的Kintex-7 325T,这是3.14ns SU和-0.16ns保持,总窗口2.98ns - 超过芯片同步时钟所需宽度的两倍。 你最终得到的那个 - 使用由MMCM驱动的BUFIO / BUFMR(在反馈路径上有什么?)可能会有可怕的PVT依赖性 - 它可能比Tpsmmcmcc / Tphmmcmcc差很多。 现在,静态时序报告是否表示接口“通过”或“失败”完全取决于您的输入约束。 鉴于您正在谈论的设备,电路板偏斜,时钟抖动,时钟占空比(如果是DDR),信号完整性......您需要确保这些是100%准确的。如果您的约束不是' t 100%准确,然后工具告诉你通过或失败几乎没有意义。 总而言之,您需要设计高速接口的捕获机制。 您应该从数据表中的基线时序信息(我上面引用的那些)开始,然后实现与之匹配的时钟相位。 如果您正在使用芯片同步,则将IDELAY添加到时钟和数据路径并调整它们以获得正确的关系。 如果您使用的是MMCM,请将MMCM PHASE_CLKOUTx值调整为正确的值。 然后确保您的约束与它们在板上的匹配完全匹配,运行工具并分析结果。 数据表中的值(以及IDELAY的延迟)仅供参考,时序分析器给出了精确的分析,因此一旦运行了工具,就应该调整IDELAY / MMCM值以获得最大的余量,然后 你完成了 Avrum 以上来自于谷歌翻译 以下为原文 Rao, I'm afraid this doesn't really make sense... The CCIO->BUFIO/BUFR->IDDR/ISERDES is the clocking mechanism that works with the smallest data eyes; the clock insertion and PVT dependencies of this path is tightly controlled specifically to be able to capture data at the highest rate. Using the IDELAYs on the clock and data paths (which are PVT compensated) should allow you to capture the data with pretty much any clock/data timnig relationship. The timing requirements of this capture mechanism are documented as Tpscs/Tphcs in the data sheet (ds182 for the Kintex-7 - I don't know which device you are using...) For the Kintex-7 in a -1 speed grade, this is -0.36ns setup (and yes, that is NEGATIVE 0.36ns, so the required data eye starts 0.36ns after the rising edge of the clock), and a hold time of 1.70ns, for a required data eye of 1.34ns. Going to and from the BUFMR will change the timing slightly (and the datasheet doesn't have information on how much, but I would expect it to be widen the data eye by only a very small amount - maybe 100-200ps). The CCIO->MMCM->BUFG, with the MMCM clock feedback also having a BUFG between CLKFBOUT and CLKFBIN is also specified in the datasheet - this is Tpsmmcmcc/Tphmmcmcc. For the Kintex-7 325T in a -1 speed grade this is 3.14ns SU and -0.16ns hold, for a total window 2.98ns - more than twice the required width of the chip-sync clocking. The one you ended up with - using the BUFIO/BUFMR driven by an MMCM (with something? on the feedback path) would likely have terrible PVT dependence - it would likely be significantly worse than Tpsmmcmcc/Tphmmcmcc. Now, whether the static timing report says the interface "passes" or "fails" depends entirely on your input constraints. You need to be sure that these are 100% accurate given the device you are talking to, the board skew, the clock jitter, the clock duty cycle (if it is DDR), the signal integrity.... If your constraints aren't 100% accurate, then the tool telling you that you pass or fail is pretty much meaningless. All in all, you need to design the capture mechanism of a high speed interface. You should start with the baseline timing information in the datasheet (the ones I quoted above), and then implement a clocking sheme that matches that. If you are using chip-sync, then add IDELAYs to both the clock and data path and adjust them to get the proper relationship. If you are using the MMCM, then adjust the MMCM PHASE_CLKOUTx value to the correct value. Then make sure that your constraints exactly match what they are going to be on the board, run the tools and analyze the results. The values in the datasheets (and the delays of the IDELAYs) are for reference only, the timing analyzer gives the exact analysis, so once you have run the tools, you should tweak the IDELAY/MMCM values to get the most margins, and then you are done. Avrum |
|
|
|
Avrum,
感谢您的关注。 我使用的是Kintex-7 -1速度级部件(相当大的部件)。 我希望我在CCIO上有BUFR / BUFIO的中间版本与你分享。 我当时没有尝试过的一件事就是IDELAY。 但是没有IDELAY,我无法确定时机(即使数据仅限于一家银行)。 我相当肯定我的Offset-In约束是正确的(即使采用电路板跟踪延迟等)。 我不得不以某种方式使用BUFMR,因为我的时钟仅在一个银行上进行,但数据分散到两个银行。 我很清楚我在实施的MMCM / BUFMR解决方案中看到的时间是正确的 - 我查看了时间报告(即使是与同事一起)并发现它正在按照我的意图完成,尽管只是在一些 投入。 顺便问一下,BUFIO / BUFR解决方案(没有IDELAY)可以获得的最高DDR速度是多少? 只是好奇。 我非常感谢你的帮助笔记。 我肯定会在我的下一个设计上关注它们。 问候, 饶 以上来自于谷歌翻译 以下为原文 Avrum, Appreciate your concern. I am using a Kintex-7 -1 speed-grade part (fairly large one). I wish I had those intermediate builds with BUFR/BUFIO's on CCIO's to share with you. One I thing I didn't try with them at that time is IDELAY. But without IDELAY, I could not meet timing for sure (even when data was limited to one bank). I am fairly certain that my Offset-In constraints are correct (even taking board trace delays etc). I had to use BUFMR one way or the other because I had the clock come in only on one bank, but data spread out to two banks. I am farily certain the timing I see with the MMCM/BUFMR solution I implemented is correct - I reviewed the timing report (even with a colleague) and found it to be doing exactly what I intended, although missed by a small fraction on a few inputs. Btw, just curious, what's maximum DDR speed one could get with BUFIO/BUFR solution (no IDELAY)? Just curious. I truly appreciate your helpful notes. I will follow them on my next design for sure. Regards, Rao |
|
|
|
饶
使用BUFIO / BUFR解决方案可获得的最大DDR速度(无IDELAY) 对此没有真正的答案。 正如我在上一篇文章中提到的,没有IDELAY,有一个关于时钟的固定数据窗口 - Tpscs / Tphcs。 这定义了无需动态校准即可捕获的最小数据窗口。 从上一篇文章来看,这是1.34ns,所以,我想,理论上,这意味着你可以捕获740Mbps或370MHz DDR。 当你考虑我提到的所有其他事情(抖动,边沿速率,信号完整性,占空比等等)时,这会降低,但在真正干净的接口上,你应该可以做到500 - 600Mbps。 但是,如果没有IDELAY,那么数据窗口必须完全正确。 让我们举一个500Mbps(250MHz,DDR)的例子。 您的1/2时钟周期(单位间隔或UI)为2ns。 由于您只需要该UI中的数据稳定1.34ns,因此所有这些因素都有空间。 但是,为了在没有IDELAY的情况下正确捕获此接口,进入FPGA的稳定数据眼必须以每个时钟边沿后开始0.36ns所需的数据眼为中心并持续1.34ns(我将其称为[0.36] ,1.34]。如果数据眼不完全在那个“完美的位置”,那么你将无法捕获界面。 让我们假设您的源设备为您提供一个数据眼图,它在时钟边缘之前开始0.75ns并持续1.5ns(因此直到边缘后的0.75ns:[ - 0.75,0.75])。 这个窗口足够宽,可以捕获(大于1.34),但它位于错误的位置。 因此,您需要做的是更改所需的时钟/数据关系,以将此1.5ns数据有效窗口放在正确的位置。 一种方法是在数据路径上添加额外的延迟。 它从-0.75ns开始(时钟前0.75ns),我们需要它在时钟后开始0.36ns。 因此,如果我们向数据添加1.03ns的延迟,[ - 0.75,+ 0.75]窗口现在变为[0.28,1.78]。 这(完全)重叠了我们所需的[0.36,1,70]窗口,两侧的边距为0.08ns。 我们可以通过在数据上添加13个IDELAY(这将是1.015 - 稍微偏离1.03的目标 - 这将需要通过时序分析器确认)来实现这一点。 为此,我们将在时钟和数据路径上设置IDELAY,将时钟IDELAY设置为0抽头,将数据IDELAY设置为13次点击,通过trce运行(使用您的约束)并根据结果调整它 (转到12或14个水龙头)。 然而,它实际上更好地走另一条路。 当您使用IDELAY延迟数据时,它会增加抖动 - +/- 5ps / tap,因此13次点击会增加+/- 65ps,这超出了我们的预算允许范围。 但是,如果您使用IDELAY延迟时钟而不是数据,那么您不会支付此抖动惩罚。 因此,不是试图用时钟的上升沿捕获这个“第一”窗口,而是在上升沿之前的下降沿捕获它。 该边缘的窗口出现在-2ns + [0.36,1.70],因此[-1.64,-0.3]。 我们希望在时钟上添加足够的延迟,将此窗口移动到数据所在的[-0.75,0.75]窗口,因此我们在时钟上添加了0.97ns的延迟。 这将所需的窗口带到[-0.67,0.67],这完全位于[-0.75,0.75]的可用窗口内。 要做到这一点,我们仍然会在时钟和数据上都有IDELAYS,但是将数据IDELAY设置为0并将时钟IDELAY设置为12个抽头(这给出0.9375 - 再次,不完美,但足够接近)。 这就是界面的设计方式。 Avrum 以上来自于谷歌翻译 以下为原文 Rao, what's maximum DDR speed one could get with BUFIO/BUFR solution (no IDELAY) There is no real answer to that. As I mentioned in the last post, without the IDELAY, there is a fixed data window with respect to the clock - Tpscs/Tphcs. This defines the smallest window of data that can be captured without dynamic calibration. From the previous post, this is 1.34ns, so, I guess, in theory, this means that you could capture 740Mbps, or 370MHz DDR. This would come down when you factor in all the other things I mentioned (jitter, edge rates, signal integrity, duty cycle, etc...), but on a REALLY clean interface you should be able to do 500 - 600Mbps. But, without the IDELAY, then the data window would have to be EXACTLY right. Lets take an example of 500Mbps (250MHz, DDR). Your 1/2 clock period (Unit Interval, or UI) is 2ns. Since you only need the data to be stable for 1.34ns in that UI, there is room for all those factors. But, to capture this interface correctly without IDELAYs, the stable data eye coming in to the FPGA would have to be centered around the required data eye that starts 0.36ns after each clock edge and lasts 1.34ns (I will refer to this as [0.36,1.34]. If the data eye wasn't exactly in that "perfect spot" then you wouldn't be able to capture the interface. Lets say that your source device gives you a data eye that start 0.75ns before the clock edge and lasts 1.5ns (so until 0.75ns after the edge: [-0.75,0.75]). This window is wide enough to be captured (its bigger than 1.34), but its in the wrong place. So, what you would need to do is to change the required clock/data relationship to place this 1.5ns data valid window in the right place. One way to do this would be to add extra delay on the data path. It starts at -0.75ns (0.75ns before the clock), and we need it to start 0.36ns after the clock. So, if we add 1.03ns of delay to the data, the [-0.75,+0.75] window now becomes [0.28,1.78]. This (perfectly) overlaps our required window of [0.36,1,70], with an margin of 0.08ns on both sides. We could do this by adding 13 taps of an IDELAY on the data (which would be 1.015 - a little off our target of 1.03 - and this would have to be confirmed with timing analyzer). To do this, we would put an IDELAY on both the clock and data path, set the clock IDELAY to 0 taps, and the data IDELAY to 13 taps, run it through trce (with your constraints) and maybe tweak it based on the results (go to 12 or 14 taps). However, its actually better to go the other way around. When you use an IDELAY to delay data it adds jitter - +/-5ps/tap, so 13 taps adds +/-65ps, which is more than our budget allows. However, if you use an IDELAY to delay a clock, rather than data, then you don't pay this jitter penalty. So, instead of trying to capture this "first" window with the rising edge of clock we could instead capture it with the falling edge before the rising edge. The window for this edge occurs at -2ns + [0.36,1.70], so [-1.64,-0.3]. We want to add enough delay on the clock to move this window into our [-0.75,0.75] window where the data is, so we add 0.97ns of delay to the clock. This brings the required window to [-0.67,0.67], which is perfectly inside our available window of [-0.75,0.75]. To do this, we would still have IDELAYS on both clock and data, but set the data IDELAY to 0 and the clock IDELAY to 12 taps (which give 0.9375 - again, not perfect, but close enough). This is how the interface is designed. Avrum |
|
|
|
None
以上来自于谷歌翻译 以下为原文 Avrum, Thanks again. I would greatly appreciate if you could review the following timing reports (two small attachments) on setup/hold of a representative data bit input, and let me know if it looks ok to you. They are relative to the falling edge of the clock, but the timing for rising edge is identical. Just to fill in the details beyond what the timing report shows, the clock output phase is shifted -90-deg (out of MMCM, relative to its input clock), and there is a BUFR between CLKFBOUT and CLKFBIN. Thanks. Rao ================================================================================ Timing constraint: COMP "ADC_DI_P_i<11>" OFFSET = IN 0.47 ns VALID 0.86 ns BEFORE COMP "ADC_DCLKI_P_i" "RISING"; For more information, see Offset In Analysis in the Timing Closure User Guide (UG612). 1 path analyzed, 1 endpoint analyzed, 0 failing endpoints 0 timing errors detected. (0 setup errors, 0 hold errors) Minimum allowable offset is 0.463ns. -------------------------------------------------------------------------------- Paths for end point ADC_Intrfc/IDDR_DI_gen[11].DI_IDDR (ILOGIC_X1Y112.D), 1 path -------------------------------------------------------------------------------- Slack (setup path): 0.007ns (requirement - (data path - clock path - clock arrival + uncertainty)) Source: ADC_DI_P_i<11> (PAD) Destination: ADC_Intrfc/IDDR_DI_gen[11].DI_IDDR (FF) Destination Clock: ADC_Intrfc/clk_450mhz_io_I rising at -0.555ns Requirement: 0.470ns Data Path Delay: 0.478ns (Levels of Logic = 2) Clock Path Delay: 0.700ns (Levels of Logic = 4) Clock Uncertainty: 0.130ns Clock Uncertainty: 0.130ns ((TSJ^2 + DJ^2)^1/2) / 2 + PE Total System Jitter (TSJ): 0.050ns Discrete Jitter (DJ): 0.075ns Phase Error (PE): 0.084ns Maximum Data Path at Fast Process Corner: ADC_DI_P_i<11> to ADC_Intrfc/IDDR_DI_gen[11].DI_IDDR Location Delay type Delay(ns) Physical Resource Logical Resource(s) ------------------------------------------------- ------------------- AG8.PADOUT Tiopp 0.000 ADC_DI_P_i<11> ADC_DI_P_i<11>ADC_Intrfc/IBUFDS_DI_11/SLAVEBUF.DIFFIN AF8.DIFFI_IN net (fanout=1) 0.000 ADC_Intrfc/IBUFDS_DI_11/SLAVEBUF.DIFFIN AF8.I Tiodi 0.476 ADC_DI_N_i<11> ADC_Intrfc/IBUFDS_DI_11/IBUFDS ILOGIC_X1Y112.D net (fanout=1) 0.000 ADC_Intrfc/DI_i_inv<11> ILOGIC_X1Y112.CLK Tidock 0.002 ADC_Intrfc/DI_Q1<11> ADC_Intrfc/IDDR_DI_gen[11].DI_IDDR ------------------------------------------------- --------------------------- Total 0.478ns (0.478ns logic, 0.000ns route) (100.0% logic, 0.0% route) Minimum Clock Path at Fast Process Corner: ADC_DCLKI_P_i to ADC_Intrfc/IDDR_DI_gen[11].DI_IDDR Location Delay type Delay(ns) Physical Resource Logical Resource(s) ---------------------------------------------------- ------------------- AH4.I Tiopi 0.409 ADC_DCLKI_P_i ADC_DCLKI_P_iADC_Intrfc/ADC_DCLKI_DCM/clkin1_buf/IBUFDS MMCME2_ADV_X1Y2.CLKIN1 net (fanout=1) 0.503 ADC_Intrfc/ADC_DCLKI_DCM/clkin1 MMCME2_ADV_X1Y2.CLKOUT0 Tmmcmcko_CLKOUT -2.167 ADC_Intrfc/ADC_DCLKI_DCM/mmcm_adv_inst ADC_Intrfc/ADC_DCLKI_DCM/mmcm_adv_inst BUFMRCE_X1Y4.I net (fanout=1) 1.012 ADC_Intrfc/clk_450mhz_io_dcm_I BUFMRCE_X1Y4.O Tbmcko_O 0.030 ADC_Intrfc/dclki_bufmr ADC_Intrfc/dclki_bufmr BUFR_X1Y9.I net (fanout=2) 0.513 ADC_Intrfc/clk_450mhz_io_bufmrI_out BUFR_X1Y9.O Tbrcko_O 0.090 ADC_Intrfc/dclki_bufr ADC_Intrfc/dclki_bufr ILOGIC_X1Y112.CLK net (fanout=32) 0.310 ADC_Intrfc/clk_450mhz_io_I ---------------------------------------------------- --------------------------- Total 0.700ns (-1.638ns logic, 2.338ns route) -------------------------------------------------------------------------------- Hold Paths: COMP "ADC_DI_P_i<11>" OFFSET = IN 0.47 ns VALID 0.86 ns BEFORE COMP "ADC_DCLKI_P_i" "RISING"; -------------------------------------------------------------------------------- Paths for end point ADC_Intrfc/IDDR_DI_gen[11].DI_IDDR (ILOGIC_X1Y112.D), 1 path -------------------------------------------------------------------------------- Slack (hold path): 0.153ns (requirement - (clock path + clock arrival + uncertainty - data path)) Source: ADC_DI_P_i<11> (PAD) Destination: ADC_Intrfc/IDDR_DI_gen[11].DI_IDDR (FF) Destination Clock: ADC_Intrfc/clk_450mhz_io_I rising at -0.555ns Requirement: 0.390ns Data Path Delay: 0.320ns (Levels of Logic = 2) Clock Path Delay: 0.982ns (Levels of Logic = 4) Clock Uncertainty: 0.130ns Clock Uncertainty: 0.130ns ((TSJ^2 + DJ^2)^1/2) / 2 + PE Total System Jitter (TSJ): 0.050ns Discrete Jitter (DJ): 0.075ns Phase Error (PE): 0.084ns Minimum Data Path at Fast Process Corner: ADC_DI_P_i<11> to ADC_Intrfc/IDDR_DI_gen[11].DI_IDDR Location Delay type Delay(ns) Physical Resource Logical Resource(s) ------------------------------------------------- ------------------- AG8.PADOUT Tiopp 0.000 ADC_DI_P_i<11> ADC_DI_P_i<11>ADC_Intrfc/IBUFDS_DI_11/SLAVEBUF.DIFFIN AF8.DIFFI_IN net (fanout=1) 0.000 ADC_Intrfc/IBUFDS_DI_11/SLAVEBUF.DIFFIN AF8.I Tiodi 0.394 ADC_DI_N_i<11> ADC_Intrfc/IBUFDS_DI_11/IBUFDS ILOGIC_X1Y112.D net (fanout=1) 0.000 ADC_Intrfc/DI_i_inv<11> ILOGIC_X1Y112.CLK Tiockd (-Th) 0.074 ADC_Intrfc/DI_Q1<11> ADC_Intrfc/IDDR_DI_gen[11].DI_IDDR ------------------------------------------------- --------------------------- Total 0.320ns (0.320ns logic, 0.000ns route) (100.0% logic, 0.0% route) Maximum Clock Path at Fast Process Corner: ADC_DCLKI_P_i to ADC_Intrfc/IDDR_DI_gen[11].DI_IDDR Location Delay type Delay(ns) Physical Resource Logical Resource(s) ---------------------------------------------------- ------------------- AH4.I Tiopi 0.490 ADC_DCLKI_P_i ADC_DCLKI_P_iADC_Intrfc/ADC_DCLKI_DCM/clkin1_buf/IBUFDS MMCME2_ADV_X1Y2.CLKIN1 net (fanout=1) 0.553 ADC_Intrfc/ADC_DCLKI_DCM/clkin1 MMCME2_ADV_X1Y2.CLKOUT0 Tmmcmcko_CLKOUT -2.407 ADC_Intrfc/ADC_DCLKI_DCM/mmcm_adv_inst ADC_Intrfc/ADC_DCLKI_DCM/mmcm_adv_inst BUFMRCE_X1Y4.I net (fanout=1) 1.271 ADC_Intrfc/clk_450mhz_io_dcm_I BUFMRCE_X1Y4.O Tbmcko_O 0.034 ADC_Intrfc/dclki_bufmr ADC_Intrfc/dclki_bufmr BUFR_X1Y9.I net (fanout=2) 0.584 ADC_Intrfc/clk_450mhz_io_bufmrI_out BUFR_X1Y9.O Tbrcko_O 0.093 ADC_Intrfc/dclki_bufr ADC_Intrfc/dclki_bufr ILOGIC_X1Y112.CLK net (fanout=32) 0.364 ADC_Intrfc/clk_450mhz_io_I ---------------------------------------------------- --------------------------- Total 0.982ns (-1.790ns logic, 2.772ns route) -------------------------------------------------------------------------------- ================================================================================ |
|
|
|
这似乎不对......
正如我所说,使用MMCM和BUFR以及BUFMR的组合应该会产生可怕的时序 - 它不是正常的,预期的时钟系统......进出MMCM的路径将采用奇数路径,一个用于CLK0 和CLKFB不一样(因为一个通过BUFMR而另一个不通过),甚至可能涉及结构路由。 然后我们看一下报告本身。 您的约束明确指出您的数据有效窗口为0.86ns。 这太小了,无法用FPGA中的任何机制静态捕获。 即使是最好的机制,芯片同步,也需要大于1ns的数据眼。 所以,我很确定这是一个工具错误 - 你从定时引擎得到一个“错误传递”。 它不了解时钟结构,并为您提供不正确的静态时序结果。 如果您有webcase访问权限,则应将其作为错误提交。 现在到了你真正的问题。 0.86ns太小而无法静态捕捉! 您使用其他方法遇到的问题是真实的 - 没有什么能够通过这个小窗口来静态捕获数据。 因此,捕获此功能的唯一方法是使用捕获电路的动态校准。 有应用笔记(如XAPP524)处理这样的问题。 Avrum 以上来自于谷歌翻译 以下为原文 This doesn't seem right... As I said, using the combination of an MMCM and a BUFR and BUFMR should generate terrible timing - it isn't a normal, expected clocking system... The paths to and from the MMCM will take odd routes, the one for the CLK0 and CLKFB are not the same (since one goes through the BUFMR and the other doesn't), and it may even involve fabric routing. Then we look at the report itself. Your constraint clearly states that your data valid window is 0.86ns. This is just too small to capture statically with any mechanism in the FPGA. Even the best mechanism, chip sync, requires a data eye that is larger than 1ns. So, I am pretty sure this is a tool bug - you are getting a "false pass" from the timing engine. It doesn't understand the clock structure and is giving you incorrect static timing results. If you have webcase access, you should file this as a bug. Now to your real problem. 0.86ns is just too small to capture statically! The problems you have had with other approaches are real - nothing will be able to statically capture data with this small a window. So, your only way to capture this is using dynamic calibration of your capture circuit. There are app notes (like XAPP524) that deal with issues like this. Avrum |
|
|
|
Avrum,
我并不完全相信这个时间报告是错误的。 我想知道数据表的表46(MMCM)和48(BUFIO)中的建立/保持信息是否真的适用于IDDR触发器(但仅适用于IFF)。 我说这个的两个原因是在表16(性能特征)中,它注意到DDR LVDS接收器的时钟频率为1400Mbps(-2个部分)。 与我们讨论的1.3ns窗口相比,这只是一个0.715ns的开眼界。 所以肯定它可以比表48所表明的更好。 实际上我认为表24可能代表IDDR翻牌的时间(TISDCK_D_DDR / TISCKD_D_DDR)。 它列在ISERDES时间表下,但没有理由说IDDR的表现更为低劣。 其次,我想通过将我的时钟方案改为BUG而不是BUFMR / BUFR(反馈路径上的BUFG - 自动)来验证表46的数字。 附件是一个setup-viol片段。 当然,它错过了它,但正如它所说的那样,最小允许偏移量为1.274ns,与表46中的-2 410t部分(2.84ns)不同。 我忘了提到我的输入是LVDS(可能从报告中对你有明显的BUFDS数据)。 直观地说,我预计以时钟边缘为中心的0.86ns眼窗可以让FPGA可靠地为它提供时钟(在DDR模式下)。 以上来自于谷歌翻译 以下为原文 Avrum, I am not fully convinced this timing report is buggy. I am wondering if the setup/hold info in tables 46 (MMCM) and 48 (BUFIO) of the datasheet really apply to IDDR flops (but only to IFFs). Two reasons I say this is that in table 16 (perf characteristics), It notes that the DDR LVDS receiver can clock 1400Mbps (for a -2 part). That is a only a 0.715ns eye-opening, compared to the 1.3ns window we were discussing. So surely it can do better than what table 48 suggests. In fact I am thinking that table 24 might be representative of the IDDR flop's timing (TISDCK_D_DDR/ TISCKD_D_DDR). It's listed under ISERDES timing, but no reason why the IDDR would perform more inferiorly. Secondly, I wanted to verify table 46 numbers by changing my clocking scheme to BUG instead of BUFMR/BUFR (BUFG on feedback path as well - automatic). Attached is a setup-viol snippet. Sure, it misses it, but as it says, the minimum allowable offset is 1.274ns to make it, unlike what it says in table 46 for a -2 410t part (2.84ns). I forgot to mention that my inputs are LVDS (might have been obvious to you from the report with BUFDS on the data). Intuitively, I expected a 0.86ns eye-window centered around the clock-edge is plenty for the FPGA to clock it in reliably (in DDR mode). |
|
|
|
实际上,IOB中只有一组资源 - SDR IOB FF只是IDDR中两个捕获FF中的一个。
我非常有信心数据表中的时序涵盖了SDR IOB FF,IDDR和ISERDES。 这些参数(Tpsdcm / Tphdcm,Tpsmmcm / Tphmmcm,Tpscs / Tphcs)已经存在了很长一段时间 - 自从Virtex-II以来我一直在使用它们,并且我有很好的权威,它们是准确的,并且它们适用 在这些情况下。 但是,要记住的是它们是“最坏情况” - 它们适用于FPGA上任何位置的任何IOB。 FPGA上的单个引脚可能不同,但FPGA中所有引脚的重叠将大致映射到此窗口。 此外,还需要对这些数字应用一些额外的降额因子 - 它们名义上是针对SSTL15 I / O标准规定的。 其他标准将更慢或更快(见表19)。 由于所有这些原因,“最终”分析应来自静态时间报告。 但是,这些数字非常有意义。 至于能够做到1400Mbps,没有任何东西可以说这可以通过静态捕获机制来实现。 所有更快的接口可能需要使用动态校准。 通过动态校准,您可以获得低于1ns的窗口。 最后要关联trce的静态时序数,你需要设置松弛和保持松弛。 接口所需的总数据眼图将是输入数据眼图的宽度(OFFSET IN约束中的VALID) - 设置松弛 - 保持松弛。 由于设置和保持松弛中的任何一个或两个都是负数,因此所需窗口将比VALID窗口宽(在您的示例中为0.86)。 此计算应与Tpsmmcmcc / Tphmmcmcc的宽度对齐。 Avrum 以上来自于谷歌翻译 以下为原文 In reality, there is only one set of resources within the IOB - the SDR IOB FF is just one of the two capture FFs in the IDDR. I am quite confident that the timing in the datasheet covers SDR IOB FFs, IDDRs and ISERDES. These parameters (Tpsdcm/Tphdcm, Tpsmmcm/Tphmmcm, Tpscs/Tphcs) have been around for a LOOONG time - I have worked with them since Virtex-II, and I have it on pretty good authority that they are accurate, and that they apply in these cases. However, the thing to remember is that they are "worst case" - they apply to any IOB anywhere on the FPGA. An individual pin on the FPGA may be different, but the overlap of all pins in the FPGA will roughly map to this window. Furthermore, there is some additional derating factors that need to be applied to these numbers - they are nominally spec'ed for the SSTL15 I/O standard. Other standards will be slower or faster (see Table 19). For all these reasons, the "final" analysis should come from the static timing report. But, these numbers are quite meaningful. As for being able to do 1400Mbps, there is nothing that says that this can be acheived with static capture mechanisms. All the faster interfaces might need to use dynamic calibration. With dynamic calibration you can get below the 1ns window. Finally to correlate the static timing numbers from trce, you would need both the setup slack and hold slack. The total data eye required for your interface would be the the width of your input data eye (the VALID in your OFFSET IN constraint) - setup slack - hold slack. Since either or both of the setup and hold slacks will be negative, the required window will be wider than your VALID window (0.86 in your example). This calculation should line up with the width of the Tpsmmcmcc/Tphmmcmcc. Avrum |
|
|
|
只有小组成员才能发言,加入小组>>
2380 浏览 7 评论
2797 浏览 4 评论
Spartan 3-AN时钟和VHDL让ISE合成时出现错误该怎么办?
2262 浏览 9 评论
3335 浏览 0 评论
如何在RTL或xilinx spartan fpga的约束文件中插入1.56ns延迟缓冲区?
2428 浏览 15 评论
有输入,但是LVDS_25的FPGA内部接收不到数据,为什么?
756浏览 1评论
请问vc707的电源线是如何连接的,我这边可能出现了缺失元件的情况导致无法供电
545浏览 1评论
求一块XILINX开发板KC705,VC707,KC105和KCU1500
366浏览 1评论
1963浏览 0评论
682浏览 0评论
小黑屋| 手机版| Archiver| 电子发烧友 ( 湘ICP备2023018690号 )
GMT+8, 2024-11-23 07:42 , Processed in 1.385157 second(s), Total 95, Slave 78 queries .
Powered by 电子发烧友网
© 2015 bbs.elecfans.com
关注我们的微信
下载发烧友APP
电子发烧友观察
版权所有 © 湖南华秋数字科技有限公司
电子发烧友 (电路图) 湘公网安备 43011202000918 号 电信与信息服务业务经营许可证:合字B2-20210191 工商网监 湘ICP备2023018690号