怎么减少循环延迟

各位大家好，我需要您对以下问题发表意见。
假设我们有以下代码：
int Array [10000] = {0};
for（int i = 0; i
这个循环有10000次迭代，我只需要一次迭代来获取我的数据，但我不知道哪个迭代是正确的。
所以我没有任何理由有很多延迟。
有没有办法减少指令无用的迭代（例如array_partition）...... ???
我也使用展开和管道，但没有任何积极的结果。
提前致谢...！！！

以上来自于谷歌翻译

以下为原文

Hello everyone, I need your opinion about the following issue. Let's say that we have the following code:

int Array[10000]={0};for(int i=0; i<10000;i++){ //fixed bound loop if(...limitation....){ .............. .............. }} This loop has 10000 iterations and I need only one iteration to take my data but I don't know which iteration is correct. So I have a lot of latency without any reason.

Is there any way to reduce the useless iterations with directives (for example array_partition)...???

I also use unroll and pipeline but without any positive results.
Thanks in advance...!!!

回帖（8）

姜雨孜

2018-11-1 09:09:42

嗯，这很烦人。
我能看到的唯一解决方案是将所有内容转换为整数或定点;
至少它会明显更快（应该能够获得单周期性能）。
在原帖中查看解决方案

以上来自于谷歌翻译

以下为原文

Well, that's pretty annoying. The only solution I can see is to convert everything to integer or fixed-point; at least then it'll be significantly faster (should be able to get single-cycle performance).
View solution in original post

姜雨孜

2018-11-1 09:15:00

所以这个想法是“（...限制......）”只适用于10,000次迭代中的一次？
而你无法确定提前进行哪次迭代？
你可以做的并不多。
如果你需要读取10,000个数组元素来找到哪个是“正确的”，并且你每个周期只能读一个（来自单端口RAM，或者一个端口永久连接在别处的RAM）那么那就是
采取（最多）10,000个周期。
如果对数据有某种排序，那么就有办法改进它。
例如，如果数据已排序并且您正在寻找大于某个常量值的第一个元素，那么二进制搜索将保证您在大约十五次迭代中找到它（尽管这些不能被流水线化，因为每个元素都依赖于
在前一个）。

以上来自于谷歌翻译

以下为原文

So the idea is that "(...limitation...)" will only be true for one of the 10,000 iterations? And you can't determine which iteration that will be in advance?

There's not really much you can do there. If you need to read 10,000 array elements to find which one is the "right" one, and you can only read one per cycle (from a single-port RAM, or a RAM where one port is permanently connected elsewhere) then that's going to take (up to) 10,000 cycles.

If there's some sort of ordering to the data then there are ways to improve it. For example, if the data is sorted and you're looking for the first element larger than some constant value, then a binary search will guarantee that you find it in about fifteen iterations (although these can't be pipelined, because each one depends on the previous one).

俞敏东

2018-11-1 09:28:46

我有以下限制：
for（int i = 0; i
if循环内的函数每800次迭代只能正确一次。

以上来自于谷歌翻译

以下为原文

I have the following limitation:

for(int i = 0; i < 800; i++){if (inTriangle(x,current,xCoor,yCoor,zCoor)){ .............. ..............}}The function inside the if loop can be correct only one time every 800 iterations.
我有以下限制：
for（int i = 0; i
if循环内的函数每800次迭代只能正确一次。

以上来自于谷歌翻译

以下为原文

I have the following limitation:

for(int i = 0; i < 800; i++){if (inTriangle(x,current,xCoor,yCoor,zCoor)){ .............. ..............}}The function inside the if loop can be correct only one time every 800 iterations.

举报

姜雨孜

2018-11-1 09:38:01

你能准确解释代码在做什么吗？
从函数名称，这是我的猜测：你有一个三角形棱镜阵列（“当前”），你正在检查这些点（xCoor，yCoor，zCoor）中的哪一个落入。
它只能落入一个，因为它们不重叠。
问题是如何确定哪一个是相关的。
显而易见的问题是“三角形是否以任何有意义的方式排列？”
举一个简单的例子，其中“当前”只是一个大小相等（每边S单位）立方体的阵列，它们被安排在一个10 * 10 * 8的棱镜中。
找出一个点所在的立方体是微不足道的：（xCoor / S，yCoor / S，zCoor / S）处的立方体是正确的。
然后你根本不需要循环。
三角形更难，但不是更难。
如果三角形的大小不一样，或者排列得不那么整齐，那么你可能会使用类似的东西。
存储一个单独的立方体贴图，并为每个立方体存储每个与该立方体相交的三角形棱镜（如果您知道棱镜的顶点，这很容易）。
这需要更多的内存，但是（如上所述）找出一个点所在的立方体是微不足道的，一旦你知道你只需要检查与该立方体相交的棱镜。

以上来自于谷歌翻译

以下为原文

Can you explain exactly what the code is doing? From the function names, here's my guess: you've got an array of triangular prisms ("current"), and you're checking which one of these a point (xCoor, yCoor, zCoor) falls into. It can only fall into one because they don't overlap. The question is how to determine which one is relevant.

The obvious question is "are the triangles arranged in any meaningful way?" Take a simplified example where "current" is just an array of equally-sized (S units per side) cubes and they're arranged in a 10*10*8 prism. Finding out which cube a point is in is trivial: the cube at (xCoor/S, yCoor/S, zCoor/S) is the correct one. Then you don't need the loop at all. It's harder for triangles, but not much harder.

If the triangles are not all the same size, or not arranged so neatly, you could potentially use something similar. Store a separate map of cubes, and for each cube store every triangular prism that intersects that cube (which is easy if you know the prism's vertices). This takes a bit more memory, but (as above) it's trivial to find out which cube a point is in, and once you know that you only have to check for prisms that intersect that cube.
你能准确解释代码在做什么吗？
从函数名称，这是我的猜测：你有一个三角形棱镜阵列（“当前”），你正在检查这些点（xCoor，yCoor，zCoor）中的哪一个落入。
它只能落入一个，因为它们不重叠。
问题是如何确定哪一个是相关的。
显而易见的问题是“三角形是否以任何有意义的方式排列？”
举一个简单的例子，其中“当前”只是一个大小相等（每边S单位）立方体的阵列，它们被安排在一个10 * 10 * 8的棱镜中。
找出一个点所在的立方体是微不足道的：（xCoor / S，yCoor / S，zCoor / S）处的立方体是正确的。
然后你根本不需要循环。
三角形更难，但不是更难。
如果三角形的大小不一样，或者排列得不那么整齐，那么你可能会使用类似的东西。
存储一个单独的立方体贴图，并为每个立方体存储每个与该立方体相交的三角形棱镜（如果您知道棱镜的顶点，这很容易）。
这需要更多的内存，但是（如上所述）找出一个点所在的立方体是微不足道的，一旦你知道你只需要检查与该立方体相交的棱镜。

以上来自于谷歌翻译

以下为原文

Can you explain exactly what the code is doing? From the function names, here's my guess: you've got an array of triangular prisms ("current"), and you're checking which one of these a point (xCoor, yCoor, zCoor) falls into. It can only fall into one because they don't overlap. The question is how to determine which one is relevant.

The obvious question is "are the triangles arranged in any meaningful way?" Take a simplified example where "current" is just an array of equally-sized (S units per side) cubes and they're arranged in a 10*10*8 prism. Finding out which cube a point is in is trivial: the cube at (xCoor/S, yCoor/S, zCoor/S) is the correct one. Then you don't need the loop at all. It's harder for triangles, but not much harder.

If the triangles are not all the same size, or not arranged so neatly, you could potentially use something similar. Store a separate map of cubes, and for each cube store every triangular prism that intersects that cube (which is easy if you know the prism's vertices). This takes a bit more memory, but (as above) it's trivial to find out which cube a point is in, and once you know that you only have to check for prisms that intersect that cube.

举报

更多回帖

rotate(-90deg);
回复

8 0 0

8 0 0

相关问答
循环

嵌套循环可以减少延迟周期吗？

2020-05-22 1559

如何减少设计延迟？

2019-10-29 954

请问怎么减少这种循环程序之间的干扰？

2018-07-20 2216

有没有办法减少LVDS输出的延迟时间？

2023-05-25 339

怎样通过循环次数计算延迟函数的延迟时间呢

2021-10-19 1616

nodemcu while循环失败的原因？

2023-05-09 117

如何使用verilog减少生成输出的初始延迟？

2019-06-24 892

怎么减少平方根程序中的时钟延迟

2019-03-20 881

关于时间延迟的问题

2013-09-10 4637

MCUXpresso IDE中是否有延迟功能？

2023-03-16 174

发帖

登录/注册

20万+工程师都在用，免费PCB检查工具

无需安装、支持浏览器和手机在线查看、实时共享

电子发烧友APP

登录注册
|投诉反馈|电子发烧友网

© 2021 bbs.elecfans.com

湘ICP备2023018690号

点击登录

登录更多精彩功能！

首页

论坛版块

小组

免费开发板试用

ebook

直播

搜索

登录