好的,所以你的输入格式是64.0,你想要32.4格式输出正确吗?
我怀疑你在网上找不到你想要的东西,所以你需要找到
这样的例子并修改它们以满足您的确切需求。
例如,这是一个8.0输入,4.4输出样本设计,显示了一种方法:
http://www.engr.usask.ca/classes/EE/431/Verilog%20Files/better_square_root.v
您需要将设计扩展到所需的宽度,然后展开逻辑以达到您的要求
延迟要求。
这是思考的另一个例子。
同样,每个输出需要大约1个时钟周期
位(最糟糕的情况)。
您可以展开循环
http://academic.csuohio.edu/yuc/comp-f08/sqrt-virtex.v
这是另一个页面,您可以使用36.0输入平方根算法
去适应。
它纯粹是组合的。
您需要将其修改为64位,产生32.4输出。
所以,假设你找不到更好的起点......一个方法就是对待你的
输入为64.8输入格式,即用8个零填充它。
执行整数平方根后,
你应该得到32.4输出。
一旦你使核心算法工作,看看你如何管道它以改善它的时钟
达到延迟目标时的频率,即分散6个流水线阶段。
祝你好运,希望你能找到更多建议!
约翰普罗塞纳
在原帖中查看解决方案
以上来自于谷歌翻译
以下为原文
OK, so your input format is 64.0 and you want 32.4 format ouput correct?
I suspect you you won't find exactly what you want on the web, so you'll need to find
so examples and modify them to fit your exact needs.
For example, here is an 8.0 input, 4.4 output sample design that shows one approach:
http://www.engr.usask.ca/classes/EE/431/Verilog%20Files/better_square_root.v
You'd need to extend the design to the width you want and unroll the logic to reach your
latency requirement.
Here's another example for thought. Again, it will take about 1 clock cycle per output
bit (worst case). You may be able to unroll the loop
http://academic.csuohio.edu/yuc/comp-f08/sqrt-virtex.v
Here's another page that has a sample 36.0 input square-root algorithm that you may be able
to adapt. it is purely combinatorial. You'd need to modify it to be 64 bits producing a 32.4 output.
So, assuming you can't find any better starting points.... One aproach would be to treat your
input as a 64.8 input format, ie, pad it with 8 zeroes. After you perform an integer square root,
you should get a 32.4 output.
Once you get the core algorithm working, look into how you can pipeline it to improve it's clock
frequency while meeting your latency goal, ie, scatter 6 pipeline stages in it.
Good luck, I hope you can find more suggestions!
John Providenza
View solution in original post
好的,所以你的输入格式是64.0,你想要32.4格式输出正确吗?
我怀疑你在网上找不到你想要的东西,所以你需要找到
这样的例子并修改它们以满足您的确切需求。
例如,这是一个8.0输入,4.4输出样本设计,显示了一种方法:
http://www.engr.usask.ca/classes/EE/431/Verilog%20Files/better_square_root.v
您需要将设计扩展到所需的宽度,然后展开逻辑以达到您的要求
延迟要求。
这是思考的另一个例子。
同样,每个输出需要大约1个时钟周期
位(最糟糕的情况)。
您可以展开循环
http://academic.csuohio.edu/yuc/comp-f08/sqrt-virtex.v
这是另一个页面,您可以使用36.0输入平方根算法
去适应。
它纯粹是组合的。
您需要将其修改为64位,产生32.4输出。
所以,假设你找不到更好的起点......一个方法就是对待你的
输入为64.8输入格式,即用8个零填充它。
执行整数平方根后,
你应该得到32.4输出。
一旦你使核心算法工作,看看你如何管道它以改善它的时钟
达到延迟目标时的频率,即分散6个流水线阶段。
祝你好运,希望你能找到更多建议!
约翰普罗塞纳
在原帖中查看解决方案
以上来自于谷歌翻译
以下为原文
OK, so your input format is 64.0 and you want 32.4 format ouput correct?
I suspect you you won't find exactly what you want on the web, so you'll need to find
so examples and modify them to fit your exact needs.
For example, here is an 8.0 input, 4.4 output sample design that shows one approach:
http://www.engr.usask.ca/classes/EE/431/Verilog%20Files/better_square_root.v
You'd need to extend the design to the width you want and unroll the logic to reach your
latency requirement.
Here's another example for thought. Again, it will take about 1 clock cycle per output
bit (worst case). You may be able to unroll the loop
http://academic.csuohio.edu/yuc/comp-f08/sqrt-virtex.v
Here's another page that has a sample 36.0 input square-root algorithm that you may be able
to adapt. it is purely combinatorial. You'd need to modify it to be 64 bits producing a 32.4 output.
So, assuming you can't find any better starting points.... One aproach would be to treat your
input as a 64.8 input format, ie, pad it with 8 zeroes. After you perform an integer square root,
you should get a 32.4 output.
Once you get the core algorithm working, look into how you can pipeline it to improve it's clock
frequency while meeting your latency goal, ie, scatter 6 pipeline stages in it.
Good luck, I hope you can find more suggestions!
John Providenza
View solution in original post
举报