完善资料让更多小伙伴认识你,还能领取20积分哦, 立即完善>
我刚刚为客户设置了VDI概念证书,目的是利用NVidia GRID vGPU,但我遇到了主要的应用程序兼容性问题:(
我的设置是 HP ProLiant DL380 Gen 9,双10核CPU,128GB RAM,4x300GB 15k SAS(约550GB本地存储) NVidia Grid K2 Citrix XenServer 6.5 Citrix XenDesktop 7.6(应用于服务器组件和VDA的推荐补丁) 适用于XenServer 6.5的NVidia vGPU驱动程序 - Windows显示驱动程序(341.08)和GRID vGPU Manager(340.57) 我创建了一个为vGPU配置的基本桌面映像,并创建了一个机器目录。 然后我修改了直通GPU的基本图像,并创建了另一个机器目录。 这让我对vGPU与vDGA进行了并列比较,vGPU配置了GRID K240Q配置文件,vDGA让卡上的其中一个GPU通过。 使用vDGA机器,基本上所有软件都可以工作,这一切都很好。 然而,使用vGPU机器几乎任何需要OpenGL崩溃的NVOGLV64.DLL :( 无效的应用程序列表是 3DEqualiser4 Adobe After Effects CC 2014 Adobe PhotoShop CC 2014(它运行,但没有硬件加速) Adobe Premier Pro CC 2014 Autodesk AutoCAD 2015 Autodesk AutoCAD Architecture 2015 Autodesk Maya 2015 Hiero 马里 MODO 核弹 轮廓 SolidWorks 2010 Toon Boom HARMony Toon Boom故事板 我知道3D加速是可能的,因为Unigine Heaven基准测试适用于两种配置文件(vGPU和vDGA)以及所有渲染模式。 我真的需要一些帮助来理解是否 我的设置有问题 NVidia VM驱动程序存在问题 在重新编写以支持vGPU环境之前,应用程序才会起作用 大多数崩溃采取以下形式 错误应用程序名称:AEGPUSniffer.exe,版本:0.0.0.0,时间戳:0x53e05513 错误模块名称:nvoglv64.DLL,版本:9.18.13.4108,时间戳:0x5452245c 异常代码:0xc000001d 故障偏移:0x0000000000d5fb10 错误进程id:0x1878 错误应用程序启动时间:0x01d049daf4c88fde 错误应用程序路径:C: Program Files Adobe Adobe After Effects CC 2014 支持文件 AEGPUSniffer.exe 错误模块路径:C: Windows SYSTEM32 nvoglv64.DLL 一些应用程序创建了故障转储文件,并分析那些显示0xc000001d异常(无效操作代码)导致我的AVX指令(我认为)。 我唯一的想法是指令中指向的内存没有正确的16字节对齐,但它需要比我有权访问更多的调试。 任何帮助/指针将不胜感激,否则vGPU对这个客户几乎没用:( 以上来自于谷歌翻译 以下为原文 I have just set up a Proof of Concept VDI for a customer, with the aim of utilising NVidia GRID vGPU, but I have had major application compatibility issues :( My setup is
I have created one base desktop image configured for vGPU, and created a Machine Catalog. I then modified the base image for passthrough GPU, and created another Machine Catalog. This gave me side-by-side comparison of vGPU vs. vDGA, with the vGPU configured with GRID K240Q profiles and the vDGA getting one of the GPUs on the card passed through. With the vDGA machine, basically all of the software worked, which is all fine. However, with the vGPU machine nearly anything that required OpenGL crashed in the NVOGLV64.DLL :( The list of applications that don't work is
I know that 3D acceleration is possible, as the Unigine Heaven benchmark works in both profiles (vGPU and vDGA) and in all rendering modes. I really need some help to understand if
Most of the crashes take the following form Faulting application name: AEGPUSniffer.exe, version: 0.0.0.0, time stamp: 0x53e05513 Faulting module name: nvoglv64.DLL, version: 9.18.13.4108, time stamp: 0x5452245c Exception code: 0xc000001d Fault offset: 0x0000000000d5fb10 Faulting process id: 0x1878 Faulting application start time: 0x01d049daf4c88fde Faulting application path: C:Program FilesAdobeAdobe After Effects CC 2014Support FilesAEGPUSniffer.exe Faulting module path: C:WindowsSYSTEM32nvoglv64.DLL Some of the applications created crash dump files, and analysing those showed the 0xc000001d exception (invalid op code) was caused my a AVX instruction (I think). My only thoughts are that the memory pointed in the instruction wasn't correctly 16-byte aligned, but it would require more debugging than I have access to. Any help/pointers would be greatly appreciated, otherwise vGPU is pretty much of no use to this customer :( |
|
相关推荐
10个回答
|
|
您使用的CPU是什么?
我打算以Haswell为基础进行猜测? 以上来自于谷歌翻译 以下为原文 What CPU's are you using? I'm going to make a guess at Haswell based? |
|
|
|
谢谢Jason的回复。
CPU是Intel Xeon E5-2650 v3 @ 2.30Ghz,所以是的,它们是Haswell-EP。 这些处理器与vGPU存在已知问题吗? 以上来自于谷歌翻译 以下为原文 Thanks for the reply Jason. The CPUs are Intel Xeon E5-2650 v3 @ 2.30Ghz, so yes they are Haswell-EP. Is there a known issue with these processors with vGPU? |
|
|
|
我遇到了同样的问题,流程设置:
NVidia Grid K1 Citrix XenServer 6.5 Citrix XenApp 6.0(Windows 2008 R2) 适用于XenServer 6.5的NVidia vGPU驱动程序 - Windows显示驱动程序(341.08)和GRID vGPU Manager(340.57) 我设法用347.52-quadro-tesla-grid-winserv2008-2008r2-2012-64bit-international-whql.exe更改Windows显示驱动程序。 之后,我在nvoglv64.dll中没有出现错误。 以上来自于谷歌翻译 以下为原文 I have experience the same issue, with the flowing setup: NVidia Grid K1 Citrix XenServer 6.5 Citrix XenApp 6.0 (Windows 2008 R2) NVidia vGPU Drivers for XenServer 6.5 - Windows Display Driver (341.08) and GRID vGPU Manager (340.57) I manage to change the Windows Display Driver with 347.52-quadro-tesla-grid-winserv2008-2008r2-2012-64bit-international-whql.exe. After that, I do not get fault errors in nvoglv64.dll. |
|
|
|
自XenServer 6.5发布以来,已经报道了类似的问题,尽管这可能与购买Haswell系统的客户更为巧合。
我们有一个更新的驱动程序包,应该在本周发布,其中包含解决此问题的解决方法。 查看我们的驱动程序下载页面,了解Xenserver 6.5的新vGPU软件包,一旦下载并测试它们,请告知我们是否能解决问题。 以上来自于谷歌翻译 以下为原文 Similar issues have been reported since the XenServer 6.5 has been released, though that may be more coincidental with customers buying Haswell systems. We have an updated driver package that should be released this week that has incorporated a workaround to address this issue. Check our drivers download page later today for a new vGPU package for Xenserver 6.5, once you've downloaded and tested them, let us know if it resolves the issue. |
|
|
|
杰森,你让我成为一个非常快乐的人:)
我会做。 非常感谢您的更新。 以上来自于谷歌翻译 以下为原文 Jason, you've made me a very happy man :) I will do. Thanks a lot for the update. |
|
|
|
None
以上来自于谷歌翻译 以下为原文 Preliminary Update I have updated the XenServer driver and the Windows driver in the vGPU profile base image [NVIDIA GRID VGPU SOFTWARE RELEASE 340.78/341.44 WHQL], and initial testing has been 100% positive :) The ones I quickly tested (it's quite late here) are
All ran with 3D acceleration. So looking very promising! I will do some more thorough testing in a couple of days, when I visit the customer's site. Many thanks again for the information, and the heads-up on the new driver release. |
|
|
|
完整更新
我拨回来测试了我的“崩溃”列表中的所有剩余应用程序 3DEqualiser4 Adobe After Effects CC 2014 Adobe Premier Pro CC 2014 Autodesk AutoCAD Architecture 2015 Hiero 马里 MODO 轮廓 Toon Boom Harmony Toon Boom故事板 并且所有这些都没有崩溃:) 所以肯定我的问题已由驱动程序更新解决。 旁白:目前vGPU模式不支持CUDA / OpenCL。 这是技术问题(硬件限制等)还是驱动程序限制? 我的客户正在测试的一些应用程序确实使用CUDA / OpenCL进行光线跟踪等,虽然CPU始终是一个后备,但基准/比较CPU与vGPU以查看性能增益可能会很有趣。 了。 是否有一个路线图将增加对CUDA / OpenCL的支持添加到vGPU,或者是否没有足够的感知需求而只关注OpenGL / DirectX视觉效果(而不是计算)? 以上来自于谷歌翻译 以下为原文 Full Update I dialled back in and tested all the remaining applications on my "crash" list
and all of them ran without crashing :) So certainly my issue has been resolved by the driver updated. On an aside: currently CUDA/OpenCL is not supported with vGPU mode. Is this a technical issue (hardware limitation, etc.), or driver limitation? There are a few applications that my customer is testing that do use CUDA/OpenCL for raytracing, etc., and while CPU is always a fallback, it would have been interesting to benchmark/compare CPU vs. vGPU to see what performance gains could be had. Is there a roadmap to add support for CUDA/OpenCL to vGPU, or is there not enough perceived demand for it and just concentrating on OpenGL/DirectX visuals (rather than compute)? |
|
|
|
非常好,谢谢你让我们知道它已经解决了。
在CUDA问题上。 首先,了解vGPU基于调度程序共享资源非常重要,它不会分配CUDA核心块,但会为您分配时钟计划的“片段”。 如果未充分利用GPU,这可以让我们增加VM的时钟时间,从而在其他用户未完全使用GPU时为用户提供性能提升。 现在,在使用CUDA时,您实际上将直接向GPU发送代码,它将一直运行直到完成。 如果这超出了用户的预定时间,它只会继续运行并锁定其他用户的GPU资源。 今天没有机制来暂停或抢先完成代码,因此多个用户共享资源的情况并不好! 这就是今天CUDA支持不适用于vGPU的原因,仅适用于直通。 它是为未来开发的吗? 当然,我们非常希望确保vGPU可以为GPU直接提供相同的功能,包括用于GPGPU,这是一个路线图项目,但我目前无法分享时间表。 以上来自于谷歌翻译 以下为原文 Excellent, thanks for letting us know it's resolved. Onto the CUDA question. First it's important to understand that vGPU shares resources based on a scheduler, it doesn't allocate blocks of CUDA cores, but you get allocated a "slice" of the clock schedule. This allows us to increase a VM's clock time if the GPU is not fully utilised so giving users a bump in performance when other users aren't using the GPU fully. Now, when using CUDA you would essentially be sending code directly to the GPU and it will run until completion. If this exceeds the users scheduled time, it just keeps running and locks out the GPU resources for other users. Today there's no mechanism to suspend or pre-empt completion of the code so not a good situation for multiple users sharing resources! This is the reason why today CUDA support not available for vGPU, only for passthrough. Is it being developed for the future? Absolutely, we're keen to ensure that vGPU can offer identical capabilities to a passthrough GPU including use for GPGPU, and it is a roadmap item, though I can't share timelines at present. |
|
|
|
嗨,杰森,
感谢您对CUDA / vGPU的解释。 是的,我可以看到,在CUDA的情况下,需要先发制人的调度程序来处理vGPU之间正确的资源分配。 我已经阅读了计算内核到每个vGPU的“时间切片”(之后我注意到vGPU报告了VM的所有内核,而不是它们的子集,不像RAM分配),但是我不知道 它是如何实现的。 我猜它是在dom0驱动程序中的某种循环排队? 如同,它接受来自每个VM的图形“操作”,然后沿着具有挂起操作的顺序从每个VM队列执行“操作”? 或者它是通过dom0严格定时的? 如果是这样,使用的时间片的大小是多少? 纯粹的好奇心,所以如果“秘密酱”,我明白,如果你不想说:)有趣的是知道VM“看到”任何时间安排的效果将导致活动“脉动”:大多数 什么时候没有,然后GPU上的“全功率”脉冲。 必须很有趣,确保这不会导致VM中的任何自适应计时问题:) 以上来自于谷歌翻译 以下为原文 Hi Jason, Thank you for the explanation on CUDA/vGPU. Yes, I can see that a pre-emptive scheduler would be required to handle the correct allocation of resource between vGPUs in the case of CUDA. I had read up on the "time-slicing" of the compute cores to each vGPU (after I noticed that the vGPU was reporting all cores to the VM, not a subset of them, unlike the RAM allocation), but didn't know how it was actually achieved. I am guessing it is some kind of round-robin queuing in the dom0 driver? As in, it accepts graphics "operations" from each VM and then goes round executing "operations" from each VM queue in sequence that has a pending operation? Or is it strictly timed via the dom0? If so, what is the size of the time-slices used? Pure curiosity on my part, so if "secret sauce" I understand if you don't want to say :) Interesting to know the effect that the VM "sees" as any time-scheduling will cause a "pulsing" in activity: most of the time nothing, then pulses of "full power" on the GPU. Must be fun ensuring that this doesn't cause any adaptive timing issues in the VM :) |
|
|
|
调度程序实际上在GPU硬件中,Dom0不知道它,因为它发生在硬件级别。
它在vSphere上的工作方式与在XenServer上的工作方式完全相同,根本没有虚拟机管理程序参与GPU虚拟化。 当VM引导并且vGPU配置文件连接到物理GPU时,它有效地给出了最小的保证切片时间,但是如果有更多可用时间,则可以使用它。 所有这些都是用硬件完成的,所以它真的很快。 卡和驱动程序中有很多聪明的行为可以解决应用程序的问题,我们有帧速率限制器,可以防止用户在物理GPU上唯一的用户时遇到FPS中的大幅波动。 以上来自于谷歌翻译 以下为原文 The scheduler is actually in the GPU hardware, Dom0 isn't aware of it because it happens at the hardware level. It works in exactly the same way on vSphere as it does on XenServer, no hypervisor involvement in the GPU virtualisation at all. When a VM boots and the vGPU profile is attached to the physical GPU it's effectively given a minimum guaranteed slice of time, but if more is available it can be utilised. All done in hardware so it's really fast. There's a lot of clever behaviour in the cards and driver that is there to smooth things out for the application, and we have the Frame Rate Limiter which prevents users experiencing wild swings in FPS when they're the only user on a physical GPU. |
|
|
|
只有小组成员才能发言,加入小组>>
使用Vsphere 6.5在Compute模式下使用2个M60卡遇到VM问题
3075 浏览 5 评论
是否有可能获得XenServer 7.1的GRID K2驱动程序?
3490 浏览 4 评论
小黑屋| 手机版| Archiver| 电子发烧友 ( 湘ICP备2023018690号 )
GMT+8, 2024-11-25 05:09 , Processed in 0.961427 second(s), Total 92, Slave 76 queries .
Powered by 电子发烧友网
© 2015 bbs.elecfans.com
关注我们的微信
下载发烧友APP
电子发烧友观察
版权所有 © 湖南华秋数字科技有限公司
电子发烧友 (电路图) 湘公网安备 43011202000918 号 电信与信息服务业务经营许可证:合字B2-20210191 工商网监 湘ICP备2023018690号