完善资料让更多小伙伴认识你,还能领取20积分哦, 立即完善>
戴尔R730配备2个K1卡,每个卡配有Xendesktop 7.6
我们为XenDesktop环境提供了3个服务器池,并使用vgpu和最新的驱动程序。 今天,连接了vgpu的用户都被锁定了。 没有连接vgpu卡的用户很好。 我无法通过XenCenter或ssh访问控制台进入服务器。 我们试图关闭一个VM,它会挂起。 我不确定它是否相关,但我们有一些使用vgpu的XenAPP服务器,我能够将其关闭并关闭它们。 同时,我的控制台和ssh开始响应,所有用户都能够重新登录其会话。 我检查了XS服务器上的日志,并在我们遇到问题的同时注意到与nvidia相关的kernel.log中的一些错误。 也许我们有一张坏卡? 顺便说一下,XenServer上是否有任何nvidia特定日志? 9月9日08:25:40 xenserv1内核:[1531154.743329] NVRM:Xid(PCI:0000:86:00):38,003f 0000a097 00000000 00000000 00000000 00000000 9月9日08:25:40 xenserv1内核:[1531155.252162] NVRM:Xid(PCI:0000:86:00):43,Ch 00000043,engmask 00000101 9月9日08:25:45 xenserv1内核:[1531160.223941] NVRM:Xid(PCI:0000:86:00):43,Ch 0000003f,engmask 00000101 9月9日08:30:37 xenserv1内核:[1531451.953662]信息:任务vgpu:25831被阻止超过120秒。 9月9日08:30:37 xenserv1内核:[1531451.953673]“echo 0> / proc / sys / kernel / hung_task_timeout_secs”禁用此消息。 9月9日08:30:37 xenserv1内核:[1531451.953677] vgpu D 0000000000000000 0 25831 1 0x00000000 9月9日08:30:37 xenserv1内核:[1531451.953682] ffff88002eb67b28 0000000000000282 ffff88002eb67a68 ffffffff81007f13 9月9日08:30:37 xenserv1内核:[1531451.953686] ffff88018880b110 0000000000013fc0 ffff88005c579710 ffffffff81a13420 9月9日08:30:37 xenserv1内核:[1531451.953689] 0000000000000001 0000000000000001 0000000000000000 0000000000000001 9月9日08:30:37 xenserv1内核:[1531451.953692]呼叫追踪: 9月9日08:30:37 xenserv1内核:[1531451.953702] []? xen_flush_tlb_all + 0x163 /量0x170 9月9日08:30:37 xenserv1内核:[1531451.953705] []? __xen_remap_domain_mfn_range +的0xCC / 0XF0 9月9日08:30:37 xenserv1内核:[1531451.953710] [] schedule + 0x55 / 0x60 9月9日08:30:37 xenserv1内核:[1531451.953712] [] schedule_timeout + 0x3a / 0x200 9月9日08:30:37 xenserv1内核:[1531451.953715] [] __down + 0x76 / 0xb0 9月9日08:30:37 xenserv1内核:[1531451.953719] []向下+ 0x38 / 0x50 9月9日08:30:37 xenserv1内核:[1531451.953781] [] os_acquire_mutex + 0x37 / 0x50 [nvidia] 9月9日08:30:37 xenserv1内核:[1531451.953817] [] _nv010795rm + 0x18 / 0x30 [nvidia] 9月9日08:30:37 xenserv1内核:[1531451.953867] []? _nv000227rm + 0xd / 0x30 [nvidia] 9月9日08:30:37 xenserv1内核:[1531451.953911] []? _nv012224rm + 0x3d / 0x120 [nvidia] 9月9日08:30:37 xenserv1内核:[1531451.953956] []? _nv012268rm + 0x531 / 0x620 [nvidia] 9月9日08:30:37 xenserv1内核:[1531451.953999] []? _nv000645rm + 0x12 / 0x20 [nvidia] 9月9日08:30:37 xenserv1内核:[1531451.954044] []? _nv001518rm + 0x2415 / 0x3930 [nvidia] 9月9日08:30:37 xenserv1内核:[1531451.954081] []? _nv000692rm + 0x700 / 0x860 [nvidia] 9月9日08:30:37 xenserv1内核:[1531451.954115] []? rm_ioctl + 0x73 / 0x100 [nvidia] 9月9日08:30:37 xenserv1内核:[1531451.954120] []? __kmalloc + 0×20 /量0x170 9月9日08:30:37 xenserv1内核:[1531451.954154] []? nvidia_ioctl + 0x431 / 0x4c0 [nvidia] 9月9日08:30:37 xenserv1内核:[1531451.954159] []? bad_area_nosemaphore + 0×13 / 0x20的 9月9日08:30:37 xenserv1内核:[1531451.954181] []? nvidia_frontend_ioctl + 0x39 / 0x80 [nvidia] 9月9日08:30:37 xenserv1内核:[1531451.954204] []? nvidia_frontend_unlocked_ioctl + 0x1d / 0x30 [nvidia] 9月9日08:30:37 xenserv1内核:[1531451.954208] []? vfs_ioctl + 0x1d / 0x40的 9月9日08:30:37 xenserv1内核:[1531451.954210] []? do_vfs_ioctl + 0x4bb /量0x520 9月9日08:30:37 xenserv1内核:[1531451.954213] []? SyS_ioctl + 0x6d / 0XA0 9月9日08:30:37 xenserv1内核:[1531451.954218] []? system_call_fastpath + 0x16 / 0x1b 9月9日08:30:37 xenserv1内核:[1531451.954231] INFO:task kworker / 3:2:27753阻塞超过120秒。 9月9日08:30:37 xenserv1内核:[1531451.954235]“echo 0> / proc / sys / kernel / hung_task_timeout_secs”禁用此消息。 以上来自于谷歌翻译 以下为原文 Dell R730’s with 2 K1 cards each with Xendesktop 7.6 We have a pool of 3 servers for our XenDesktop environment and use vgpu with the latest drivers. Today, the users that have a vgpu connected all locked up. The users that did not have a vgpu card attached were fine. I could not access the console via XenCenter or ssh into the server. We tried to shut down a VM and it would hang. I am not sure if it was related but we have a couple XenAPP servers using vgpu that I was able to rdp into and shut them down. At the same time, my console and ssh started responding and all the users where able to log back on to their session. I check the logs on the XS server and noticed a few errors in the kernel.log related to nvidia the same time we had the issue. Perhaps we have a bad card? BTW, are there any nvidia specific logs generated on XenServer? Sep 9 08:25:40 xenserv1 kernel: [1531154.743329] NVRM: Xid (PCI:0000:86:00): 38, 003f 0000a097 00000000 00000000 00000000 00000000 Sep 9 08:25:40 xenserv1 kernel: [1531155.252162] NVRM: Xid (PCI:0000:86:00): 43, Ch 00000043, engmask 00000101 Sep 9 08:25:45 xenserv1 kernel: [1531160.223941] NVRM: Xid (PCI:0000:86:00): 43, Ch 0000003f, engmask 00000101 Sep 9 08:30:37 xenserv1 kernel: [1531451.953662] INFO: task vgpu:25831 blocked for more than 120 seconds. Sep 9 08:30:37 xenserv1 kernel: [1531451.953673] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Sep 9 08:30:37 xenserv1 kernel: [1531451.953677] vgpu D 0000000000000000 0 25831 1 0x00000000 Sep 9 08:30:37 xenserv1 kernel: [1531451.953682] ffff88002eb67b28 0000000000000282 ffff88002eb67a68 ffffffff81007f13 Sep 9 08:30:37 xenserv1 kernel: [1531451.953686] ffff88018880b110 0000000000013fc0 ffff88005c579710 ffffffff81a13420 Sep 9 08:30:37 xenserv1 kernel: [1531451.953689] 0000000000000001 0000000000000001 0000000000000000 0000000000000001 Sep 9 08:30:37 xenserv1 kernel: [1531451.953692] Call Trace: Sep 9 08:30:37 xenserv1 kernel: [1531451.953702] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.953705] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.953710] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.953712] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.953715] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.953719] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.953781] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.953817] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.953867] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.953911] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.953956] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.953999] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954044] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954081] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954115] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954120] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954154] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954159] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954181] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954204] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954208] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954210] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954213] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954218] [ Sep 9 08:30:37 xenserv1 kernel: [1531451.954231] INFO: task kworker/3:2:27753 blocked for more than 120 seconds. Sep 9 08:30:37 xenserv1 kernel: [1531451.954235] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. |
|
相关推荐
2个回答
|
|
我们用Citrix开了一个案例,由于列出的错误,他们希望我们申请XS65ESP1005,109和1010。
以上来自于谷歌翻译 以下为原文 we opened a case with Citrix and they want us to apply XS65ESP1005, 109 AND 1010 due to the errors listed. |
|
|
|
我有类似的问题!
更新是否解决了您的问题? 以上来自于谷歌翻译 以下为原文 I have very similar problem! Did the updates solve your problem? |
|
|
|
只有小组成员才能发言,加入小组>>
使用Vsphere 6.5在Compute模式下使用2个M60卡遇到VM问题
3075 浏览 5 评论
是否有可能获得XenServer 7.1的GRID K2驱动程序?
3490 浏览 4 评论
小黑屋| 手机版| Archiver| 电子发烧友 ( 湘ICP备2023018690号 )
GMT+8, 2024-11-24 17:36 , Processed in 0.577035 second(s), Total 78, Slave 62 queries .
Powered by 电子发烧友网
© 2015 bbs.elecfans.com
关注我们的微信
下载发烧友APP
电子发烧友观察
版权所有 © 湖南华秋数字科技有限公司
电子发烧友 (电路图) 湘公网安备 43011202000918 号 电信与信息服务业务经营许可证:合字B2-20210191 工商网监 湘ICP备2023018690号