[Beowulf] How to debug error with Open MPI 3 / Mellanox / Red Hat?
John Hearns
hearnsj at googlemail.com
Tue Apr 30 09:24:30 PDT 2019
Hello Faraz. Please start by running this command ompi_info
On Tue, 30 Apr 2019 at 15:15, Faraz Hussain <info at feacluster.com> wrote:
> I installed RedHat 7.5 on two machines with the following Mellanox cards:
>
> 87:00.0 Network controller: Mellanox Technologies MT27520 Family
> [ConnectX-3 Pro
>
> I followed the steps outlined here to verify RDMA is working:
>
>
> https://community.mellanox.com/s/article/howto-enable-perftest-package-for-upstream-kernel
>
> However, I cannot seem to get Open MPI 3.0.2 to work. When I run it, I
> get this error:
>
> --------------------------------------------------------------------------
>
> No OpenFabrics connection schemes reported that they were able to be
>
> used on a specific port. As such, the openib BTL (OpenFabrics
>
> support) will be disabled for this port.
>
>
> Local host: lustwzb34
>
> Local device: mlx4_0
>
> Local port: 1
>
> CPCs attempted: rdmacm, udcm
>
> --------------------------------------------------------------------------
>
> Then it just hangs till I press control C.
>
> I understand this may be an issue with RedHat, Open MPI or Mellanox.
> Any ideas to debug which place it could be?
>
> Thanks!
>
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit
> https://beowulf.org/cgi-bin/mailman/listinfo/beowulf
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://beowulf.org/pipermail/beowulf/attachments/20190430/61cfdeea/attachment.html>
More information about the Beowulf
mailing list