[Beowulf] Defective Mellanox EDR Switches

Ryan Novosielski novosirj at rutgers.edu
Wed Jun 6 20:12:43 PDT 2018


One slight correction: 100% of our switches with FRU PN 00WE097/PN 00WE096Y manufactured on 2016-11-28 (quantity 3) have failed, and one same FRU PN/PN manufactured on 2016-12-15 too. We have another switch with FRU PN 00WE093/PN 00WE092Y that was manufactured on 2016-11-28 that has so far been OK, but I’m now suspicious of it.

--
____
|| \\UTGERS,  	 |---------------------------*O*---------------------------
||_// the State	 |         Ryan Novosielski - novosirj at rutgers.edu
|| \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
||  \\    of NJ	 | Office of Advanced Research Computing - MSB C630, Newark
     `'

> On Jun 6, 2018, at 6:48 PM, Ryan Novosielski <novosirj at rutgers.edu> wrote:
> 
> Something to be aware of, potentially, if you happen to own any of this equipment:
> 
> 100% of our Mellanox SwitchIB2 SB7890 EDR externally-managed switches that were manufactured on 2016-11-28 have failed. I’ve been told there was a manufacturing defect related to capacitors in the voltage regulators.
> 
> Mellanox apparently didn’t see fit to really notify us, even after diagnosing one of our switches, and has been slow in offering up specific information about the remedy or what dates can be expected to be affected, so hopefully this information can be of use to someone else. It’s possible that there’s a software update that fixes this, from what I gathered from Mellanox, but I’ve not been able to find anything specific yet.
> 
> The symptom is all switch port lights turning amber, and all connectivity being lost. A power cycle corrects the problem — until the next time it happens.
> 
> --
> ____
> || \\UTGERS,       |---------------------------*O*---------------------------
> ||_// the State     |         Ryan Novosielski - novosirj at rutgers.edu
> || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS Campus
> ||  \\    of NJ     | Office of Advanced Research Computing - MSB C630, Newark
>     `'
> _______________________________________________
> Beowulf mailing list, Beowulf at beowulf.org sponsored by Penguin Computing
> To change your subscription (digest mode or unsubscribe) visit https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.beowulf.org%2Fmailman%2Flistinfo%2Fbeowulf&data=02%7C01%7Cnovosirj%40rutgers.edu%7C5015fdad1a9241d2e1e108d5cbffa193%7Cb92d2b234d35447093ff69aca6632ffe%7C1%7C0%7C636639221165245444&sdata=b1zEz6r1RhfCstSUWhoQdBJzjMEhbAzjLaceq%2F7H8KE%3D&reserved=0

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 195 bytes
Desc: Message signed with OpenPGP
URL: <http://www.beowulf.org/pipermail/beowulf/attachments/20180607/d1cb442e/attachment-0001.sig>


More information about the Beowulf mailing list