sefaucher

MegaRAID with Samsung 860 Pro (megasas35 & MR_MONITOR in Event Viewer)

Recommended Posts

We recently bought a Cisco C240 M5SX server which came with a "Cisco 12G Modular Raid Controller with 4GB cache (max 26 drives)". It appears to be a Tri-Mode LSI/Avago/Broadcom controller since it supports NVMe drives and works with the MegaRAID Storage Manager software, though I could not find an exact match on the Broadcom web site. Since we had great luck with Samsung 850 Pro drives on our older C240 M3 servers (which use LSI 9271-8i controllers) we chose to go with 860 Pro in the new server. We ran the Cisco HUU (Host Upgrade Utility) and brought all firmware up to the latest versions and thought everything was fine but we started to experience some slowdowns that last for 5 to 10 seconds and noticed the Event Viewer had Event ID 129 from megasas35 and Event ID 268 from MR_MONITOR. Our initial thought is the controller does not like 860 Pro so we bought a couple of Intel S4500 (firmware 0100) since the S4500 is listed as a compatible SSD in the Cisco manual. So far the S4500 are working much better but we won't know for sure this fixed it until we replace the remaining 860 Pro (all of our SSDs are in JBOD mode by the way). Just wondering if anyone else has any experience with the 860 Pro on similar controllers or how to troubleshoot megasas35 and MR_MONITOR errors.

MegaRAID Storage Manager 17.05.01.03 shows...
BIOS Version 7.01.05.3_4.17.08.00_0x07010704
Firmware Package Version 50.1.0-1456
Firmware Version 5.010.00-1445
Firmware Build Time Apr 26 2018 12:20:22
SSD Guard Enabled
SSD Disk Cache Setting Disabled
Power savings on unconfigured drives Disabled
Power savings on hot spares Disabled

1TB Samsung 860 Pro (firmware 1B6Q)
Windows Server 2012 R2
Cisco 12G Controller device driver 7.703.7.0

Share this post


Link to post
Share on other sites

Consumer SSDs still fall into the category of its great if they work, but when things start to become problematic with firmware, it can be usually pointed to use the appropriate drive for the application. Have you tried the light enterprise version of the 860 Pro?

I wonder if the slowdowns are related the the over-provisioning differences between those two drives. Hit a write workload where the drive needs to run GC, and that could be the slowdown. To rule this out you would need to manually over-provision the SSDs with something like HDPARM, then put them back into the RAID group. Those S4500s

Share this post


Link to post
Share on other sites

None of our SSDs are in a RAID group (not even a single member RAID-0) ... they are all in JBOD mode. Is there a way to query when GC is done (perhaps with MSM or MegaCLI) so we could cross reference the Event Viewer logs to see if GC happens at the same time as the megasas35 and/or MR_MONITOR events?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now