sub.mesa

Member
  • Content Count

    25
  • Joined

  • Last visited

Community Reputation

0 Neutral

About sub.mesa

  • Rank
    Member
  1. sub.mesa

    Hardware RAID and AF 4k with XP

    That is a bug in HDtune; it blindly assumes disks have 512 byte sectors. It is shitty software. It is not the only thing that is bad about this piece of software. It confused many people believing something is wrong (i.e. with performance figures). Stripesize is something very much misunderstood as well. Writing a file smaller than the stripesize will not cause the entire stripe block to be written. Stripesize is only a mechanism to determine what data ends up on which drive. In most cases, the stripesize can only be too small, not too large. Some users use stripesizes of up to 16 megabytes. Popular belief has it that this causes very weak performacne on 4K random I/O; but the opposite is true: larger stripesizes favor smaller I/O or random I/O - i.e. small files, while smaller stripesizes are better for sustained sequential I/O like large files.
  2. Can you post the SMART data of each harddrive? In particular, bad sectors and cabling errors are interesting. You have used OCE or Online Capacity Expansion. This procedure is not without risk in particular bad sectors are a high risk during the rebuild. You should always do a rebuild before expanding, to try to minimise the risk of making your array inaccessible due to an aborted expansion attempt.
  3. sub.mesa

    RAID 5 with bad / fixed sectors

    But, can you provide at least one controller/firmware combination that actually does this? Until this date, no one has been able to confirm to me any actual product that uses redundancy in case of an unreadable sector. Only ZFS does that. But you are right, it is at least theoretically possible. However, conventional RAID can never know whether the parity or mirrored copy is correct or stale. RAID can not distinguish between corrupt data and valid data. It lacks the facilities required to make that determination, such as checksums. It can only recognise that the data and parity are not in sync. Virtually all RAID controllers will rebuild the parity, blindly assuming that the data is good and the parity is bad. Well by convention, this timeout value cannot be less than 10 seconds. TLER is typically set at 7 seconds, to cope with even the most strict controllers that employ 10 second timeouts. If the harddrive did not provide the requested data within 10 seconds, it is detached and marked as failed. This pretty much means such RAIDs are extremely sensitive and virtually incompatible with modern disks that by design produce bad sectors due to insufficient ECC errorcorrection. Mark the sector as bad? You mean marking it as Current Pending Sector in the SMART output? All drives do this; the consumer drives simply spend more time on recovery before giving up, 120 seconds typically. Any good technology would be able to cope with this; as it is easy to send a reset command and go on with life. Only primitive firmware RAID systems appear to have problems with such kind of drives. Generally this means you need TLER drives for old-fashioned RAID controllers, while modern implementations of software RAID under Linux and BSD platforms as well as ZFS do not require special disks with TLER support and work just fine with casual consumer drives. In fact, TLER feature can be dangerous and is nothing more than an ugly hack. Assume you have a RAID5 where one drive is completely failed. This means you run degraded - basically a RAID0. In this circumstance, where you lost your redundancy, you are at the mercy of bad sectors. It is extremely common to encounter these, during the rebuild of the RAID5 with a new disk. What happens is that one or more disk members will encounter bad sectors. If you have TLER disks and lose your redundancy, this pretty much means data corruption or even a failed array - as many controllers kick out disks with bad sectors even if they return I/O errors. Without TLER, you will leave the recovery methods of the harddrive intact. This means that in degraded conditions you still have a last line of defence; which otherwise would have been killed by TLER. It pleases me to read this. I have helped many people with broken RAIDs; hardware RAID like Areca and software RAIDs like Intel driverRAID. So many people lose their data due to incompetent software engineering. The whole TLER issue is just sad; basically an incompatibility between hardware and software. Even ordinary consumers deserve better protection for their data!
  4. Configure your external harddrive to 'optimize for quick removal' instead of 'optimize for performance'. This will disable write-back on your external harddrive and this will increase the chance no corruption to filesystem metadata occurs in case of crash/disconnect/power failure.
  5. sub.mesa

    secure data transfer

    Transferring crucial data should not be done with technology that provides no fundamental protection to corruption. You really should use ZFS for this task, combined with zfs send/receive functionality to utilise the end to end data integrity feature. Only with this kind of protection can you be reasonably sure that your data did not corrupt during transfer. A legacy solution would be using rsync with checksum option enabled, but this requires lots of time on both client and server while providing only minimal checksum protection. If you desire protection, ZFS is the way to go. There is no (usable) substitute at this time.
  6. sub.mesa

    ARE SSDs

    Your motherboard indeed doesn't matter. But you were also asking about more reliable SSDs. Be aware that only two consumer SSDs are inherently safe: Intel 320 and Crucial M500. These have sufficient protections like power-safe capacitors and RAID4 bitcorrection on the NAND, both are crucial at preventing corruption of your data. SSDs powered by other controllers, like Crucial M4, Samsung 830/840 and Sandforce SSDs like OCZ are inherently unsafe. Basically, they are designed to fail or become corrupt on sudden power loss. About 90% of all consumer SSDs fail due to software issues; not because the hardware is faulty.
  7. sub.mesa

    Dead drive in Solaris ZFS pool

    I'm pretty sure that if you ask a good data recovery company to create a sector-correct copy of your failed disks, they should understand what that means. If you copy the exact same contents of a disk used by ZFS to another disk, then ZFS would identify that disk as being part of your pool. So yes, this should work. I'm puzzled however. You were running a RAID0 with 16 disks on ZFS? Of course ZFS can provide superior protection to your data, but not in RAID0 mode.
  8. sub.mesa

    RAID 5 with bad / fixed sectors

    RAID protects against failed disks. The problem is that it offers no protection against bad sectors. From the RAID engine's view, a drive with a bad sector is a defective drive. And that is one of the major shortcomings of most RAID systems; they treat harddrives very binary. Either your harddrive will work fine without bad sectors, or your drive is going to be kicked out after a few bad sectors. Even worse, beyond the RAID layer lies a very oldfashioned filesystem that belongs to a different era. Today's filesystems are very outdated in the sense that they offer no protection at all to your files. No protection for metadata, no protection for corruption, no protection for misordered writes, virtually nothing. In your case I'm not sure how you fixed all those bad sectors. If you used SpinRite like you said, you should have the original contents of the bad sector and thus no damage. But somehow I think that HDD Regenerator simply overwrites bad sectors with zeroes. In that case the data corruption is permanent. You can run a filesystem check (fsck) on your filesystem and see what damage it reveals. Files not detected as bad mean the metadata structure in intact but it is very possible and even likely that the files themselves are corrupted in varying degree. You can notice this as corrupt archives or 'bleeps' and 'artifacts' in audio/video files. Corruption is potentially very dangerous. If you desire a solution that is highly tolerant to bad sectors and virtually impossible to lose data just because of bad sectors, then ZFS is your man. It offers protection against the dangers you are exposed to at this very moment, and would have prevented the damage caused by bad sectors in your case. If you want to provide good protection to your data, migrating to ZFS is a possibility. You can always keep what you have and use it as backup solution or sell it off.
  9. The RRER value is not critical; it is only slightly below the threshold that's why it gives a SMART error. The actual issue is the high number of unreadable sectors, seen in Current Pending Sector. This shows all signs of media surface problems which are very different from mechanical issues that cause the harddrive to stop working suddenly. Instead, media surface problems increase the likelyhood of unreadable sectors occurring. This appears to be your issue, and indeed this harddrive is no longer reliable and should be replaced under warranty.
  10. sub.mesa

    RAID vs single drive: same speed

    Remember that Intel uses write-back and as such could give odd performance results when testing with low test sizes. Try testing with bigger files. A hardware RAID controller would show similar behaviour. Thanks to its write back onboard memory, you will not notice any slowness from the disks until this buffer is full. It simply 'hides' a collection of writes from the host system. I suggest testing with CrystalDiskInfo to give more information about filesystem performance.
  11. sub.mesa

    RAID 0 1 5 10

    Brian, RAID5 has more tolerance for failure than RAID1? That's quite an odd statement. I'd recommend RAID1 (RAID1+0) or RAID6, but keep in mind no matter how much redundancy you still need a proper backup to protect against other risks, including the risk of failures of the RAID layer itself. If you want the best protection for your files, then the ZFS filesystem is the only viable option, since it offers formidable protection against corruption, bad sectors and a very reliable RAID layer integrated with a solid filesystem. In most cases, the choice for ZFS implies building a NAS server connected to the local network. The best alternative would focus primarily on (incremental) backups since conventional RAID layers and old-fashioned filesystems are inherently unsafe.
  12. Your choice is one of the lowest quality RAID5 solutions that has alot of problems that affect performance, stability and data protection. The Rosewill DAS uses a Silicon Image port multiplier and is considered to be FakeRAID, meaning Silicon Image Windows-only drivers do all the RAID work. These are very low quality drivers which can put your data at risk. -> low quality RAID5 can harm your data -> port multiplier means very poor performance -> FakeRAID on Windows means you need TLER disks or your disks will drop out of the RAID when encountering bad sectors -> no method to expand the RAID5 I strongly urge you to reconsider your current choice, since many people are very displeased with Silicon Image. I also understand you want to use RAID5 without any form of backup. This, combined with the fact you are going to use the lowest quality RAID5 solution available, means you have a high risk of losing data over time. If you cannot afford a backup, a much better solution is FreeNAS or ZFSguru which both support the reliable ZFS filesystem. That offers much better protection for your data, is alot faster and offers more features than a low-quality fakeRAID solution. That would mean you run a NAS solution instead of a DAS, but that is to circumvent the limitations of the Windows operating system related to safely storing data. It does require a separate system with at least a 64-bit CPU and 2GB+ memory, however. The upside is that you can use cheap Samsung F4EG disks without needing expensive RAID edition drives featuring TLER, or expensive hardware RAID controllers with BBU.
  13. sub.mesa

    Software Raid5 Write Performance

    Try a reasonable benchmark; not 1GB but 16 or 32GB. If the scores are significantly lower, that is the actual performance. Are you using properly aligned partitions?
  14. sub.mesa

    Storage gurus...your assistance please

    You have a write-back mechanism at work. That means that tests with a small size will test the controller's memory, not the actual disks. Thus try CrystalDiskMark and use the largest test size available (16GiB i think?). It should be at least 8 times that of all write-back mechanisms combined.
  15. Why did you disable disk cache? Any improvement to latency when enabling it?