[TriLUG] 512/4096B cluster size incompatibilities

David Burton via TriLUG trilug at trilug.org
Sat Dec 29 16:48:12 EST 2018


I believe that all 4 TB drives use 4K hardware sectors.

Most of them (maybe all?) use a sector size of "512E". The "E" stands for
"emulated." Under the covers they're actually 4096 bytes per physical
sector. So if you ever write a single "emulated" 512-byte sector to the
drive (instead of N×8 sectors), the drive has to do a read-modify-write
operation (with the write delayed until the next rotation!) -- slooooow.

Everybody knows that's going on, so modern partitioning software always
makes sure to align partitions on at least 4 KB (8 sector) boundaries.
(Actually, they usually align on 1 MiB boundaries, because SSDs typically
prefer 256KiB or 512KiB alignment.)

Also, file systems always use clusters of 4 KB (or a multiple of 4 KB).

So, in practice, the drives never have to do those read-modify-write
operations, because they always see block-writes which start on LBA numbers
divisible by 8, and which contain an exact multiple of 8 sectors. They
never actually encounter LBA numbers in which the low three bits are
non-zero.  (If you want to ruin the performance of a drive, then hack the
partition table so that the partitions don't start on 4 KB boundaries. It
should still work, but writes will be slow as mud.)

Seagate also makes some drives which are addressed with "native 4K" ("4KN")
LBA sectors numbers. But I think those are all huge drives (6 TB and up).
Anyhow, they have different model numbers.

So I doubt there's a hardware or firmware difference between your two
drives. It must be a partitioning / formatting issue. What does  * fdisk
-lu*    show?

Or maybe it's a hardware issue. What does   smartctl -a /dev/sd*x*   show
for each drive?


Note:
2³² sectors of 512-bytes is 2 TB.
2³² sectors of 4096-bytes is 16 TB.
That's probably one reason that all 4 TB drives use 4K hardware sectors,
internally. Of course, the main reason is that 4 KB sectors improve storage
efficiency, by incurring the per-sector overhead only 1/8 as often. Even
after expanding the ECC field for improved reliability, going to 4 KB
sectors still saves a huge amount of wasted space. But, also, if a 4 TB
drive had 512-byte physical sectors then the sectors couldn't even be
addressed, internally, within the drive, without using more than 32 bits
for the sector numbers. Modern drives support LBA48, so that's doable, but
I'll bet that inside the drive's firmware they're using 32 bits for sector
numbers.

Dave


On Sat, Dec 29, 2018 at 2:19 PM Joseph Mack NA3T via TriLUG <
trilug at trilug.org> wrote:

> I've just found that the logical cluster size matters to the user. I
> assumed
> this was a hardware layer problem sufficiently deep that it would not
> directly
> affect me.
>
> I have two identical model Seagate 4TB drives, bought the same day, from
> Intrex
> (this was a while ago)
>
> Model Family:     Seagate Desktop HDD.15
> Device Model:     ST4000DM000-1F2168
>
> Both have been functioning perfectly (as data disks) since I bought them.
>
> To avoid any tom-foolery from the manufacturer, I have been buying the
> same
> submodel disks (M000) from Amazon, ever since, even though I'm not buying
> the
> current submodel anymore.
>
> As I found out this morning, despite the same model and submodel numbers,
> the
> two disks aren't identical. One has a 512B logical cluster size and the
> other
> has a 4096B logical cluster size.
>
> I wanted to copy one disk to another. I had a bunch of external drive
> enclosures. I dd'ed from the source disk (4096B cluster) to the target
> disk
> (512B cluster), without any errors from dd. However the target disk had a
> crazy
> single partition, which wouldn't mount. I confirmed that I could write to
> the
> target disk by erasing its partition table (copying /dev/zero to the
> beginning
> of the disk).
>
> The same thing (no errors) happened using ddrescue resulting in the same
> crazy
> partition table.
>
> Thinking the problem was the drive enclosure for the target disk (new and
> supposed to work with 8TB disks), I got out an old enclosure that was
> known to
> work with 4TB disks and used it on the target disk, again with the same
> result
> (ddrescue, no errors).
>
> I concluded there was something about the target disk and not the
> enclosure.
> Rather than dd'ing from the source disk, I partitioned the target disk to
> have
> the same sized partitions as the source disk (so I could copy files into
> the
> partitions). I noticed that the sector numbering was quite different. Then
> I
> noticed that the logical sector sizes were different.
>
> I also noticed that the 512B 4TB disk wasn't recognised by one of my
> external
> enclosures (not seen in /dev/sd* while the LED on the front of the
> enclosure
> flashed rapidly), while the 4096B 4TB disk was recognised.
>
> I take it you can't dd or ddrescue between disks of different cluster
> sizes and
> if you do, you won't get error messages.
>
> I also take it that external enclosures for 4TB disks need to have the
> right
> cluster size for the disk, in order for the disk to be recognised.
>
> Joe
>
> --
> Joseph Mack NA3T EME(B,D), FM05lw North Carolina
> jmack (at) wm7d (dot) net - azimuthal equidistant
> map generator at http://www.wm7d.net/azproj.shtml
> Homepage http://www.austintek.com/ It's GNU/Linux!
> --
> This message was sent to: Dave Burton <ncdave4life at gmail.com>
> To unsubscribe, send a blank message to trilug-leave at trilug.org from that
> address.
> TriLUG mailing list : https://www.trilug.org/mailman/listinfo/trilug
> Unsubscribe or edit options on the web  :
> https://www.trilug.org/mailman/options/trilug/ncdave4life%40gmail.com
> Welcome to TriLUG: https://trilug.org/welcome


More information about the TriLUG mailing list