"> 2) The cacheline fetch would get more data faster. The data would
> be transferred in the first 6 beats of the load from RAM (assuming a
> 64-bit data bus) rather than waiting for 7, so you’d finish the copy
> 1 ns sooner or so. Similar 1-cycle win on a 128-bit Ln->L(n-1) cache
> transfer."
> be transferred in the first 6 beats of the load from RAM (assuming a
> 64-bit data bus) rather than waiting for 7, so you’d finish the copy
> 1 ns sooner or so. Similar 1-cycle win on a 128-bit Ln->L(n-1) cache
> transfer."
— Re: [PATCH 1/8] drivers/random: Cache align ip_random better [LWN.net]