Memory barrier for fat initialization

Tue Jan 13 15:58:14 UTC 2015

tg at gmplib.org (Torbjörn Granlund) writes:

> My understanding is that the AMD64 as well as older x86 architectures do
> not allow store/store reordering, except when explicitly told otherwise.

I'm a bit confused about this. Then when are memory barriers (mfence and
friends) ever needed? I have a pretty vague idea about how memory models
work in both theory and practice. I'm thinking about something like:

  cpu0 for some reason has parts of cpuvec cached in a local L1 cache.

  cpu1 writes cpuvec and __gmpn_cpuvec_initialized = 1.

  cpu0 reads __gmpn_cpuvec_initialized, from shared L2 cache,
  gets the updated value, 1.

  cpu0 reads en cpuvec entry. Gets the old value from its local L1
  cache. Do the architecture specs rule out this possibility?

> There should be some portable mechansm for this, either as part of some
> POSIX standard or within gcc.  I haven't looked into that.

I had a quick look at what linux does. It seems to use arch-specific
inline assembly to implement the barriers it needs.

Regards,
/Niels

-- 
Niels Möller. PGP-encrypted email is preferred. Keyid C0B98E26.
Internet email is subject to wholesale government surveillance.