AMD bulldozer and GMP
Rick C. Hodgin
foxmuldrster at yahoo.com
Sun Feb 19 14:58:02 CET 2012
The thing AMD needs to add to make it work properly, and be a completely
killer implementation of an extended ARM ISA, is the ability to control
extremely fine grained threads at the application level (without OS
service calls).
I have an outline for how this could be done. I wrote a paper about it
back in 2007/2008. Gave a copy to Wolfgang Gruener who forwarded it on
to a third party... but nothing was ever mentioned on it again.
With such an ability, and a simple hardware-level feature to allow
threads to be turned on and off for particular intra-single-thread
workloads, the threshold would be crossed from
difficult-to-implement-parallelism to
extremely-beneficial-parallelism-for-nearly-all-apps.
Best regards,
Rick C. Hodgin
On Sun, 2012-02-19 at 08:48 -0500, Rick C. Hodgin wrote:
> Vincent,
>
> I believe AMD is working on a CPU design that will eventually allow them
> to switch from an x86-based ISA to one ARM-based.
>
> I could be wrong. Probably am. But there are many signs.
>
> Best regards,
> Rick C. Hodgin
>
> On Sat, 2012-02-18 at 11:08 +0100, Vincent Diepeveen wrote:
> > Rick,
> >
> > you can prove that throughput of bulldozer is not more than from the
> > quadcore intels.
> > In the end bulldozer decodes 4 instructions a cycle a module and
> > intel decodes 4 instructions a cycle a core,
> > and bulldozer is a tad slower then than the intels as its caches are
> > a lot slower.
> >
> > For parallel well scaling applications that is why bulldozer always
> > will lose it from quadcore intels.
> >
> > Besides - they really overclocked bulldozer a lot to get where they
> > are now.
> >
> > GMP is such a highly optimized software product that integer
> > multiplication dominates and i bet in bulldozer
> > they added huge latencies there in order to overclock bulldozer a lot
> > to be at least nearby the intel quadcores.
> >
> > On Feb 15, 2012, at 12:54 AM, Rick Hodgin wrote:
> >
> > >> It is totally incomprehensible what AMD is doing.
> > >> The new processor runs hot, slowly, and hardly
> > >> outperforms a 5W processor for integer number
> > >> crunching. OK, they do, thanks to a 2x clock and
> > >> a more cores. But clock-for-clock they are equal.
> > >
> > > There was a lot of surprise in the CPU community when AMD released
> > > its early Bulldozers for internal benchmarking. The additional
> > > cores provided far greater throughput overall, but so much was lost
> > > for lesser-parallelized applications that everyone was left
> > > scratching their heads and wondering what was going on (as you are
> > > doing now).
> > >
> > > AMD was also surprised by its performance actually, indicating to
> > > many that it was an unexpected condition. Yet, in the end there
> > > were some fixes made but nothing to bring it up to par with what
> > > everyone (outside of core development??) was expecting.
> > >
> > > Best regards,
> > > Rick C. Hodgin
> > >
> > > _______________________________________________
> > > gmp-discuss mailing list
> > > gmp-discuss at gmplib.org
> > > https://gmplib.org/mailman/listinfo/gmp-discuss
> > >
> >
>
More information about the gmp-discuss
mailing list