On 2013-02-23 06:06, Niels Möller wrote: > Not sure what the bottlenecks of your loop are though; instruction > decoding, load/store, or the recurrency chain (but at least it shouldn't > be multiplier throughput, right?). Yeah, neither am I. I can't find any info on what latency of neon insns should be. r~