=> Thread: Batched BLAS "Optimized Batched Linear Algebra for Modern Architectures", Jack Dongarra, et al, Euro-Par 2017 --- "Fast Batched Matrix Multiplication for Small Sizes using Half Precision Arithmetic on GPUs", .., Jack Dongarra, IPDPS 2019 htt
4,447 followers
78 followers
4,447 followers
=> [Webinar] Speed Up Small-Matrix Multiplication using New Intel Math Kernel Library Capabilities, Oct 18 2017 https://t.co/IadenlkB3v Compact/Batch DGEMM & SGEMM MKL Performance Benchmarks https://t.co/ouNR108ZYq https://t.co/TEPbwX3nr5 Batched BL
843 followers
RT @ogawa_tter: "Optimized Batched Linear Algebra for Modern Architectures", Jack Dongarra, et al, Euro-Par 2017 (Aug 1 2017) https://t.co/…