[AMD] mir-glas is slower than OpenBLAS for DGEMM

I suceesfully compiled the benchmark `gemm_report.d` provided by mir-glas. I ran it twice.
One comparing with OpenBLAS and another comparing against ACML-5.3.1. 
As you can see from the benchmarks mir-glas does not yield full performance for large matrices.
Peak performance for my machine is about 23 GFLOPs for double precision.
But also ACML does noch achieve full performance.
So I decided to compare with dgemm.goto and dgemm.acml benchmark programs provided in
`OpenBLAS/benchmark`. Here ACML reaches peak performance too. Is there any overhead calling
ACML from D?
![dgemm_bench](https://cloud.githubusercontent.com/assets/22576608/24573080/0a24418a-167f-11e7-8089-70ab2b76e481.png)
![print](https://cloud.githubusercontent.com/assets/22576608/24573089/246a589a-167f-11e7-97be-f84c3f57444e.png)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD] mir-glas is slower than OpenBLAS for DGEMM #20

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[AMD] mir-glas is slower than OpenBLAS for DGEMM #20

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions