Tesla C1060 matrixvecmult executes kernels MatrixVectorMul1n which compute c = A * b where A is stored in row major form and c and b are vectors Using device Tesla C1060 CPU took 0.168037 s Testing MatrixVectorMul1 WorkGroupSize = 64 GlobalSize 100032 Finished kernel execution Average kernel execution time 0.137472 Found 34947 different entries 0 entries more than 0.001% Testing MatrixVectorMul2 WorkGroupSize = 64 GlobalSize 3840 Finished kernel execution Average kernel execution time 0.132226 Found 34947 different entries 0 entries more than 0.001% Testing MatrixVectorMul3 WorkGroupSize = 64 GlobalSize 3840 Finished kernel execution Average kernel execution time 0.0278443 Found 90948 different entries 0 entries more than 0.001% Testing MatrixVectorMul4 WorkGroupSize = 64 GlobalSize 3840 Finished kernel execution Average kernel execution time 0.0219479 Found 90505 different entries 0 entries more than 0.001% Testing MatrixVectorMul5 WorkGroupSize = 64 GlobalSize 3840 Finished kernel execution Average kernel execution time 0.0206547 Found 90459 different entries 0 entries more than 0.001%