↧

Evidently, a working example

June 15, 2013, 10:34 am

Evidently, a working example would go a long way to make our responses more productive than the wild speculation you are calling for. Why not compare a single path vectorized code against a single...

View Article

Is that VML library function?

June 16, 2013, 2:42 am

Is that VML library function?Is this issue consistent with every argument beign passed to vectorized sincos() function?How do you call the functions?Do you have some variable interdependencies?I...

View Article

>>..."__svml_sincos4_e9"

June 16, 2013, 11:57 am

>>..."__svml_sincos4_e9" function which is apparently the vectorized version of trigonometric functions...Yes, you're correct and SVML stands for Short Vector Math Library. There is a desription...

View Article

And one more thing.

June 16, 2013, 12:05 pm

And one more thing.>>..."__svml_sincos4_e9" functionSince the function __svml_sincos4_e9 is optimized for processors with Intel AVX instruction set for 64-bit platforms ( function with e9 code is...

View Article

Tahnks for your replys,

June 17, 2013, 2:21 pm

Tahnks for your replys,Basically, the reason I'm using "#pragma omp simd" is portability, so in the future, we may move to AMD platform, or other co-processors, or even other compilers, so using...

View Article

Another point of slowness in the code is call to the __svml_exp4_e9 which I'm using "exp" function in the other part of my code. According to VTune analysis, in the non-vectorized code the exp function...

View Article

>>...Do I need to do some

June 17, 2013, 8:49 pm

>>...Do I need to do some tunning before call to math functions?..It looks like No because codes are portable ( you've mentioned that ) and implemented without any intrinsic functions ( is that...

View Article

As it was already mentioned

June 17, 2013, 11:16 pm

As it was already mentioned your code could have AVX-to-SSE transition penalties.Your programme is single-threaded so there is no execution ports stalls.But I am thinking about the possibility that...

View Article

Image may be NSFW.
Clik here to view.

Well, as you can see in the

June 18, 2013, 9:12 am

Well, as you can see in the attachment, there is no AVX to SSE and SSE to AVX conversions in the __svml_ functions, so I'm still wondering what the reason of __svml_ slowness is, as the non-vec...

View Article

>>...I'm still wondering what

June 18, 2013, 4:23 pm

>>...I'm still wondering what the reason of __svml_ slowness is, as the non-vec functions are fast.A test case is needed in order to understand what is going on and to answer your questions.

View Article

Evidently, a working example

June 15, 2013, 10:34 am

Evidently, a working example would go a long way to make our responses more productive than the wild speculation you are calling for. Why not compare a single path vectorized code against a single...

View Article

Is that VML library function?

June 16, 2013, 2:42 am

Is that VML library function?Is this issue consistent with every argument beign passed to vectorized sincos() function?How do you call the functions?Do you have some variable interdependencies?I...

View Article

>>..."__svml_sincos4_e9"

June 16, 2013, 11:57 am

>>..."__svml_sincos4_e9" function which is apparently the vectorized version of trigonometric functions...Yes, you're correct and SVML stands for Short Vector Math Library. There is a desription...

View Article

And one more thing.

June 16, 2013, 12:05 pm

And one more thing.>>..."__svml_sincos4_e9" functionSince the function __svml_sincos4_e9 is optimized for processors with Intel AVX instruction set for 64-bit platforms ( function with e9 code is...

View Article

Tahnks for your replys,

June 17, 2013, 2:21 pm

Tahnks for your replys,Basically, the reason I'm using "#pragma omp simd" is portability, so in the future, we may move to AMD platform, or other co-processors, or even other compilers, so using...

View Article

Another point of slowness in

June 17, 2013, 2:37 pm

View Article

>>...Do I need to do some

June 17, 2013, 8:49 pm

>>...Do I need to do some tunning before call to math functions?..It looks like No because codes are portable ( you've mentioned that ) and implemented without any intrinsic functions ( is that...

View Article

As it was already mentioned

June 17, 2013, 11:16 pm

View Article

Image may be NSFW.
Clik here to view.

Well, as you can see in the

June 18, 2013, 9:12 am

Well, as you can see in the attachment, there is no AVX to SSE and SSE to AVX conversions in the __svml_ functions, so I'm still wondering what the reason of __svml_ slowness is, as the non-vec...

View Article

>>...I'm still wondering what

June 18, 2013, 4:23 pm

>>...I'm still wondering what the reason of __svml_ slowness is, as the non-vec functions are fast.A test case is needed in order to understand what is going on and to answer your questions.

View Article