Quantcast
Channel:
Browsing all 20 articles
Browse latest View live
↧

Evidently, a working example

Evidently, a working example would go a long way to make our responses more productive than the wild speculation you are calling for.  Why not compare a single path vectorized code against a single...

View Article


Is that VML library function?

Is that VML library function?Is this issue consistent with every argument beign passed to vectorized sincos() function?How do you call the functions?Do you have some variable interdependencies?I...

View Article


>>..."__svml_sincos4_e9"

>>..."__svml_sincos4_e9" function which is apparently the vectorized version of trigonometric functions...Yes, you're correct and SVML stands for Short Vector Math Library. There is a desription...

View Article

And one more thing.

And one more thing.>>..."__svml_sincos4_e9" functionSince the function __svml_sincos4_e9 is optimized for processors with Intel AVX instruction set for 64-bit platforms ( function with e9 code is...

View Article

Tahnks for your replys,

Tahnks for your replys,Basically, the reason I'm using "#pragma omp simd" is portability, so in the future, we may move to AMD platform, or other co-processors, or even other compilers, so using...

View Article


Another point of slowness in

Another point of slowness in the code is call to the __svml_exp4_e9 which I'm using "exp" function in the other part of my code. According to VTune analysis, in the non-vectorized code the exp function...

View Article

>>...Do I need to do some

>>...Do I need to do some tunning before call to math functions?..It looks like No because codes are portable ( you've mentioned that ) and implemented without any intrinsic functions ( is that...

View Article

As it was already mentioned

As it was already mentioned your code could have AVX-to-SSE transition penalties.Your programme is single-threaded so there is no execution ports stalls.But I am thinking about the possibility that...

View Article


Image may be NSFW.
Clik here to view.

Well, as you can see in the

Well, as you can see in the attachment, there is no AVX to SSE and SSE to AVX conversions in the __svml_ functions, so I'm still wondering what the reason of __svml_ slowness is, as the non-vec...

View Article


>>...I'm still wondering what

>>...I'm still wondering what the reason of __svml_ slowness is, as the non-vec functions are fast.A test case is needed in order to understand what is going on and to answer your questions.

View Article

Evidently, a working example

Evidently, a working example would go a long way to make our responses more productive than the wild speculation you are calling for.  Why not compare a single path vectorized code against a single...

View Article

Is that VML library function?

Is that VML library function?Is this issue consistent with every argument beign passed to vectorized sincos() function?How do you call the functions?Do you have some variable interdependencies?I...

View Article

>>..."__svml_sincos4_e9"

>>..."__svml_sincos4_e9" function which is apparently the vectorized version of trigonometric functions...Yes, you're correct and SVML stands for Short Vector Math Library. There is a desription...

View Article


And one more thing.

And one more thing.>>..."__svml_sincos4_e9" functionSince the function __svml_sincos4_e9 is optimized for processors with Intel AVX instruction set for 64-bit platforms ( function with e9 code is...

View Article

Tahnks for your replys,

Tahnks for your replys,Basically, the reason I'm using "#pragma omp simd" is portability, so in the future, we may move to AMD platform, or other co-processors, or even other compilers, so using...

View Article


Another point of slowness in

Another point of slowness in the code is call to the __svml_exp4_e9 which I'm using "exp" function in the other part of my code. According to VTune analysis, in the non-vectorized code the exp function...

View Article

>>...Do I need to do some

>>...Do I need to do some tunning before call to math functions?..It looks like No because codes are portable ( you've mentioned that ) and implemented without any intrinsic functions ( is that...

View Article


As it was already mentioned

As it was already mentioned your code could have AVX-to-SSE transition penalties.Your programme is single-threaded so there is no execution ports stalls.But I am thinking about the possibility that...

View Article

Image may be NSFW.
Clik here to view.

Well, as you can see in the

Well, as you can see in the attachment, there is no AVX to SSE and SSE to AVX conversions in the __svml_ functions, so I'm still wondering what the reason of __svml_ slowness is, as the non-vec...

View Article

>>...I'm still wondering what

>>...I'm still wondering what the reason of __svml_ slowness is, as the non-vec functions are fast.A test case is needed in order to understand what is going on and to answer your questions.

View Article
Browsing all 20 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>