Alan McGovern (Mono Team) help me to optimize Gradient function. Now I have Mono.SIMD 3x faster than Mono 4DFloats ! But 2x less speed than new C++ version .... The good thing is that Mono.SIMD version is now faster than 4DFloat C++ code.