Convolutional Neural Networks (CNNs) for Artificial Intelligence (AI) algorithms have been widely used in many applications especially for image recognition. However, the growth in CNN-based image recognition applications raised challenge in executing millions of Multiply and Accumulate (MAC) operations in the state-of-The-Art CNNs. Therefore, GPUs, FPGAs, and ASICs are the feasible solutions for