Extreme learning machine (ELM) is a single layer feedforward neural network algorithm used for classification problems due to its accuracy and speed. It provides a robust learning algorithm, free of local minima, suitable for high-speed computation along with fast learning speed. In this paper, ELM algorithm implementation on hardware and software is discussed. A low-cost hardware implementation of 16-bit H-matrix generation on FPGA is discussed in the paper. Hardware implementation is carried out on Nexys-4 board using MATLAB and hardware description language (HDL). Generation of H-matrix is carried out using two activation functions, piecewise log-sigmoid and piecewise tan-sigmoid. This paper aims at optimizing the hardware implementation of ELM algorithm by minimizing the utilized resources of the FPGA. Finally, the ELM algorithm accuracy and hardware utilization for both activation functions are compared. © Springer Nature Singapore Pte Ltd 2020.