Energy-based generative neural networks[1][2] is a class of generative models, which aim to learn explicit probability distributions of data in the form of energy-based models whose energy functions are parameterized by modern deep neural networks. Its name is due to the fact that this model can be derived from the discriminative neural networks. The parameter of the neural network in this model is trained in a generative manner by Markov chain Monte Carlo [3](MCMC)-based maximum likelihood estimation. The learning process follows an analysis by synthesis scheme, where within each learning iteration, the algorithm samples the synthesized examples from the current model by a gradient-based MCMC method, e.g., Langevin dynamics, and then updates the model parameters based on the difference between the training examples and the synthesized ones. This process can be interpreted as an alternating mode seeking and mode shifting process, and also has an adversarial interpretation [4]. The first energy-based generative neural network is the generative ConvNet proposed in 2016 for image patterns, where the neural network is a convolutional neural network [5][6]. The model has been generalized to various domains to learn distributions of videos [7], and 3D voxels [8] . They are made more effective in their variants [9][10][11][12][13][14]. They have proven useful for data generation (e.g., image synthesis , video synthesis, 3D shape synthesis , etc.), data recovery (e.g., recovering videos with missing pixels or image frames , 3D super-resolution, etc), data reconstruction .