I am trying to understand VAEs and how the training and testing happens in these networks.
Training: The issue with the training is that we cannot differentiate a sampling function which uses the mean and variance of the latent space representation. So instead we use the reparameterization trick to move the sampling function out of the AE network and independent of the input data. So, now we can easily use backpropagation to train the AE network.
Loss: Now, we get to the loss (only considering the reconstruction loss): we use an expectation over the entire batch to compute the loss and thus the gradients. My understanding for using expectation (over all samples in the batch): Expectation is used because of the sampling process in the latent space and the expectation of variable $epsilon$ will be 0. Am I correct?
Testing: If I am correct about the loss section. Then I am not able to understand how we will test for a single image. I feed a trained VAE an image $x$. It will compute a mean and variance in latent space for this image. We use a stochastic function to generate the input to the decoder and thus the output is stochastic (varies between some bounds of the actual output). The output of the decoder could be anything because the VAE function for testing is not deterministic.
Can you help me understand how the testing of VAEs work for a single image?
TL;DR I think VAEs are stochastic networks. How will I test them for accuracy?