2023 Feb 11
Observations
In this figure we can see that after converging to the lower step count it pops back into a higher step count. There may be an incident where when we consider the replay buffer the actual inputs of the replay buffer gets minimized so that the network has so little inputs to process and the other part where we padded will be bigger than the input.
To remove this the structure of pointer net can be used to reduce this error.
simple explanation in pointer networks
Here you can find a simple explanation on implementing a pointer network.