The scaler works by low-pass filtering and subsampling. In the vertical dimension, this means that some of the horizontal scan lines are dropped, but the data contained in the dropped lines is averaged into the remaining lines. This averaging minimizes the dropped line effect, but it is not perfect; it is an approximation of perfect (you may see perfectly scaled images on television, but the equipment that does this costs thousands more than the entire computer).
In the case where an image is scaled vertically by less than full but greater than half, the image displayed on the screen consists of two video fields interlaced onto the monitor. The video scaling processes fields and it takes two fields to make a frame. When a line is dropped in one field, the corresponding line in the opposite field is also dropped. When the two fields are reconstructed on the display, the effect is that two vertically adjacent lines are dropped, and this can be visible in some cases.
The most likely reason it is not as visible when using a VCR is that the image from a VCR is lower resolution from the filtering done in the recording process. The artifact is still there, but the lower resolution hides it.