Mach takes a look at your GCode and 'pre plans' the moves your machine needs to make. When using the parallel port driver anything else Windows may be doing is interrupted 25,000 or more (depends on kernel setting) times a second so the parallel port driver can run. The driver takes the moves Mach has pre planned and converts them into step pulses. Since it has to run so many times a second the parallel port driver takes up a lot of processor time and its output will always have some jitter, i.e. the timing is not exactly perfect as it is software generated signal. On some PCs this timing instability can be severe on some it works well, but it will always have some jitter. All of this timing pulse generation takes a lot of processor time.
The SmoothStepper plug-in takes the same moves that Mach has pre-planned and converts them into a format that is easy for the SmoothStepper to work with and feeds them out to the SS board. The SS board uses an FPGA (programmable hardware logic) to generate all the timing pulses. The SS board does buffer a limited number of moves and constantly tells the SS plug-in what it has done and where the machine is (so the machine and Mach stay in sync.) Since the SmoothStepper does all of the time critical pulse generation in hardware there is no load on the PC (the processing the SS plug-in does in minimal compared to what the parallel port driver has to do.)
All of this means that the pules the SS outputs are rock steady, there is no jitter, and there is less load put on the PC.
The toolpath question is a tricky one. Since the SS frees up processor time on the PC it can help with the problem of Mach displaying the toolpath for large files. The real problem at this point is Mach 3. It does all of its work in a single thread so it can really get bogged down trying to generate the toolpath on large files. This can lead to Mach being bogged down and the SmoothStepper 'starving' for data.
The next major revision of Mach (V4) will have the graphics processing done in a separate thread so the main body of Mach will not get bogged down. This should put an end to the large toolpath issue. To be fair a lot depends on your PC hardware. I have run files around 600,000 lines on a cheap ($350) Shuttle all-in-one touch screen PC using the SmoothStepper and not had any issues with the toolpath display. Your PC may produce different results.