Hi crchisholm,
you are correct, a Mach4 licence and a Darwin Parallel Port licence and you are ready to go.
The pulse streams required to activate your stepper drivers (in the 540) are generated by your PC when using a parallel
port. The computer does this by running a timer where it resets a counter and when it counts up to a certain number it generates
an interrupt and the program resets the counter and starts again and also issues a pulse to the parallel port. The timers available
in a PC CPU chip are pretty limited by comparison to microcontrollers suitable for controlling machinery. Microcontrollers have hardware
timers that count up, reset and issue a pulse without CPU intervention. A PC timer requires considerable CPU processing to issue a 
pulse stream and are generally called 'interrupt driven' timers.
The downside is that PC CPUs use interrupts for lots of different purposes, a simple one is the realtime clock, it generates an interrupt every
100ms or so just so it can keep the clock on your screen updated. It also uses them to switch between different software threads which we expect
to run at the same time. If two or more interrupts happen at once one will have to be delayed while the other one is processed. As far as our
pulse stream goes it means one pulse is 'late'. This variance in pulse timing is a fact of life with a parallel port and is called 'timing jitter'. There
is a test program which tries to measure jitter and therefore establish if or how well a parallel port will operate.
Some PCs do a good job of running a parallel port with low jitter and some do not. Very powerful and capable PCs are not necessarily any better
than an old XP clunker. Laptops generally do a poor job. Some PCs can be improved by turning certain things off and disabling certain automatic
features. In most cases you cant have different programs running with Mach as one program will interfere with Mach and the regular pulse stream
is broken. The good news is that many PCs do work OK and tens of thousands of CNCers around the world still use the parallel port.
For PCs that wont run a parallel port or if you require very stable pulse streams thats where external motion controllers come into play. They have onboard
either or both an FGPA and a microcontroller which generate very high quality pulse streams and very very much faster pulse streams than can be generated
by a PC alone. The result is smoother motion and also means that the pulse stream can 'keep up' with highly precise servo encoders for which a PC cant.
I suggest you use a parallel port, its a good fit with your existing machine and at a later date add an external controller if you think it desirable.
Craig