There really is no big mystery to this - it's standard inter-thread communication - as practiced in a gazillion systems from enterprise data bases to games. There is no need to "sprinkle" waits here and there, just put them where they're needed and not where they arn't. This isn't alchemy.

When CB does (for example) a setOEMDRO, CB is not setting anything, it's REQUESTING Mach to set the DRO.
More generally, the client thread (CB) REQUESTS that the SERVER thread (Mach) do something. That request sits in a buffer until the SERVING thread (Mach) has time to action it.
IF the client thread in no way depends on that request being actioned then it can just carry on. HOWEVER - If the client thread depends on that REQUEST being actioned before it can successfully continue according to the logic of the program, it MUST wait for the server thread to let it know it's done it.
There is however a second circumstance when the client thread needs to wait on Mach signaling it's completed the request. In some multi-threading apps, it's appropriate that the buffer used is a FIFO queue. This means that several requests can be sent to the SERVER. However, AFAIK, the buffer used in the Mach/CB context is a single - one-shot buffer. This means that if we want to send multiple sequential requests we MUST wait until each one has been completed before we send the next request and so on.
The mechanism that performs all this is that Mach sets a semaphore when it's completed the request and CB can monitor that semaphore until it see's it set - that's what isMoving does. Shame it's called isMoving because it has nothing to do with movement. A better name would perhaps have been MachIsBusyDoingOurRequest. As in...
while MachIsBusyDoingOurRequest() 'or perhaps something shorter maybe!!!
wend
So, using the above code as an example...
Code"G92 Z0.000" ' Set the Z value +/- to insure the pierce height comes out correct
Code"G0 Z" &PierceHeight
Two things MAY happen here. One is that the G0 request actually corrupts the G92 request. The other is that the G0 request may happen BEFORE the G92 request. It's perhaps unlikely if you have a good fast system, but that is BAD programming.
The other example is even more obvious perhaps.
Code "G4 P" &PierceTime 'Pierce Delay
Code "G0 Z" & GetUserDRO(1001)
Here, we're asking Mach to perform a dwell. You can virtually guarantee that the G0 will be executed BEFORE the dwell is over. You're also risking buffer corruption again. It might - it might not. Who knows.
If you have a fast system and if the request that you're making is a quick one - then you MIGHT get away with it - then again you MIGHT get away with it MOST of the time. But sooner or later it's going to bite you in the **s*.
Sorry - lecture over - hope it helps.
Ian