Welcome to the wonderful world of herding electrons - I'll skip the EE discussion and keep things empirical.
The cap is acting in conjunction with a resister in the BoB to create a simple filter circuit.
The higher the cap value, the larger the amount of filtering added.
Adding the cap is an electronic way of doing what the mach software de-bounce setting is doing.
So the effects from adding the cap and the debouce setting will be additive - and this matches what you are seeing - with the cap you need less software filtering.
If you are curious to learn more, you can google "RC time constant".
Here is reasonable tutorial I spotted from the google results:
http://www.electronics-tutorials.ws/rc/rc_1.htmlDave