What's the best way to create accurate audio timing system(bars, beats)?

Hey. I try to make a small game, in which the player turns on and off the prepared audio samples. But the task is to synchronize all audio clips, so that the player does not worry about “getting to the beat”.

Now i use :
Event Tick → Delay (1/4note duration) → Multigate (4 action, where first action its alaways main beat in 4/4 metronome). But i think its way too much depends on hardware perfomance, fps, and can easy create gap between bars.