A tool for self-learning musicians that gives feedback about the rhythm accuracy of a performance.
The student compares a recording of his/her performance with a reference (e.g. a teacher’s recording, a video from the internet, etc.) and Rhythm checker highlights which parts the rhythm were off and by how much.
Project for the AI&Music festival hackathon (EduHack). You can also check the Devpost submission.
How does it work
At the core of Rhythm checker is a Dynamic Time Warping algorithm that efficiently computes an alignment, which we then parse to obtain the displayed charts. The implementation of DTW is ours; prior to DTW we obtain the Mel Frequency Cepstral Coefficients with the Librosa module; the final charts are displayed with Altair.
The program only deals with the recordings at a very low level, analyzing them in the waveform and frequency domains. It has no notion of music structure. It could also be used to compare speech signals and others, though we have not tested it.
How to interpret the results
The resulting line chart displays the rhythm accuracy of the given performance compared to the reference audio. High values (in blue) mean that that particular section was too fast, while negative values (in orange) mean that it was slow. The ideal performance should keep the line as close to 0 as possible.
Note that the results are produced by an objective comparison with the reference; a different rhythm is not necessarily bad (e.g. it can be a conscious decision by the musician for artistic purposes). This tool should only be used as a guide for the student to help identify some mistakes that would have gone unnoticed otherwise.
The accuracy of the results is not perfect either. Some things to consider are:
- There can be spikes at the beginning or end of the recording. This is due to silence before and after the performance (both from the reference and student’s versions).
- A constant line with a small value usually reliably indicates that the particular section of the performance had a uniform speed difference compared with the target. A quickly oscillating line of low amplitude between positive and negative values could be a sign of imperfections of the program.
- High spikes in the middle of the chart usually indicate that the program has been unable to align the reference and the student’s versions, usually caused by some difference other than rhythm, such as a wrong note; values just before and after that spot might be unreliable. An overly long pause might also create such effect, in that case the program would be giving the correct results.
How to use it
Check the official repository for the code and usage instructions.