Script slows down over time - it might be due to constantly updating the frequency table as a dict object. We should be able to pre-allocate a list given the length of the reference sequence (using info in SAM header). Implement and evaluate processing times on the same data.