Text to Speech models are often accessed via APIs. Those APIs typically have a large set of limitations and rules in order to make sure the provider's system can process requests in a timely fashion.

One of the most common limitations is character limit. You can't typically submit an entire document worth of text to generate audio. Instead you have to break up text into bite sized chunks.

This tool lets you break your text up into various sizes

