Skip to content
Compare
Choose a tag to compare
@alexjsteffen alexjsteffen released this 11 Sep 19:59

Full Changelog: 0.1.1...0.2


Release Notes

Improved Command-Line Interface and API Key Handling

The command-line interface has been enhanced using the clap crate, introducing a new --apikey option. This allows users to provide their OpenAI API key directly through the command line, offering more flexibility in how the key is supplied. The program now intelligently retrieves the API key, first checking for the command-line argument and then falling back to the OPENAI_API_KEY environment variable. If neither is available, a clear error message guides users on how to provide the key.

Enhanced Default Settings and User Experience

To streamline the user experience, default values have been introduced for key parameters. The TTS model now defaults to "tts-1-hd", and the voice setting defaults to "fable". These sensible defaults allow users to quickly get started while still providing the option to customize these settings as needed.

Refactored Code Structure and Improved Readability

The main function has undergone significant restructuring to improve clarity and maintainability. Clear section comments have been added to delineate different stages of the process, such as input file handling, text chunking, audio generation, and file combination. This restructuring makes the code flow more logical and easier to follow.

Optimized Audio File Handling

The audio file combination process has been refined. The sorting logic has been improved to ensure that audio chunks are combined in the correct order, crucial for maintaining the integrity of the final output. Additionally, the output file has been renamed from "combined.flac" to the more intuitive "output.flac".

Progress Tracking and Error Handling

A placeholder for a progress bar has been added during audio generation, laying the groundwork for better progress tracking. This will provide users with real-time feedback on the conversion process. Error handling has also been improved, particularly around API requests and responses, ensuring that users receive clear feedback if issues arise during the text-to-speech conversion.

Documentation and Code Cleanup

In an effort to streamline the codebase, redundant function-level documentation has been removed for several internal functions. This change reduces code verbosity while maintaining essential documentation where needed. The overall effect is a cleaner, more maintainable codebase that remains well-documented at critical points.

These updates collectively enhance the usability, flexibility, and maintainability of the text-to-speech conversion tool, providing a more robust and user-friendly experience for both new and experienced users.