And the people gave a shout, saying, It is the voice of a god, and not of a man.
Acts 12:22, The Bible
I've recently been looking for a way to get the computer to read my text. I always like to hear my text read aloud, so that I could make changes if the flow isn't right.
I figured, "Why do it myself when I could get the computer to do it?". In this blog post, I would like to discuss how I made Google speech to work.
The search for human audio
My first hunch was to search for a speech engine that could just read my text. I didn't want to have to use the browser (like I've done plenty times before); I wanted to do everything entirely from the terminal.
I stumbled upon my heart's desires when I saw a post on Ask Ubuntu. I tried say
, festival
, spd-say
and espeak
as text-to-speech engines. They all worked, but they all sucked; they all sounded like robots for the text they read aloud.
It was truly unbearable. So I scrolled. I scrolled lower and lower on the page, looking at answer after answer.
Until I stumbled upon what I wanted to see:
pip install google_speech
google_speech "Test the hello world"
A command to install the Python library for Google Speech and say the words "Test the hello world". I heard the voice of a digital angel. She sounded a lot more natural than any of the previous versions I've had.
I rushed back to my terminal from my browser to try it out. Unfornately, I had to set up a virtual environment first. (That's most annoying part of Python ๐. I understand its use; I don't understand why it is mandated.)
Once I took care of the virtual environment issue, it worked! And there was no need to sign up or use API keys or anything. A double win.
Hear it for yourself:
google_speech "Hello world"
google_speech "The quick brown fox jumps over the lazy dog"
Audible freedom
With this addition, I became super-powerful. I felt like Thanos with all 5 rings and finding a sixth hidden ring just to top it off. I knew from this point on, it would be easy to use my terminal to read my text.
To read text from my website, I run:
$ curl https://mmhq.me/posts/how-i-read-my-webpages-with-google-speech | xmllint --html --xpath '//p/text()' 2> /dev/null - | google_speech -
Simple. Let's break it down.
I use curl
to open the webpage I would like to read:
$ curl https://mmhq.me/posts/how-i-read-my-webpages-with-google-speech
I use xmllint
to filter the HTML page downloaded by curl
. I use XPath (the //p/text()
part) to tell xmllint
to print the texts of all paragraphs on the webpage:
$ xmllint --html --xpath '//p/text()' 2> /dev/null -
Then last, but not least, I tell Google Speech to read the text aloud:
$ google_speech -
I could also save the text to an MP3 file if I desire:
$ google_speech -o output.mp3 -
The final command is all these commands piped (with |
) to save me the extra Enter key strokes.
Bottom line
Terminal is king. Also, shout out to Google for making the text-to-speech service free for all. Never thought I'd be thanking them again, maybe there still is hope in this universe.