Thursday, April 24, 2014

Talking C-3PO Project


Video




Background

My name is Luc Gallant, I live in Northern Alberta, Canada. I am a Computer Engineer working as an Electrical Engineer in the Power Generation industry. I've done my fair share of programming but usually in C++/Java/C#. I've not ventured into scripting languages so please don't judge too harshly...

I have put this post together in order to showcase what I have created here and also to let people know how I went about it so that anyone who wants to make one of their own. Of course the main thing here is that I cannot by copyrights distribute the sounds themselves, but, if you have a legal copy of the movie, you can strip the soundtrack and then use my exported Audacity label files to extract your own sound files. Unfortunately with the time I spent flagging sound files and building the whole thing, I can't afford to spend the time to tersely explain every single step of the way. Use the internet...

My desire to do this project was based on having a life size C-3PO bust, and it constantly looking at me, saying nothing. I just needed it to talk. Plus, I liked that such a project would have both software and hardware interaction.

Project Objective

Allow visitors to make C-3PO talk from both his quotes from all six movies, as well as being able to speak sentences based on the words extracted from the movie quotes. The user would use either their smartphone or our media center tablet to visit a website hosted on a computer of some sort, and when the user selects quotes or types in his/her own sentences, they would play out at the C-3PO bust itself. Note that the way I've done the software and the methodology I used, could be applied to any bust or talker project desire, should you so have the stubbornness to proceed...

Hardware

First thing to do was to select a computer platform. I'm sure I could have used a full size PC but that would be a tad excessive for the task at hand. From the constant stream of Maker Shed e-mails I get, I decided to select between the Raspberry Pi and the Arduino. I posed a question to Adafruit's site and quickly got direction towards the Raspberry Pi.

http://forums.adafruit.com/viewtopic.php?f=25&t=51241

I proceeded to ordering the Raspberry Pi Starter Kit from Adafruit, along with other material - see the bare bones Bill of Materials required to achieve this project, at the end of the blog. Mainly the other material comprised of an audio amplifier and speakers, because of course, the sound has to play out somewhere.

Operating System

I received the Raspberry Pi Starter Kit not long after ordering it and followed one of the million walkthroughs to get Wheezy Raspbian installed. I used the image straight from the Raspberry Pi website along with the SD card formatting tool and Win32DiskImager. Once the SD card was plugged in and I knew that the unit had an IP address, I logged in via Remote Desktop Connection and played around. Basically it's a small computer hosting Linux, so not much more to add there. This was by far the easiest part of the task. You also could elect to buy an SD card with it pre-installed. Although that's what I list in the BOM, that is not the one I got.

Software

The next part was to get the software programmed in so that somehow the sound would play out of the RasPi's speakers even though the user was accessing the website. Typically, when you access a website, you don't make sound play out on the remote server, it plays out on your own computer. That was my biggest concern, is how to get that to work. So, I posted a question on the Raspberry Pi forum, to see how to get over that hurdle.

http://www.raspberrypi.org/forums/viewtopic.php?t=72995

I only got one response, but luckily it was an excellent response, and got me going right in the direction that I needed to.

I read up that Python was the official programming language on the RasPi, and so decided to learn a little bit about the language. I had not written in any form of interpreted language except maybe Prolog (brutal). I googled around and somehow the website I got going on was a class by Google on Python:

https://developers.google.com/edu/python/

I read all the course items and decided I knew enough. Luckily Python has tons, and I mean tons, of forum posts where people are asking questions, so you can look up how to do pretty much anything just by googling it. That's how I hacked and slashed my way though the software part of the project. I'm not proud...

I actually proceeded to setting Python up in Windows and doing most of the file processing and database stuff prior to even receiving the Raspberry Pi, but for the purposes of this post I'll do it in a different order.

Once I received the hardware I used the suggestion that was given to me in the RasPi forum post and made a simple Hello World program using Flask tools. This by the way is an excellent tutorial.

http://www.mattrichardson.com/Raspberry-Pi-Flask/

Once I got the Hello World program to work, I knew I had at least a web server setup on the Raspberry Pi, now, only to setup a proof of concept of sound actually playing out.

I followed the suggestion online of using Pygame to play sounds. I played around with all kinds of ways to get sounds to play out, but in the end just opted for Pygame.

Playing a simple sound through Pygame is not very difficult and many sites explain how.

Sound Work

Of course, one of the biggest tasks, is to get all the quotes/words extracted from the movie files.

As a point of mention, I do own all six movies.

First thing's first, is to identify every quote/word to be obtained from the movies. I did this by getting every script from the movie and extracting all C-3PO quotes using various macros and excel work. It was a ton of work, but I eventually got a working file for each movie, with a list of quotes and words. Here's an example:



I labelled every quote, sentence and word so that it would be easy to reference as the project went on (Q2, S5, W21, W22, etc...). In the same file, I put in the time in hours, minutes and seconds, where this sentence and words are said in the movie. I did this initially by watching the movies at 400% speed but eventually smartened up and used the subtitle files available online.

Then, I needed the audio files. I got from the six movies the sound tracks, and extracted them using VLC, to mp4 containers.

With that extracted, as well as the timestamp for each quote, I used Audacity and put labels on every quote, and also on every intelligible word. You have to make sure to select the quotes/words exactly so that they don't contain any unwanted sound, and so that they don't cut C-3PO off.



I did this exercise for every single movie. By the way, that's ~354 quotes and ~4293 words. Of these, I obtained all the quotes and about 1000 words since many of the words are intelligible because there's a ship or blasters going off in the background.

I installed the ffmpeg plugin in Audacity and exported all the labels into .ogg files, to be used in Pygame as sounds down the road.

I also exported the labels themselves, to be used as a database to feed back into Excel, telling me whether I've captured that specific word or not (I endeavoured to do every quote, but not every word due to intelligibility and duplicates)

I also created a Master File which incorporated all the movies' words, to tell me whether a word had already been done (taking into account duplicates). For instance, the word "your" might be said 23 times, but it only needs to be captured once. Also, it does not need to be captured every movie, just once. I used this Master File to make sure I had the 100 most common English words:

http://en.wikipedia.org/wiki/Most_common_words_in_English

I then created a comma separated values database of every quote number and quote, and another for every word number and word, total, regardless of if the file was exported by Audacity. The program I would write would then use this as a start to its database.

Website for Quotes

I needed to elaborate on the Hello World website and get it to, as a start, play quotes. Unlike described above I didn't get all the quotes at once, I did them bit by bit, so I needed to make the website dynamically created based on what quotes existed.

The link to the Flask tutorial shows how to create a dynamic website, all I needed to know was how to loop through a Python list to create the html dynamically. I googled that and found a great example here (See "Loop in templates"):

http://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-ii-templates

I then passed to the html renderer a Python Dictionary of my quotes, say:

{ Q1 : "Hello there", Q2 : "Goodbye you" }

and then the dynamic html renderer iterated through these dictionaries to list out all the quotes. When a user would click the quote, it would go to an address as example:

http://192.168.5.210/quotes/Q2

Which the Python program would handle and parse out "Q2", and play that quote file.

That piece came together fairly quickly actually.

Website for Custom Sentences

Then came the challenging part, the custom sentence creation piece.

The custom sentences would work similarly to the quotes, except a user would type the sentence in into a text box. Then, when the user presses a button, it would go into a web link. Example, user types in a sentence: "Hello there Luke", would end up going into a weblink as:

http://192.168.5.210/sentence/Hello%20there%20Luke

As in the quotes, this text portion "Hello%20there%20Luke" would get sent into the Python program.

The Python program would then parse out the words, and look them up in a word dictionary, and play them consecutively if they are available.

At the end of my project I added an autocomplete feature based on the JQuery tools that would tell you what words in the database contain the text you've typed in, whether it is at the beginning of the word, midword, or end of word.

Final website look is as shown here in this PDF File.

Quote/Word Engine

First thing for the Python program to do upon being started is to inventory the file database that Audacity has outputted to confirm which files exist. The program takes the Quote and Word CSV databases, and performs a union with the directory listing of the files. The file directory merely contains all the files, Q1.ogg, W1.ogg, etc...

Once that is done, the program knows what is available to play versus not available.

If a quote request comes in as above, the sound will play if that file is available.

If a custom sentence request comes in, the availability of each word will be determined. If all the words aren't available, C-3PO responds with a programmable sentence, currently programmed to "I don't understand what you want me to say."

If all the words are available, the Python program will run a shell command using "sox", to combine all the files into a single file. I spent a considerable amount of time trying to get all the files to play one after the other, using events, using timing - none of these worked. The files need to be combined beforehand, it can't be done real time, as cycle time varies a slight bit and for speech it needs to be exact. Using "sox" was the easiest, quickest way to get this done. Example:

"sox /mnt/Files/W272.ogg /mnt/Files/W159.ogg /mnt/W2856.ogg /mnt/Files/W248.ogg /mnt/Files/tmp.ogg"

I also programmed in punctuation. So for instance, if a user types in a period, comma, colon, or semi-colon, a delay is introduced. Period is 1.0 second whereas the others are 0.1 seconds. Other punctuation is ignored.

I also put in a small feature such that if a word is typed in that isn't available, it will add it to a "futureWords.txt" file, so that after a few months in operation, I could target those words that don't exist, to make the program more complete.

Currently the program is designed for one person to access it. There is only one instance of the temporary file created, and currently don't really have plans to expand this. The point is that it be used at my life-size bust, and not really by the general public.

Bill of Materials

I bought more than what is listed, but barebones this is all that is required:

Raspberry Pi: 40$ - https://www.adafruit.com/products/998
Power Adapter & Cable: 10$ - https://www.adafruit.com/products/501https://www.adafruit.com/products/592
4gb SD Card: 5$ - https://www.adafruit.com/products/1562
Mini Wifi Adapter (if required): 12$ - https://www.adafruit.com/products/814

Total: 67$

Pre-made speaker with built-in amplifier, or custom make your own (cheaper to get pre-made ones). By the way initially I bought this to hopefully implant the speaker in C-3PO's head but this became impossible due to the head not disassembling and me not wanting to break the bust.:

2 speakers, 1 W, 8 Ohm: 4$ - https://www.adafruit.com/products/1313
3.7 W Audio Amp: 9$ - https://www.adafruit.com/products/987
Power Supply for Amp: 10$ - https://www.adafruit.com/product/276
Audio Jack: 1$ - https://www.adafruit.com/products/1699
Stereo Cable: 3$ - https://www.adafruit.com/products/876
Quarter Size Perma-Proto Board: 3$ - https://www.adafruit.com/products/1608
Female Headers: 3$ - https://www.adafruit.com/products/598
Panel Mount Barrel Jack: 3$ - https://www.adafruit.com/products/610
Box for speakers: 12$ - http://cgi.ebay.ca/ws/eBayISAPI.dll?ViewItem&item=150967092945

Total: 48$ (no doubt in my mind this could be bought "pre-made" for 1/4 the price)

And soldering supplies & some wire to jumper within the Perma-proto. At the time of writing this my speaker is not working because the amp is defective and I am waiting for a new one.

Software Used

Excel for "database" creation and getting the scripts into a format that could be manipulated
Audacity for all audio work with ffmpeg plugin
Notepad++ for HTML and Python editing
IDLE for Python editing
Flask for hosting the dynamic website through Python
Wheezy Raspbian on the RasPi
Putty to connect via SSH
Windows Remote Desktop to remote into the RasPi
Win32DiskImager to create the image of Raspbian onto the SD card
jQuery to perform the Auto Complete
Sox for custom sentence creation
Pygame for playing out sounds

Source Code and Project Files

I thought about publishing the source to gitHub but for now I'll just post it here in a zip file. Before anyone judges the quality of this source, as I already stated above, I hacked and slashed this together. It is not as efficient, clean, or proper as it should/could be.

This zip file contains three folders - Code, Scripts, and Sounds:

  • The Code folder is exactly as it is on my computer. This contains the python scripts and folder layout to host the website properly. To start the program, call the main file as such:
    • "sudo python c3po-brain.py"
    • This will continue to run until Control-C is pressed
  • The Script folder contains all the scripts downloaded from the internet, and then worked on through Excel. The before last tab shows where I have been working from. It will give you all the parsing, numbering, etc etc...
  • The Sounds folder where all the sound work took place. It is as it is on my computer, less the actual sound files in the "Files" subfolder (I've left the delay and long delay separators in there). The "Files" subfolder is where all the audio files need to be placed for when you run the python program. To help you along your way, I've included another subfolder for distribution purposes called "Label Tracks" which is an Audacity export of all the labels used to create the individual sound files. This is the time consuming portion. To get all the sound files created, you will need to source the movie audio files, from your own copy of your movies, and import into them Audacity. Then, import the respective label file into Audacity to mark the sounds out. Important thing is that you may need to adjust time offset depending on from where and how you are sourcing your audio. Also, ensure the project is at 44.1 kHz otherwise I had sound issues on the RasPi (slow sound).
The project will work without too much effort, but creating the audio files is what will slow you down initially. Again, a copyright thing.

Here is the zip file - have fun!: File