Play and Speak Development: August 2008

Tuesday, August 26, 2008

Project plan submitted

After a week of research, testing and making up a lot of stuff, the project plan is completed and handed in. Liz and I aren't too sure about the SWOT related tasks part of the project plan, but whatever... it's submitted now.

Much of our work over the past week or two has been solely on this document so reading through it will give updates on testing, research and other information.

To view the document, please click here.

Tuesday, August 19, 2008

On to the Project Plan

After getting the digital mock up done and presented, I completely forgot about posting another blog. Anyway, click here to view the digital mock up. It should be processed by the time people read this post. It's basically how I think the application would work and how users would interact with it.

So the results of the presentation... I haven't checked yet :p The tutors did give us a lot of feedback since everyone had some sort of experience with either voice recognition software or early child education. One notable suggestion that could potentially be molded into our project is moving the entire application to another platform... PC based -> Nintendo DS.

It turns out that the DS already has something similar to what we're proposing. I think it was called Brain Training which is comprised of a lot of mini games, one of which being a voice training application. If DS coding isn't too complex and there are existing voice recognition libraries that are importable, the scope of the project could completely change.

After researching the development of DS games, I was informed that the general name for DS coding is "homebrew". I later came across a site called NDS HomeBrew, which as the name suggests, has a lot of support for anything DS.. especially develop. I've posted a thread on there asking about voice recognition with the DS in the hope that someone can help with it. If not, I may have to go back to using Dragon and Flash. Developing on the DS would be much much nicer :p

The thread can be found here.

Getting back to Dragon, I did some testing with it... so... setting up an account after it's been trained to your voice works to an extent. Of course it's not perfect so there are still a lot of limitations with this type of technology. Stuttering certainly doesn't help. It was a lot more rigid than I thought it would be.. for stuttering, it needs to be a little more flexible. Though I do understand that making it more flexible will probably make it less accurate.

After the initial setup, I ran through a few test cases speaking with various accents, pitch changes and tones to see how much the results would be affected. The accents I tried were: Indian, Asian (generalised and really obvious), intentionally stuttering, talking with a high/low voice and asking someone else to talk.

The results... well, some things changed. A lot of problems came up with recognising even slightly complex words when accents were concerned. Stuttering sometimes gave null results. Asking someone else to talk gave similar results to accent changes. As it stands now, training will be crucial to accuracy.

However, all of these results could actually be nulled since after I played back what I said just to test the mic, I found that the mic itself was slightly damaged and was letting in an annoying clicks and pops which may have affected the voice recognition. Until I purcahse a better mic, I won't be testing too much more since the results may not be so accurate.

I'm hoping to get a reply on the NDSHB forums ASAP so I have a little more focus and gauge if this will be possible or not. If the coding is too intense, I don't think I'll have the abilities to really do this with the DS :/

Thursday, August 7, 2008

Presenting Tuesday

So after talking to Bonnii, we've been able to confirm a present this Tuesday. Everything should be good to go by then... until then, voice recordings to go then I can finish the animation!

I'll upload a compressed version somewhere so everyone can get the gist of our project.

Wednesday, August 6, 2008

... NOT getting started

So we're starting off on the wrong foot already and did NOT present yesterday. It wasn't anyone's fault but for the moment, I can probably guess that Bonnii is a little disappointed with me (I'm not surprised nor can I blame her :p).

Due to Liz's flu, any group work for the past week has been rather difficult. Most of the SWOT is done but the finishing touches that were to be made has yet to be completed. Also, the digital mock up needs some voice recordings that has yet to be recorded. The sketches however, have been done for a while now... just waiting on those recordings.

Liz did email Ralf but we haven't gotten any feedback as of yet. Hopefully we'll get a chance to present this soon without any penalties.

Getting started

This is a blog for the development of the Play and Speak speech training application for children. It is an application to help children learn how to read with the help of voice recognition technology. Along with Elizabeth Duong, I, Bill Giang will aim to complete a working prototype by the end of the semester.

So far our concept utilises voice recognition software (e.g. Dragon Naturally Speaking) along with our own application (most likely developed in Flash) to record speech then compare it to a word that will display on screen. I conceptualised this process the following way:

01. Application begins
02. Flash randomly picks a word from a database and displays it on screen.
03. Voice input captured with Dragon
04. Voice is converted to text and saved as a text file (speech to text)
05. Flash then reads the text file and compares the converted speech to the word displaying on screen
06. If it is correct, Flash will display the result as correct. Else, Flash will ask user to try again
07. 01-05 continued til end of words (maximum of 20 or so)
08. Overall result displayed on screen (e.g. 17 out 20)
09. Choice to start the game again.

After some research on Dragon, I learnt that Dragon can convert speech to text then directly input that text to specific applications. If it is able to write directly into Flash as text input, some unnecessary steps can be cut out.