3PO-LABS: ALEXA, ECHO AND VOICE INTERFACE
  • Blog
  • Bots
  • CharacterGenerator
  • Giants and Halflings
  • The Pirate's Map
  • Responder
  • Neverwinter City Guide
  • About
  • Contact

3PO-Labs: Alexa, Echo and Voice Interface

From Zero to Submission in a day

5/2/2016

1 Comment

 
There's been a lot of talk in the Alexa dev community lately about all of the tutorial or template based skills that are flooding the market (and of course the related discoverability concerns). All these "build a skill in under an hour" type walkthroughs are great for bringing new devs into the fold, but it got us thinking about what it really takes for an experienced Alexa developer to build something well. The question we came to was "could one of us build a skill from nothing to submission in just one day?". To answer the question, I decided to try it out, all the while cataloging the journey. Read on for more...


So, the answer, for those of you who are impatient, is "Yes, but...", with big caveats on how "good" it has to be to be considered "done". I built DiceBot to a certifiable state over the course of this one day, but didn't actually submit it that day (moar featurez!). This article is all about the process, though, not the end goal, so lets get to it...

11:20 AM - Initial Idea

So, the skill I intend to create today is pretty simple. It's a random number generator, disguised as dice. Pretty simple, and Alexa actually already does this. My skill comes with a twist, though. Based on the subtly different speech patterns used, I'm going to allow the user to "weight the dice" to skew the randomness one way or another.

1:05 PM - Basic setup complete

Off to a slow start - things took longer than expected. Setup included the following:
  • New git project (and github remote project)
  • Copy an existing project, pare down to basics, rename things
  • Set up Maven for project
  • Get project running in Dropwizard
  • Eat lunch (important!)
  • Set up basic Postman call to test running service
  • Set up a Lambda proxy to avoid dealing with SSL
  • Create new skill configuration
    • Post on forums about misleading wording for new skills
  • Test out basic request
  • Fight with port forwarding on home router until tests start working

One annoying thing I found out is that you can't define an intent schema containing only the Amazon built-in intents.  To get up and running, I wanted to work with only the launch intent and built-ins, but that fails to save.  So, I added a garbage intent that I won't implement server-side, and now things are up and running, so I can start building out the skill.
Picture
Yes, lets!

2:50 PM - Crawling along

Man, this is really moving slowly. I started by stubbing out the dice rolling utility (aka Java.util.random) and all of its convenience methods (roll x dice y times, etc), and then setting up some unit tests to get a bit of TDD going. The hope is that the upfront thought will minimize refactoring later.
Once I had some positive and negative unit tests, and implementing methods to make them pass, I started connecting things up to do a happy-path flow through Alexa.  I went super simple to start, and will expand things outward starting now. At this point, my schema is just this:
Picture
And the output of a call is pretty barebones too:
Picture
As you can see, the ssml is being reused for the card text, and the formatting is janky. Believe me, the audio output is at least as awkward sounding as this text looks.  There will be time (I hope) for fit and finish later, though. Next up, lets fill out the schema and see if things work the way the unit tests say they should.

4:05 PM - That was easy

That was actually fairly painless - test-driven development for the win, I guess. While I was testing it out I actually noticed that I'm doing both bounds inclusive, which means rolling a six-sided die can return 7 possible values (0 is the extra number). Whoops!
The other thing I noticed is the strange way unused slots are mapped.  My expanded utterances look something like this: 
Picture
As you can see, not all slots are used for each utterance of that intent. We can use default values if the user doesn't specify one of those values.  The weird thing, though, is that instead of completely omitting the slot when they send us the intent, they send us this poorly-built half-slot object.  Check it out:
Intent Request with half-baked slot

    
What is the deal with this, Amazon? Why would I want you to pass me a slot object with a name but no value? It's kind of a tease... "hey, I could've given you this value. But I didn't, though. Deal with it." I ended up running into a couple null pointers as a result of checking whether the parent object was populated, but not the value itself.

Anyhow, to this point everything has been fair dice. Now to make things cheater friendly.

6:45 PM - Cheater code completed

This one took a while because of all the testing around it, but it's good to go now. What I did inside of my DiceUtil class was to have it accept an instance of DiceSkewStrategy, which is an interface I just made up with a single method named skew(int value, int maxValue). I then built a bunch of tests and an implementation of this strategy called CoefficientDiceSkewStrategy.
When building out my implementation, I wanted a few things to be true:
  • I wanted the level of skew to be configurable/tunable
  • I wanted it to skew fairly evenly, instead of clustering on specific numbers
  • I wanted the entirety of the range to stay in play (i.e. don't want to eliminate rolling a 1 on a d20).
Had I conceded one or more of these requirements, I certainly could've come up with a quicker implementation, but all things considered 2.5 hours isn't so bad (especially when factoring in time spent screwing around on Reddit and the Alexa developer forums) for this implementation. So here's how it works:
  1. A new instance of the CoefficientDiceSkewStrategy is created, including a float parameter between -1 and 1, inclusive. This value is referred to as the coefficientModifier. A value of 0 means don't skew, -1 is skew heavily downward, and 1 is skew heavily upward.
  2. A random number is generated within the range of the die I'm rolling.
  3. The number generated, and its max possible value, are passed to the skew strategy.
  4. The skew strategy checks to see if the value even has room to skew.  If not, it returns early.
  5. A random float is generated and multiplied by the coefficientModifier and by the maxValue, and then rounded to an int. This gives us the value to add to our roll. Depending on the modifier used, it can be any integer between maxValue and negative maxValue, inclusive.
  6. This new value is added to our original roll.
  7. If this new sum ends up out of bounds, we keep the original roll but throw away the coefficient's random value and generate another.
  8. Step 7 is repeated up to a set number of times, currently set to 100, at which point we just use the original input from the roll.
  9. The newly skewed roll is returned. Success!
To add context onto a few of these decisions, here are a few additional notes:
  • While I picked a strategy with some very strict rules, the point of the interface was to make these things swappable or configurable in the future - maybe even by the user. I could see a possible strategy where it's fairly hands off, except to occasionally trigger additional 20s (or 1s) on a d20, for example.
  • We could've avoided the retry loop by only allowing the coefficient to go from 0 to (boundary - roll), but that would've made it so that in some cases skew was impossible.  For example, with a coefficient of .09, and a roll of 19 on a d20, there's no way to make that jump up to 20.
  • Alternately, we could have avoided rerolls by just rounding to the boundary whenever we escaped it. That would've resulted in artificially stacking on min and max values, though.
  • The reason we had to limit the reruns is because there's an asymptotic situation that happens when you have a roll that is very close to the boundary on a very large max value. Take for example a roll of 65534 with a max of 65535. Even with a small coefficient, the odds of the skewed roll being out of bounds are incredibly high, and so it's possible to get in a near-endless loop trying to capture that last point of skew.
So, with all that done, it's time to actually implement the voice interface to take advantage of it.

8:00 PM - Easy peasy

Adding the voice model bits for the skewed rolls was super easy.  I was surprised at how well Alexa was able to disambiguate between the extremely similar intents.  Here's what the sample utterances look like now:​
Sample Utterances

    
ROLL_DIE is a fair roll, ROLL_ME_A_DIE is weighted upwards, and ROLL_A_DIE_FOR_ME is a downward weighted roll. With that, the basic functionality of this thing is really complete. Now on to the boring testing, cleanup, etc.

9:30 PM - Listening, testing, and breaks (Oh my!)

So much trivial minutia to wade through.  Like listening to every piece of audio output several times to make sure it has the correct cadence, and adding breaks where it doesn't.

That's nothing compared to the testing process, though.  Right now I have 25 utterances that should work, and 22 of them should work across three different requests patterns (active session, "ask dice bot", "tell dice bot").  That's 66 manual test cases that have to be done any time the utterances or schema change (and it will change, when I tack on some branding stuff).  We've harped on this before, but I can't stress enough how much (lack of) testability is a hindrance right now.

10:30 PM - Finishing up

One thing that was left to implement was the repeat functionality. If a user asked to have 10 dice rolled, they might have a hard time remembering all 10, and want to hear the numbers again (as opposed to just getting 10 new dice rolls). The way we handle things like this is that within a given session we maintain a conversation history that allows us to refer back to previous requests.

There were also a bunch of fit-and-finish items to work through. For example, when a user says roll 1d10, we want to use  the words "die" and "value" instead of "dice" and "values".
With the cleanup largely complete, the skill is effectively in a state now where it could be submitted for certification (although, I'd still need to make icons for it). That being said, I'm not going to do that tonight, as there are a few things related to DERP branding (building DiceBot's home page) and a couple minor features (like a dynamic text option) that I'd like to tackle first. Plus, we consider user testing to be very important, so the fact that only I've used it thus far is a non-starter.

We may also choose to tackle a few additional features, such as:
  • Storing user data and letting users configure their coefficient modifier
  • The ability to append roll sets (think: "Roll 2d10 and 4d6 for me")
  • Additional DiceSkewStrategies (like the crit/crit-fail one talked about above)
Welp, thanks for sticking with us on this adventure! By the time you guys read this, DiceBot will be live, but we'd love to hear ideas about additional features we should add. Let us know in the comments or on our contact page. Or come join us on Alexa slack!
1 Comment
essay services online link
10/14/2019 03:48:03 am

I am not working in Information Technology industry that's why I was quite lost with the topic that was discussed on this article. But it seems like you are accomplishing something right, that's why I couldn't be happier and prod of your achievements. I know that you have worked so hard for this, and we need to commend you for a job well done. It is true that when you worked so hard for something, you deserve to be acknowledge for that!

Reply



Leave a Reply.

    Author

    We're 3PO-Labs.  We build things for fun and profit.  Right now we're super bullish on the rise of voice interfaces, and we hope to get you onboard.



    Archives

    May 2020
    March 2020
    November 2019
    October 2019
    May 2019
    October 2018
    August 2018
    February 2018
    November 2017
    September 2017
    July 2017
    June 2017
    May 2017
    April 2017
    February 2017
    January 2017
    December 2016
    October 2016
    September 2016
    August 2016
    June 2016
    May 2016
    April 2016
    March 2016
    February 2016
    January 2016
    December 2015

    RSS Feed

    Categories

    All
    ACCELERATOR
    ALEXA COMPANION APPS
    BOTS
    BUSINESS
    CERTIFICATION
    CHEATERS
    DEEPDIVE
    EASTER EGG
    ECHO
    FEATURE REQUESTS
    MONETIZATION
    RECAP
    RESPONDER
    TESTING
    TOOLS
    VUXcellence
    WALKTHROUGH

Proudly powered by Weebly
  • Blog
  • Bots
  • CharacterGenerator
  • Giants and Halflings
  • The Pirate's Map
  • Responder
  • Neverwinter City Guide
  • About
  • Contact