3PO-LABS: ALEXA, ECHO AND VOICE INTERFACE
  • Blog
  • Bots
  • CharacterGenerator
  • Giants and Halflings
  • The Pirate's Map
  • Responder
  • Neverwinter City Guide
  • About
  • Contact

3PO-Labs: Alexa, Echo and Voice Interface

A proof-of-concept and its roadmap

9/4/2017

2 Comments

 
Hi everyone. We at 3PO-Labs have made no secret of our frustrations around testing the Alexa voice/intent model. Around this time last year I started playing around with a sloppy hack to address the problem. I've fiddled with it off and on during that time, but a recent renewal of interest in the problem has pushed things to a point where I can start to peel back the covers on what I've built.


Something like a problem description

In building this tool, I've bounced thoughts off a bunch of people (shout-out to Jo, Nick, and Travis for being solid sounding boards), but one recurring theme was that it always took a really long time to explain what I was even hoping to accomplish. To that end, I figured I'd start by describing what the point of this whole thing is.

Our tool looks to solve the problem of doing automated testing of the Alexa voice/intent model. There are a lot of reasons why a person might want to do this sort of testing, but that's an altogether different post for the future. For now, just go with the assumption that as a skill developer you do want to test the intent model.

Going a little deeper into what I mean by "test the intent model", I like to think of the Alexa end-to-end scenario in three discretely testable phases or stages:
  1. Given some audio input from a user, what kind of Alexa skill request does Alexa render?
  2. Given an Alexa skill request, how does a given Lambda/service act & respond?
  3. Given a response from a Lambda/service, how is that output rendered by Alexa?

Now, problem #2 has been addressed by a ton of different tools that are much better than anything I'd have the patience to put together. The official service simulator, and Bespoken's tools are both really good for helping you delve into #2.

Stage #3 is actually what we were addressing way back in the day when we built ASKResponder. Since then Amazon has released a couple tools (like the voice simulator) that make ASKResponder a bit less useful than it was on day 1, but it's still a handy little utility.

That leaves the first stage, which has been my white whale for some time. The tool that I'm showing you (in limited, proof-of-concept form) today allows you to finally address this aspect of the Alexa lifecycle.

A bit of level-setting

Before we jump into it, though, I want to make perfectly clear that this proof-of-concept is suuuper flakey (some of which is largely out of our control) at this point, and it's purposely limited in its scope. My general mental roadmap for this tool looks something like this:
  1. Open up the PoC for people to play around with (Check. Whoo!)
  2. Figure out how to meaningfully document what this process is all about, why it's worth doing, and how it all works.
  3. Meanwhile, address some of the major technical shortcomings described below, hopefully with some suggestions from the dev community.
  4. Open up the generic version of this UI to people to try on their own skills.
  5. Open up the REST API (and maybe provide a Java client) to let people actually do test automation on their own skills.

So, considered linearly (and knowing that it's taken me a year to get to this point), that seems like a long timeline. Luckily, most of this work has happened in parallel (I already have a 50-case test suite for CompliBot via the REST API, for example), so the generic UI and REST API are mostly done, just waiting on the solutions from the third bullet point.

So...PoC?

Alright, before I unleash you on the PoC, here's what you should know. The page you'll see has a bunch of buttons. Clicking one of those buttons will grab an audio file from S3, which will then be sent through AVS, and trigger the Alexa skill (Neverwinter City Guide) I created at the Dev Days event in Seattle in July. The response will be compared against an expected response, and marked as "passed" or "failed" accordingly. You can also provide your own audio file url, or listen to our audio files, if you want.

There are some serious shortcomings which I'll describe in a moment, but if you're antsy to start pushing buttons, here's the link:

http://utility.3po-labs.com:12443/NeverwinterCityGuide.html

Caveats on caveats on caveats

So, as I mentioned, the tool is a blatant hack, and it has a lot of shortcomings that we're trying to find solutions for.
  • Right now, this only works for brand new sessions. We're not messing with session management. We'll get to that eventually, but one-shot invocations are the ones that cause the most trouble today, so that's what we're addressing first.
  • There is currently a strongly recommended 30-second wait between making requests. This has a lot to do with the aforementioned sessions. Basically, Amazon provides us no way to kill an AVS session programmatically, so if you don't wait long enough, concurrent requests will be considered to be part of the same session (and you'll get wonky results, as you might expect).
  • Multi-threading is a huge headache. For any given Alexa skill request there's no correlation id that lets you tie it back to the AVS request that generated it. As such, we're basically doing our best to guess which skill request matches which AVS request using a composite key of application id and skill user id (which you see in the fields at the top of the page). One additional thing to note is that for the PoC specifically, if more than one of you are using the page at the same time, you'll clobber each other's requests. We think the odds of this happening are pretty low, though.
  • Sometimes timeouts happen. There are a lot of moving parts here, and the 6-second window Alexa skills have to return is occasionally not long enough.
  • As a result of the above 4, there are sometimes race conditions that make the page start doing funny things like returning previous requests. IF THIS HAPPENS TO YOU, PLEASE CONTACT ERIC.

Problem solving

So, hopefully this is an interesting PoC that gets you thinking generically about how you might use a tool along these lines, and specifically about what you need from us for this to be useful. Beyond that, there are specific things we need help with:
  • If you can figure out a good way to programmatically kill an AVS session so that subsequent requests will have tabula rasa, that would be great.
  • If you can figure out any sort of way to pass some sort of beacon from AVS to ASK, such that we can deterministically tie the two requests together, we'd be forever in your debt.
  • The other thing that we'd like is additional data for the PoC. The male and female voices are both from people in Seattle, which isn't known for speaking English with any sort of an accent. If you have an interesting regional English accent (Southern US, for example), or if English is not your primary language, we'd love to have you record a bit of audio for us to use as additional inputs for testing our model.
  • A name! Right now the executable for this service is literally called "newtool-service.jar". If you have ideas for what we can call this thing, we'd love to hear it.

Looking forward to feedback on this one, and to rolling out the more useful features to you guys.
2 Comments
best essay review service link
2/25/2019 04:37:28 am

I love the honesty that comes from 3PO Labs. They are honest enough that there are still problems on the device they are currently working on, and it's not yet perfect. Alexa will soon be the best sign once it is done, I am sure of that! I am hoping that more and more people who eventually become a fan of 3PO Labs will be understand the current situation.

Reply
Twitter search link
10/9/2019 07:41:50 pm

It's actually very intriguing dispatched remarks. All the comments are very cooperative and very good. Thanks for circulating.. PLease go to my site know out twitter,

Reply



Leave a Reply.

    Author

    We're 3PO-Labs.  We build things for fun and profit.  Right now we're super bullish on the rise of voice interfaces, and we hope to get you onboard.



    Archives

    November 2019
    October 2019
    May 2019
    October 2018
    August 2018
    February 2018
    November 2017
    September 2017
    July 2017
    June 2017
    May 2017
    April 2017
    February 2017
    January 2017
    December 2016
    October 2016
    September 2016
    August 2016
    June 2016
    May 2016
    April 2016
    March 2016
    February 2016
    January 2016
    December 2015

    RSS Feed

    Categories

    All
    ACCELERATOR
    ALEXA COMPANION APPS
    BOTS
    BUSINESS
    CERTIFICATION
    CHEATERS
    DEEPDIVE
    EASTER EGG
    ECHO
    FEATURE REQUESTS
    MONETIZATION
    RECAP
    RESPONDER
    TESTING
    TOOLS
    VUXcellence
    WALKTHROUGH

Proudly powered by Weebly
  • Blog
  • Bots
  • CharacterGenerator
  • Giants and Halflings
  • The Pirate's Map
  • Responder
  • Neverwinter City Guide
  • About
  • Contact