3PO-LABS: ALEXA, ECHO AND VOICE INTERFACE
  • Blog
  • Bots
  • CharacterGenerator
  • Giants and Halflings
  • The Pirate's Map
  • Responder
  • Neverwinter City Guide
  • About
  • Contact

3PO-Labs: Alexa, Echo and Voice Interface

Deprecating the LITERAL - I literally can't even...

9/21/2016

23 Comments

 
Our last two posts - about our screwup and about Holocron - have been fairly fun and lighthearted. Aside from the wordplay in the title, this is not one of those posts. Amazon has begun down a path, and we need to talk about why this is a really, really bad move for the platform and for us as developers...


tl;dr: Amazon has deprecated the Amazon.LITERAL slot type, without providing feature parity for its many uses. Soon, they won't allow any new skills to use it. This will severely handcuff the scope of what developers can do with the platform. It concedes defeat (on one point) to some of their competitors. And it sends a terrible message to the dedicated development community about the value of their work.

The end of an era for literals

When Alexa was first opened up to the public, its model for describing the possible voice interactions was much more simplistic. The only way to do any sort of parameterization of values was via the Amazon.LITERAL slot, which basically allowed you to define a sample utterance with a generic placeholder in the middle that could be filled in with whatever words the user said. It wasn't perfect (speech-to-text never is), and sometimes you had to provide an absurd number of samples before it would start to get things right. In general, though, it was pretty damn good.

If we've already lost you, a good analogy might be to think about it like Cards Against Humanity (or Apples to Apples, for you less-than-horrible people). The sample utterances can be thought of like the black template cards, which have a gap for you to fill in.
Picture
That gap is what we'd call in Alexa parlance a "slot". Any of the white cards -representing possible values for that slot - that you have in your hand, taken from the thousands of cards printed so far, are guaranteed to work linguistically in that slot. Semantically, some make more sense than others, but they are all syntactically sound.

Now, this was all well and good for a while, but developers started to realize there were times where exchanging freedom for more accurate matching was preferably. Sometimes, you just wanted to pick from something formatted in a very specific way, or from an enumerated list of values. Amazon saw this and responded with two features - built in slots (like dates), and custom slots (where a developer could define a set of values, and Alexa would prioritize matching to those values if possible).

Cool. Everybody was happy at this point. We had the ability to do some utterances with custom slots, and some with literals. We could even mix them together in the same utterance, if we wanted!

A couple months went by, and Amazon dropped a "deprecated" notice on the literal slot type, saying they'd basically prefer you to use other things. This happens all the time in software, and deprecated software lives on for ages, so nobody thought much of it until recently.
We were browsing the documentation on the interaction model the other day, when we noticed this sneaky little disclaimer:
​Important: English (US) skills using the AMAZON.LITERAL slot type should be updated to use custom slots. Starting November 30, 2016, any English (US) skill using AMAZON.LITERAL will no longer pass certification.

English (UK) and German skills do not support AMAZON.LITERAL and cannot use the AMAZON.LITERAL slot type.
Suffice it to say, a lot of people in the dev community were not super pumped​. It's one thing to tell new developers that they're better off using only the new features. It's another thing entirely to tell experienced users of the platform - who know what they're doing - that their approach is wrong and will no longer be allowed.

So no more literals to fill the gaps that custom and built-in slots cannot. Coming back to the CAH example, this would be kind of like the makers of the game coming out and saying "Alright, from now on, you're only allowed to answer black cards with white cards from the corresponding expansion.", or, maybe more accurately "You can use cards from any expansion to answer a black card, but only for your own uses. You're not allowed to share the hilarious mad-lib with anyone else, unless you follow our rules about constraining your range of possible answers."

The effects of this are many-fold.

(Quick aside, speaking of CAH expansions - if you like the game, they have some really good mini-expansions with proceeds going to great causes. These are totally referral links, but even if you remove our affiliated id you should check them out.)

Two quick caveats

Before I jump into the reasons why I think Amazon is making the wrong choice here, I should put some safety language up.

We don't know exactly why Amazon is doing this (the announcement happened silently and unceremoniously without explanation), as they are frustratingly secretive about the inner workings and roadmap of Alexa. This approach is exceedingly problematic for developers, but that's a whole other blog post for the future. The point here, though, is that all of our arguments are based on what information we do have, which is naturally incomplete.

There are two specific possibilities that come to mind that might invalidate the arguments below:
  • Amazon may be planning to provide feature parity before deprecating Amazon.LITERAL, but due to their secrecy we just haven't heard about the new feature that'll fill the gap yet.
  • They understand the degree to which they are weakening the platform, but literals are so much more resource intensive or technologically difficult to maintain that they are choosing the cheaper route.
We personally doubt that either of these is accurate, but we felt that it was important (given how critical we are being) to make a good-faith effort to play Devil's advocate.
Assuming the first bullet point is untrue, there are a couple lenses through which we can assess the damage being done here - that of the individual developer, and that of the product as a whole.

A major pain for developers

I'll start out by saying that there are a lot of skill developers who are not going to be affected by this. The vast majority of people who just stick around long enough to write a single simple skill - they probably don't care in the least. This is specifically a blow to the dedicated developers of the platform - those who have been around the block and taken the time to learn the nuances of the Alexa platform; those who are trying to break out of normal patterns and make innovative and interesting new ideas; those who are trying to push both Alexa's boundaries and the boundaries of voice user interface as a whole.
Lets jump into some more detailed specifics:
  1. So long, dictation. Now, Alexa is not a dictation platform. That's not what it was made for, and it's less effective at doing it than it is at mapping to more constrained models. That said, it was still pretty damn good at it, and a lot of people have made (and are continuing to make) some really interesting skills along those lines, where they took the 80% matching that Alexa would give, and did some post processing of their own to clean things up for their specific use cases. We have one skill that is close-to-complete that is taking this exact approach. We're now forced into a position of likely having to abandon it, after putting 100+ hours of effort into it.
  2. This also signals the fall of the fallback intent. To be perfectly honest, this was never a good solution to begin with, but because Alexa lacks very important information that developers need (when an utterance misses or misfires, what was the user actually saying?), this was the only way to gather that context. Now we're back to flying blind.
  3. Forcing dynamic data into static slots. There is no way to update a slot's values on the fly. There's not even a way to do it programmatically (without using something like Selenium to drive the Alexa Developer Console UI). This means that if you need new values for your slot, you have to rebuild the entire model, and have it sent through certification again. Drawing once again from the Cards Against Humanity analogy, this would be the equivalent of needing to buy a complete new copy of the game with the thousands of original cards, every time you wanted to add a new 30 card expansion to the game. You can't just plug the new slot values in and be done. It should be noted that this is not an abstract concern in the least. One of the most commonly recurring questions we hear in the dev community is about people who have a dynamic product database that they want a slot to match, so a user can ask for information about individual products.
  4. The recertification nightmare. One really scary thing about this change is what's going to happen with recertification. Recertification passes happen randomly (or on a cadence that is close enough to random that we haven't discerned it yet), and if you fail, you're given 72 hours to fix your skill or else it gets pulled. That means that there are a ton of live skills right now that will be at risk of getting a notification on a Friday afternoon that they are no longer in compliance, and if they don't fix it by Monday, too bad. And because this isn't going to just happen all at once on November 30th (lets call it "L-Day" from now on) - rather it'll happen to skill owners individually and without warning.

As mentioned, this is a red flag for one of the skills we have in development, but it's not the only one of our projects that depended in the literal. We've been getting some really great feedback on our article from the other day about the Wookieepedia skill we experimented with. The thing is, for reasons #1 and #3, there's absolutely no way we could've done what we did without Amazon.LITERAL.

A black mark on the product

Even assuming Amazon isn't too concerned about the potential skills that this will filter out, they still have to worry about the image of their brand. Killing literals is a step in the wrong direction, in that regard.

First, this is a clear and unconditional concession to the competing platforms. Now, credit where credit is due - Amazon were first to market by a long shot and essentially invented this product space; their choice to open up the platform to everyone for free from the start was a master stroke, and the evangelists and developer marketing teams have done an extraordinary job getting people on to their platform.

But you can't rest on your laurels when the barbarian hordes are bearing down. SiriKit and Cortana are limited in who can use them and how, but they're perfectly happy to give users the data necessary to do post-processing. Third party tools like Mycroft are trying to compete in this space, and lets not forget that Google's big announcement next week is gonna tell us what Google Home is all about.

Beyond all this, there is a cornucopia of strong natural language platforms popping up to sit behind voice services. Microsoft has Luis, Google has its Cloud Natural Language API, IBM has Watson, Wolfram made available the system behind Alpha, and Stanford CoreNLP has been chugging along solid as ever.

By suppressing these more advanced use cases, Amazon is openly conceding that their platform is no longer an appropriate sandbox for pushing the state of the art in NLP. As mentioned above, this may very well be a calculated business decision - they may have decided that the two core competencies that are best for their bottom line are skills tied to preexisting products, and quickie skills built by the masses.

Which leads to my second point, which is that this sends a really poor message to a broad swatch of Alexa's base of developers. There has been a growing sense of disenfranchisement among many of Alexa's most ardent supporters of late (this topic probably deserves its own post too...). To say that we can no longer get certification for the projects we're working on if they use Amazon.LITERAL is tantamount to saying that our projects are not worthy of being seen by the Alexa user community. It implies that the quality of our work is inferior to that of the flood of tutorial-based skills continuing to be released. And it outright tells us that our ideas are untenable if they don't fit a small set of patterns.

It might be a bit melodramatic, but the implications herein, combined with the way the change was announced (or not announced, as it were), makes it feel like a slap in the face to those of us who have been the biggest champions of the platform.

The good news is that it's not too late for Amazon to grant the venerable literal a well-earned pardon.
23 Comments
Matt Buck link
9/26/2016 12:53:58 pm

I absolutely agree. The most distressing part of this entire affair was the complete lack of announcement from Amazon. Developing for this platform already comes with its own collection of challenges and pitfalls, and this new one just feels pretty demoralizing. Amazon's goal seems to merely to drive up the number of template skills so that tech journalists will keep repeating the "Alexa knows how to do thousands of things already" line.

Reply
Eric
9/27/2016 10:32:37 pm

That is extremely well stated, Matt.

Reply
JasonBSteele
9/27/2016 01:33:57 pm

This is terrible news. I have just bought an Echo in the UK so that I can develop an app which sends the raw spoken text to Microsoft Bot Framework and LUIS for processing.

Is there no way around this?

Reply
Eric
9/27/2016 10:28:04 pm

Well, the answer to that question is "yes and no".

I'm assuming since you're in the UK that you're planning to build skills using Alexa's UK speech model, rather than the US model. If that's so, then you don't have many options, because the literal was not included there.

If you are planning on building for the US model like many people in the UK have been doing up to this point, then you technically have some options, but they're not great.

That said, I'm not giving up this fight yet. I know we have some people on the Amazon side who are sympathetic, and we need to give them the ammo to make their case internally.

Email Dave Isbitski and explain your use case. Bring it up on the forums so that the developer advocates can pass it along. And generally just keep talking about it.

Reply
JasonBSteele
9/28/2016 07:59:58 am

Hi Eric,

I tried emailing Dave and David dot Isbitski at amazon.com but both bounced. Do you know his email address?

EricF
9/27/2016 02:40:39 pm

Arrrrrrgh. I just bought an Echo to build an app that needs the LITERAL. It's a shame. If you do hear about feature parity, it would be appreciated if you would update us!

Reply
Eric
9/27/2016 10:31:56 pm

Well, if you get it certified before Nov. 30, you can still get it out there (although you'll have the specter of recert hanging over your head, and they haven't said yet how they're going to handle that).

As I mentioned to the commentor above, I'm gonna fight this one for a while. We have some ideas on how to make our case even stronger, but the best thing would be if we were just one voice in a chorus crying out about this.

We'll definitely keep people updated about any developments on this front, though.

Reply
ben92898
9/28/2016 11:27:42 pm

So there's now no way to collect input that is not rigidly defined in a custom slot? For example, no ability to ask them for the name of a book (since there are more than 50k books and you can't have more than that across all custom slots)? Surely I must be missing something. I just can't believe it. This is horrible. I just got hired for a project that requires this. Clearly they will be letting all the big players do free form input (unless this is a way to kill off competitors, prevent Spotify from being useful while letting their own music service continue as before).

Reply
Eric
9/29/2016 08:37:33 am

As I mentioned to the other commentors, this isn't set in stone yet. Make your use case known, and make it clear that people need this functionality. Emailing Dave Isbitski, who is their chief evangelist, would be a good start (isbitski at amazon dot com).

Reply
ben928989
9/29/2016 07:32:01 pm

I'll do as you suggest, but realistically it seems pretty set in stone to me... I mean, it had not been made available for the newer, non-US places. And it's slated to banned in 60 days, I have to assume the decision was not made lightly or frivolously and they're pretty committed to it, and their reasons are strong, if lousy. I'm sure that the feature will be back, in some modified form, when their competitors apps prove to be more versatile and useful, but it may take a couple of years. Ugh, I hate it when a company intentionally cripples its platform's developers.

Mith
10/5/2016 09:13:12 am

Doesn't the custom slot type allow generic input too though? It's not a pure enumeration it just weights towards whatever you give in that list.

Reply
Eric Olson
10/5/2016 09:43:41 am

Great question/idea! The best answer I can give is "yes and maybe". You are correct in your description of how custom slots work - it's a matter of weighting not pure enumeration.

The implicit question there is whether you can achieve the same functionality as you could through the literal, which is much harder to answer. We have no real way to do any sort of test along these lines without a bunch of confounding variables. Anecdotal evidence suggests that a custom slot with the same sample values as a literal will result in fairly significant holes in the output range.

To that end, David and I have been working on an approach to try to get a less anecdotal and more numbers-driven answer to that question. We also have some ideas about how we might "influence" the model to get something closer to parity with the literal. We'll definitely post our findings.

Reply
JasonBSteele
10/7/2016 04:49:56 am

Hi, I found that just dumping the output of http://randomtextgenerator.com/ into the values allowed my Custom slot to behave like a literal one. Seems a bit crazy that we have to do this... but it works :)

Will Cooke link
10/7/2016 02:11:34 pm

Well said! I just spent a good few hours getting my infrastructure set up so I could receive requests containing both constrained and unconstrained input for a project I'm working on, only to find out when I came to try it - you can't do that any more. The only way I can see to move forward is to dump thousands and thousands lines of text from the database into the web form and hope they upload before the form times out.

Reply
Eric Olson
10/7/2016 07:21:32 pm

Yeah, the "workarounds" we've been starting to experiment are similar to what you and JasonBSteele (commenter above) are talking about - basically trying to squeeze the system to get as close as we can to feature parity. We shouldn't have to be going down this path, but I have to admit it's been kind of fun trying to reverse engineer things to optimize for slots. Hopefully we'll have some good data on that in the next week or two that we can blog about.

Also, FWIW, I have it on good authority that all of the conversations about this (here, on Slack, on Reddit, and on the forums) has gotten the attention of some people in Amazon. It may not amount to anything - they may still continue down the exact same path - but at least we can say we tried. I know that the comments you guys have been leaving have been a big part of it. So thanks for that!

Reply
Andrea Bianco link
10/25/2016 06:45:57 pm

Hello Eric,
Your name and expertise has come to my attention through the Amazon Alexa Office calls.
I want to reach out and network and build relationships between Alexa enthusiast. I would like to bring together some community dev's as well.

Do you have a Twitter account, Facebook, and an email to contact you ?

Reply
Eric
10/25/2016 06:54:55 pm

Hi Andrea! I tweet under the name Galactoise. The contact info for me and my counterpart (David) can both be found on our contact page: http://www.3po-labs.com/contact.html

We really appreciate your reaching out. Out of curiosity, are you on Alexa Slack? Many of (what I'd consider) the top minds in Alexa development hang out there - it's a great place to network, bounce ideas around, or get help with problems. Self-registration page is here: http://www.alexaslack.com/ (I'm Galactoise on here, as well)

Reply
BobbyD
11/1/2016 10:01:15 am

As of this morning the (11/1/2016) use of AMAZON.LITERAL is no longer allowed. What a shame.

Reply
Sam Turner
11/23/2016 08:40:58 am

Sorry but this article is really misinformed. Yes the literal slot is being replaced, but the literal slot was never a free for all, you still had to give examples of the type of content. The same is true with custom slots. They are a list of examples, not a definitive list of what can be said. I can promise you that it is still possible to use slots to collect whatever is heard.

Reply
Ed
12/27/2016 08:16:54 pm

The Amazon decision is beyond believe. I just hope somebody responsible for the decision also made a good assessment about the damage that it can do to the brand? It is practically helping competitors win over all frustrated Alexa developers.

I feel like I am breaking with a girlfriend who I just discovered have very unreasonable mental limitations. Alexa, we could do a lot together only if you allowed for it. It is breaking my heart but I am leaving you. I have a new open-minded girlfriend, sorry.

Reply
Chris Becker
1/18/2017 01:50:51 am

Here is a solution which works for me (via custom slot):

Indent-Schema:

{
"intents": [
{
"intent": "RawText",
"slots": [
{
"name": "Text",
"type": "CATCH_ALL"
}
]
}
]
}

Custom Slot Types:

CATCH_ALL a | a a | a a a | a a a a | a a a a a | a a a a a a | a a a a a a a | a a a a a a a a | a a a a a a a a a | a a a a a a a a a a
(as many words you would like to recognize - here from 1 - 10)

Sample Utterances:

RawText {Text}

Reply
Eric link
1/18/2017 01:13:52 pm

We actually took a very similar approach where we filled up as much as we could with just the letter X. It actually didn't work as well as we had hoped. We started writing a blog post about it, but got distracted by other topics - I should go back and finish that.

Reply
Chris Becker
1/19/2017 12:18:11 am

I found this article on amazon:

https://developer.amazon.com/blogs/post/Tx3IHSFQSUF3RQP/Why-a-Custom-Slot-is-the-Literal-Solution




Leave a Reply.

    Author

    We're 3PO-Labs.  We build things for fun and profit.  Right now we're super bullish on the rise of voice interfaces, and we hope to get you onboard.



    Archives

    May 2020
    March 2020
    November 2019
    October 2019
    May 2019
    October 2018
    August 2018
    February 2018
    November 2017
    September 2017
    July 2017
    June 2017
    May 2017
    April 2017
    February 2017
    January 2017
    December 2016
    October 2016
    September 2016
    August 2016
    June 2016
    May 2016
    April 2016
    March 2016
    February 2016
    January 2016
    December 2015

    RSS Feed

    Categories

    All
    ACCELERATOR
    ALEXA COMPANION APPS
    BOTS
    BUSINESS
    CERTIFICATION
    CHEATERS
    DEEPDIVE
    EASTER EGG
    ECHO
    FEATURE REQUESTS
    MONETIZATION
    RECAP
    RESPONDER
    TESTING
    TOOLS
    VUXcellence
    WALKTHROUGH

Proudly powered by Weebly
  • Blog
  • Bots
  • CharacterGenerator
  • Giants and Halflings
  • The Pirate's Map
  • Responder
  • Neverwinter City Guide
  • About
  • Contact