Tag Archives: java

Hacking Amazon Alexa with Java

For the recent AT&T IoT Hackathon in Dallas, we decided to try something new and make an Amazon Echo Dot a central part of our project. Our project used a Raspberry Pi with a camera to detect when the lever on a coffee airpot is pushed down, and capture a picture. We then fed the picture through IBM Watson for facial recognition, and wrote the name and the image to an S3 bucket.
coffee-pot
This is where Alexa took over. I wrote a Amazon Lambda function in Java which read the S3 bucket and exposed two intents. The first was to ask “who took the last cup?” The function would respond with the name, which came for a text file in the S3 bucket. The second intent was more fun. You could then tell Alexa to “shame them”. This posted a Tweet with the image of the person and a caption saying they took the last cup of coffee.

We actually got this all working in a day. I handled the Alexa side of the project, while my teammate handled the Pi and Watson. The biggest challenge was figuring out how to actually get Lambda and Alexa playing together nicely using Java.

Amazon produces a lot of doc about Alexa, and about Lambda, but very little deals with using the two of them together with Java. Most of the examples are for NodeJS. There were a lot of tutorials out there using NodeJS, most of mixed quality. In the interesting of improving the situation for us Java developers, I’ll share my lessons learned and walk through how to get this setup.

For the TL;DR crowd, you can grab the project source off my GitHub project and be sure to look at the examples in the Alexa Skills Kit Java SDK.

Creating your Project

First off, ignore all the “Using Lambda with the Eclipse SDK” tutorials. You do not want to do this as you’ll just be wasting your time. You need to be using the Java Alexa Skills Kit SDK. The jar is available in Maven Central, and all the source is in the GitHub repository. More importantly, the SDK includes numerous examples for how to use the SDK. For working with Alexa and Java, reading the source is the only reliable option.

Ultimately, Alexa cares about JSON payloads. The Skills Kit SDK is essentially a bunch of wrapper classes around the JSON exchange between Lambda and Alexa. This is the reason the other tutorials you’ll find don’t work with Alexa. You can’t have a Lambda that simply takes a String and returns a String. You need to implement a Speechlet, which takes a SpeechRequestEnvelope and returns a SpeechResponse.

For the initial project structure, I used Gradle. Since I’m talking to S3 and Twitter, I also have dependencies for those. You can trim them out if you’re not using them for your own project.

group 'org.sporcic'
version '1.0'

apply plugin: 'java'

sourceCompatibility = 1.8

repositories {
    mavenCentral()
}

dependencies {
    compile 'com.amazon.alexa:alexa-skills-kit:1.2'
    compile 'com.amazonaws:aws-lambda-java-core:1.1.0'
    compile 'com.amazonaws:aws-lambda-java-events:1.3.0'
    compile 'com.amazonaws:aws-lambda-java-log4j:1.0.0'

    compile 'com.amazonaws:aws-java-sdk-s3:1.11.56'
    compile 'org.twitter4j:twitter4j-core:4.0.5'

    compile 'log4j:log4j:1.2.17'
    compile 'org.slf4j:slf4j-api:1.7.0'
    compile 'org.slf4j:slf4j-log4j12:1.7.0'
}

task buildZip(type: Zip) {
    baseName = 'coffeeStatus'
    from compileJava
    from processResources
    into('lib') {
        from configurations.runtime
    }
}

build.dependsOn buildZip

line 6 : make sure you set your sourceCompatibility to 1.8, as Amazon Lambda uses Java 8
lines 13-16 : these are the core Amazon Lambda and Alexa SDK libraries. You need them.
lines 18-19 : I need these since I’m talking to S3 and Twitter. Remove them if you aren’t
lines 21-23 : the logging libraries you’ll need for S3
lines 26-35 : to deploy Java to Amazon Lambda, it has to be packaged as a zip file, with all the dependencies inside a
directory called lib inside the zip file. This Gradle task takes care of that for you, and adds the tasks onto the normal build task.

This is all the Gradle file you need to write a function for Amazon Lambda. You can add additional dependencies depending on what you’re trying to do. You will upload this jar via the Amazon Lambda management console.

Now you need to create your SpeechletRequestStreamHandler implementation. This is a pretty simple class:

package org.sporcic;

import java.util.HashSet;
import java.util.Set;
import com.amazon.speech.speechlet.lambda.SpeechletRequestStreamHandler;

public class CoffeeStatusSpeechletRequestStreamHandler extends SpeechletRequestStreamHandler {

    private static final Set<String> supportedApplicationIds = new HashSet<String>();

    static {
        String appId = System.getenv("APP_ID");
        supportedApplicationIds.add(appId);
    }

    public CoffeeStatusSpeechletRequestStreamHandler() {
        super(new CoffeeStatusSpeechlet(), supportedApplicationIds);
    }
}

line 7 : name the class what you want, but you’ll use the fully qualified name of this class in the name of the handler in the Lambda configuration
lines 12-13 : the Skills SDK has logic to verify the application ID of the caller to the Lambda function. Rather than hard coding the application ID of the Alexa Skill in code, I ready it from an environment variable configured in the Lambda Management console.
line 17 : you need to implement a no-arg construction which calls super() with an instance of your Speechlet and the Set of your authorized application IDs

One final piece of setup is to create a log4j.properties file in the src/main/resources of your project. This is necessary to use logging inside of your Lambda function. The file needs to contain this configuration:

log = .
log4j.rootLogger = DEBUG, LAMBDA

#Define the LAMBDA appender
log4j.appender.LAMBDA=com.amazonaws.services.lambda.runtime.log4j.LambdaAppender
log4j.appender.LAMBDA.layout=org.apache.log4j.PatternLayout
log4j.appender.LAMBDA.layout.conversionPattern=%d{yyyy-MM-dd HH:mm:ss} <%X{AWSRequestId}> %-5p %c{1}:%L - %m%n

NOTE: Be sure to change the level of the rootLogger before you go to production!

Now comes the fun of implementing your Speechlet. Like a Servlet, the Speechlet interface defines the lifecycle methods for handling requests from Alexa. I inspired my code from the Helloworld Speechlet in the Skills SDK. The primary difference is I used the new SpeechletV2 interface.

The SpeechletV2 interface defines four lifecycle methods Alexa will use to interact with your Lambda function:

public interface SpeechletV2 {

    void onSessionStarted(SpeechletRequestEnvelope<SessionStartedRequest> requestEnvelope);

    SpeechletResponse onLaunch(SpeechletRequestEnvelope<LaunchRequest> requestEnvelope);

    SpeechletResponse onIntent(SpeechletRequestEnvelope<IntentRequest> requestEnvelope);

    void onSessionEnded(SpeechletRequestEnvelope<SessionEndedRequest> requestEnvelope);
}

The primary method you’ll interact with is the onIntent() method. Here’s my implementation for Skill with two intents:

    @Override
    public SpeechletResponse onIntent(SpeechletRequestEnvelope<IntentRequest> requestEnvelope) {
        log.info("onLaunch requestId={}, sessionId={}",
                requestEnvelope.getRequest().getRequestId(),
                requestEnvelope.getSession().getSessionId());

        Intent intent = requestEnvelope.getRequest().getIntent();
        String intentName = (intent != null) ? intent.getName() : null;

        if ("CoffeeStatusIntent".equals(intentName)) {
            return getCoffeeStatusResponse();
        } else if("ShameUserIntent".equals(intentName)) {
            return tweetTheShame();
        } else {
            return getUnknownCommandResponse();
        }
    }

lines 3-5 : just shows logging is handled the same as about every other application, along with how to get the request and session IDs
lines 7-8 : you get the Intent off the request, and can get the actual name by calling getName() to decide what you’re going to do. These are the same intent names defined in the interaction model in the Alexa Skill Kit configuration.
lines 10-16 : I evaluate the String value for the Intent and call another function for each intent. I also have a fall through function which returns a generic unknown command response.

Now lets walk through one of the functions that builds the SpeechletResponse:

private SpeechletResponse getWelcomeResponse() {
        String speechText = "Welcome to Coffee Status";

        SimpleCard card = new SimpleCard();
        card.setTitle("Coffee Pot");
        card.setContent(speechText);

        PlainTextOutputSpeech speech = new PlainTextOutputSpeech();
        speech.setText(speechText);

        return SpeechletResponse.newTellResponse(speech, card);
    }

lines 4-6 : while the Echo’s are voice devices, Alex also had the mobile application. The cards (SimpleCard and StandardCard) define what shows up in the Alexa application as a result of the voice interaction. The SimpleCard only displays text, while the StandardCard provides the ability to include an Image.
lines 8-9 : this is where we define what gets said back to the user via Alex
line 11 : now that we have the Card and the OutputSpeech, we use a static factory method on the SpeechletReponse to build the response. The response can either be a “Tell” response, which simply states the OutputSpeech text, or an “Ask” response, which says the OutputSpeech and then prompts the user to provide additional information which can continue the user’s session.

The Intent provides access to the Slots data, which were defined in the Alexa Skill interaction model. The History Buff example in the Alexa Skills SDK is an excellent example of how to get data from the slots and have an interaction with the user.

Once all the code is ready, do a standard ./gradlew build to generate the zip file for upload to the Lambda Management console. The zip is place in the build/distributions directory of your Java project.

One final note: the SDK lays down a pattern for adding the configuration of your Intents and Sample Utterances to the code repository. The pattern is to create a speechAssets folder under the directory your Speechlet is in. The two files you’ll create are IntentSchema.json and SampleUtterances.txt. Here are examples of mine:

{
  "intents": [
    {
      "intent": "CoffeeStatusIntent"
    },
    {
      "intent" : "ShameUserIntent"
    }
  ]
}
CoffeeStatusIntent who took the last cup of coffee
CoffeeStatusIntent who took the last cup
CoffeeStatusIntent who was the last person to get coffee
CoffeeStatusIntent what jerk took the last cup
CoffeeStatusIntent what jerk took the last cup of coffee
ShameUserIntent to shame them
ShameUserIntent shame them

Having these in your source code makes them easier to edit, since you can just copy/paste them into correct fields in the Alexa Skill configuration. And having them under thumb also helps as a reference for developing your intents.

This takes care of the code. In my next post, I’ll cover how to deploy this to Amazon Lamba, and how to configure and text the Alexa skill.

Custom Error Pages with Spring Boot

I’ve been a big fan of the Spring Framework. Yes, it is now even more bloated than the JEE world it set out to replace, but for enterprise software development it provides a consistent solution to common problems, including ones you might not have realized you are going to have.

My biggest gripe with Spring is how painfully slow and complicated it has been to get a Spring Framework project started. Getting a basic MVC application setup with JPA and a good view technology is a royal pain in the butt. The new Spring Boot project was created to change that.

Spring Boot has turned setting up a Spring Framework project into a breeze. It’s not perfect, but after using it on a small project, I definitely plan on using it as a baseline going forward.

One of the issues with Spring Boot is that while it does a tremendous job with 90% of the problems, there is still 10% you need to dig in and figure out. Custom error pages was one of those problems for me.

Spring Boot uses embedded Tomcat by default, which means your 404 (and other) error page is the lovely, standard Tomcat page. I don’t want my error pages showing internal application state, especially for 500 errors, so I wanted to configure custom error pages.

It turns out this a pretty simple task with theĀ org.springframework.boot.context.embedded.EmbeddedServletContainerCustomizer class.

Add the following Bean definition to whichever class you’re using for your main method to startup Spring Boot:

@Bean
public EmbeddedServletContainerCustomizer containerCustomizer() {

   return (container -> {
        ErrorPage error401Page = new ErrorPage(HttpStatus.UNAUTHORIZED, "/401.html");
        ErrorPage error404Page = new ErrorPage(HttpStatus.NOT_FOUND, "/404.html");
        ErrorPage error500Page = new ErrorPage(HttpStatus.INTERNAL_SERVER_ERROR, "/500.html");

        container.addErrorPages(error401Page, error404Page, error500Page);
   });
}

This the the Java 8 version using a lambda expression to simplify things. It creates three ErrorPage instances for three common HTTP Status Codes and then adds them to the container. The ErrorPage class is an abstraction for setting up error pages which will work with both Jetty and Tomcat.

The equivalent code for Java 7 using an inner class would be this:

@Bean
public EmbeddedServletContainerCustomizer containerCustomizer() {

    return new EmbeddedServletContainerCustomizer() {
        @Override
        public void customize(ConfigurableEmbeddedServletContainer container) {

            ErrorPage error401Page = new ErrorPage(HttpStatus.UNAUTHORIZED, "/401.html");
            ErrorPage error404Page = new ErrorPage(HttpStatus.NOT_FOUND, "/404.html");
            ErrorPage error500Page = new ErrorPage(HttpStatus.INTERNAL_SERVER_ERROR, "/500.html");

            container.addErrorPages(error401Page, error404Page, error500Page);
        }
    };
}

The actual error pages need to be place in the static content directory of the Spring Boot web application. The default location is src/main/resources/static :

File Location

For the actual files, this archive contains versions inspired by the error page included in the awesome HTML5 Boilerplate.

With files in place, you will now see a simplified version of the core error pages which don’t expose the internal state of your application. For development, you would typically want to keep your regular 500 page so you can see what blew up without chasing the log files.

The Case for Strength

The pendulum has swung once again and I have escaped the world of architecture to get back to delivering software. I left Bank of America (Countrywide) for the second time last week, but not the mortgage industry. I am now the Application Manager for the origination system Nationstar Mortgage.

The challenge for my first week has been getting my head around the codebase. The application is a combination of a vendor solution along with a lot of custom code built up along the side. Like a lot of in house applications which started small, this one has grown in to what I call the “big ball of mud” pattern, where there was not the long term vision and architectural rigor for structuring a scalable, maintainable application. The fun part is it is my job to fix it.

After a week of looking at code, I have acquired new-found love for strong typing. In a mad genius, army-of-one development mode, the freedom of dynamic languages is both liberating and powerful. But as you start to add more cooks to the kitchen, things start to go downhill rapidly.

Case in point, I was attempted to troubleshoot a sporadic NullPointerException that pops-up in production. The culprit method is getting an object out of a list, and calling a method on a nested object it contains. I’m trying to determine if the object itself is null, or if the nested object is missing, so I’m tracing back how to where that object comes from.

The problem is that the instance of the object comes from an ArrayList, which is created from an object in HashMap, and neither makes use of generics. I know what my final object is supposed to be, but I have to trace back through other code to see how that the HashMap is being populated with.

This whole codebase has left me begging for generics. Generics in Java are not just about type safety, they are about documenting your code. For example, instead of:

ArrayList myLoans = new ArrayList();
HashMap properties = new HashMap();

You should make use of generics so that other people aren’t left guessing on the contents:

List myLoans = new ArrayList();
Map properties  = new HashMap();

In the second case, I know I have a list of Loan objects, and I know I have a map where the key is a String and the entry is an Address object. I don’t have to jump around in the code to see how they are used to understand exactly what they contain. Note another difference: I use the collection interfaces (List, Map) in my declaration rather than the implementation classes (ArrayList, HashMap).

From an object-oriented programming perspective, you should always strive to hide the implementation details of objects. And by using the interface, you buy yourself the flexibility of being able to swap out for a new implementation class without breaking the code. For example you could change the ArrayList to a Vector if you needed the synchronization and not break downstream code.

So with my dive back in to the bowels of enterprise software development, I’ve regained a new appreciation for my old friend Java. There is great strength in strong typing which permits you to build much more maintainable applications than the alternatives. Use that strength and make your team happy.

Abel and Ready

This is the story of Abel, a hard-working, dedicated Java developer. One night, after a hair-pulling coding session with Spring Security, Abel was visited by the Ghost of Java. “Abel, you have been extremely dedicated to me,” said the Ghost, “and for that, you will be rewarded.” “You will be visited by three ladies of the night as a reward for your years of sweat and turmoil hacking me,” the Ghost continued.

Two thoughts quickly raced through Abel’s mind. First, being a Java developer, it meant he would actually have to shave and shower for the next three days. Second, being a dedicated Java developer, three ladies in three nights might be a bigger task than he could handle. But throwing caution to the wind, he told the Ghost of Java he was up to the task, although he mentioned he would have preferred Java 7.

On the first night, a rather mature woman, slightly past her prime, showed up at his door. She was perfectly dressed in a revealing blue dress, but her exquisite makeup was not up to the task of concealing her sags and wrinkles. She said her name was IBM and that she would show Abel a good time. What she lacked in looks she made up for in conversation. She treated Abel like the center of the universe and was always ready with a supportive comment or another glass of expensive champagne. Abel was somewhat satisfied with the happy ending to the evening. It wasn’t spectacular but IBM made sure he knew it was all about him.

Abel’s second night took an abrupt turn for the worse when the next lady walked in. Looking like a crack whore in a thousand-dollar mini skirt, Abel’s new date introduced herself as Oracle. Upon seeing her, Abel’s first reaction was to put his hand on his wallet. Unlike his prior date, Oracle made clear the world revolved around her. She talked all night about the great many tricks she could do for Abel, but all for a price. Abel was thoroughly distracted from the happy ending between trying to keep an eye on his wallet and worrying he might catch something from his new date.

On the third night, Abel hid in his closet, fearing a repeat of the second night. Instead, he was pleasantly surprised when a curvaceous, girl-next-door redhead came in to his room and introduced herself as Microsoft. Although not much for conversation, Microsoft had curves in all the right places and not a wrinkle to be found. Abel got a chuckle over how she liked spinning in the flowers and talking about her MySpace page. The happy ending was very easy; so easy in fact that he wondered how he would ever be satisfied with anything more complicated again.

Abel’s plight is one that many Java developers have run through their head this past week as news made it out that Oracle had purchased Sun Microsystems. IBM, the aging monstrosity it is, would have killed Java through neglect. Oracle, the crack whore of software, will possibly kill Java through trying to monetize (read “nickel and dime”) the development community to death. And then we have Microsoft, with C# 3.5, whispering sweet nothings in developers ears.

So what is a Java developer to do? I fretted over this most the week so that I could be more rational when I put the pen to it. IBM owning Java would not be pleasant due to IBM always managing to be five years behind the technology curve. While that might be popular with large enterprises, it does nothing to please the alpha geeks that made Java what it is today. We can joke about Java being the new Cobol. IBM would have guaranteed it.

About the only worse possibility would be for Oracle to own Java, which is where we find ourselves today. Whereas Sun was an engineering company, Oracle is a sales company, and software developers strongly prefer dealing with the former. I have no doubt that Oracle will try to find a way to monetize Java, to the detriment of the community.

Finally, we have the wildcard Microsoft. C# is not much a leap for a Java developer. And having sampled their forbidden fruit, there is definitely something to be appreciated. Microsoft would be more than happy to offer disenfranchised Java developers a new home.

But ultimately, there is another option, as Rod Johnson pointed out. Through the open source community, Java has grown beyond any one vendor. Yes, Oracle may try and stifle this community through a future Java release under a restrictive license which breaks compatibility with open source Java, but that would only cement their irrelevance.

Oracle’s ownership of Java means that the fate of Java is now in the community’s hands and not a vendor’s. There will not be much love for Oracle with Java developers, so I expect the next version of Java anyone cares about to called JDK 7 and not Java 7. And it will be a true, open effort of dedicated developers and the Java ecosystem they have grown. So as tempting as it is to cave in to C#, I’m going to stick with Java for now and see where we can take this.

No Suprise

I didn’t have to wait long to prove out my first prediction for 2009. Big Blue is apparently in talks to buy Sun.

I have pretty mixed feelings about this. It means Java would be safe, but it would also be dead. IBM is a technology dinosaur, usually running several years behind the pack in the Java space. This would mean the end of innovation at the JVM level and the chances for a Java 7 any time this decade probably go down to zero.

IBM might also try to start monetizing the JVM. While it would be an incredibly stupid thing to do, never underestimate the stupidity of IBM.

The only bright spot will be what gets built on top of the JVM. The JDK 1.6.0_12 is a high-performance, stable beast of a platform. Things like Groovy and other JVM-based dynamic languages are the future of Java, so at least they’ll have a stable core to build on.