Supervised Learning Tutorial


In this example we will create a restaurant search bot, by training a neural net on example conversations.

A user can contact the bot with something close to “I want a mexican restaurant!” and the bot will ask more details until it is ready to suggest a restaurant.

This assumes you already know what the Domain, Policy, and Action classes do. If you don’t, it’s a good idea to read the basic tutorial first.

The Dataset

The training conversations come from the bAbI dialog task . However, the messages in these dialogues are machine generated, so we will augment this dataset with real user messages from the DSTC dataset. Lucky for us, this dataset is also in the restaurant domain.


the babi dataset is machine-generated, and there are a LOT of dialogues in there. There are 1000 stories in the training set, but you don’t need that many to build a useful bot. How much data you need depends on the number of actions you define, and the number of edge cases you want to support. But a few dozen stories is a good place to start.

Here’s an example conversation snippet:

## story_07715946
* _greet[]
 - action_ask_howcanhelp
* _inform[location=rome,price=cheap]
 - action_on_it
 - action_ask_cuisine
* _inform[cuisine=spanish]
 - action_ask_numpeople
* _inform[people=six]
 - action_ack_dosearch

You can read about the Rasa data format here : Training Data Format. It may be worth browsing through data/ to get a sense of how these work.

We can also visualize that training data to generate a graph which is similar to a flow chart:


The chart shows the incoming user intents and entities and the action the bot is supposed to execute based on the stories from the training data. As you can see, flow charts get complicated quite quickly. Nevertheless, they can be a helpful tool in debugging a bot. More information can be found in Visualization of story training data.

Training your bot

We can go directly from data to bot with only a few steps:

  1. train a Rasa NLU model to extract intents and entities. Read more about that in the NLU docs.
  2. train a dialogue policy which will learn to choose the correct actions
  3. set up an agent which has both model 1 and model 2 working together to go directly from user input to action

We will go through these steps one by one.

1. Train NLU model

Our program looks like this:

def train_babi_nlu():
    training_data = load_data('examples/babi/data/franken_data.json')
    trainer = Trainer(RasaNLUConfig("examples/babi/data/config_nlu.json"))
    model_directory = trainer.persist('examples/babi/models/nlu/',
    return model_directory

You can learn all about Rasa NLU starting from the github repository. What you need to know though is that interpreter.parse(user_message) returns a dictionary with the intent and entities from a user message.

This step takes approximately 18 seconds on a 2014 MacBook Pro.

2. Train Dialogue Policy

Now our bot needs to learn what to do in response to these messages. We do this by training the Rasa Core model. From

def train_babi_dm():
    training_data_file = 'examples/babi/data/'
    model_path = 'examples/babi/models/policy/current'

    agent = Agent("examples/restaurant_domain.yml",
                  policies=[MemoizationPolicy(), RestaurantPolicy()])



This creates a policy object. What you need to know is that policy.next_action chooses which action the bot should take next.

Here we’ll quickly explain the Domain and Policy objects, feel free to skip this if you don’t care, or read Plumbing - How it all fits together for more info.


Let’s start with Domain. From restaurant_domain.yml:

    type: text
    type: text
    type: text
    type: text
    type: text
    type: list

 - greet
 - affirm
 - deny
 - inform
 - thankyou
 - request_info

 - location
 - info
 - people
 - price
 - cuisine

    - "hey there!"
    - "goodbye :("
    - "default message"
    - "ok let me see what I can find"
    - "ok let me see what else there is"
    - "ok making a reservation"
    - "what kind of cuisine would you like?"
    - "is there anything more that I can help with?"
    - "how can I help you?"
    - "in which city?"
    - "anything else you'd like to modify?"
    - "for how many people?"
    - "in which price range?"
    - "I'm on it"

  - utter_greet
  - utter_goodbye
  - utter_default
  - utter_ack_dosearch
  - utter_ack_findalternatives
  - utter_ack_makereservation
  - utter_ask_cuisine
  - utter_ask_helpmore
  - utter_ask_howcanhelp
  - utter_ask_location
  - utter_ask_moreupdates
  - utter_ask_numpeople
  - utter_ask_price
  - utter_on_it
  - examples.restaurant_example.ActionSearchRestaurants
  - examples.restaurant_example.ActionSuggest

Our Domain has clearly defined slots (in our case criterion for target restaurant) and intents (what the user can send). It also requires templates to have text to use to respond given a certain action.

Each of these actions must either be named after an utterance (dropping the utter_ prefix) or must be a module path to an action. Here is the code for one the two custom actions:

from rasa_core.actions import Action

class ActionSearchRestaurants(Action):
    def name(self):
        return 'search_restaurants'

    def run(self, dispatcher, tracker, domain):
        dispatcher.utter_message("here's what I found")
        return []

The name method is to match up actions to utterances, and the run command is run whenever the action is called. This may involve api calls or internal bot dynamics.


From examples/ again:

class RestaurantPolicy(KerasPolicy):
    def _build_model(self, num_features, num_actions, max_history_len):
        """Build a keras model and return a compiled model.
        :param max_history_len: The maximum number of historical turns used to
                                decide on next action"""
        from keras.layers import LSTM, Activation, Masking, Dense
        from keras.models import Sequential

        n_hidden = 32  # size of hidden layer in LSTM
        # Build Model
        batch_shape = (None, max_history_len, num_features)

        model = Sequential()
        model.add(Masking(-1, batch_input_shape=batch_shape))
        model.add(LSTM(n_hidden, batch_input_shape=batch_shape))
        model.add(Dense(input_dim=n_hidden, output_dim=num_actions))


        return model

This policy builds an LSTM in Keras which will then be taken by the trainer and trained. The parameters max_history_len and n_hidden may be altered dependent on the task complexity and the amount of data one has. max_history_len is important as it is the amount of story steps the network has access to to make a classification.


Now we can simply run python to get our trained policy.

This step takes roughly 12 minutes on a 2014 MacBook Pro

Using your bot

Now we have a trained NLU and DM model which can be merged together to make a bot. This is done using an Agent object. From

def run_babi(serve_forever=True):
    agent = Agent.load("examples/babi/models/policy/current",

    if serve_forever:
    return agent

We put the NLU model into an Interpreter and then put that into an Agent.

You now have a working bot! It will recommend you the same place (papi’s pizza place) no matter what preferences you give, but at least its trying!