Tuesday, June 21, 2011

The conversation loop: replying to a statement

The actual AI XML web service part of the Talking Owl Project (as opposed to all of the fancy-schmancy UI application front end stuff, that lets you search for Owls to talk to, favorite the conversations you like the most, and all of that "user experience" stuff) really only needs to perform a very basic function: a single arc (or iteration) of the "conversation loop."

In other words: it has to reply to a statement from a user.

As simple as this sounds, I feel like I really need to wrap my head around the best approach to the top-level, abstract steps involved in this kind of operation. When I think about it conceptually (that is, verbally; that is, not in terms of programming or code), the basic steps seem to be this:

  1. I have some knowledge of the conversation so far in my working memory.
  2. I get the sentence from the user.
  3. I "parse" the sentence, or break it down into a useful representation of its parts
  4. I "understand" the sentence, or come up with some kind of mental representation of what it means
  5. I add the mental representation of the meaning of the sentence to the knowledge already in my working memory
  6. I make inferences based on the newly combined mental models: this can include inferring new knowledge, answering questions by filling in gaps, or coming up with questions based on gaps or conflicts that appear.
  7. Add any interesting knowledge that I've gained into long-term memory.
  8. Generate a verbal sentence response based on the contents of working memory after all of that has taken place.

So are you can see, there is a lot going on for what seems like a "simple step."

The reality is, when looking at it in terms of pseudo-code, there are additional book-keeping types of steps that need to be in there as well: validating input, logging the user's input in the database, logging the owl's response, and so on.

Additionally, it's important to keep in mind what pieces of the puzzle are involved in what operation. For example, when I refer to "parsing the sentence" I can't simply pseudo-code it as:

inputstructure = inputquery->parse();

or

inputstructure = TalkingOwl::parse(inputquery);

because sentences don't parse themselves, and parsing isn't an objective activity. When a sentence is parsed, it is parsed by a particular brain, and different brains may parse the same sentence in different ways. Thus, the pseudo-code for that step has to look something like,

inputstructure = CurrentOwl->parse(inputquery);

And so, my first stab at the pseudo-code for the top-level of abstraction on this service is something like this:

// Create the OwlQuery object from the input xml parameter
//   * Parses the string input to find the XML components
//   * Assigns the XML component values to the query object
//   * Catches invalid formats or missing data to stop execution
//
$owlquery = new OwlQuery( $_GET['xml'] );


// Present the query to the Parliament of Owls
//   * Finds or initializes the active conversation
//   * Finds the active owl
//   * Validates that the query is valid for the active user
//
Parliament::presentQuery( $owlquery );


// Retrieve the current active owl from the Parliament
//
$owl = Parliament::getOwl();


// Retrieve the current active conversation from the Parliament
//
$conversation = Parliament::getConversation();


// Initialize the Owl's working memory for the current
// conversation.  (Note: an owl can carry on multiple
// conversations at once, therefore it's "current memory
// state" is stored in the database with respect to
// each conversation it has.  This step is required
// so that the Owl object is initialized with the 
// correct working memory state for THIS conversation.)
//
$owl->loadMemoryXML( $conversation->getMemoryXML() );


// Use the owl's brain to parse the query sentence into parts
//
$parsedquery = $owl->parseQuery( $owlquery );


// Use the owl's brain to convert the parsed sentence object 
// into OWL/RDF triples that represent the meaning of the 
// sentence.
//
$querymodel = $owl->understand( $parsedquery )


// Add the mental model of the meaning of the sentence
// to the existing working memory of the owl
//
$owl->loadMemoryXML( $querymodel->getXML() );


// We now have the owl's mental representation of the
// conversation WITH the user's input but WITHOUT any
// further processing.  Log this entry into the 
// conversation object.
//
$conversation->logEntry(_USER, $owlquery->text, $owl->getMemoryXML() );


// Run some simple inference and OWL/RDF triple resolution
// based on the combination of the two models, and the contents
// of long-term memory.  This will fill in response nodes to 
// unbound RDF nodes representing queries, and will generate
// query nodes where appropriate.
//
$owl->infer();


// Store triples in long-term memory, and update weights and parameters
//
$owl->learn();


// Generate a response from the Owl based on the current
// contents of working memory.
//
$responsetext = $owl->speak();


// Log the owl's contribution to the conversation, along
// with the Owl's working memory contents after speaking.
// 
$conversation->logEntry(_OWL, $responsetext, $owl->getMemoryXML() );


// Generate and OwlResponse object based on the most recent entry into
// the conversation log.
//
$owlresponse = new OwlResponse( $conversation );


// Print the response as XML
//
print( $owlresponse->getXML() );

This will yield XML output in response to the XML string input that it received as the query. Exactly what I want for a back-end REST service. It will be the job of the front-end application UI to take that printed XML and display it as something pretty on the screen.

No comments:

Post a Comment