All of us talk.
Some of us talk more, some of us talk less.
Out talks contain words (some numbers) organized in some grammatically sensible arrangement.
So, what does it take to understand the triggers of our talks, and what exactly would be required to build, lets say, a perfect chatbot, which can talk on its own.
Let us pull down, from perfect chatbot, to semi-perfect chatbot.
By semi-perfect, I imply an chatbot which can actually talk without looking completely foolish, but it may not be correct factually.
So, if I ask the robot:
Me: Whats up?
The robot may reply:
Robot: I am swimming
Obviously, the chatbot is lying. Actually, he is just uttering anything.
The main point is what does it take to build a bot that, inspite of all its deficiencies, can still make some sense?
I am partly inspired by Her
Let us analyze 🙂
Chat transcript (fake)
P 1 : How are you doing?
P 2 : I am doing fine.
P 1 : So, where are you these days?
P 2: I am in Bangalore?
P 1: Which company are you working in?
P 2: I am working in Cisco
P 1: How is work?
P 2: Work is good. I am looking for change though. Let me forward my resume to you. You are working in Microsoft, right? Btw, I am now working in Sales.
The above chat is pretty common kind of chat. But question is how it begins, and where it starts.
How does P 1 know how to start the talk?
He could have said so many things like :
- Good morning
- Hey dude
- Long time
So, what made him chose the sentence “how are you doing?”
So, there is an implicit context here.
Probably, P 1 met P 2 after a long time, so, he wanted to catch up with P 2, and so he asked him the above question.
Also, P 1 wanted to start the talk with a correct statement for which his mind is “trained”. The correct statements to start a talk are generally greetings like “Good Morning” or questions “Whats up”, “How are you doing?”
Also, it is quiet easy to see that different kind of people will stick with different kind of statements to begin a talk. Something they get comfortable with, and find that useful.
Important point to note here is : P 1 did not manually sort all the words that he knew and combined them to create the above statement.
He just knew that statement upfront, because he has seen many people use that statement and he has practised it so many times.
So, essentially, P1 has lot of ready-made statements in his mind which he could have used. What he did is , he just found out the correct templated based on the context and used it. We generally never try to come up with something completely new in conversations. In writing, we may do, but in conversations (where time to think is limited), we generally tend to chose the correct template and fill it. This, in my mind, is the major difference between, writing and talking.
Though, it can be argued that writing is also heavily template based, but I hope you see ths point here.
The contexts which P 1 would have used to come with the above template would be like:
- Time-gap between P1 and P2
- State of conversation (beginning, middle, end)
- Closeness between P1 and P2 (level of formality that they need to maintain)
So, this is what we are doing, while initiating a talk:
- Decide which template to use to begin talking based on contexts
- Use the template and fill it appropriately with correct data (comprised of facts and contexts)
Let me details the 2nd point a bit more.
Once P1 decides on the template he wants to use, he needs to fill it appropriately, with correct grammar (conversational grammar), facts, and contexts.
Template 1: Hey <Person>, <Status Question>
Filling : Hey John, How are you doing?
The mind can dynamically form templates based on past experiences of statements. The complete mechanism in which mind works to convert a statement into temlate and then back to another statement is, out of scope for ths blog, but lets keep it simple.
Our brain has a database of past sentences which have contexts tagged with them.
Our brain finds the appropriate sentence based on correct context, practices (frequency of usage of those templates and their success) and fills them.
Anatomy of a template:
The usable template basically consists of lot of “fill in the blanks”.
These blanks are dynamically filled.
These blanks can be filled with:
- Names (person,location,thing)
- Facts (working in Cisco)
- Contexts (Gender,Location,Interaction History)
The template itself can be of various types like:
- Greeting (Hello Sir)
- Assertion (Our company is doing fantastic in Machine Learning)
- Interrogation (Would you like to buy the software for NLP? )
1) Interrogation Template:
Hi <GenderBasedAddress>, <ContextualQuestion>
Hi Sir, <Context=ProductEnquiry,Question>
Hi Sir, <Context=Product:iphone6,about:usage; Question: Have you used <product>>
Hi Sir, Have you used Iphone6 ?
2) Assetion Template:
I <VerbTemplate Context:Work><Company>
I work in <Company>
I work in Cisco
Needless to say, the generalization of contexts is a very very heavy exercise and it can require excessive amount of memory and training to come up with all possible grammatical constructs, correct templates out of them etc.
But, at current times, where memory and speed are enromous, above problem can be solved to a good extent.
But the most difficult issue would be: Identification of all applicable contexts.
This is where human brain with all its learning powers, sensory and cognitive perceptions, can never be paralleled by a machine, which is why no artificial intelligence would be able to create the perfect chatbot.
But for fun, lets say, we want to create an intelligent chatbot, what can we do? Given the ruthlessness of human beings :D, if we want to spend sometime talking to a pretty chatbot, what all will it take to create one, which may not be perfect, but which can atleast surprise us by saying things we didn’t expect.
How to create a good friendly chatbot:
I think we can do the following:
- Create the above templatisation (fill-in-the-blank) model
- Train a computer with lots and lots and lots of conversations. So, it can create enormous number of templates. (This is again limited by the fact that there is no public dump of conversations on internet anywhere 🙁 concerning the privacy issues involved. The best that can be done probably is to train the bot using all the chats that you have done yourself).
- Use randomness effectively.
- Reduce Randomness
On the 3rd point (Use Randomness), what it means, if the computer finds multiple templates, program it to randomly select one.
If it does not find a fact, program it to fill a random fact.
E.G. Template is : I love <Noun>
And the program does not know which noun to fil in the above template.
Find a random noun (from list of places,persons), and fill above template
So, it may fill like. “I love Jamaica” !
So, the chat may go like:
P 1 : Which food do you love?
Bot : I love Jamaica
Weird but fun!!
On fourth point (Reduce randomness), following tricks can be applied.
Program the computer to link various facts, phrases and contexts.
Now this step can be extensively memory burning.
But if done correctly, you may be able to do something like:
Get all nouns which are linked with Food.
And the program may return list of food items.
And the chat all of a sudden may look more interesting..!!
P 1 : Which food do you love?
Bot : I love Chinese
Well, the last step (reduce randomness) is much easier said than done, but nonethless possible.
All the while, while analyzing P1, we forgot to notice P2, who also contrubuted a lot.
What I am saying is, in a conversation, 2 people are present (generally), which is important.
The converstaions generally go like this ->
p1 : Statement 1
p2 : Statement 2 (Response to Statement 1)
P1: Statement 3 = Change the topic (If uncomfortable with Statement 2 😀 )
P2 : Statement 4 (Response to Statement 3 or Change the topic again)
So, in a conversation, all statements are essentialy a response to earlier statement or they start something new (or partially new).
This concept is essential to understand and create a good autobot.
The computer needs to understand which kind of Response-template is correct for a given Statement-Template
Change-The-Topic or Start-A-New-Topic is again something which computers would find hard to do. But, we can again use semi-randomness here. Essentially, we can program the computer to just select a random topic (from list of topics) , find a random template, and fill the template using words/phrases related to the topic.
Wow ! It will be weird but exceptional fun !!
Tools : The main tools and resources that we can use here are :
- NLP tools like OpenNLP to derive the templates
- Statistical Regression Models to validate which templates get more success
- Dictionary to autofill phrases with correct words
- A database for storing linked facts and the contexts in which they are linked
For simplification, a chat context may be defined to comprise of
- Person Attributes (Gender,Name,Height etc)
- Environment (Time, Location, Weather)
- Facts in Triples : (Subj Pred Obj -> John eats Banana, iphone related-to phone)
P.S. – I am planning to make a chatbot sometimes in future, for fun.
P.S.2 – All views mentioned in above post are my own and if someone wants to use them, please contact me !