ChatBot ( साइलि )
The Multilingual Bot. Your Future personal Assistance
ChatBot Data Collection and management
The bot is trained with Json File for each language with each intent, Each intent will have tags and these Tags are level of our Class.
Tags will have respective patterns. So the pattern is nothing but the array of answers.
For example, tag greeting will be like the list below
Hi, hey hello, how are you, good morning, good day, etc. Similarly, the tags will have responses as well, as the pattern
Hey, hello, thanks for visiting,
Likewise, we will have multiple tags and multiple greetings The examples of other tags are,
Goodbye, thank you, [in case of our course sector] ⇒ courses, durations, fees, locations, teachers, timing, and so on
So whenever a new question or sentence comes to our bot it will try to recognize the tags it belongs with and pick out the random responses from the available class.
This is not required it needs exact sentences that we have given it during the training,
It will try to categorize the sentence from different available words and phrases and proceed for the appropriate answers.
This is the way it works. And we train it.
So How it is made with
Stemming is the concepts for the getting root of the words,
It will chop the unwanted ending words that is proportional to the root of the words
Eg. “organize”, “organizes”, “organizing”
[“organ”, “organ”, “organ”]
This is based on the different stemmers that we chose. There are different stemmers available so it's up to us which we shall choose.
( splitting a string into meaningful units, eg. words, punctuation characters, numbers)
Eg. “ where I can take the 2 courses?
[“where”,” I”, “can”,” take”,” the”,”2”, “Courses”, “?”]
Bag of Words
Bag of words
Converting a string to vectors containing numbers for each word
Based and feasibility with the multiple languages
Our NLP preprocessing Pipeline
“Which course do you have?”
[“Which”, “course”, “do”, “you”, “have” “?” ]
[“which”, “course”, “do”, “you”, “have”, “?”]
Exclude punctuation characters
[“which”, “course”, “do”, “you”, “have”]
Bag of words (from NLTK )
Training the data of multiple languages together
We need to provide our data at first.
French format file.
English format file.
Nepali format file
Pytorch Module for each training module
Here Again, I am making feature engineering for the making model.
From the loop inside the folder containing different language files, I am getting the entire intents of the file. On that intents, I am applying my stop words, stemmer, and tokenizing them.
Here the words are ready for training and testing after the vectorizing and I am defining my neural network with batch size, learning rate, epochs, layers
After this, the neural is made to start with the optimizer, and based on hardware the model is choosing the GPU or CPU version of the PyTorch model.
4)saving/logging model and implementing the chat for each belonging module and class
Now I am saving my model in the respective path to get it back
5)serving it with web frameworks like Django or flask
For serving chatbot, we can have multiple options like web framework, mobile application, and or even the API-based data.
Thus Here I am using Django to make conversation with and on chat.
Here I have created, view, template, and route.
Django Model View and Template
The HTML page will make requests of string from user input, it will pass to the view function through the URL route, and in view, the specific function will check which language first with the availability in the model and previous blue printed JSON file. After that, it will figure out the resemblance of the word with the neural network and get the score for the word if the successful prediction takes place. And hence give the answer if in True case or returns predefined “not understanding message”
Automatic Sound Recognization,(ASR)
In the audio section, I am using python library gTTS for the Text to Speech and SpeechRechognigation for the Speech to text
The return message will be pronounced with the intents based language voice. So it will give actual accents for the actual languages.
For the voice to text, I am manually using the language model to define it.
Integration of Model.
The module can easily integrate with Facebook, WhatsApp.
Saili (साइलि) can be a personal Assistant like Siri, Alexa, Cortona
Thanks to Epita and Team for this wonderful platform