The question here is how to design the mind which enables robots to learn languages by experiencing the real world in the same way a child does. In particular, what does it meant by “experiencing the real world”? Could the real world be modelled? How does the modelling of the real world help robots to autonomously learn, analyze and synthesize human languages through the interaction with the physical world (i.e. environment)? What is the definition of meanings? What is the relationship between physical meanings and languages? What is the principle behind the process of learning human languages? These are the issues that should be addressed with regard to the design of the mind for robots to learn and to understand human languages through interaction with the real world ([1],[2]).
Let's first examine the issue of “what is the definition of meanings?”.
The understanding of the definition of meanings will help us to find the appropriate principle behind the design of a robot's mind. In our opinions, the real world should be divided into both the physical world (i.e. environment) and the conceptual worlds (i.e. texts in various languages). Therefore, the meanings of a word in a human language will consist of two parts: a) the physical meanings of the entity, referenced by the word, in the physical world, and b) the conceptual meanings of the word itself, in various human languages. Here, we consider that an entity's properties (i.e. geometrical, mechanical, chemical, electrical, etc) as well as its constraints (i.e. kinematic constraint, dynamic constraint, etc) are the physical meanings of the entity. Due to the nature of constraints, when multiple entities co-exist in a common space of the physical, interactions among these entities will occur. And, these interactions will create the concepts such as actions, behaviours, events, episodes and stories, etc. Therefore, along the history of mankind, the process of encoding the meanings in the physical world gives rise to the invention of human languages. On the basis of the inventive nature of human languages, we advocate that the use of human languages creates the so-called conceptual worlds. That is to say that a conceptual world is the set of texts in one human language, which describes the meanings of the physical world. Hence, multiple human languages produce multiple conceptual worlds. Most importantly, the properties and constraints of a word in a particular human language are simply the conceptual meanings of the word itself. For example, nouns, verbs, adjectives, proverbs are properties of words, while noun-phases and verb-phases are constraints of words. In summary, properties and constraints in the physical world as well as the conceptual worlds define what we call the meanings.
Then, let's examine the issue of "what is the relationship between physical meanings and languages?".
We all know that languages are the inventions of human beings for the purpose of encoding the physical meanings of entities in the physical world. Interestingly enough, the relationship between physical meanings and languages is similar to the relationship between scenes and cameras. For example, we can say that cameras are the inventions of human beings for the purpose of projecting the appearances of scenes into images. In a similar way, we can say that languages are the inventions of human beings for the purpose of projecting the physical meanings of entities into texts. In robot vision, one of the tasks is to do reconstruction or photo interpretation, which aims at reconstructing the scenes from given videos or images. Similarly, in robot hearing, one of the tasks is to do reconstruction or text understanding, which aims at reconstructing the physical meanings from given texts or sounds of texts.
Now, we come to this important question of “Can robots learn language the way children do?”
As mentioned above, properties and constraints are the contents of knowledge or meanings. And, the best way of representing these knowledge or meanings is the use of human languages. Therefore, the mastery of human languages is crucial to the development of robots of tomorrow which are capable of interact and communicate with human beings. Human children have the innate capability of mastering or learning any human language. This capability depends on two important factors. The first factor is the built-in blueprint of the mind which is the foundation of learning human languages. The second factor is the chance of interaction in the physical world during the process of learning human languages. As a result, if we could make robots of tomorrow to also gain these two conditions (i.e. the built-in blueprint of the mind similar to human beings’ one, and the ability of interacting in the physical world) , then robots will be able to learn language the way children do.
Actually, we have initiated the project with the aim of developing a robotics mind under the name of KnowNet, which is a software with the functionality such as teacher-assisted learning of human languages, vision-guided learning of human languages, visualization of physical meanings in 3D virtual space, text understanding, text synthesis, speech recognition, speech synthesis, conversational dialogue and multiple language translation, etc.
References
- Xie, Jayakumar. S. Kandhasamy and H.F. Chia. Meaning-centric Framework for Natural Text/Scene Understanding by Robots,International Journal of Humanoid Robotics, 1(2), June 2004.
- Jayakumar S. Kandhasamy, Organized Memory for Natural Text Understanding and Its Meaning Visualization by Machine,PhD thesis (Under Review), School of Mechanical and Aerospace Engineering, Nanyang Technological University, Singapore (2005).