PAL: An Experiment in Computative Linguistics

A Brief History

Aug 7, 2001.

PAL is an ongoing experiment in computative linguistic morphology, and on a higher level, in (artificial) intelligence. PAL began his life in mid-1998 as a relatively simple chatterbot for use on Internet Relay Chat, but has over time evolved into a much more serious and ambitious project. PAL is not an acronym.

PAL I used a simple question-answer database to match what was asked of him to a previously learned response. Learning was done through direct question-answer mapping by the (human) teacher. In later stages of development, the program could learn by itself, by monitoring what he determined were dialogues on IRC. In responding to a query, questions did not have to be precisely matched, as PAL I utilized a matching method which allowed for a great deal of error tolerance. Despite the primitive nature of the system, the results were often surprising, and always amusing.

PAL II, in development in 1999, was an attempt to create a community of PAL-like agents, which could share language databases and cooperate in other ways. This phase of the project was soon halted as it became apparent that a more sophisticated PAL engine should be a priority.

PAL III was the first serious attempt to create a "brain" capable of learning language by itself, with no hard-coded or pre-entered rules. The idea was that simply by "listening" to language, on IRC for instance, a program should be able to make conclusions about sentence structure, words and how they relate, and to develop a "mind" based on the inter-relation of everything it has "heard". In this way, PAL could, after having spent some time in an IRC channel, develop the ability to compose sentences in the language that he had been exposed to, not by simply repeating what was previously said, but by adhering to language rules that, ideally, would by this point be derived. For this purpose, PAL3 used a graph structure, with symbols (words) for nodes, and connections (edges) between them representing relationships like "follows in a sentence". These connections also varied in strength depending on the number of activations. The key idea was that a strong path in this graph structure should represent a grammatically correct sentence fragment. The result was a frequently chatty program that rarely succeeded in making much sense or carrying a conversation. Work on PAL3 was stopped in late 2000.

In one particularly long experiment, PAL3 built up a word-graph of about 34,000 words with almost 500,000 links, over the course of several weeks spent on an IRC channel. For interest, we include here a couple of examples from those sessions.

<ff> pal: how do you feel?
<pal> i feel like i got a new one is the same way as the millenium or
  something the chord thing i was getting like 10 minutes
<ff> pal: you seem upset
<pal> i'm so upset
<ff> pal: why?
<pal> when i get a free meal why not just a bunch
<ff> pal: good point
<pal> it's supposed to look at that point and all that
<ff> pal: now you sound confused
<pal> got it from the quest we went to reboot i'm confused
<ff> pal: you certainly are
<pal> you certainly are
<ff> pal: no, you are
<pal> no i don't want a job there

As has been stated, it quickly becomes clear during a conversation that PAL3 is not intelligent, but rather attempts, as is traditional for chatterbots, to trick the other participant into an ongoing exchange. However, due to the wealth of information that PAL3 bases its responses on, they can often provide interesting insight into the general character of the environment it learned from.

<ff> pal: give me a hug
<pal> but it's a big hug
<ff> pal: i can take it
<pal> well i'm going to take a long time
<ff> pal: i'll be patient
<pal> the patient is dead
<ff> pal: you made a joke
<pal> i have to put that as a joke on him
<ff> pal: a joke on whom?
<pal> whom opened for tmbg at last year's concert
<ff> pal: is whom a band?
<pal> to whom
<ff> pal: to anyone. objectively speaking.
<pal> anyone know if i get to the other day i was thinking that the
  server will be the most part but it's not that that would suck a lot of 
  money in the middle of the other day
<ff> pal: you need therapy
<pal> but i don't have to go to therapy or something like a brain
<ff> pal: you have a brain, too
<pal> but i need a larger brain my friend and he said that it was like
<ff> pal: a larger brain would help, yes

PAL IV is the current stage in the development of the "brain". Work is currently being done to enable PAL to create conceptual links in addition to linguistic ones, to enable him to maintain a semantic layer on top of his syntactic (linguistic) one. The idea, then, is that "thoughts" will originate and crystalize on a conceptual layer, with the linguistic layer (which is a more sophisticated version of PAL III) being responsible for turning these thoughts into valid statements of the appropriate language. This conceptual layer is built by noting relationships between symbols (words) which, by the number and variety of linguistic links, can statistically be determined to likely be representative of concepts, ie. are nouns, adjectives, or the like. By early August of 2001, much of this infrastructure has been implemented.

