Preface
This monograph is a reference manual for a machine translation interlingua. It is still in the draft stage, and will be undergoing continuous revision as the software based on it is developed and tested. If you have any helpful comments or suggestions, please feel free to contact me. If you do contact me, please quote sparingly from the monograph.
The latest version of this document can be found at http://www.rickmor.x10.mx/lexical_semantics.html. A tutorial (including audio wav files) and dictionaries for the interlingua can be found at http://www.rickmor.x10.mx/Latejami/index.html. These files are also works in progress. As time goes by, the dictionaries will be expanded and more self-study lessons will be added.
There is an email discussion group at yahoo.com that you may subscribe to if you wish to discuss Latejami with other people. To subscribe, send an email message to: Latejami-subscribe@yahoogroups.com
In this monograph, I would like to discuss word design for an artificial language designed specifically for use as an interlingua in machine translation. Such a language must be designed to meet two primary goals: first, it must be easier to accurately translate from the source natural language into the interlingua than into another natural language; and, second, it must be almost trivially easy (i.e., requiring simple computer programming) to accurately translate from the interlingua into the target language. In other words, mapping between natural languages and the interlingua must be both accurate and made as easy as possible.
The interlingua achieves these goals by means of its simple but powerful derivational morphology which makes word design rigorous yet straight-forward, while at the same time greatly reducing the number of basic morphemes (i.e. primitives) required by the language.
Initially, I will not try to describe this method in abstract terms, since this discussion is intended for the non-linguist. Instead, I will present the reader with many examples of various kinds of linguistic constructions, discuss the semantics of these constructions, introduce linguistic terminology where and as needed, and finally, try to derive some useful generalizations.
I'll start this exposition by looking first at verbs. Specifically, I will look at two of the most important criteria that go into defining a verb: its valency (i.e. the number of basic arguments that it requires) and its case requirements (i.e. the semantic roles played by the basic arguments). When combined, the valency and case requirements of a verb are usually referred to as the argument structure of the verb.
Before proceeding, though, let me give you a quick review of valency and case. Consider the following English sentence:
The chimpanzee broke the window with a coconut.
In this example, the verb "break" has a valency of two, since it requires two arguments: the subject "the chimpanzee" and the object "the window". The arguments are required because, if either were missing, the resulting sentence would be ungrammatical (or, in the case of some verbs, would have a different meaning):
*The chimpanzee broke. *Broke the window.
[Please note that I am using the standard linguistic convention of indicating an unacceptable item by preceding it with an asterisk.]
But the following is okay:
The chimpanzee broke the window.
For the verb "break", the case role of the subject is agent, and indicates the entity responsible for the event. The case role of the object is patient, and indicates the entity which experiences the state or change of state described by the verb. In other words, the argument structure of the English verb "break" requires two arguments: the first argument (i.e. the subject) must be a semantic agent, and the second argument (i.e. the object) must be a semantic patient.
Arguments required by a verb are called core arguments.
The phrase "with a coconut" is what is called an oblique argument since it is not essential for the sentence to be grammatical. It simply provides additional peripheral information about what happened. In this sentence, it indicates the instrument of the event. In other words, "a coconut" is the instrument used in carrying out the act indicated by the verb. If the sentence had been:
The chimpanzee broke a thousand windows in Boston on Tuesday.
then "in Boston" would be a locative oblique argument, and "on Tuesday" would be a temporal oblique argument.
[The case terminology that I am using here is fairly common, but not universal. Linguists who work with case grammar and thematic relations have yet to agree on the number and nature of case roles needed to adequately describe natural language. As it turns out, this lack of agreement is irrelevant to what we are trying to accomplish here. We will, in effect, create our own internally consistent, semantically precise, and easily expandable implementation of a case system.]
In English, oblique arguments are usually marked by preceding them with a preposition. Thus, the preposition is the marker which tells us the case role of whatever follows it. Agent and patient are usually unmarked. The most common exception to this in English is in passive constructions, where the original subject is preceded by the preposition "by", as in "the window was broken BY the chimpanzee" or "the thieves were seen BY the children". Some verbs, such as English "put", have a third, required argument (i.e., it is part of the valency of the verb), which is marked by a preposition. For example:
*He put the book. He put the book on the table.
Here, the preposition "on" marks a destination case role.
Incidentally, natural languages often allow a speaker to omit a core argument if it is obvious from context. For example, a Japanese speaker often omits the agent of a verb as a sign of politeness. This usage, however, performs a discourse function - not a grammatical function - and the omitted argument is still assumed to be present.
An additional case role that occurs within the valency of many verbs is what I will call focus. Linguists often call this case role theme, object, or topic, but there is no consensus, and their definitions often overlap other roles, especially patient. In all of the following examples, the direct object is the focus:
The children saw the thief. The team needs a new coach. The woman remembered her father. The boys are playing baseball. The woman owns a beach house. The tarp covered the boxes. The fans enjoyed the game. The employees learned discretion. The man ignored his wife. The choir is singing a requiem. The boy loves his mother. The class is studying French. The fence surrounds three buildings. The old man told a story.
Note that in each of the above sentences, the direct object provides a reference point or focus for the event, without causing or being changed by the event. It does this by pinpointing, narrowing down, or providing a reference for (i.e. 'focusing') the state or change of state indicated by the verb. Note that a focus does not play an active role in the event described by the verb, and is not obviously changed by the event. Thus, a focus can be best described as one of the following:
1. The entity on which the patient's attention or mental state is 'targeted' or 'focused'; e.g. to see, to play, to learn, to love, to tell, etc. 2. The referent of a relationship with the patient (i.e. the patient's state relative to the focus); e.g. to own, to surround, to include, to need, etc. 3. An elaboration of the event itself; e.g. to play, to sing, to tell, etc.
Note that the concepts can overlap, as in "to need", "to avoid", "to know", and "to hate", since the object of such verbs can be considered the focus of a relationship or of a mental state. In fact, without stretching the second definition too much, one could say that it applies to all focused events, even those involving perception or elaboration. For example, the sentence "John sees the forest" describes a relationship between "John" and "the forest", and the sentence "Louise sang a little ditty" describes a relationship between "Louise" and "a little ditty".
Thus, we can say that the patient experiences a relationship whose referent is the focus. If the verb has an agent, then the agent is responsible for the relationship. The nature of the relationship is indicated by the meaning of the verb. It is important to keep in mind that the focus does not directly modify or interact with the patient. Perhaps the best and most useful generalization we can make is that the focus is the referent of a relationship with the patient, it is not affected by the event, and it is not responsible for the event. However, the precise meaning of the focus will ultimately depend on the meaning of the verb itself.
Thus, it would appear that focus is not really a pure case role. Both agent and patient can be defined with semantic precision, while focus seems somewhat vague or even 'out-of-focus'. The reason for the vagueness is that it is possible to differentiate among the various senses of focus; e.g. the perceived entity ("to see"), the missing/lacking entity ("to need"), the locative reference point ("to surround"), an elaboration of the event itself ("to sing"), etc. But these senses never overlap for a particular verbal concept, and we would end up making distinctions that are never made in natural languages. Thus, focus is a vague and general-purpose case role, but it is an essential one.
In summary, the three major case roles that are capable of being included within the valency of a verb are:
agent - the entity responsible for the event described by the verb patient - the entity which experiences the state or change of state described by the verb focus - the entity which acts as the referent of a relationship with the patient
Thus, the agent is responsible for the event, the patient experiences the event, and the focus provides the referent for the state or change of state indicated by the event. [We will discuss the semantics of focus in more detail later on. First, though, we need to acquire a more substantial background in the semantics of verbs.]
Note that an argument does not have to be a physical entity. It can also be an event. In the following examples, the direct object is the patient:
We lengthened our trip. The police halted the procession. Bill chaired the seminar. Joe postponed the finance committee meeting. The army prevented the destruction of the village. The station repeated the broadcast.
There are other case roles in addition to the ones I just mentioned, but they are all oblique (i.e., they are never required by a verb). I will discuss them as the need arises. For now, though, we have enough background to proceed with the discussion.
In the following sections, I will discuss and classify a large number of English verbs, based on their semantics and their argument structures. While doing so, I will also introduce some of the terminology and the formal descriptive notation that I will be using throughout the remainder of this monograph.
Probably the largest group of verbs in English (or any language, for that matter) are called state verbs, since they describe either an unchanging state of affairs or a change of state. Verbs which describe an unchanging or static situation are often called stative verbs (do not confuse "stative" verbs with "state" verbs). Verbs which describe a changing or dynamic situation are often called either process or accomplishment verbs. Because linguists do not agree on the precise meanings of these terms, I will immediately abandon them and use the more generic expressions "static state verbs" and "dynamic state verbs".
Let's start by looking at some static state verbs; i.e. verbs which describe a steady or ongoing state:
The patients suffered. The boy sweated. The building shook. The baby slept. The fish stank. The stars twinkled.
These verbs are all intransitive; i.e. they have a subject but no object. Also, each one describes the steady, ongoing state of the subject. Thus, the subject is the patient. From now on, I will refer to verbs of this type as "P-s", where "P" represents "patient" and "-s" indicates that the verb is a static verb.
Here are some more static state verbs with the form P-s:
The trees were tall -> P-s verb = "to be tall" The door was closed -> P-s verb = "to be closed" The stew was salty -> P-s verb = "to be salty" The walls were blue -> P-s verb = "to be blue" The mouse was dead -> P-s verb = "to be dead"
English speakers may be surprised to see adjectives and past participles being treated as descriptive verbs. However, words which describe steady states have just as much of a verbal nature as words which describe changes of state. The English verbs "to sleep", "to stink", "to twinkle", etc. illustrate this very well. In fact, many natural languages (e.g. Japanese, Korean, several Sino-Tibetan languages such as Mandarin Chinese, some Siouan languages, several Austronesian languages, and many native languages of Africa, Central America and South America) do not have true adjectives. Instead, these languages use words that are essentially intransitive verbs, and which can be inflected or otherwise used in the same way as any other intransitive verbs.
Now, the above examples represent intransitive static state verbs. Here are some examples of intransitive dynamic state verbs:
The window broke. The ice melted. The plants grew. The baby fell asleep. The mouse died. The stew cooled. The patient recuperated.
The only difference between these and the previous examples is that the patient experiences a change of state rather than a steady state. Thus, these verbs are the dynamic counterparts of the intransitive static state verbs.
From now on, I will refer to these verbs as "P-d", where "-d" indicates that the verb is a dynamic verb.
Next, let's look at some verbs which describe events in which the subject causes something to happen to the object. These verbs are all transitive; i.e. they have both a subject and an object. Here are a few examples:
He cured the patient. He broke the window. He killed the mouse He closed the door He salted the stew He captured the thief.
In all of the above, the subject "He" is responsible for the event described by the verb. Also, in all cases, the event causes a change of state to occur in the object. Thus, the subject is the agent and the object is the patient. In other words, these verbs are transitive dynamic state verbs.
For verbs like these, I will use the notation "A/P-d", where "A" represents "agent", "P" represents "patient", a slash "/" separates subject from object, and "-d" indicates that the verb is a dynamic verb.
Note that English, unlike almost all other languages, uses exactly the same word for some of its P-d and A/P-d verbs:
P-d: The window broke. A/P-d: John broke the window. P-d: The patient healed. A/P-d: The doctor healed the patient.
Note though, that this usage is highly idiosyncratic, and many words that you would expect to follow the pattern do not:
A/P-d: The doctor cured the patient. P-d: *The patient cured. P-d: The patient recuperated. A/P-d: *The doctor recuperated the patient. A/P-d: The cat killed the mouse. P-d: *The mouse killed. P-d: The mouse died. A/P-d: *The cat died the mouse.
So far, we've seen P-s, P-d, and A/P-d verbs. Thus, an obvious question is: are there such things as A/P-s verbs?
Yes. And as the designation implies, these verbs always indicate that the agent maintains the patient in some kind of steady state. Thus, all of these verbs imply that the agent somehow "controls" the patient. Here are some examples:
He is operating the lathe. He rules the country. He conducted the orchestra. He chaired the symposium. He holds the knife. He used the hammer. He will prevent the accident. He manages the company. He is bringing the children.
Note that, although these verbs may imply both an entry into and an exit from the event or situation, the major emphasis is on the process BETWEEN the endpoints. For these reasons, these verbs are static rather than dynamic.
Now, for states that are normally rendered using adjectives, English uses the particle "keep" to distinguish between A/P-s and A/P-d verbs. Here are some examples:
He kept the door open. A/P-s verb = "to keep open" He kept the girl alive. A/P-s verb = "to keep alive" He kept the thief captive. A/P-s verb = "to keep captive" He kept his mother happy. A/P-s verb = "to keep happy"
All of the above are effectively A/P-s verbs. English simply uses the particle "keep" to achieve the desired effect. A good paraphrase of these 'verbs' is "agent causes patient to remain in a steady state".
Next, let's look at some verbs that use the focus case role that we discussed earlier. Here are some examples:
The student needs money. The boy misses his father. The company owns the yacht. The child has the coloring book. The report lacks a cover. The kids enjoy the game. The man loves his wife. The policeman sees the thief. The girls hear the music.
In all of the above, the subject experiences a steady state relative to the object. Thus, the subject is a patient, the object is a focus, and the verb is a static state verb. For these verbs, I will use the notation "P/F-s", where "F" represents the focus.
It is also possible to have verbs like these which also have an agent. Here are some examples:
The boy imitated the teacher. The lady looked at the house. (Think of "to look at" as a single complex verb.) The men obeyed the rules. The girls listened to the music. (Think of "to listen to" as a single complex verb.) The children followed their parents. The priest thought about his sins. (Think of "to think about" as a single complex verb.)
In the above examples, the subject not only experiences the steady state indicated by the verb, but is also responsible for the state; i.e., the subject is also in control. Thus, the subject is both the agent and the patient, and the object is the focus. I will refer to these verbs as AP/F-s.
Incidentally, notice how some of the above complex verbs become simple verbs when they are defocused:
The lady is looking. The lady is looking at the house. The girls are listening. The girls are listening to the music. The priest is thinking. The priest is thinking about his sins.
Thus, the unfocused verbs would be described as AP-s.
It is also possible for AP/F verbs to indicate a change of state. Here are some examples:
Louise befriended her classmate. Mike joined the party. John memorized the poem. He entered the room. The teacher took the book. Mary divorced him two years ago. The man disowned his oldest son. Bill left the building.
These verbs describe a situation in which the agent causes himself to undergo a change of state relative to the focus. Thus, they are all AP/F-d.
Since all of this may be confusing, let me paraphrase the relationships in a way that illustrates the states and how they are focused:
P/F-s: John saw the mouse. = John experienced a visually perceptive state focused on the mouse. AP/F-s: John looked at the mouse. = John maintained himself in a visually perceptive state focused on the mouse. P/F-d: John noticed the mouse. = John entered a visually perceptive state focused on the mouse. AP/F-d: John glanced at the mouse. = John caused himself to enter a visually perceptive state focused on the mouse. P/F-s: The platoon heard the music. = The platoon experienced an aurally perceptive state focused on the music. AP/F-s: The platoon listened to the music. = The platoon maintained itself in an aurally perceptive state focused on the music. P/F-d: John remembered the party. = John entered a state of remembrance focused on the party. AP/F-d: The platoon surrounded the village. = The platoon caused itself to be in a state of 'around' focused on the village. P/F-s: He loved his father. = He experienced a state of loving focused on his father. P/F-d: She learned discretion. = She entered a state of knowledge focused on discretion.
Overall then, verbs in this group can be generalized as follows:
P/F-s: Static, subject = patient only (to hear, to love) X experienced a steady state focused on Y AP/F-s: Static, subject = agent & patient (to look at, to listen to) X maintained himself in a steady state focused on Y P/F-d: Dynamic, subject = patient only (to remember, to learn) X underwent a change of state focused on Y AP/F-d: Dynamic, subject = agent & patient (to glance at, to surround) X caused himself to undergo a change of state focused on Y
Note that in all of the above paraphrases, the words "focused on" could be replaced by the words "relative to", emphasizing that the focus is the referent of a relationship with the patient.
Now, some verbs involve the exchange of one item for another, usually between two people. Here are some examples:
John swapped an apple for an orange with Bill. John sold Bill a book for $10. Bill bought a book from John for $10. John loaned Bill his tiller for $10. Bill rented a tiller from John for $10.
In each case, two transfers of possession take place. John loses possession of one item while gaining possession of another, and the reverse change of possession occurs for Bill. Thus, we have, in effect, two patients and two foci, where the foci are the items being exchanged.
We can also regard these verbs as composites; i.e. useful abbreviated versions of two distinct verbs, as in "John gave me his apple and I gave him my orange".
Since both patients are equally responsible for the exchange, each one functions as both agent and patient. However, the subject in the above exchanges plays a more important or 'primary' role as agent than the other patient, and the first item plays a more important or 'primary' role as focus. Thus, for example, in the case of "sell", the seller is the primary agent-patient, while the buyer is the 'secondary' agent-patient. The object sold is the primary focus, and the amount paid is the 'secondary' focus.
[This is not the only possible analysis, but I feel that it is the most practical. It also eliminates the need for any special treatment of exchange verbs that do not need a secondary focus, such as "to lend/borrow".]
Finally, there are some cases where the subject is the only agent-patient, as in "John swapped his brown tie for a blue one". Here, John causes himself to undergo a change of relationship with two different items, without the involvement of anyone else. In this example, "a blue one" is the secondary focus.
There are also state verbs which are used to describe the weather and other environmental phenomena. Here are some examples:
It's raining. It stinks in here. It's windy. It's cold outside. It's snowing. It's scary in there. It's humid today. It's dark in there. It's getting hot = it's heating up. It's getting cloudy = it's clouding up. It's quiet when the kids are at school.
In this group of verbs, the subject is the null place holder "it". English verbs always require a subject in the indicative, but this is not true of most languages.
Note that verbs in this class can be either static or dynamic. Also note that, since these verbs describe states or changes of state, they have an implied patient which is obvious from the context (i.e. the local environment or current situation). In effect, English uses the pronoun "it" to represent the implied patient.
I will not describe the argument structure of these verbs right now, because we do not yet have a sufficient background to treat them properly. Instead, I will postpone their discussion until after we discuss grammatical voice changes.
So far, all of the verbs we have discussed are state verbs. That is, the basic concept represented by such a verb is some kind of state, and that this state applies only to the patient. The states can be focused or unfocused, and they can be brought about or maintained with or without an agent.
Also, the states themselves can be categorized by their dynamism; i.e. a state can be "energetic" (e.g. 'alive', 'twinkling', 'sleeping', 'smelly', etc.) or "non-energetic" (e.g. 'dead', 'green', 'tall', etc.). In general, an energetic state can be described using an English present participle, and a non-energetic state can be described using an English adjective or past participle, but there are many exceptions.
Verbs may also be categorized according to their telicity. Dynamic verbs that have a built-in endpoint are called telic, as in "The violinist played a dirge". Dynamic verbs that do not have a built-in endpoint are called atelic, as in "The violinist played with the local orchestra".
Unfortunately, distinctions in dynamism and telicity are not very useful, and I know of no natural languages that mark these distinctions. Whether a concept is energetic or not is a basic part of the nature of the concept and has nothing to do with how the concept is applied. In other words, it is an inherent part of meaning of the verb root, and there is no need to mark it or express it externally.
Also, the telicity of a verb often depends on the meaning of its arguments rather than on the form of the verb. Thus, in a derivational system such as I am presenting here, telic distinctions are useless.
[Incidentally, this entire section is 'for your information only'. I felt that it was important to mention dynamism and telicity only because linguists attribute so much importance to these concepts in their theoretical discussions about verbs. In my opinion, distinctions in dynamism and telicity are interesting but useless for our purposes. And, as I will illustrate below, there is a much more important and useful distinction: the distinction between agent-oriented concepts and patient-oriented concepts.]
State verbs are not the only kind of verbs that languages employ. There is one other class of verbs, which I will refer to as action verbs, which differ significantly from state verbs. Let's look at a few examples and then see if we can deduce some useful generalizations:
Louise told Bill a joke. Louise kicked Bill. Louise teased Bill. Louise betrayed Bill. Louise pushed Bill. Louise punished Bill.
In each of the above examples, the subject "Louise" is clearly the agent. Also, in the first example, the second object is clearly the focus. But what is the object "Bill"?
In each case, Louise is trying to have some kind of effect on Bill, but the final result is not clear. For example, when Louise kicks Bill, we know that something happens to Bill, but Bill's final state depends on many things that are left unstated, such as how hard she kicked, what kind of shoes she was wearing, where she kicked Bill, and so on. This is quite different from state verbs, where the final state is always clearly indicated by the meaning of the verb. For example, the sentence "He broke the window" makes it very clear what the final state of the window is; i.e. 'broken'. It doesn't tell us anything about the act itself or how it was accomplished. Now, we could say that Bill's final state is 'kicked', but this does not tell us about his condition - it simply tells us how it was accomplished.
The reason why the final outcome of the above examples is not clear is because these verbs tell us about the act itself rather than the outcome of the act. In other words, these verbs emphasize what the agent is doing rather than emphasizing what is happening to the patient. Another way of putting it is that an action verb tells us how a patient was affected, but does not tell us what the resulting state is. A state verb is exactly the opposite - it tells us the state of the patient without telling us how the state was achieved.
Thus, state verbs are patient-oriented, since they highlight what the patient experiences. Action verbs are agent-oriented, since they emphasize what the agent is doing.
If a root concept is patient-oriented, then the verb will indicate what the patient experiences. Patient-oriented verbs may or may not have agents. If the root concept is agent-oriented, then the verb will indicate what the agent is doing. An agent-oriented verb must have an agent. All patient-oriented verbs are state verbs. All agent-oriented verbs are action verbs.
The most common action verbs are speech acts. Here are some examples:
He advised his clients. He blessed the crowd. He told me a joke. He mocked them. He answered the teacher. He called me an idiot. He blamed John for the accident. He dared me to try it. He promised me that he would come early.
In all of the above the first object is the patient, since it is the entity which the agent is trying to affect. For the verbs which have two objects, the second object is the focus. Thus, in the sentence "He told me a joke", "He" is the agent, "me" is the patient, and "a joke" is the focus.
Verbs which have two objects are called ditransitive.
Finally, we mentioned earlier that the focus of a verb can be one of the following:
1. The entity on which the patient's attention or mental state is 'targeted' or 'focused'; e.g. to see, to play, to learn, to love, to tell, etc. 2. The referent of a relationship with the patient (i.e. the patient's state relative to the focus); e.g. to own, to surround, to include, to need, etc. 3. An elaboration of the event itself; e.g. to play, to sing, to tell, etc.
There is another group of action verbs that are typically referred to as activities. Here are some examples:
The children played (hide and seek). The athletes ran (the marathon). The guests danced (the polka). The old hag smoked (a pipe). The boy read (a good book). The prisoners ate (their suppers). The hawk flew (in circles).
These verbs describe situations in which the agent maintains itself in an ongoing, energetic state. As a result, these verbs are all static AP/F-s verbs, and can be paraphrased as "Agent does something to maintain itself in a steady, active state". In effect, since the agent and the patient are the same, and since an action verb tells us what the agent is doing, it also tells us the state of the patient. In other words, the action and the state are essentially the same.
Now, many activity verbs can take an explicit patient that is not also the agent. Here are some examples:
John played Bill three games of chess. The athletes ran their sneakers threadbare. His wife danced him into a stupor. She smoked us out of the house (i.e., her smoking caused us to leave). The boy read his sister a story. The hawk flew the mouse in circles.
In these examples, we are still saying what the agent is doing while placing more emphasis on what is being done to someone/something else. Thus, these verbs are the A/P versions of the basic activities. And in all of them, the patient takes a direct part in the activity.
[Incidentally, the word "threadbare" in the "run" example, and the expressions "into a stupor" in the "dance" example and "out of the house" in the "smoke" example are called resultatives, since they indicate the final or 'result' state of the patient. Also, the first example using "play" could also be analyzed as a reciprocal construction. We'll have more to say about resultatives and reciprocals later.]
It's important to emphasize that, when dealing with action concepts, we cannot treat AP derivations as we did with state verbs. In an AP state derivation, the agent is causing itself to experience the state that normally applies only to the patient. In an AP action derivation, the agent is causing the patient to perform the action that is normally performed only by the agent.
In other words, in an AP state derivation, the agent experiences the same thing (i.e. state) as the patient. In an AP action derivation, the patient does the same thing (i.e. action) as the agent.
Thus, an AP-s version of a verb such as "to kick" does not mean that the agent kicks himself. Instead, it means that the agent is simply "kicking"; i.e., he is involved in the activity of "kicking" with no specified or discernible target. This is a subtle distinction, but it is an extremely important one.
[Incidentally, this distinction could also be handled by designating the above verb as simply A-s rather than AP-s. However, I have chosen to keep the AP notation because of the inherent symmetry of the distinction, and because it emphasizes that the agent is causing itself to experience what is essentially an energetic "state".]
Now, let's look at some of the distinctions that exist among these categories, and see if we can make some generalizations about verbs. In looking over the above groupings, we can draw the following conclusions:
1. All verb concepts are either: a. Patient-oriented -> the root describes the ongoing or final state of the patient. b. Agent-oriented -> the root describes what the agent is doing. 2. All verbs are either: a. Static verbs -> these indicate that the patient experiences a steady state. b. Dynamic verbs -> these indicate that the patient experiences a change of state. 3. The subject of a verb can be any of the following: a. Agent b. Patient c. Both agent and patient d. Nothing 4. The object of a verb can be any of the following: a. Patient b. Focus c. Nothing 5. Some verbs take three arguments. In these cases, the subject is the agent, the first object is the patient, and the second object is the focus. 6. All verbs have a patient, whether stated or implied.
As mentioned earlier, there are a few odd-balls which have unusual argument structures, but these are rare and tend to be irregular or idiosyncratic. For the time being, we will limit our discussion to the larger, more regular categories. [Actually, as we will see throughout this monograph, the so-called 'odd-balls' can always be derived from more regular verbs via some form of grammatical voice change or derivational modification.]
From the above list, we might be tempted to create a matrix of 2x2x4x3x2 = 96 elements. However, most combinations never appear. Note, for example, that the orientation of the verb is an inherent part of the meaning of the root, and we will never find two verbs that differ only in this characteristic. Also, a patient can be the subject OR the object - not both - which, of course, makes sense. And if the first argument is both agent and patient, then the second argument cannot be a patient. Also, it serves no useful purpose to have a verb with an object but with no subject. And so on.
With all of the above in mind, we can construct a chart of the possible forms that verbs can take:
ARGUMENTS STATIC DYNAMIC ------------------------------------------- A/P/F to conduct to tell A/P to manage to cure AP/F to ignore to memorize AP to behave to escape P/F to see to recall P to stink to recuperate none to be cloudy to cloud up
Note that I have excluded verbs that take instrumental subjects (e.g. "The hammer broke the window"). English is one of the very few languages that allows constructions like this. And those few that do allow this generally mark the verb to indicate that the subject is instrumental (e.g. Malagasy, many Bantu languages, many Philippine languages, etc.).
So, how do we apply these generalizations to the practical problem of verb design? Answer: we do it by classifying and marking our verbs (in some way or other) to indicate their valency, case requirements, and whether or not they reflect a steady state or change of state. The easiest way to do this is to design the morphology of the language to reflect these differences. For example, the following English verbs will all be derived from the same root but will have different markers to indicate their different argument structures:
AP-d to escape = Agent causes self to become free AP/F-d to escape from = Agent causes self to become free relative to focus A/P-d to release, to free, to liberate = Agent causes patient to become free A/P/F-d to release from, to free from = Agent causes patient to become free relative to focus P-d to get loose, to become free = Patient becomes free P/F-d to get loose from, to become free of = Patient becomes free relative to focus AP-s to stay free, to remain free = Agent keeps self free P-s to be free = Patient is free
And so on. For all of the above, we can use a state root with the meaning 'free/unrestrained', and can apply a different marker to indicate whether the result is AP-s, A/P-d, etc.
[If you have difficulty understanding the formal description that follows, I suggest that you read my separate essay entitled "Morphology". The essay provides a brief and simple tutorial on how to describe the shapes of words and morphemes. However, it is not necessary to understand how words are shaped in order to understand the lexical semantic system discussed in this monograph.]
Here is a formal description of the morphology of the interlingua:
Definitions:
() indicates that the enclosed item is optional {} indicates that the enclosed item may appear zero or more times [] indicates that the enclosed item must appear one or more times | ::= logical or V ::= any vowel ::= a | e | i | o | u S ::= any semivowel ::= y | w C ::= any consonant ::= b | c | d | f | g | j | k | l | m | n | p | q | r | s | t | v | x | z [The letter 'h' is reserved for anaphora, which will be discussed later.] C1 ::= modifier starter ::= b c d f j k q r t x z [q and r not used in native words] C2 ::= classifier terminator ::= g l m p s v C3 ::= suffix terminator ::= g m n p s v [Note that C3 is any classifier terminator except l, which is reserved for prefixes and classifier terminators. C3 also includes n, which can never start a modifier (but can terminate one).]
A vocalic nucleus N has the following form:
N ::= vocalic-nucleus ::= [V]
More precisely, a vocalic nucleus can consist of one or more vowels, and, if there is more than one vowel, then 'i' or 'u' is converted to the corresponding semi-vowel 'y' or 'w'. For example, "eua" becomes "ewa". I'll have more to say about this later.
A prefix has the form:
prefix ::= l N (n) examples: la, loy, lawe, len, loyn]
A suffix has the form:
suffix ::= N C3 | N m C | N n C examples: on, int, ayn, ev, umb, wav
A suffix changes the syntax and semantics of a word in a precise (i.e., totally predictable) way. For example, if we add the A/P-d suffix "-ap" and the final verb marker "-a" to the root "bodam" (meaning 'duck'), the result "bodamapa" means 'to turn P into a duck', which is a dynamic state verb. In other words, we have changed both the syntax and meaning from a 'duck' noun to a 'change-of-state' verb.
In summary, a prefix modifies the meaning of the entire word that follows it without changing its syntax. A suffix changes both meaning and syntax of the root plus any intervening suffixes. In other words, we start with the root, add the suffixes, and then add the prefixes to obtain the final meaning.
There are two kinds of root morphemes: modifiers and classifiers.
A classifier has the form:
classifier ::= C1 N C2 examples: cop, del, tus, bam, fig, zav [Only CVC will be used for everyday vocabulary. Unused CNC, especially CSVC and CVSC, can be used for scientific and technical classifiers.]
A modifier has the form:
modifier ::= C1 N (n) examples: bu, co, day, kwi, zen, tayn
Thus, a root morpheme and a root are defined as follows:
root-morpheme ::= modifier | classifier root ::= {modifier} classifier
Note that a classifier may be preceded by zero or more modifiers but may not be followed by one. Thus it automatically terminates a root.
Finally, a word has the following form:
POS ::= part-of-speech marker ::= a, e, aw, yu, etc word = {prefix} + root + {suffix} + POS
As for pronunciation, vowels are cardinal, although laxer versions are acceptable (i.e., pronounce vowels as in Italian or Swahili). Pronounce /w/ as in "awake", /y/ as in "soybean", /c/ like "ch" in "chin", /j/ as in "judge", /x/ like "sh" in "ship", /q/ like "s" in "measure", and /r/ as any rhotic (flap, trill, retroflex, uvular, etc). The consonant /h/ may be pronounced like 'h' in "house", as a glottal stop (i.e., like "tt" in "button"), or as [x] (i.e., like "ch" in German "acht"). [More generally, /h/ may be pronounced as a glottal stop or as any unvoiced velar, uvular, pharyngeal, or glottal fricative.]
Geminates (i.e., two or more consecutive, identical vowels, semivowels, or consonants) are not allowed. For example, "kk", "bb", "uu", and "yy" are not allowed. The sequences /uw/, /wu/, /iy/, /yi/, /ou/, /ow/, /ei/, /ey/, /ao/, /ae/, /wy/, and /yw/ are also not allowed. However, it is always legal to pronounce /e/ as either [e] or [ey], and /o/ as either [o] or [ow]. For example, /ea/ may be pronounced [ea] or [eya], and /oa/ may be pronounced [oa] or [owa].
The vowels 'i' and 'u' may never appear adjacent to another vowel - use 'y' or 'w' instead. For example, the roots "foidam" and "kuentis" are illegal, but "foydam" and "kwentis" are legal. If 'i' and 'u' are adjacent, convert the first to a semi-vowel. Thus, "ui" becomes "wi" and "iu" becomes "yu".
Although stress is not necessary, we will adopt the following convention for the sake of consistency:
If a root contains at least one modifier, then the first vowel of the first modifier should be stressed. [Examples: BA-kav-o, li-JO-zip-i, TWA-cu-zum-i, li-la-KO-ke-tov-i] If a word contains at least one suffix, then the final vowel of the final suffix should be stressed. [Examples: fag-AP-a, bim-IMB-a, KE-dap-OG-e, BI-jeg-unz-ANG-yu] If a word contains at least one modifier and one suffix, the suffix should be given primary (i.e., heavier) stress, and the modifier should be given secondary (i.e., lighter) stress. If a word contains neither a modifier nor a suffix, then the final vowel of the classifier should be stressed. [Examples: CAL-a, FOM-o, li-KIG-i, li-law-BEG-i]
The above provides almost all of the morphotactic system that I will be using throughout this monograph. (One additional feature will be introduced later in the chapter on Anaphora.) The appendices contain a complete description of the morphology and a list of all of the morphemes that will be created and used in this monograph.
Note that with these word-formation rules, every morpheme and every word is unambiguously started and terminated. Thus, any word with this morphology can always be parsed unambiguously into its component morphemes, and a stream of words can always be divided unambiguously into individual words even if there are no spaces or pauses between words. In fact, even spaces or pauses within a word cannot confuse the parser. Thus, the boundaries between morphemes and words are never in doubt.
This feature of word morphology is usually called either self-segregation or auto-isolation.
As we will see later in Appendix E, the syntax of the interlingua will also ensure self-segregation at the constituent and sentence levels.
In the interlingua described in this monograph, each root will have a default argument structure associated with its classifier. (For a complete list of classifiers, refer to Appendix C.) We can change the default by using a suffix that will indicate the new argument structure.
Here are the suffixes used to change the argument structure of a word:
A/P/F-s: -anz A/P/F-d: -amb A/P-s: -as A/P-d: -ap AP/F-s: -inz AP/F-d: -imb AP-s: -is AP-d: -ip P/F-s: -unz P/F-d: -umb P-s: -us P-d: -up
The above suffixes should only be used if the default argument structure of the root is being changed. To change just the part-of-speech of a root without changing its default argument structure, use an appropriate part-of-speech marker instead (see below).
Now, before proceeding, let's briefly review the semantics behind the notation we are using.
All verbs have a patient, whether stated or implied. If a verb has an agent, then the agent is responsible for the event described by the verb. If a verb has a focus, then the focus is the referent of a relationship with the patient. This referent can be either another entity, as in "John needs a pencil", or an elaboration of the event itself, as in "John told a joke".
A verb is either an agent-oriented action verb or a patient-oriented state verb. An action verb emphasizes what the agent is doing rather than what the patient is experiencing. A state verb emphasizes the ongoing or final state of the patient rather than how it came about or how the agent, if any, brought it about. An action verb must have an agent. A state verb may or may not have an agent.
For these examples, I'm going to start with an English verb, analyze it to determine its argument structure, and create a word for it in the interlingua. I will then try to create as many other verbs as possible from the same root by using different suffixes.
Let's start with the verb "to know", in the sense of 'having knowledge'. Typical sentences using this verb could be:
He knows the answer. or He knows that you left early.
Here, the subject is the patient and the object is the focus. The subject experiences a steady state of 'knowledgeable' focused on the object. Thus, this verb is a patient-oriented state verb and its argument structure is P/F-s.
Now, in the interlingua, the root "kop" will represent the state concept that means 'knowing' or 'knowledgeable'. And since 'knowing' is inherently relational, its argument structure will be P/F-s by default. In addition, the final marker "-a" will set the part-of-speech to verb. Thus, the word "kopa" is the P/F-s verb meaning 'to know'.
Note that we are not using the P/F-s suffix "-unz", even though it is technically correct (i.e., it has the correct argument structure). For the sake of consistency, we will only use an argument structure suffix to change an argument structure. And since the default argument structure of "kop" is already P/F-s, we can use it without "-unz".
Next, let's take the same root and see what happens when we apply different argument structure suffixes to it. We will deal first with focused verbs, since the concept of 'knowing' is inherently focused:
A/P/F-s: "kopanza" = 'to keep (someone else) current in (something)' Agent maintains patient's knowledge of focus. e.g. He keeps them up-to-date on company procedures. A/P/F-d: "kopamba" = 'to inform/enlighten (someone) about (something)' Agent causes patient to gain knowledge of focus. e.g. He informed us about the meeting. AP/F-s: "kopinza" = 'to keep track of', 'to keep up on', 'to keep informed about', 'to keep oneself current or up-to-date on (something)', 'to monitor' Patient maintains his knowledge of focus. e.g. He kept track of the student activities. AP/F-d: "kopimba" = 'to learn (something)', 'to determine/ ascertain/learn that (event)', 'to find out' Patient causes himself to gain knowledge of focus. e.g. He determined that the butler did it. He learned the rules. P/F-s: "kopa" = 'to know', 'to understand', 'to realize' Patient is knowledgeable about focus. e.g. He knows the rules of the game. P/F-d: "kopumba" = 'to learn', 'to realize', 'to discover', 'to find out', 'to come to know' Patient gains knowledge of focus. e.g. He learned the rules by watching the others.
Keep in mind that the above English glosses are approximations, and that the real meaning should be determined from the root plus its argument structure. With the precisely defined semantics used above, there is no doubt. Also, keep in mind that the paraphrases cannot capture the immediacy of the involvement of the participants. This immediacy can only be represented by the single word - not by the paraphrase. For example, a paraphrase of the verb "to kill" is 'to cause to die', even though the two are not synonymous. The paraphrase is simply the closest we can get to the true meaning using multiple words. Please keep this in mind, since we will be using paraphrases throughout this monograph.
Note that all of the above derivations are focused. Focused derivations are the most useful simply because the concept 'knowing' is most often applied this way. But the unfocused derivations are also very useful, as we'll see later when we discuss Grammatical Voice . Before we can discuss these differences, though, we need to acquire a little more background in verbal semantics.
The semantics of a verb that is converted to a noun will be as follows:
When converting a basic verb to a noun, the noun will represent a PROTOTYPICAL GENERIC SUBJECT of an event indicated by the verb.
Now, in the interlingua, we will use the final marker "-i" to change the part-of-speech of a word to 'noun' without changing its argument structure. For example, the noun form of the P/F-s verb "kopa" is simply "kopi". If the argument structure must also be changed, then an argument structure suffix and a part-of-speech suffix will be needed.
Here are some examples:
P/F-s: "kopi" = 'knower', 'the cognizant one', 'the one in the know' P/F-d: "kopumbi" = 'learner' A/P/F-d: "kopambi" = 'informer'
The semantics of verbs that are converted to adjectives will be as follows:
When converting a basic verb to an adjective, the adjective will represent the prototypical QUALITIES of a generic subject, expressed attributively. This meaning can be best paraphrased as "having the attributes of one who VERBs or of something which VERBs".
In the interlingua, the word-final "-o" will indicate that the part-of-speech of a word is 'adjective'. Here are some sample derivations from the root "kop":
P/F-s: "kopo" = 'knowing', 'cognizant', 'aware', 'in the know' A/P/F-d: "kopambo" = 'having the attributes of one who informs or enlightens' = 'informing/enlightening'. (Literally: 'having the attributes of one who causes others to increase in knowledge about something'.) AP/F-d: "kopimbo" = 'learning/determining'
It is important to note that the use of present participles (e.g. "informing") to represent the actual meanings is somewhat misleading, because English participles have strong implications of tense and aspect. For non-participial renderings, this is not a problem as in "the man in the know". Also, for similar reasons, do not confuse adjectives with relative clauses. For example, a "learning geologist" is not quite the same as a "geologist who is learning" since the relative clause definitely specifies tense and aspect, whereas "learning geologist" could also be used if the learning occurred in the past or future.
To continue along the same lines as above, we will use final "-e" to indicate that the part-of-speech of a word is 'adverb'. However, before we can put this to use, we must first digress for a while and discuss the semantics of case tags and adverbs.
In this section, I would like to discuss the semantics of adverbs (especially those that correspond to English adverbs that end in "-ly") and most case tags (such as English prepositions, Japanese post-positions, Hungarian case inflections, etc.), and I will try to show how verbs can be converted to adverbs and case tags. The final result will be a system that can replace many complex, idiosyncratic and periphrastic constructions of natural languages with constructions that are syntactically simple and semantically transparent.
First, let me illustrate how verbs can, in fact, represent the semantics of English prepositions, adverbs, and particles by giving examples from other languages. In these languages, some verbs are actually used in the same way as English prepositions, adverbs, and particles. Consider the following from Vietnamese:
(1) Toi di lai nha bang. I go to bank I'm going to the bank. (3) Nha bang o Hanoi... bank in Hanoi The bank in Hanoi...
In the first example, the word "lai" is actually the verb 'to come'. When used transitively, it takes a destination as a direct object (like the English verb 'to enter'). In the second example, the word "o" is actually the verb 'to be located at' and takes a location as a direct object. (Thus, the second example could also stand alone as a complete sentence meaning 'The bank is in Hanoi'.) Many other languages, such as Igbo, Ewe, Twi, and Yoruba (Niger-Congo languages of west Africa), Indonesian, Chinese, Cambodian, and many pidgins and creoles have similar constructions. Also, these constructions are not limited to locatives. In Chinese, for example, the word "yung" is the verb meaning 'to use'. It is also the preposition meaning instrumental 'with', as in the sentence "He broke the window with a hammer".
It's also possible to create adverbs, particles, and completely new verbs in this manner. In Hindi, for example, "to run go" means 'to run away', and "to cook take" means 'to cook for oneself'. In Yoruba, "to carry come" means 'to bring', and "to carry go" means 'to take away'.
Linguists have a name for this type of construction, in which two or more verbs are linked without the use of coordinating conjunctions or subordinators. They are called serial verbs.
There are two major types of serial verb constructions: the events indicated by the verbs are either simultaneous or consecutive. In this discussion, we are only interested in the first category, where the two verbs represent events that occur simultaneously.
Other useful serial verb constructions are those in which two or more verbs are linked, all taking the same subject and object. In these cases, the lack of a conjunction or subordinator often implies a certain 'immediacy'; i.e., that the event is a single entity, rather than a combination of unrelated or sequential events. Some languages, such as Chinese and Yoruba, allow any combinations that make semantic sense, and even allow noun phrases to split the verbs, creating an effect similar to relative clauses, but where the events indicated by the verbs are often much more tightly linked. Note that these types of constructions are not idiomatic - they are actually quite productive and their meanings are predictable from syntax and context. What most serial verb constructions have in common is that they are taken by speakers as representing parts of the same event.
English has a few verbs that can be used in this way, such as "to go visit", "to come play", "to let go", "to stir-fry", and "to test-fly" but note that the first two represent consecutive events, which is not what we are interested in here. Most of the time, English uses participles to achieve a simultaneous effect. Here are some examples, where the first sentence of each triplet indicates simultaneity:
The child ran screaming to his mother. vs. The child who ran to his mother was screaming. vs. The child who was screaming ran to his mother. The man woke up shivering. vs. The man who woke up was shivering. vs. The man who was shivering woke up. The boy stumbled, knocking over several chairs. vs. The boy who stumbled knocked over several chairs. vs. The boy who knocked over several chairs stumbled. The girl slept, dreaming of unicorns. vs. The girl who slept dreamt of unicorns. vs. The girl who dreamt of unicorns slept.
What is happening here is that the participial phrase is more closely linked to the verb rather than to the noun it ostensibly modifies. As a result, we can create what are essentially compound verbs without subjects, and the results make perfectly good sense:
to run screaming to wake shivering to stumble knocking over several chairs to sleep dreaming of unicorns
In effect, the words "screaming" and "shivering" behave exactly like adverbs, and the words "knocking over" and "dreaming of" behave exactly like case tags (i.e. English prepositions) that introduce phrases that modify the verb.
Thus, we should be able to create adverbs and case tags from verbs by applying the same semantic logic. Here's are some examples:
I broke the window using a hammer. I broke the window with a hammer. to use: A/P-s The kids ran, crossing the road. The kids ran across the road. to cross: AP/F-d They came, tagging along (i.e. accompanying an unspecified focus). They came along. to tag along: AP-s The army positioned itself, surrounding/encircling the town. The army positioned itself around the town. to surround: AP/F-d or P/F-s or AP/F-s The car moved slowly, backing up. The car moved slowly backwards. to back up: P-d He visited his parents, staying three days. He visited his parents for three days. to stay: P-s or AP-s
Additionally, if English had a verb like Vietnamese "o", Chinese "zai", Cambodian "niw", or Hausa "yana" (all of which mean 'to be located at or in'), we could create the locative senses of the prepositions "in" and "at" from it. For example, if the English word "bain" meant 'to be located in/at', we would have:
The children were playing, baining the backyard. The children were playing in the backyard. to bain: P/F-s
In summary, speakers of languages with serial verb constructions effectively make up new 'prepositions' as they are needed. If a preposition with a desired literal meaning is not available, English speakers will either use existing prepositions metaphorically, or will use participial constructions as illustrated above. In this monograph, we will implement a system that has the flexibility of the serial verb constructions (but which is semantically and morphologically precise), and thus avoid the need for potentially untranslatable metaphor.
As an example of the adverb/case tag creation process, let's continue where we left off when we started this digression, and create a set of adverbs and case tags from the state concept of 'knowledgeable'. As mentioned earlier, we will use the part-of-speech marker "-e" to mark the part-of-speech. Those whose verb forms do not take objects (i.e. intransitive verbs) will become adverbs, and those which do take objects (i.e. transitive verbs) will become case tags (i.e. English prepositions) adding a new oblique argument to the main verb. Thus, in effect, the case tag will link its argument to the verb. In the following examples, I will use English for all words except the new case tag/adverb. I will also use English word order. Here are the results:
A/P/F-s: "kopanze" = 'keeping (someone else) current in' e.g. The company spends a lot of money kopanze its employees the latest technology. [Note that "kopanze" has two objects. Thus, there is no need for an equivalent to the English preposition "in".] A/P/F-d: "kopambe" = 'informing (someone) about (something)' e.g. The policeman stood in front of the room kopambe us the robbery. AP/F-s: "kopinze" = 'reviewing', 'keeping oneself current in' e.g. They spent the night at John's house kopinze the lessons for the next day's exam. AP/F-d: "kopimbe" = 'learning about (something)' e.g. He spent three years kopimbe the conspiracy. P/F-s: "kope" = 'knowing (something)' e.g. Joe quietly left the room kope he would be called on next. P/F-d: "kopumbe" = 'learning (something)', 'coming to know' e.g. He watched their activity for three hours kopumbe valuable information.
In all cases, note how the derived case tag modifies the whole sentence, just as if it were an oblique argument of the main verb. Note also that, in the above examples, the case tag is tightly bound to the subject of the main verb. For example, in the sentence:
Joe quietly left the room kope (= 'knowing') he would be called on next.
the subject of the case tag "kope" is P and links to the subject of the main verb "to leave" which itself is AP/F-d. Thus, the effective subject of the case tag "kope" is "Joe". And in the sentence:
The policeman stood in front of the room kopambe (= 'informing') us the robbery.
the subject of the case tag "kopambe" links to the subject of the main verb "to stand" which is AP-s. Thus, the effective subject of the case tag "kopambe" is "the policeman".
[Incidentally, note that "kopambe" is A/P/F-d and must be followed by two arguments, "us" and "the robbery". No preposition can appear between them. The English translation, however, requires the preposition "about" or "of" to precede the focus of the verb "inform".]
In this section, we discussed how to convert existing verbs into adverbs and case tags. Later, we will discuss how to systematically create the many case tags required by a language, such as those needed to represent English prepositions.
In the interlingua, we will use the root "xum" to represent a vague but useful relational state, with the meaning 'having an unspecified relationship with', 'having something to do with', and so on.
[Reminder: 'x' is pronounced like "sh" in "show" or "bishop".]
Note that "xum" is the 'other' classifier for the scalar relational state group and that it is P/F-s by default. [See Appendix C for a complete list of classifiers in the interlingua.
The P/F-s verb form "xuma" will indicate that a relationship exists between patient and focus, but will imply nothing about the nature of the relationship. Thus, its meaning can be paraphrased as 'to have an unspecified relationship with' or 'to have something to do with'.
Here are a few other derivations using "xum":
P/F-s: xumo - 'associated', 'related', 'corresponding' A/P/F-s: xumanza - 'to keep P involved with', 'to have P stay involved with', A/P/F-d: xumamba - 'to cause P to become associated or involved with F' AP/F-s: xuminza - 'to keep oneself in an unspecified steady state with respect to', 'to stick with', 'to remain associated with', 'to continue one's relationship with' e.g. He stuck with the project. AP/F-d: xumimba - 'to get oneself involved with', 'to take on' e.g. He took on the project three years ago. AP-s: xumisa - 'to keep onself in an unspecified steady state with respect to something generic or known from context', 'to persevere', 'to remain steadfast' e.g. He remained steadfast/persevered until the end. P/F-d: xumumba - 'to become involved with', 'to become associated with', 'to come to have something to do with' e.g. He got involved with the project against his will.
Now, we can also derive several useful but vague action verbs using the action classifier "bus". Note that "bus" is the 'other' classifier for the action classifier group and that it is A/P-d by default.
Here are some of the more useful derivations using "bus":
A/P-d: busa - 'to do something to', 'to affect' e.g. Billy did something to the cat. AP/F-s: businza - 'to do/make/perform/carry out (something)' e.g. He is doing his homework. We made three trips to New York. AP-s: busisa - 'to act', 'to take action', 'to take steps' e.g. He acted as he did to save lives. AP/F-d: busimba - 'to accomplish/achieve' e.g. We accomplished the task. A/P-s: busasa - 'to control/manage/run e.g. John ran/managed the company for three years.
As we will see later, many of the above verbs can undergo additional derivations to produce some very useful words.
Since actions always imply agents, non-agentive derivations will not be very useful.
So far, we've only talked about verbs in the active voice; i.e., where all of the arguments of a verb are present and appear in the proper order. For example, the A/P-d verb "to break" has an agent subject and a patient direct object. However, natural languages have many ways of changing the relative importance or topicality of a verb's arguments. Languages can also remove arguments from the argument structure, while implying that they still exist, and make the missing arguments either obliquely expressable or not expressable at all. Finally, languages can also incorporate normally oblique arguments, making them part of the argument structure of the verb. For example, consider the following:
John broke the window. = active voice The window was broken. = passive voice, implied agent The window was broken by John. = passive voice, oblique agent The window was broken with a hammer. = passive voice, oblique instrument, implied agent A hammer broke the window. = incorporated instrument, agent cannot be expressed at all (*by John), new structure is something like I/P-d, where I = instrument. The window broke accidentally. = middle voice, implied agent, agent cannot be expressed at all (*by John). The window broke. = P-d verb. This is sometimes confused with middle voice. In the system described in this monograph, this verb is a basic verb and the example is in the active voice. No agent is expressed or implied. John broke something. = anti-passive (this is an approximation - English does not have a true morphological anti-passive construction). The agent alone is prominent. The patient loses its prominence but may be expressed obliquely. However, even when not expressed obliquely, a patient is always implied. The window broke John. (poetic license needed here) or The window, John broke it. = inverse voice (again, these are approximations - English does not have a regular inverse construction). Patient becomes subject, agent becomes object and MUST appear.
Different languages handle these distinctions in different ways. As you can see from the above examples, English uses combinations of syntax, morphology, periphrasis, and even poetic license. Other languages are more regular, some using inflections for some voices, while others may use derivations or a combination of both. In addition, some languages allow the incorporation of other case roles into the argument structure of a verb. In fact, the number of possible voice variations among the world's languages is quite large.
Since grammatical voice has different meanings to different people (with middle voice being the most confused/confusing), let me precisely define the meaning that I am using here. Specifically,
A grammatical voice change starts with a basic verb and rearranges the argument structure by increasing or reducing the topicality of one core argument relative to another, but without changing the basic meaning of the verb. In the process, an existing argument may be deleted. A deleted argument may be expressed obliquely (e.g. passive) or may not be expressable at all (e.g. middle). However, the role of the deleted argument is ALWAYS implied.Thus, even though the original subject may not be expressed in a middle voice construction, it is still implied. For example, in "Mice kill easily", someone or something is responsible for the killing even though it cannot be expressed. In "Mice die easily", no agent is expressed or implied. Thus, the former is an example of a grammatical voice change, while the latter is not.
An argument that increases in relative topicality is said to be promoted, and an argument that decreases in relative topicality is said to be demoted. Demoted arguments continue to play their original semantic roles, but are somehow less important or less involved. The following examples illustrate this effect:
Active: The enemy bombed the city. Passive: The city was bombed. <- no agent or The city was bombed by the enemy. <- oblique agent Active: She sewed the dress. Anti-passive: She sewed something. <- no patient or She did the sewing on the dress. <- oblique patient
Although the number of possible voice combinations is large, there are a few that crop up often among the world's languages. Here are the most common ones:
Active - transitive: The subject is slightly more important or topical than the object. Both must be expressed. This is by far the most common form used in almost all languages. [The only exceptions I know of are Fijian and the Salish languages of northwestern North America. In these languages, all transitive verbs are derived by addition of an affix to the intransitive form. Also, in Fijian, the most commonly used verb form is active INTRANSITIVE.] Passive: The original object becomes the subject and becomes considerably more topical than the original subject. The original subject is no longer part of the verb's argument structure, and does not have to be expressed. However, it is always implied and may be expressed obliquely (in English, typically using the preposition "by"). Middle: The original object is made more topical and becomes the subject. The original subject is deleted from the verb's argument structure and may not be expressed at all even though it is implied. Anti-passive: The subject is made considerably more salient than the object. The original object is no longer part of the verb's argument structure, and does not have to be expressed. However, it is always implied and may be expressed obliquely. Inverse: The arguments of the active verb are simply reversed. The original object becomes the subject, gaining in importance; and the original subject becomes the object, losing importance. Unlike passive, the original subject is not oblique and MUST appear.
Keep in mind that the above are generalizations. Individual languages vary both in the ways that the various voices are implemented as well as in their semantics. Also, keep in mind that the list contains just the most common voice systems. Many other combinations are possible, especially those involving normally oblique case roles.
As we saw above, a language like English, which does not have this ability, must resort to complex and idiosyncratic constructions to achieve the same effect. Always keep in mind, though, that a voice change simply re-arranges the topicality of some of the participants in a sentence. Our goal should be to achieve the same results in a consistent and easy-to-understand manner.
Also, English rarely uses the same strategies to handle these needs. For example, an effect similar to the passive and anti-passive can be achieved by using impersonal constructions: "Johnson punched someone" (anti-passive) or "Someone is at the door" (passive). An effect similar to the inverse can often be accomplished by fronting or left dislocation, as in "(As for) the car, John wrecked it". However, true inverse effects can sometimes be obtained by periphrasis, as in:
Active: The cup is full of water. Inverse: Water fills the cup.
Finally, inverse and middle effects are sometimes achieved in English by using completely different root morphemes, as in "I enjoyed the show" vs. "The show pleased me" (inverse), or by use of metaphor or idiom, as in "He remembered the answer" vs. "The answer came to mind" (middle).
[Incidentally, the inverse voice comes in two varieties. In the first, which is sometimes called a semantic inverse, an inverse operation may be required in order to properly assign case roles to the arguments of a verb. Semantic inverse constructions are especially common in the native languages of North America. For example, in Plains Cree (Algonquian), a more animate argument is inherently more topical than a less animate argument, and neither word order nor case marking of nouns can change the interpretation. Thus, if "man" and "dog" appear as the main arguments of the verb "bite", then it will always be interpreted as "man bites dog", regardless of word order. An inverse marking on the verb simply reverses the relative topicality, making "dog" more topical than "man", and is required to obtain the sense "dog bites man". I do not consider this usage a true voice alteration. It is simply an uncommon way of marking semantic case roles in a sentence. Similarly, some Sino-Tibetan languages have an inverse voice based on the relative topicality of 1st, 2nd, and 3rd person, rather than animacy. Note though, that although an inverse operation may at times be required, it can also be used when it is not required in order to achieve the changes in topicality that we are describing here. In these cases, such an operation is called a pragmatic inverse.
True pragmatic inverses can be found in languages such as Maasai (Nilo-Saharan), Sahaptian languages (Penutian, western North America; e.g. Nez Perce), Caucasian languages (e.g. Georgian), and Chamorro (Austronesian, Guam). (In fact, Maasai and Sahaptian languages have both semantic and pragmatic inverses.) Finally, a combination of word order changes and direct case marking of nouns can sometimes be used to achieve an inverse effect (e.g. Korean). However, other languages which have this ability (e.g. Russian) frequently use it for quite different purposes. As for true inverse systems, recent research indicates that such systems are actually much more common among the world's languages than had been previously supposed.]
Most European languages (including English) use cumbersome rules involving auxiliaries, participles, reflexives, context, word-order, and even complete lexical changes to indicate voice. More heavily inflected languages (Arabic, Latin, Japanese, Ainu, etc.) use the very simple expedient of inflecting the verb for most indications of voice. Many South American lowland languages and some isolating (i.e. uninflected) languages such as Chinese and Vietnamese do not have a formal morphology or syntax to cover voice, although they can achieve similar effects via explicit topicalization and/or periphrasis.
Finally, other languages such as the Bantu languages of Africa (e.g. Swahili) and Austronesian languages (e.g. Indonesian) use derivational morphemes (which is essentially what we are doing here) to achieve most voice effects. In other words, they create a completely different verb from the same root as the active verb, but the new verb has a different topicalization and argument structure.
So, how should an MT interlingua implement grammatical voice? Ideally, we would like to create a system that can handle any voicing needs, while being both simple and consistent.
I do not feel that grammatical voice change should be implemented in syntax - syntax is not nearly as flexible as morphology. Instead, grammatical voice changes can be best implemented using derivational morphology. In other words, we will allocate a single suffix for each voice. The resulting verbs will, of course, have a different argument structure.
For the interlingua, we will allocate the following suffixes for these voice morphemes:
Middle voice: -em Passive voice: -es Anti-passive voice: -os Inverse voice: -ang
Voice suffixes do not change an existing part-of-speech.
For example, if the state root meaning 'open/unshut/unblocked' is "doykav" (default = P-s), then the word for the A/P-d verb 'to open/unshut' is simply "doykavapa". We can implement the other voices as follows:
middle: doykavapema e.g. The window doykavapema easily = The window opened easily. passive: doykavapesa e.g. The window doykavapesa (by the thief) = The window was opened (by the thief). anti-passive: doykavaposa e.g. The thief doykavaposa (of the window) = The thief did the opening (of the window) or = The thief was the opener (of the window) or = The thief opened something. [The third gloss applies only if the argument is not expressed obliquely.] inverse: doykavapanga e.g. The window doykavapanga the thief = The window - the thief opened it.
where optional oblique arguments are shown in parentheses. [We'll discuss how to implement these oblique arguments later.]
In the above examples, the inverse paraphrase is only approximate, and actually increases the topicality of the fronted item more than it should. Here are some better examples of true inverse effects in English:
Active: John owns the book. Inverse: The book belongs to John. Active: This bolt is part of the device. Inverse: The device includes this bolt. Active: We experienced many strange things. Inverse: Many strange things happened to us. Active: This alliance will result in much misery. Inverse: Much misery will come of this alliance.
A useful notational scheme will be to put an implied case role in square brackets, with a plus "+" or minus "-" sign to indicate whether it can be expressed obliquely. Thus,
middle: changes A/P-x to P-x [-A] AP/F-x to F-x [-AP] P/F-x to F-x [-P] passive: changes A/P-x to P-x [+A] AP/F-x to F-x [+AP] P/F-x to F-x [+P] anti-passive: changes A/P-x to A-x [+P] AP/F-x to AP-x [+F] P/F-x to P-x [+F] inverse: changes A/P-x to P/A-x AP/F-x to F/AP-x P/F-x to F/P-x
where "-x" represents either "-s" or "-d".
For verbs that take three arguments, we will do the following:
middle: changes A/P/F-x to P/F-x [-A] e.g. *The students taught French easily. [This is ungrammatical in English with the intended meaning, but grammatical in the interlingua.] passive: changes A/P/F-x to P/F-x [+A] e.g. The students were taught French (by Mr. Johnson). anti-passive: changes A/P/F-x to A/F-x [+P] e.g. He shouted obscenities (at the crowd). [Note that the English verb "to shout" is inherently anti-passive. Thus, we must start by creating an A/P/F-d version of this verb, and then perform an anti-passive operation to derive an exact equivalent of the English verb "to shout".] inverse: changes A/P/F-x to P/A/F-x e.g. The student - John taught him geometry.
In addition, some languages, such as Latin, Shona (Bantu), Turkish, Classical Greek, and German allow impersonal passives, in which an intransitive verb is passivized becoming a zero-argument verb. For example, the AP-s activity verb "to run" could undergo a passive or middle transformation into 0-s [+AP] or 0-s [-AP], depending on the language, where "0" is used to indicate that the verb has no arguments. It is interpreted as something like 'running took place' or 'there was running'. A verb like P-d "to grow" could become 0-d [+/-P], and would mean something like 'growing took place' or 'there was growth'. The interlingua allows all of these variations.
Another useful derivation would be to take an A/P/F verb and reduce the topicality of the third argument. (Remember, the anti-passive discussed above reduces the topicality of the second argument.) We will refer to this as an anti-anti-passive operation. However, I know of no natural language that has a distinct way of marking such an operation, so we will not do so in the interlingua. Instead, we can achieve the same effect by simply changing the argument structure of the word using an A/P suffix. We'll see examples of how to do this later.
As we saw with the verb meaning 'to shout (at)', grammatical voice alterations are useful for creating speech act verbs which never take a patient as a direct object, such as the A/F-s [+P] verb "to dictate", as in "He dictated the letter (to his aide)". For verbs like these, we can create a verb that does allow a direct object patient, and promote a focus to first object by means of the anti-passive alteration.
The passive, anti-passive, and inverse voices are easy to understand, and I'll say no more about them. Middle voice, however, is so frequently confused with basic intransitivity that I'd like to say a little more about it.
English does not have a formal morphosyntax for middle constructions, unlike many other languages (Persian, Swahili, Basque, Somali, Hausa, Turkish, and many, many others - middle forms in these languages often go by other names, such as statives or agentless passives, but they often function semantically as middles). English does not even have a reflexive clitic construction, as do several other European languages, which often performs additional duty for middle voice. This is unfortunate, since, as we will see, it can be extremely useful and productive.
English sometimes allows an active verb to be used without modification in a middle construction, as long as the context forbids an active interpretation. Thus, we can say "The joke did not translate well", or "The plane landed ten minutes ago", or "The library closed early". But even when the meaning is clear, English can be quite idiosyncratic as in "*The mountains see in the distance" or "*The boxes are covering in the storeroom". Sometimes, if the verb has an agent, an indefinite construction can be used, as in "They don't make cars like they used to". And in cases where context and semantics do not make it clear, English is often forced to use periphrastic or passive constructions, completely different words, metaphors, or even idioms. Consider the following examples:
ACTIVE MIDDLE I see the mountains. *The mountains see. The mountains are in view. Thus, from the verb "to see", P/F-s, we can derive: "to be in view", F-s [-P] The gang terrorized the *The neighborhood terrorized for three neighborhood for three years. years. The neighborhood lived in a state of of terror for three years. Thus, from the verb "to terrorize", A/P-s, we can derive: "to live in a state of terror", P-s [-A] That woman buys caviar only *Caviar buys only when it's on sale. when it's on sale. Caviar sells only when it's on sale. Thus, from the verb "to buy", AP/F-d, we can derive: "to sell (intransitive sense only)", F-d [-AP] He threw the rock at the window. *The rock threw at the window. The rock went flying at the window. Thus, from the verb "to throw", A/P-d, we can derive: "to go flying (metaphorically)", P-d [-A] I remembered her face. *Her face remembered. Her face came to mind. Thus, from the verb "to remember", P/F-d, we can derive: "to come to mind", F-d [-P] He swallowed the pills *The pills swallowed with difficulty. with difficulty. The pills went down with difficulty. Thus, from the verb "to swallow", A/P-d, we can derive: "to go down", P-d [-A]
And so forth. The number of possible examples is almost unlimited. Thus, English can deal with middle concepts, although the forms are usually highly irregular, unpredictable, periphrastic, and often either metaphoric or idiomatic.
Some English verbs that can be used both transitively and intransitively, such as "open", "cook", and "fill", have gerund forms that refer to the state of the object rather than the subject. For example, "the opening door" means 'the door that is being opened', not 'the door that is doing the opening'. In these cases, the English gerund is equivalent to the interlingua's middle form. For example, the adjective "doykavapo" means 'doing the opening' while "doykavapemo" means simply 'opening' as in "the opening door". Also, "doykavapemo" implies that someone or something is causing the door to open; i.e., an agent. If no agent is implied, then the P-d form "doykavupo" should be used instead. [Note that the final marker "-o" is needed in all three cases to convert the result to an adjective.]
Middle verbs are often confused with basic P-s or P-d state verbs. The reason is that the patient is the subject of an intransitive verb, and it is often uncertain whether or not a transitive subject is implied. In languages which have a formal middle voice, however, there is never any doubt. Unfortunately, speakers of languages like English will have to be a little more careful. When in doubt, the basic P-s or P-d form should always be used instead of the middle form unless an agent is clearly implied. Middle verbs are also often confused with reciprocals and reflexives because some languages (especially European languages) use the same forms for more than one voice. In the semantic system used by the interlingua being discussed here, middles, reflexives, and reciprocals are completely different. [We will discuss reflexives and reciprocals later.]
It's important to keep in mind the difference in semantics between middle and passive derivations (including anti-middle and anti-passive). A middle derivation is used when the demoted argument cannot be specified, which is always the case in generic situations and when the demoted argument is known from general knowledge (e.g. "Mice kill easily"), as well as when the demoted argument is generic in the current context (e.g. "The mountains finally came into view"). However, a passive derivation without an oblique argument implies that the speaker is intentionally omitting some information that is not known to the listener, most likely because the speaker does not consider the actual argument to be very important, or perhaps because the speaker does not know it himself. A passive derivation with an oblique argument implies that the speaker considers the argument to be less important than the non-oblique arguments.
For example, compare "The library closed at 6 o'clock" (middle) with "The library was closed at 6 o'clock" (passive). The middle construction gives the impression that the closure was normal, while the passive construction implies that the closure was unusual and that unknown information was omitted, as in "The library was closed at 6 o'clock by the mayor because of the emergency".]
Thus, the middle and passive derivations represent three distinct degrees of relevance:
Middle: Argument cannot be specified because it is too general, is common knowledge, or is generic in the current context. Specifying it would be redundant or excessively verbose. Passive without oblique: Argument is not specified because the speaker does not consider it important or does not know it. Passive with oblique: Argument is provided but is less important than would otherwise be implied if it were not oblique.
As usual, though, language is rarely so precise and there will be some overlap. In other words, a speaker can at times use a middle derivation when a passive one would be more technically correct, or vice-versa.
Finally, since the middle voice makes the subject generic, the noun version of a middle voice alteration has the meaning of a prototypical, generic object of the unmodified verb. This allows us to create many new and useful words. Here is an example using a root we already know:
"kop" = 'to know' -> "kopemi" = 'datum', 'fact', 'item of knowledge'
Compare the above with the passive form "kopesi", which would have the meaning 'something which is known'. With the passive form, the original subject (i.e. the "knower") still has a strong presence. In the middle form, however, the original subject is almost completely eliminated.
Some natural languages can make almost any case role a subject or object of the verb (e.g. Malagasy, some Mayan languages, and most Philippine languages). In fact, among the Philippine languages, verbs almost always have an explicit morpheme that indicates the case role of the subject, and almost any case role can be promoted to subject. Many Bantu languages of Africa (e.g. Swahili) and some Australian languages (e.g. Dyirbal) allow an instrumental case role to be promoted to object. Many Bantu languages also allow a locative case role to be promoted to subject. Indonesian allows a beneficiary case role to be promoted to object. And so on.
Obviously, the above system could be easily extended to add normally oblique case roles to the argument structure of a verb. However, we will not be doing this in the interlingua for the following reasons:
1. It is extremely rare among natural languages.
2. The number of possible combinations of argument position and case role is very large, and would require a large number of special morphemes that would rarely be used.
3. Most (all?) languages that allow promotion of normally oblique case roles have special reasons for doing so. For example, many languages allow relativization of only certain core arguments, and thus a voice change is required before other arguments can be relativized.
4. If the syntax of the interlingua is designed properly, then any argument can be promoted or demoted by simply changing its position relative to the other arguments. For example, consider the following, greatly simplified VSO syntax:
sentence ::= verb { argument } argument ::= core_argument | oblique_argument core_argument ::= noun_phrase oblique_argument ::= case_tag ( noun_phrase ) [A case tag that is not followed by a noun phrase is an adverb.]The above syntax allows oblique arguments to be placed after, between, or even before the core arguments, which can have the same effect as explicit, morphological promotion or demotion. For example, if we need to promote an instrumental case role, we can do something like this: "Broke with a hammer John the window" or "Broke John with a hammer the window". Note though, that we must modify the verb itself if we want to promote or demote a core argument.
For all of the above reasons, there is no need to implement grammatical voice changes that would promote normally oblique case roles to core positions. Thus, while there must be a way to modify the relative topicalities of core arguments, there is simply no need to create special morphemes to promote normally oblique arguments.
Incidentally, core arguments are not limited to noun phrases. They can also be embedded clauses. Here are some examples:
John wanted the book vs. John wanted Bill to leave. I saw the soldiers vs. I saw the soldiers marching. They liked her vs. They liked her portrayal of Juliet. We know the answer vs. We know that he likes her.
A clause which appears as the argument of a verb is called a complement.
Note that the English embedded clauses are idiosyncratic in that they require either infinitives, participles, nominalizations, or complete finite clauses, depending on the particular verb. By using an embedded clause with the same form as a normal sentence (i.e. a complete finite sentence), you can achieve the same effect with a simpler morphology and syntax. Here is how the above examples would look (the complete embedded clause is in parentheses):
John wanted (Bill leave). I saw (the soldiers were marching). They liked (she portrayed Juliet). We know (he wants (she buy the car)).
They seem awkward in English, but they're linguistically sound, syntactically simpler, and totally lacking in idiosyncracy. Also, this approach is used in many natural languages.
[Incidentally, a complete description of the syntax of the interlingua has been provided in Appendix E of this manual.]
There are two voice changing operations that demote an argument: passive and middle. A passive voice change demotes an argument but allows it to be expressed obliquely. A middle voice change demotes an argument but does not allow it to be expressed obliquely. If the prefix "anti-" is not used, the first argument (i.e., the subject) is demoted. If the prefix "anti-" is used, then the second argument (i.e., the first object) is demoted.
For example, a passive demotes the first argument, and allows it to be expressed obliquely. An anti-middle demotes the second argument and suppresses its salience so much that it cannot be expressed obliquely.
Here is a complete list of middle and passive suffixes:
-es passive -os anti-passive -em middle -om anti-middle
As stated earlier, if it is necessary to demote the third argument of a ditransitive verb, an appropriate argument structure suffix should be used. For example, if we wish to perform a middle operation on the third argument of an A/P/F verb (i.e., an "anti-anti-middle" operation), then we will use either the A/P-s suffix "-as" or the A/P-d suffix "-ap", whichever is appropriate.
Obviously, this implies that the focus of these verbs can never be expressed obliquely and that we can not make a semantic distinction between anti-anti-passive and anti-anti-middle. However, I do not consider this a disadvantage because I know of no natural language that can do these things.
When using verbs, we must be careful not to confuse case roles. It is sometimes easy to mistake a focal event for a patient. Consider the following example:
It's sad that John died.
It is tempting to treat the embedded sentence "John died" as if it were the patient in a P-s state verb formed from the root meaning 'sad'. However, an event cannot be "sad" in the sense that it can experience sadness. What we are really describing are the feelings of the speaker (and perhaps others) towards the situation. Thus, when we say "it's sad that ...", we are really describing our feelings or beliefs about the situation. In effect, the speaker and those he may be speaking to are the real patients.
Thus, in a sentence like the above, the real patient is implied, and the mental state of the patient is 'focused' on the event indicated by the embedded sentence. Thus, the embedded sentence is the focus of the main state verb meaning 'to be sad about'.
We can easily create a basic P/F-s verb meaning 'to be sad about', as in the sentence "Bill is sad about his parents' divorce". Using this basic verb, we can perform a middle voice alteration to create the F-s [-P] form meaning 'it is sad that'.
It is also possible for an event to be the agent or cause of the sadness. For this, we would need an A/P-s verb, since the event itself causes the patient to be sad. Thus, we really have several possible forms, as illustrated below:
A/P-s John's death makes (i.e. keeps) me sad. A/P-d John's death saddened me. P/F-s I am sad that John died. F-s [+P] It's sad (for everyone) that John died. F-s [-P] It's sad that John died. OR Sadly, John died.
A similar analysis can be done using the state concept 'hoping':
P/F-s I hope that I'll win. F-s [-P] Hopefully, I'll win.
where both "sadly" and "hopefully" are actually verbs that take a complete embedded sentence as an argument - they are not adverbs as in English.
Words and expressions like these are called disjuncts, and many other examples can be derived in the same way: "to presume" -> "presumably", "to be interesting" -> "interestingly", "to be possible" -> "possibly", "to be incidental" -> "incidentally, by the way", "to be necessary" -> "necessarily", "to be fortunate" -> "fortunately", etc.
Finally, the unspecified arguments to many disjuncts are often provided by the speech situation, such as who is speaking, who is listening, where the speech is occurring, and so on. These are called deictic disjuncts, and I'll have more to say about them later.
Here are some examples of derivations using voice suffixes and the root "bus", which we introduced earlier:
Middle P-s [-A] "busasema" - 'to be under control/in hand' e.g. The runaway budget is now UNDER CONTROL. Inverse P/A-s "busasanga" - 'to be under the control of' e.g. The project IS now UNDER THE CONTROL OF the engineering department. Anti-passive A-s [+P] "busasosa" - 'to be in control/charge' e.g. John IS IN CHARGE here. Middle F-s [-AP] noun "businzemi" = 'deed', 'act', 'action', Anti-passive AP-s [+F]: "businzosa" - 'to be doing something' e.g. He IS DOING SOMETHING right now.
It is important to emphasize that the basic voice operations (middle, passive, and inverse) are not sequential. They act independently, as if each operation were the only one operating on the original argument structure. For example, if we apply middle and anti-passive to an A/P/F verb, the middle operation converts A to [-A], the anti-passive operation converts P to [+P], and the result is F [-A] [+P]. This combination is legal, and the order of application of the voice suffixes is irrelevant.
The net effect of this rule is that a core argument can only be affected once. For example, it is illegal to apply a passive and an inverse, since, independently, the passive would convert A/P/F to P/F [+A], while the inverse would convert it to P/A/F, and the two results are not compatible. In other words, the agent argument would have been affected twice, and the result would be ambiguous. If our goal is A/F [+P] (i.e., inverse followed by passive), then we should use a simple anti-passive (suffix "-os"). If our goal is F/P [+A] (i.e., passive followed by inverse), then we're out of luck - there is no way of accomplishing this in the interlingua. Fortunately, I have not been able to find any use for it, and I doubt that any natural language has such a capability.
Later, we will learn of other voice operations ( reflexive and reciprocal) that actually combine two separate core arguments into a single core argument. These voice operations are not considered basic and are not affected by the above rule. For example, it is possible to apply a passive after a reciprocal. In effect, a non-basic voice operation creates a new verb that can undergo normal basic voice operations.
In many of our verb derivations, we used the word "cause" in our paraphrases of the semantics of verbs which have an agent in their argument structure. Unfortunately, these paraphrases are approximate and often imply some distance between the agent and the event. However, I must emphasize that the agent argument of a verb is the entity that is directly responsible for the event indicated by the verb. Thus, there is a definite semantic difference between 'kill' and 'cause to die', even though our paraphrases may imply otherwise.
If we wish to intentionally put distance between an agent and an event, we must design words that are equivalent to English "cause", "make", etc. Consider the following sentences:
He MADE his son wash the dishes. I HAD Bill deliver the package. He CAUSED his wife to have a miscarriage.
In the above examples, the patient (if that is what it really is) cannot be expressed directly:
*He made his son. *I had Bill. *He caused his wife.
However, the English verb "to cause" can be used without this quasi patient:
John caused the accident.
Thus, these verbs indicate that an indirect agent is responsible for an event which itself may have a direct agent - the quasi patient is not at all a true patient of the verb "cause/make/have" (although it may be the true patient of the embedded sentence). Also, the English distinction between "cause", "have", and "make" is somewhat idiosyncratic. Semantically, there is no significant difference between them. [Actually, "to have" is a more polite version of "to make", but this distinction is not important to us here. We will discuss how to derive more polite forms of words in the section on register variations.]
The most neutral paraphrase of indirect causation is simply the static 'to keep in existence' or the dynamic 'to cause to become real/actual/existent'. In the interlingua, I will use the state root "kav" to represent this concept (default = P-s adjective). Here are some useful derivations:
A/P-d: "kavapa" - 'to cause/make/create/produce', 'to bring into being', 'to cause to come into existence', 'to bring about/on', 'to cause to become real/actual', 'to make a reality' e.g. John CAUSED the accident. John MADE Billy wash the dishes. John MADE some apple cider. A/F-d [-P]: "kavamboma" - 'to implement/execute/carry out/bring about/put into effect or practice/accomplish/ etc' e.g. They CARRIED OUT your orders. We have to IMPLEMENT the new plan by Monday. [The focus provides additional information about the unspecified patient without itself being affected. Cf. "We made the boat according to these plans" vs. "We implemented these plans".] A/P-s: "kavasa" - 'to ensure/insure/guarantee', 'to to make sure that ...', 'keep/maintain a reality' e.g. Skilled teamwork ENSURES high quality results. John will MAKE SURE that there's enough food. [Incidentally, the opposite of "kavasa" is "juvasa" and means 'to prevent or preclude'; i.e. to ensure that something remains non-existent.] P-s: "kavo" - 'real', 'actual', 'existent' e.g. John said he saw a REAL unicorn. P-s verb: "kava" - 'to be real/actual', 'there be', 'the reality is that', 'In reality ...', 'Actually, ...' e.g. THERE ARE ten people at the party. THE REALITY IS THAT they're all gone. P-d: "kavupa" - 'to come into existence', 'there came to be', 'it came to be that', 'to become a reality', 'to become real', 'to come about', 'to happen', 'it happened that', etc. e.g. The new policy CAME INTO BEING after he resigned. The accident HAPPENED because of poor visibility. THERE CAME TO BE fewer people willing to help. IT CAME TO BE THAT fewer people were willing to help.
In the interlingua, an unfocused derivation will have exactly the same semantics as the corresponding anti-middle derivation if the root is focused by default. If it is not focused by default, then the semantics will be different, as we will discuss later. Thus:
AP-x is equivalent to AP-x [-F] P-x is equivalent to P-x [-F]
For example, P-s "kopusa", meaning 'know (intransitive)', is equivalent to P-s [-F] "kopoma".
Note that, since we have not implemented an anti-anti-middle or an anti-anti-passive, A/P-d "kopapa", meaning 'to inform (transitive - not ditransitive!) is equivalent to either the anti-anti-middle or the anti-anti-passive of "kopamba".
This approach has an important implication that may not be immediately obvious. Since middle derivations indicate that the demoted argument is generic, the lack of a middle voice change indicates that the argument must either be explicitly specified or is intentionally being withheld by the speaker. And if it is being withheld, then it is equivalent to an appropriate passive operation. Here are some examples that should help illustrate this point:
kopa = P/F-s verb meaning 'to know' kopuso = kopomo = P-s [-F] adjective meaning 'knowing', 'cognizant', 'in the know', etc. kopo = ???
Since "kopo" is focused but does not have an explicit focal argument, the argument is being explicitly withheld. In other words, it is equivalent to an anti-passive:
kopo = koposo = 'knowing something that the speaker doesn't know or isn't telling'
Note that even though the form "kopo" is effectively anti-passive, it is still more general than the unfocused form "kopuso/kopomo", and will be applicable in all situations. The unfocused adjective should only be used to emphasize that the focus is generic. And since English rarely (if ever) makes this distinction, both forms of such derivations will generally have the same English translation.
By now, it should be obvious that word design can be extremely productive in a language possessing a rich classificational morphology. This kind of morphology allows the language designer to create a large vocabulary with semantic precision, while minimizing the number of root morphemes needed. However, so far we've only used this approach to design basic verbs. We now need to see if a similar approach can be used to design basic nouns.
I began my discussion of verbs by providing a large number of examples that I placed into groups based on their argument structures. I felt that this was necessary because my approach to classifying verbs is unusual (and probably unique).
For nouns, though, I don't think that large numbers of examples will be needed, simply because the classes and their semantics are fairly obvious.
[Incidentally, I am not aware of any other work that classifies verbs as I have done here. Initially, I was tempted to adopt the more widely accepted Vendlerian analysis which classifies all verbs into the four major categories: state (e.g. "to know", "to love"), activity (e.g. "to run", "to sing"), accomplishment (e.g. "to sing a song", "to write a book") and achievement (e.g. "to die", "to find"). However, although I experimented with these four categories, I was very unhappy with the results. The standard categories seemed too vague, and I often had difficulty deciding which category a verb belonged to. An even greater disadvantage is that they provide almost no information about the semantics of the words. In any case, I felt that I needed a more productive system, and eventually ended up with the approach that I am using here.]
Before starting, let's precisely define what we mean by the expression "basic noun". Here is the definition that I will use:
A basic noun will represent an entity that has an actual physical existence (including extinct entities as well as entities from fantasy, mythology, etc.). Thus, such an entity must be composed of matter, energy, a combination of both, or time. Furthermore, characteristics which distinguish it from other entities must be verifiably physical (as opposed to functional, social, cognitive, etc).Note that my definition is purely semantic and has nothing to do with how a word is actually used in a sentence. Thus, for example, we will derive the word for "window" as a basic noun, while the word for "learner" will be derived from a basic verb (as we illustrated earlier), even though both are used as nouns in a sentence. The word "window" is a basic noun because it can be uniquely described using only its physical properties. The word "learner", however, must be derived from a verb, even though it represents a physical entity, because it does not differ from other related entities (such as "informer" or "knower") in a verifiably physical way. In other words, we cannot distinguish a "knower" from a "learner" or determine their respective natures by examining only their physical traits. Their differences lie in what they do, not in what they are.
I will classify most basic nouns as follows:
1. An entity represented by a basic noun must consist of matter, energy, a combination of both, or time. 2. An entity of matter and/or energy represented by a basic noun must be either living or non-living. 3. A non-living entity represented by a basic noun must be either natural or artificial.
So, using this approach, we can create the following basic noun classes:
matter & energy: living, species -> man, lizard, clam, tree, bacteria living, organs -> hand, leaf, branch, liver, acorn living, diseases -> arthritis, pneumonia, claustrophobia non-living, natural -> storm, tide, geyser, rainbow non-living, artificial -> computer, airplane, oven, fountain matter: natural -> salt, rock, cliff, river, island artificial -> key, statue, ax, book, wharf, house energy: living -> ghost, angel, genie, demon, banshee non-living -> heat, thunder, sunshine, photon time: -> winter, midnight, equinox, childhood
I am not making a distinction between natural and artificial, non-living energy because we would be forced to make useless distinctions. For example, "light" from the sun would require a different classifier than "light" from a light-bulb.
The 'living, organs' category includes all parts of living organisms that themselves contain life. Thus, "acorn" is considered an organ, while "shell" (e.g. clam shell) and "hair" are considered 'matter, natural'.
The 'living energy' category includes anything related to the supernatural, including mythological creatures that are primarily spirit-like (such as banshees and fairies). Mythological creatures that are primarily physical will be placed in an appropriate physical class. For example, the word meaning 'dragon' will be in the lizard class, 'minotaur' will be in the mammal class, and so on.
I believe that the above classes are fundamental, and that any useful system should contain at least these ten classes. However, we will also provide additional sub-classes for classes that have a large number of members. For example, in the 'matter & energy, living, species' class, it will be useful to distinguish between plants and animals. In fact, we will create even finer distinctions, such as between 'mammal', 'bird', 'fish', 'insect', etc. In the 'matter, artificial' class, it will be useful to distinguish between substances (e.g. "plastic"), locatives (e.g. "wharf") and others (e.g. "hammer"). The same substance/locative/other distinction will also be applied to the 'matter, natural' class to allow us to distinguish between words such as "water" (substance), "cliff" (locative), and "boulder" (other).
If we make these additions, our chart will look like this:
matter & energy: living, species vertebrates: mammals -> man, tiger, mouse, deer, dolphin birds -> hawk, ostrich, canary, penguin reptiles -> lizard, snake, turtle, crocodile other vertebrates (i.e., fish) -> trout, halibut, perch, lamprey arthropods -> ant, bee, crab, mosquito, grasshopper other animals -> clam, jellyfish, snail, worm plants (including kingdoms Monera, Protoctista, and Fungus): trees -> tree, oak, shrub, apple, juniper bush other plants -> grape, morning glory, horsetail, moss living, organs -> hand, leaf, branch, liver, ear living, illnesses -> smallpox, rheumatism, cancer, flu non-living, natural -> tornado, geyser, rainbow, earthquake non-living, artificial -> lathe, telephone, pump, robot, clock matter: natural, substance -> water, sand, bauxite, ivory, urine, air locative -> planet, river, island, mountain, bay other -> boulder, fang, stalagmite, shell artificial, substance -> plastic, benzene, steel, cloth, glue locative -> wharf, city, road, school, stadium other -> window, statue, desk, book, nail energy: living -> ghost, jinni, god, devil, banshee non-living -> heat, thunder, photon, noise, light time: -> winter, sunset, equinox, infancy
The non-living, artificial matter & energy class will represent powered items that typically do not run on only human or animal power; e.g., an electric drill, but not a hand-powered drill.
Note that I use the word "locative" in the following sense: a locative noun represents an entity which typically is built in place or evolves naturally in a single location, which is extremely difficult (if not impossible) to move to a different location, which is relatively permanent, and which is typically considered a place where humans can go to, remain at, or depart from. Again, the choice may seem subjective. For example, "wharf", "staircase", "bleachers", and "gallows" will be artificial locatives, but "beehive", "den/burrow", and "nest" will not be locatives. Instead, they will belong to the 'natural other' class.
In the interlingua, we will create classifiers for all of the above classes and sub-classes. In addition, since there are many more possible classifiers than will be needed, we will sub-categorize the classes even further. For example, the 'natural substance' sub-class will have the following classifiers and associated sub-categories:
civ elements and compounds (hydrogen, oxygen, sodium, chlorine, uranium, sodium chloride, potassium sulfate, biochemicals (including drugs), insulin, DNA, nucleotide, amino acid, methane, butanol, polybutadiene, benzoic acid, chlorobenzene, dimethylamine) zop plant/animal substances and mixtures (blubber, frankincense, beeswax, beef, honey, blood, wood, marrow, milk, feces, coral, tears, spit/spittle, urine) jav other natural substances (air, coal, soil, clay, bauxite, dust, sand, ore, ruby, snow, gypsum, poison)
A complete list of all the classifiers is provided in Appendix C.
Each classifier has a default class; i.e., a default semantics and syntax. For example, as we saw earlier, the classifier "kop" (meaning 'know') is a P/F-s mental state by default. The class of a root that has more than one morpheme is determined by the rightmost morpheme, and this morpheme is referred to as the classifier.
A stand-alone classifier (i.e., one that is not preceded by a modifying morpheme) will represent a specific, prototypical member of the class, rather than the entire class. For example, when the 'bird' classifier is used alone, it will actually represent the particular category of birds called 'pigeon/dove' rather than the more general meaning 'bird'. This classifier can then be modified by other modifiers to represent other birds, such as 'eagle', 'gull', 'ostrich', and so on. If we need to create a root representing the entire class, we will modify the classifier with the modifier "bye", meaning 'member'. For example, the 'member' modifier plus the 'bird' classifier means simply 'bird', and can refer to any bird.
The member modifier "bye" will not be applied to a classifier unless the result is useful and has a counterpart in many natural languages. For example, there is a classifier for 'abstract attributes and qualities'. Since I doubt that any natural language has a single word to represent this concept, we will not create a word using "bye" plus this classifier.
Note that a specific member of a class does not have to represent a single species or a single kind or type of entity. For example, there are several species of pigeon.
The classifier morpheme of a root is semantically and syntactically precise. However, the modifiers to the left of a classifier will provide no syntactic information at all and may not necessarily be semantically precise, but will provide semantic clues that will help the student remember the meaning of the complete root. In other words, the modifiers to the left of the classifier will be used for their mnemonic value to modify the classifier. The classifier, however, will always be semantically precise. For example, the root meaning "bicycle" consists of the numeric modifier meaning 'two' plus the 'vehicle' classifier.
Also, some modifiers can have completely different meanings in different contexts. For example, the modifier with the meaning 'six' would be useless with most classifiers except the numeric classifier and certain shapes (such as the hexagon). In cases like this, the modifier will have one or more completely different meanings that will be more useful in other contexts. Even so, however, we will always try to assign multiple meanings that are at least somewhat reminiscent of or related to each other. For example, the modifier meaning 'two' will have the alternate meanings 'divided/opposition'.
In summary, a classifier is used in three ways:
1. as a stand-alone root which represents a specific member or sub-group of its class (eg. 'pigeon') 2. as a classifier that can be modified by modifying morphemes to represent other specific members of its class (eg. 'ostrich' of the 'bird' class) 3. as a classifier modified by the 'member' morpheme "bye" to represent any member of the class (eg. a single root meaning 'bird')
Thus, the approach used here will allow an entire, easily learned vocabulary of roots to be flexibly designed using a relatively small number of root morphemes.
Now, let's design some words. We'll start with the modifying root morpheme "bo", which will have the vague senses 'fish/water/liquid/swim' and apply it to several classifiers (refer to Appendix C for the complete list of classifiers):
matter & energy: living, species mammal -> bozovi - otter birds -> bodami - duck fish -> bobomi - puffer/blowfish insects -> bokagi - mosquito trees -> bojigi - tupelo/black gum/sour gum living, organs -> bocesi - bladder (e.g. urinary or gall) non-living, natural -> bofepi - rain(fall) non-living, artificial -> botimi - boat -> bobisi - washer/washing machine matter: natural, substance -> bocivi - water locative -> botisi - oasis other -> boxami - drop(let) artificial, substance -> bofupi - drink/beverage locative -> bozegi - reservoir -> bodepi - bathroom other -> bozipi - cup energy: living -> bodevi - undine, water spirit non-living -> boxogi - hydropower time: -> bofemi - monsoon, rainy season
The simplest kind of derivation is to change the part-of-speech. In the interlingua, the verb form will have the meaning 'to be X', and the other forms will be interpreted in the usual way. Thus, for example, the word "bodama" would be a P-s verb meaning 'to be a duck'. The adjective form, "bodamo", would be used in expressions such as "Billy the duck", "duck egg", and any other modification that is inalienably 'duck'. The adverb form "bodame" would have the meanings 'being duck', 'since it is (a) duck', 'since they are duck', etc. Note that this approach is perfectly consistent with the rules we adopted for basic verbs.
We can also change the argument structure to something other than P-s. When doing so, the basic noun will represent the state, and the verb suffix will apply to the state in the usual way. For example, P-d "bodamupa" would mean 'to become a duck', A/P-d "bodamapa" would mean 'to change P into a duck', and so on.
In the previous section, we used the modifier "bo" to modify several noun classifiers. We should also be able to derive useful words by applying it to verb and adjective classifiers. Here are some examples (refer to Appendix C for a complete list of classifiers):
bokemo -> "-kem" = a scalar, non-relational state classifier, P-s adjective: wet bokema -> P-s verb: to be wet bokemapa -> A/P-d verb: to wet/make wet bokemupa -> P-d verb: to get wet bokasa -> "-kas" = involuntary act classifier, P-d verb: to sweat bocala -> "-cal" = activity classifier, AP-s verb: to swim
Here are some more examples using the modifier "ko-", which is reminiscent of the root "kop", meaning 'to know' or 'to have knowledge of'. The modifier "ko-" will represent the concepts 'knowledge', 'education', 'wisdom', and so on. (refer to Appendix D for a complete list of modifiers and their meanings):
kokigi = school - "-kig" = building(s) or place of business kodepi = classroom - "-dep" = room kojisi = desk - "-jis" = furniture kotega = explain - "-teg" = speech act, A/P/F-d verb kobegi = scholar - "-beg" = person kobisi = computer - "-bis" = powered device
There are many nouns that are difficult to classify because of their inherent abstractness. Some of these nouns refer to concepts such as language (e.g. French), culture (e.g. Arab), race (e.g. Caucasian), nationality (e.g. Swiss), and religion or ideology (e.g. Christian). These, however, are all proper nouns, and I will postpone discussion of them until later, in the chapter on Proper Names, Borrowed Words, Abbreviations, and Vocatives.
There are also concepts that are more general in nature and which typically describe human activities, the abstract products of such activities, the components of such products, and so on.
The question, though, is: What are these words? Are they nouns? Are they verbs? Or are they something else?
To answer this question, consider the English words "mathematics", "opera", and "adjective". If they are inherently verbs, then why do we never use them as verbs? They are always used as nouns. And if they are inherently stative, then why can we never use them as adjectives? In fact, if they were inherently stative, we would not need to derive such words as "operatic", "mathematical", and "adjectival".
The only conclusion that makes any sense is that these words are inherently nouns.
So, if they are indeed nouns, then how do we classify them?
Consider the word "opera". We might be tempted to classify it as non-living, artificial matter & energy. However, this would put it into the same category as "jacuzzi", "computer", and "automobile". For some reason or other, my mind rejects the idea that "computer" and "opera" are in the same class.
And what about "mathematics", "adjective", and "poem"? Should they be placed in the non-living energy class? If so, they would be classified along with "electricity", "light", and "thunder". Again, my mind rejects this categorization.
One thing that should be fairly obvious by now is that noun classification is inherently arbitrary, and that there is no way to avoid this arbitrariness. We can see logic and structure in the design of verbs, but nouns resist any truly logical classification. The reason for this is simply that nouns represent the products of an essentially random universe. For example, if you look at a diagram that classifies the animal kingdom, you'll find that some main branches have very few sub-branches, while others have numerous sub-branches with sub-sub-branches, and so on. You will also find that some entities resist accurate categorization into any single class.
We can only expect that this inherent arbitrariness will be even more prevalent when dealing with more complex, abstract concepts, especially when we add concepts that represent human activities.
Thus, the only recourse in dealing with these words is to create whatever classes are needed, in the same way as we did for the non-abstract noun classes. Fortunately, we won't need many classes to achieve our goal. In fact, we need very few. [For a complete list with English examples, refer to Appendix C, sections "Groups/organizations" through "Performances".]
It's important to emphasize that abstract classes apply to what sentient creatures do, not to what non-sentient creatures do or to what nature does. We must make a distinction between the phenomenon being studied or applied and what the student or practitioner actually does. For example, the word meaning 'climate' is a member of the non-living, natural, matter & energy class, whereas 'climatology' is a field of study. In other words, two classes will be needed: the natural phenomenon to represent what is being studied, and the field of study to represent what the student or practitioner actually does.
Many performances have basic verbal activities associated with them. In these cases, we will use the same modifier(s) with both the performance classifier, the activity classifier, and any other classifiers that may apply. For example, the words meaning 'teach', 'teacher', 'faculty', and 'the field of teaching' will all use the same modifier with different classifiers.
As with other nouns, all abstract nouns can undergo further derivation. For example, if the word "bobegi" means 'plumber', the AP-s form "bobegisa" means 'to keep oneself a plumber' or 'to remain a plumber'.
There is an important distinction between the group member class (classifier = "-beg") and the agentive noun derivation of the activity (classifier = "-cal"). For example, both "kobaybegi" and "kobaycali" (where "kobaycala" is the activity verb meaning 'to teach') can be translated as 'teacher'. However, "kobaybegi" will be the most commonly used word because it applies to teaching in general, whereas "kobaycali" applies only to a specific instance of teaching. English generally does not make this distinction, although there are exceptions; cf. "studier" vs. "student". In the interlingua, we will not use the agentive noun derivation unless we wish to explicitly refer to specific instances of the activity.
[Implementation note: if the member or performance translations can be regularly derived from the activity derivation, then only the activity derivation will appear in the MT dictionary for the target language. For example, "kobaycala" will appear in the English dictionary, but "kobayjepi" and "kobaybegi" will not, because "teaching" and "teacher" can be easily derived from "teach".]
When deriving rank and title words, there will often be cases where a particular profession has a hierarchy of many ranks; e.g., the military, the nobility, corporate management, religious organizations, etc. In these cases, we will create a compositional derivation consisting of an appropriate root for the profession prefixed by a number modifier (meaning 'one', 'two', 'three', etc). The number 'one' will be used for the highest possible rank (emperor, pope, CEO), 'two' for the next highest (king, cardinal, vice-president), and so on. [We'll discuss numeric derivations later in the chapter on Counts and Measures.
Along with ranks/titles and professions, we will also need to represent the actual jobs or positions associated with them. English sometimes has special words to represent these jobs, such as "professorship" and "bishopric", but most of the time, it simply modifies one of the words "job", "position", or "office"; (e.g. "The professor's position is still open" or "Bill wants the new engineering job").
In the interlingua, we will not allocate separate classifiers for this concept. Instead, we will create a word meaning 'position/rank/office' and modify it as needed.
Many nouns have separate forms that differentiate between homogeneous entities, individuals, and groups of individuals. These are referred to, respectively, as mass nouns, count nouns, and group nouns. Here are some English examples:
Mass Count Group --------- ----- ----- mutton sheep flock grass blade of grass lawn ship fleet foliage leaf beef steer herd/cattle hair hair, strand of hair wig rice grain of rice guts/flesh organ body wood tree grove, wood map atlas water drop shower
Note that English mass nouns are never used in the plural (*muttons, *beefs), while count and group nouns have both singular and plural versions. (However, some English nouns can have more than one sense; e.g. "hair" and "wood".)
Incidentally, do not confuse group nouns discussed in this section with the abstract noun group class discussed in the previous section. Here, we are referring to natural groupings of any basic noun. The separate group class, however, refers to groups of diverse, sentient elements (typically human, although they could also include or consist of members of intelligent alien species) linked by one or more activities specifically associated with the group. The groups discussed in this section do not imply any specific type of activity. These are physical groupings that describe what a group is - not what it does.
In the noun derivation scheme discussed earlier, some classes contained only count nouns while others contained only mass nouns. Specifically, in the 'matter, non-living' classes, 'substances' are inherently mass nouns, while 'locatives' and 'others' are inherently count nouns.
Note, though, that in our derivations, the 'other' counterparts of 'substance' nouns are not necessarily their count equivalents. For example, "bofupi" = 'drink/beverage' does not have a counterpart meaning 'beverage drop' or 'unit of beverage', whereas "bocivi" = 'water', does have the approximate counterpart "boxami" = 'drop/droplet'. If we really need to emphasize that the drop is water, oil, or some other substance, then we will use a compound or periphrasis as we do in natural languages; e.g. "waterdrop" or "drop of oil". For example, for 'waterdrop', we can use "bocivo boxami".
In other words, we will take advantage of the count/mass distinctions provided by our classificational system only when it is useful.
Note also that, even though "boxami" is classed as a natural 'thing', it still can be applied to non-natural substances. I doubt if any natural language makes a distinction between 'natural substance drop' and 'artificial substance drop', and requiring that kind of distinction in the interlingua will only increase the difficulty of both learning and translation.
Now, the classificational system provides a mass/count distinction, but it does not provide a group concept. In the interlingua, we will use the modifier morpheme "ku" for this purpose. Also, when creating a group sense from a basic noun concept, "ku" should always be applied to the count derivation, not the mass derivation, so that its class is correctly provided by the classifier.
[Incidentally, the basis for "ku" is the classifier "kum", which is the classifier for numbers, and "ku" is also used in numeric derivations with the sense of 'numeric group'. We'll have more to say about numeric derivations later.]
Here are some examples of group nouns:
kutigi = herd or flock (of mammals such as horses and sheep, but not of birds, "-tig" = 'grazing mammal' classifier) kudami = flock (of birds, "-dam" = 'bird' classifier) kukagi = swarm (of insects, "-kag" = 'insect' classifier) kutimi = fleet ("-tim" = 'vehicle' classifier) kubomi = school (of fish) (, "-bom" = 'fish' classifier)
Note that, unlike English "flock", "kudami" can only be used with birds. For example, we cannot use "kudami" in 'flock of sheep'. Instead, we must use "kutigi".
Group derivations are not semantically precise, even though they may appear so in the above examples. For example, the group noun "atlas" will be formed from "ku" plus the word meaning 'map', even though an atlas can contain much more than just maps.
We'll have more to say about the count/mass distinction in the chapter on Counts and Measures.
As we mentioned earlier, we can create non-specific words that can represent any or all members of a class using the modifying morpheme "bye-" plus an appropriate noun classifier. Here are a few examples:
matter & energy: living, species animal -> byefasi 'animal' mammal -> byekupi 'mammal' bird -> byedami 'bird' reptile -> byefegi 'reptile' fish -> byebomi 'fish' plant -> byebosi 'plant' living, illnesses -> byecimi 'illness' non-living, natural -> byefepi 'phenomenon' non-living, artificial -> byebisi 'device', 'mechanism', 'appliance', 'apparatus' matter: natural, substance -> byejavi 'stuff', 'substance' (natural) locative -> byetisi '(natural) location/place/spot' other -> byetevi '(natural) thing' artificial, substance -> byecapi 'stuff', 'material' (artificial) locative -> byezegi 'site', (artificial or man-made location/place/spot) other -> byedagi '(artificial) thing', 'item', 'object' energy: non-living -> byexogi 'energy' time: -> byefemi 'time (period)', 'duration' byedusi 'moment', 'instant', 'point in time' abstract nouns: -> byetovi 'unit (of measure)' -> byejagi 'group (of people)'
We will adopt the convention of using the "natural" derivations for general use, unless the nature of the substance/location/item is known or must be emphasized. For example, we will use "byetisi" rather than "byezegi" for the generic term meaning 'location', 'place', 'spot', etc.
So far, we've discussed the major case roles of agent, patient, agent-patient, and focus, and mentioned in passing a few oblique roles, such as instrument and manner. We also spent a considerable amount of time showing how to convert verbs to case tags and adverbs. In this section, I'd like to discuss how to create oblique case tags for any case role, especially the more traditional ones that, in English, are called "prepositions".
A sentence consists of a main verb and its arguments, and each argument has a case relation associated with it. For example, a sentence like:
On Tuesday, John moved the crate to the storeroom with a forklift.
can be analyzed as follows:
move: agent -> John patient -> the crate destination -> the storeroom instrument -> a forklift time -> Tuesday
In effect, the prepositions "on", "to" and "with" are labels ; i.e., they name or 'tag' the roles played by their arguments. In English, the core roles of agent and patient are not explicitly labeled, but are indicated by the meaning of the verb and the relative positions of subject and object.
Also, keep in mind that an oblique case role not only modifies the entire event described by the verb, but the implied subject of the case tag is usually an argument of the verb. Adverbs can be derived from intransitive verbs and also link to an argument of the main verb, but do not themselves have arguments.
Deriving case tags from verbs can lead to problems, however, since the case role of a generic object can be the same for different verbs derived from the same root. For example, the AP/F version of a verb differs from its P/F counterpart ONLY IN THE SEMANTICS OF THE SUBJECT - NOT IN THE SEMANTICS OF THE OBJECT. A case tag, however, must capture the semantics of the object. Consider the following examples:
"zendapa" = P/F-s verb = 'to be with', 'to accompany' The potato chips are with the sandwiches in the basket. 'to be with/to accompany' = P/F-s verb "zendapa" The boy accompanied his father to the barber shop. 'to accompany' = P/F-s "zendapa" or AP/F-s "zendapimba"
Now, if we want to create one possible equivalent of the English preposition "with" from the verb "zendapa" (as in "The boy went to the barber shop with his father"), which version of the verb should we use? The case role of the object is the same for both verbs. They differ only in the case role of the subject - one implies agency while the other does not.
Since we have no way of telling whether the boy willingly accompanied his dad or whether he was dragged along, we must use the most general form, P/F. Also, the only real function of the case tag is to indicate the state of the patient while linking it to the main verb. How it got to that state is not important. Thus, only the P/F form is needed.
This does not mean that a speaker should never use the AP/F version of "zendapa". For example, we can use it if we want to emphasize that the implied subject was a willing participant.
[Incidentally, this implementation of the English preposition "with" is just one possible solution. We'll discuss a more generally useful implementation later.]
A static/dynamic distinction is also an important one. And although it does not appear in the English preposition "with", it does appear in other prepositions; e.g. "They jogged IN the park" versus "They jogged INTO the park". And since it is semantically valid, we should make this distinction.
Finally, there are many locative concepts that are typically used as case tags. As we'll see later, these concepts will have their own classifiers, and will be able to undergo the full range of derivation possible in the interlingua. For example, the case tag meaning "in/inside" can be changed to the AP/F-d verb meaning 'to enter', the static P/F-s verb meaning 'to be inside', the dynamic P/F-d case tag meaning 'into', the noun meaning 'inside/inner part', and so on.
There will be times when a case tag is needed that is as semantically imprecise as its counterpart in a natural language. For example, during the analysis phase of machine translation from a natural language to the interlingua, it may not be possible to determine the more precise version of the case tag.
In the interlingua, we will accomplish this by allocating three special suffixes. They will effectively remove any link with the subject of the main clause, and will also eliminate the static/dynamic distinction. Here are the three suffixes that we will use:
0/P "-um" - case tag The argument which follows this case tag is an argument of the verb and is somehow affected or potentially affected by the event. There is no indication of whether the argument is affected statically or dynamically. 0/F "-eg" - case tag The argument which follows this case tag is an argument of the verb and is somehow a focus of the event. There is no indication of whether the result is static or dynamic. "0" "-og" - adverb This suffix creates an adverb which modifies the verb and which is not explicitly linked to any of the other arguments of the verb. There is no indication of whether the result is static or dynamic.
Note that distinct suffixes must be created, since all classifiers and other suffixes clearly indicate whether they are static or dynamic. Thus, these new suffixes are intentionally vague.
The last form (i.e. "0") should be used to create equivalents of many English adverbs that end in "-ly". For most of these adverbs, it is often unclear which core argument is being linked to, if any. Consider the following example:
John quickly opened the door.
Does this imply that John acted quickly, that the door experienced quick movement, or that the entire event took very little time? If we use the P-s adverbial form, there is a strong implication of a link to either "John" or "the door" or both. However, the "0" form does not have this implication. Thus, a P-s form has implications that a "0" does not have, making the "0" form more general. And since the "0" form is more general, it will generally be more useful.
Later, we'll see an example of how to use the 0/P suffix when we discuss the beneficiary case tag.
Thus, by using specific forms (AP-s, P-s, etc), we can modify the verb while indicating a link to an argument of the verb (typically the subject). This is often necessary when we create one-time or nonce adverbs and case tags as we did earlier with the root "kop" (meaning 'to know'). By using the "0" and 0/F forms, however, we directly modify the verb. Another way of putting it is that the specific forms modify an argument while the non-linking forms modify an entire event.
So far, we have limited ourselves to using the descriptive terms "core" and "oblique" when referring to case roles. We also mentioned in passing the terms "primary" and "secondary" when we discussed exchange verbs. At this point, I would like to review what these terms mean, since a good understanding of the distinctions between them will be useful in the upcoming discussions.
When referring to case roles, the term "core" refers to roles that are part of the valency of a verb and that have not been demoted. Thus, they always refer to the four major roles: agent, agent-patient, patient, and focus.
An "oblique" case role is not part of the valency of the verb, and must be marked in some way to indicate its function. In English, oblique case roles are introduced by prepositions. In the interlingua, they are introduced by case tags.
A core argument can be made oblique by means of a passive or anti-passive grammatical voice change.
A "primary" case role is a role that occurs naturally in the valency of an unchanged verb. Thus, a primary case role must always be either an agent, agent-patient, patient, or focus. Furthermore, an argument remains primary even if it is made oblique by means of a grammatical voice change. For example, the agent of the verb "kill" is a primary case role, whether it appears as the subject of the verb, or as the argument of the preposition "by" in a passive voice operation.
A "secondary" case role is a role that occurs naturally as an oblique argument of an unchanged verb. Thus, a secondary case role can never be the primary agent, agent-patient, patient, or focus of the verb.
Starting with the next section of this monograph, I will discuss in detail how to derive many of the traditional case tags that appear in natural languages. However, before doing so, I would like to briefly digress and comment on the philosophy of case tag design.
The derivational approach that we are using here is especially advantageous because there is no need to design a linguistically complete and correct case system. Linguists have yet to agree on such a system, and I sincerely doubt that it's even possible. However, using the approach described here, any verb can be converted to a case tag, as long as the result makes sense and performs the desired function.
[Incidentally, it is also possible that the system I am presenting here has real theoretical validity. In other words, it's possible that there really are only four basic case roles, and that all of the other case relations are derivable from them. However, I am not making this claim for the simple reason that I don't know if it is true, although I suspect it is.]
Let's start by creating oblique case tags for the four primary case roles. These can be used to specify oblique A, AP, P, and F arguments in passive and anti-passive voice-changing operations.
As we discussed earlier, passive constructions remove an argument from the argument structure of a verb and make it optional. To specify the optional argument, we could use case tags derived specifically for agent, patient, agent-patient, and focus. However, most (and perhaps all) natural languages are not so semantically precise. For example, ANY passive construction in English allows the original subject to be specified obliquely using the preposition "by", regardless of the actual case role:
The window was broken by the neighbors' son. - where "by" introduces an agent. The poem was memorized by all the children. - where "by" introduces an agent-patient. The thief was heard by an off-duty policeman. - where "by" introduces a patient.
Now, in the interlingua, we will allocate the special, true generic root "tom". This root will have no semantics of its own, but will instead take its semantics (including its class and part-of-speech) from the suffix that immediately follows it. When followed by one of the passive suffixes, it will be a 0/F case tag by default. Thus, the two passive case tags are:
passive: tomese For oblique expression of original subject. The English equivalent is "by". anti-passive: tomose For oblique expression of original first object. English does not have a formal anti-passive, so there is no standard English equivalent, although "of" is probably most common; cf. "John is the breaker OF the window".
where "-es" and "-os" are the suffixes we defined earlier to perform the two passive voice changes.
Note that the voice suffixes do not normally change the part-of-speech of a word. However, when "tom" is the root, the result will depend on the suffix. In the case of the passive suffixes, the resulting word will be a case tag. As always, this can be changed by adding an appropriate part-of-speech suffix.
Now, if we really need to express the roles of agent, patient, etc. precisely, we can start with generic versions of the A/P-s, P/F-s and AP/F-s and invert them if necessary. When converted to case tags, these verbs will take on the 'label' meanings of 'semantic agent', 'semantic patient', etc. However, these derivations are too precise, since they clearly state whether they are static or dynamic. Thus, we now face the same problem we discussed earlier when we had to deal with case tags and adverbs that were too precise.
We solved that problem by allocating three suffixes for the "0/P", "0/F" and "0" non-linking argument structures. We now need to complete the set with two additional suffixes:
0/A "-am" - case tag The argument which follows this case tag is an argument of the verb and is somehow responsible for the event. There is no indication of who the patient is. There is also no indication of whether the argument is affected statically or dynamically. 0/AP "-im" - case tag The argument which follows this case tag is an argument of the verb and is somehow both responsible for the event and affected by the event. There is no indication of whether the argument is affected statically or dynamically.
However, in order to form the needed case tags, we must attach the suffixes to a root because the morphology of the interlingua does not allow words without roots. We will solve this problem by using the true generic root "tom" which we introduced earlier. Thus, the four non-linking case tags are:
Agent -> tomame Agent-patient -> tomime Patient -> tomume Focus -> tomege
As case tags, we could paraphrase them as "an agent being", "a patient being", etc.
It is important to emphasize that these case roles do not represent the same roles as the corresponding core arguments. They are secondary case roles. In the discussions that follow, we'll see how this distinction can be very useful.
By now, I assume that the semantics of case roles is reasonably clear, and that creating case tags for any role should not be too difficult. A little practice, however, never hurts. So, in this section, I will describe how to create case tags for some of the most common, traditional case roles.
In most of the following derivations, I will paraphrase the function of the case role with a standard template that will allow us to clearly and consistently capture the semantics of the case role. The template will have the form: "In the event in which X occurred, sub-event Y occurred". Here are some examples:
He broke the window with a hammer. = In the event in which he broke the window, he used a hammer. He ran into the house. = In the event in which he ran, he 'became in' the house. He drove the car like a madman. = In the event in which he drove the car, he acted/behaved like a madman. I bought the car after we got married. = In the event in which I bought the car, the 'time locus' was after we got married.
And so on. By using a standard template, we can avoid ad hoc solutions that will just have to be redone later.
The means case role elaborates how the agent accomplished the event. In English, this case role is normally marked by the prepositions "with", "via", "by", or "by means of", depending on the nature of the argument. Here are some examples:
He cooled the stew BY blowing on it. We solved the problem BY asking for help. He knocked the chair over BY kicking it. She explained BY MEANS OF a story. They isolated the virus VIA a new technique. He broke the window WITH a hammer. He called me ON his new cellular phone. I learned a lot about dolphins FROM an encyclopedia.
Note in the last five examples that "by means of", "via", and "with" can all be interchanged freely, while "on" and "from" are more limited. Thus, English uses "by" for this case role when the argument is a clause. Otherwise, it uses "via", "by means of", "with", "on", or "from".
As we will see later, the focus of an action verb elaborates the action by providing more detail about it. In effect, it provides information about HOW the deed was done. Thus, it indicates that the agent successfully affects the patient BY MEANS OF the focus. Thus, the focus of these verbs is actually the means case role since it elaborates what the agent actually did.
However, which form of the verb should we use? The A/F-s [-P] form, the A/F-d [-P] form, the AP/F-s form, or the AP/F-d form? In order to achieve the desired generality, we will have to use the 0/F suffix "-eg". This time, however, we must use it with the general action root "bus". Thus, the final result is the generic 0/F action case tag, "busege".
The instrument/means/method case role is often used with resultative semantics. Consider the following:
(1) John painted the door green. (2) John "greened" the door by painting (it).
In (1), "green" is represented by a P-d adverb. In (2), "greened" is an A/P-d verb and is followed by the means/method case role. The sentence actually used will depend on whether the speaker wishes to emphasize the "painting" or the "greening".
We can also create another case tag with a similar meaning from a verb that we discussed earlier. Here is the verb again:
A/P-s: busasa - 'to control/manage/run' e.g. John RAN/MANAGED the company for three years.
In this case, the case tag will be "busase". However, "busase" can only be used when its argument is a physical entity. It can never be used with a clausal argument. Also, "busase" is less general then "busege" because it has a strong implication of control and thus adds emphasis to what's happening to the tool being used, which is probably not desirable in most cases.
A beneficiary is the entity which may be indirectly affected by an event. Here are some examples:
John washed the dishes FOR his wife. They took up a collection FOR John's widow. Bill bought some flowers FOR his girlfriend. She built the doghouse FOR the new puppy. He cooked supper FOR the children.
A possible paraphrase for the English preposition "for" in the above examples would be something like 'on behalf of', 'for the sake of', or 'in the interest of'. A more comprehensive and accurate paraphrase would be 'to (possibly) have an unspecified or generic positive effect on'. The concept of 'possibly' makes this last paraphrase more accurate because there is no indication that the beneficiary actually experiences a change of state - only that it may occur.
The label "beneficiary", however, is something of a misnomer, as can be seen in the following examples:
He set the trap FOR the raccoon. I bought the itching powder FOR my roommate.
The first example is sometimes called a maleficiary, since the intended effect is clearly detrimental. The second example is ambiguous, since it is not clear whether the itching powder was purchased to be used BY the roommate or ON the roommate. Thus, a more appropriate name for this case role is secondary patient, since it is not always obvious if the intended effect is good or bad.
Here is an example of a standard paraphrase of this case role:
He cooked supper for his wife. = In the event in which he cooked supper, he (possibly) had an unspecified effect on his wife.
Thus, the semantics of the secondary patient are simple: the agent of the main verb is responsible for the main event which may have an unspecified effect on a secondary patient. The context determines whether the intended effect is positive or negative, and the outcome is uncertain. Thus, the semantics of this case role are perfectly matched by the 0/P case tag "tomume", which we derived earlier.
Now, consider the following:
The play was boring TO/FOR me but not TO/FOR Bill. The trip was wonderful FOR all of us. The box was too heavy FOR Dennis.
In the above examples, the secondary patient was clearly affected. In addition, the agent of the resulting state was not just the subject of the main verb - it was the entire clause to the left of "to/for". For example, it wasn't just the box that affected Dennis, it was the fact that the box was heavy that affected him (i.e., that prevented him from lifting it). Thus, if we want to more precisely indicate that the secondary patient was actually affected, we must use the more precise A/P-d generic derivation "tomape".
The comitative case role introduces additional participants in an event which are equal in function and importance to the subject of the verb. The English prepositions "with" or "along with" are normally used to mark the comitative case role. Here are some examples:
He weeded the garden WITH his wife. They went to Boston WITH the children. She died in a plane crash ALONG WITH three other passengers. I ate supper WITH my family.
[Do not confuse this usage with the instrumental sense of the word "with", as in "I ate supper WITH a fork", or with the manner sense as in "I washed the crystal WITH care". Natural languages have a bad habit of overloading their case tags. To add to the confusion, they rarely do it in the same way.]
This case role is an unusual one, because it is actually an alternative to coordination, which is normally handled syntactically. Thus, the first example can be paraphrased as "He and his wife weeded the garden".
Some readers may argue, however, that use of the case tag "with" implies a certain degree of subordination which is not implied as strongly when using coordination. This apparent subordination, however, is a pragmatic effect - not a semantic one - and is implied by the context. In different contexts, the subordinating effect can be reversed:
Billy went to the movies with his parents. (Billy accompanied his parents, and he was somehow subordinate or less in-control than his parents.) The Simpsons traveled to Boston with the children. (The children accompanied their parents and were somehow subordinate or less in-control than their parents.)
Thus, the implication of subordination, if any, can work both ways. Note, though, that the comitative argument is certainly less topical than the subject, which is to be expected since it is oblique.
In other words, the comitative case tag introduces an argument that performs exactly the same semantic role as the subject. The only difference is that the argument is reduced in topicality compared to the subject. Consider the following:
1. Dad went to a movie WITH the kids. 2. The kids went to a movie WITH dad.
In (1), "dad" is more topical than "the kids", while in (2) "the kids" is more topical than "dad". In both sentences, though, "dad" and "the kids" play exactly the same semantic role.
Now, if "Dad" and "the kids" were equally topical, we would instead say something like this:
3. Dad AND the kids went to a movie.
In other words, a coordinating conjunction does not imply any significant difference in topicality.
However, we can reduce the topicality of a part of the subject by using the comitative.
This should certainly sound familiar, since reducing the topicality of an argument is exactly what a grammatical voice change does. The only difference is that the comitative reduces the topicality of only a part of the subject. In spite of this, it is still a grammatical voice change.
Now that we've discussed the semantics of the comitative case role, let's see how we can implement it. At first sight, it seems that we have several options:
Option 1: We can create A/P, AP/F, and P/F verbs with the general meaning 'to do with/be with/accompany' and derive case tags from them as we did when we discussed the verb "zendapa".
The problem with this approach is that it is far too precise, since these case tags imply strong links to an argument of the main verb, and they precisely state whether they are static or dynamic. Thus, using this approach, we would have to create several "-s" and "-d" versions, even though natural language case markers are rarely, if ever, so precise.
Option 2: We can create the 0/F "zendapege". This case tag has the same range of role coverage as the comitative function of most natural languages, including English. However, it fails to capture the semantics correctly. Consider the following:
He weeded the garden with his wife. = He weeded the garden 'being with' his wife.
The 0/F case tag simply states that his wife was present - it does not indicate that she also did some of the weeding. Note that this objection also applies to option 1.
Option 3: We can insist that coordination be used instead of a case tag. Thus, the language would not allow a sentence like "He weeded the garden with his wife". Instead, it would have to be stated as "He and his wife weeded the garden". We could also create a conjunction that intentionally implies a certain degree of subordination, as in "He and-to-a-lesser-degree his wife weeded the garden". However, a conjunction does not reduce the topicality of an argument, and I know of no natural language that does this.
Option 4: We could use the secondary agent, agent-patient, patient, and focus case tags which we derived earlier. However, this solution is not correct because these are secondary case roles, and the roles they indicate may not be the same as the primary case roles. For example, as we saw when we discussed the beneficiary case role, a secondary patient may be somehow affected by the event, but not in the same way as the primary patient.
Option 5: Thus, what we really need is a primary case role; i.e., one that indicates the same role as the subject of the verb. Consider the following sentence:
She died in the plane crash with three other passengers.
Here, the comitative entities "three other passengers" experienced exactly the same fate as the subject. Compare this with the beneficiary case role discussed earlier, where the secondary patient does not experience the same effect as the primary patient.
Since the comitative is actually a grammatical voice change, the only correct way to implement it is by creating a new suffix that demotes a part of the subject and makes it oblique. We will call this voice change the 'co-subject' voice:
co-subject -ev demotes part of the subject and makes it obliquely expressable
Thus, the comitative case tag will be "tomeve". [Incidentally, the anti-middle form "tomevome" means 'along', as in "We went along for the fun".]
We will also need a voice-changing morpheme to indicate that an entity is specifically being excluded as a possible subject. The corresponding case tag will have the meaning 'without':
non-subject -ov an entity is specifically excluded from being subject
Thus, the case tag meaning 'without' or 'except (for)' is "tomove".
Now, when we apply a passive voice change to a verb, the appropriate passive suffix must appear on the verb even if the demoted argument is expressed obliquely. For example:
The door doykavapesa = The door was opened. The door doykavapesa tomese John = The door was opened by John.
For the co-subject and non-subject case tags, however, the suffix must not be used on the verb if it is also expressed obliquely because the argument structure of the verb has not changed. (When a passive suffix is added to a verb, one of its arguments is always demoted. When "-ev" or "-ov" are used obliquely, the subject has not been demoted, and (as we will discuss below) when they are applied directly to the verb, an implied argument must be automatically added by the translator.)
Also, since the verb is not marked, it's not clear which verb (actual or implied) is relevant. This allows us to use "tomeve" or "tomove" even if its argument is not linked to the actual subject of the verb, as in "Mother sent the children away without(=tomove) candy". In this example, "tomove" links "candy" to "children", not to the subject "mother". In other words, "the children" were without "candy", not necessarily "mother".
Now, marking the verb with "-ev" or "-ov" will be useful when the demoted entity is not being expressed obliquely. When using "-ev", the marked verb would imply that it has a co-subject even though the co-subject is not being expressed. In effect, if "-ev" is suffixed to the verb, it indicates that the subject was not by himself or on his own; i.e., there is a co-subject but it is not being specified. On the other hand, if "-ov" is used this way, it implies that the subject was by himself or on his own; i.e., it emphasizes the fact that there is no co-subject. Here are some examples using the verb "doykavapa" meaning 'to open':
John doykavapa the door. = John opened the door. John doykavapeva the door. = John opened the door with one or more unspecified others. = John did NOT open the door alone/by himself/on his own. John doykavapova the door. = John opened the door without one or more unspecified others. = John opened the door alone/by himself/on his own.
Finally, some languages also have comitative case tags that link to the object of the verb rather than to the subject (the only language I specifically know of that can do this is Mayali, Gunwinyguan family, Australia). English can do this occasionally, but only when the semantics and context make it impossible to interpret a link with the subject, as in "Bob sent her some flowers yesterday with a get-well card". Since this usage is quite rare, we will not create a unique suffix for it. (In fact, in my dialect of English, this usage sounds distinctly "odd", and can be just as easily implemented as "flowers AND a get-well card".)
Most languages, including English, have several verbs that are inherently locative in nature, such as "to enclose", "to enter", "to arrive", "to exit", "to put", "to lower", etc. All of these words can be derived from roots that will also be useful in the derivation of locative case tags and many other useful verbs, adverbs, and adjectives. For example, "to raise" is the A/P-d verb formed from the root meaning 'up (unfocused)' or 'above (focused)'; i.e., 'agent causes patient to become above an unspecified focus'. This root concept of 'up/above' can also be used to create the words meaning 'to rise', 'above', 'up', 'upwards', and so on.
To illustrate this process, let's start with the basic state concept meaning 'located at/in' and try to derive as many useful words as possible from it. For this illustration, we will use the root "zog". Here is an example of a standard paraphrase of the English preposition "at":
John studied law at Harvard. = In the event in which John studied law, he was at Harvard.
And here are some of the words we can create using this concept (roots made with "zog" will be P/F-s by default):
zoge = P/F-s case tag = 'at/in' e.g. John works zoge Boston = John works in Boston. zoga = P/F-s verb = 'to be located at/in' e.g. John zoga the reservoir = John is at the reservoir. zogangi = inverse F/P-s noun = 'location/position' e.g. Its zogangi is a secret = Its location is a secret. zogose = P-s [+F] adverb = 'someplace', 'somewhere', 'at/in some unspecified place' e.g. He lost it zogose = He lost it somewhere. zogumba = P/F-d verb = 'to become located at', 'to get to/into' e.g. How did the table zogumba the other room? = How did the table get into the other room? zogumbe = P/F-d case tag = 'to', 'on the way to' e.g. He ran zogumbe the house = He ran to the house. He sang zogumbe work = He sang on the way to work. zogumbose = P-d [+F] adverb = 'someplace', 'somewhere', '(in)to some unspecified place' e.g. They moved it zogumbose = They moved it somewhere.
It's important to emphasize that "zog" is very general and that the focus of "zog" must be an actual location. Also, many true locatives imply their immediate surroundings. Because of this, "John zoga the lake" will be translated as "John is AT the lake" simply because a "lake" is normally associated with its immediate surroundings. It is certainly possible that John is also IN or ON the lake, but these interpretations inherently exclude the surroundings. Thus, for true locatives, we must always use the English preposition that is the most general for the location, even if a more specific preposition is more prototypical. This means that, for English, we will always use "at" unless the result is awkward or ungrammatical. Here are some examples:
AT the cave/river/reservoir/dam/school/commune/swamp/ house/shopping mall/island/planet/wharf/beach/ farm IN the forest/garden/city/desert/room/suburb ON the patio/campus/road/continent/balcony/stage
Now, the above rule applies only to foci that are true locatives. For a focus that is not a true locative, "zog" will always mean that the patient and the focus are in the same general location. Here are some examples:
NEAR the refrigerator/chair/flagpole/car/tent/door WITH the books/hooks/choir/scissors/salt/dog/boy
Note that "near" is used for all large, non-living items, while "with" is used for everything else.
Now, if we had a way to negate the meaning of the root, we could create words with meanings such as 'to be not-at = to be away from', 'to become not-at = to get away from', and so on. To accomplish this, the interlingua has a corresponding antonymic locative classifier "-fag" which will be used to create contrasting pairs such as "at/away from", "above/below", "north of/south of", "to the right of/to the left of", and so on. The classifier "-zog" will be used for the more basic, positive, or highest magnitude sense, while "-fag" will be used for the non-basic, more negative, or lowest magnitude sense. With this new morpheme, we can now derive several more useful words:
fage = P/F-s case tag = 'not at/in/etc', 'away from' e.g. John attends school fage his home town = John attends school away from his home town. faga = P/F-s verb = 'to be not located at/in/etc', 'to be away from' e.g. John faga Boston = John is away from Boston. fagose = P-s [+F] adverb = 'away', '(at) elsewhere', '(at) somewhere else', '(at) someplace else' e.g. They found it fagose = They found it somewhere else. fagumba = P/F-d verb = 'to become located away from', 'to get away from' e.g. The boat fagumba the wharf = The boat got away from the wharf. fagumbe = P/F-d case tag = 'from', 'away from' (source location) e.g. I sent it fagumbe Boston = I sent it from Boston. He ran fagumbe the house = He ran (away) from the house. fagumbose = P-d [+F] adverb = 'away', '(to) elsewhere', '(to) somewhere else', '(to) someplace else' e.g. They chased the dog fagumbose = They chased the dog away. He moved the papers fagumbose = He moved the papers somewhere else.
The following useful words can also be derived from the roots "zog" and "fag", even though they are not needed to form case tags that represent English prepositions:
A/P/F-d: zogamba = 'to move to', 'to put/place at/in' e.g. We zogamba the barrels the backyard. = We moved the barrels to the backyard. fagamba = 'to move away from', 'to remove from' e.g. We fagamba the books the shelves. = We removed the books from the shelves. A/P-d: zogapa = 'to move', 'to place', 'to position', 'to deposit' e.g. Who zogapa the desk? = Who moved the desk? fagapa = 'to remove', 'to move away' e.g. Joey fagapa the old TV yesterday. = Joey removed the old TV yesterday. AP/F-d: zogimba = 'to arrive at/in', 'to reach' e.g. We zogimba Atlanta yesterday. = We arrived in Atlanta yesterday. inverse "zogimbangi" = 'destination', 'goal' fagimba = 'to leave/depart (transitive)', 'to go (away) from e.g. She just fagimba the meeting. = She just left the meeting. AP-d: zogipa = 'to arrive' (intransitive) e.g. They zogipa yesterday. = They arrived yesterday. fagipa = 'to leave', 'to go away', 'to depart', 'to go out/off', 'to head out', 'to take one's leave' (intransitive) e.g. We fagipa every day at noon. = We headed out every day at noon. A/P/F-s: zoganza = 'to keep at/in' e.g. She zoganza the stallion the pasture. = She keeps the stallion in the pasture. faganza = 'to keep away from' e.g. I faganza the dogs the chicken coop. = I keep the dogs away from the chicken coop. A/P-s: zogasa = 'to constrain', 'to keep in (place)', 'to limit/ restrict movement of' e.g. We zogasa the larger dog. = We restrict the movement of the larger dog. fagasa = 'to keep away/out', 'to hold at bay' e.g. He fagasa the mosquitos with a net. = He keeps the mosquitos away with a net. AP/F-s: zoginza = 'to attend', 'to stay/remain at/in' e.g. I zoginza the conference for three days. = I'm attending the conference for three days. faginza = 'to stay or remain away from', 'to avoid' e.g. Bill faginza school = Bill is staying away from school. AP-s: zogisa = 'to stay put', 'to stay/remain in place', 'to stay behind', 'to abide' e.g. I told the children to zogisa. = I told the children to stay put. fagisa = 'to stay away' e.g. He fagisa because of you. = He's staying away because of you.
Note that in many of the English versions of the focused verbs, the focus is oblique (e.g. "to stay AT"). Thus, if we want to precisely emulate English, we will need to create and use the more verbose [+F] versions. However, this is really not necessary because English makes little (if any) distinction between the topicality of objects and the topicality of obliques.
Finally, the above derivations are just examples using a single locative state concept. A language will need many other locative case tags. These tags will describe all of the possible states and relationships that are dealt with by English prepositions and adverbs, and will have such meanings as 'to be above', 'to be behind', 'to be inside', etc. In turn, these roots can be used to create many other useful words. For example, the root used to form the locative verb meaning 'to be inside' can also be used to create the oblique case tag 'inside of', the adverb 'inside', and other useful words such as English "to enter", "to insert", "inwards", "interior", "contain", and so on, along with all of their opposites.
Temporal case tags indicate the locus of an event in time. Consider the following examples:
John bought the book IN March. We visited them WHEN we were in New York. They built the doghouse OVER the weekend. He lost weight SINCE the accident. She won't leave UNTIL she sees the boss. We met Janice DURING/ON our last visit. He plans to leave AT noon. I'll take a shower BEFORE I leave.
Note that some English temporal case tags (e.g. "in", "over", and, "at") also have locative meanings, while others (e.g. "when", "since", and "until") have only temporal meanings. There are also locative case tags in English that are never used with temporal meanings (e.g. "along", "beneath", "against", "via").
One possible solution to the problem of creating temporal case tags would be to simply use locative case tags with temporal arguments. It is important to keep in mind, though, that different languages assign temporal meanings to locative case tags in different ways, if at all.
However, overloading locative case tags is semantically incorrect for the simple reason that locative and temporal case tags have very different meanings.
Thus, we will have to create verbs with meanings such as 'to happen at', 'to happen after', 'to happen during', etc. Note that we can also state these verbs in terms of a position on a timeline, such as 'to be at a time locus during/after/etc'. Here is an example using our standard form of paraphrasing:
John bought the car before he got married. = In the event in which John bought the car, the time locus was before he got married.
Now, let's do a few sample derivations using the concept of 'before'. For these derivations, we will use the root "cip", which, as we will see later, is actually a marker for past tense. It is P/F-s by default. Here are some of the more useful derivations:
cipa = P/F-s verb = 'to be (at a point in time) before', 'to precede', 'to happen/occur before' e.g. The accident cipa the election. = The accident preceded the election. cipe = P/F-s case tag = 'before', 'by the time (of)', 'prior to' e.g. John got drunk cipe the party started. = John got drunk before the party started. cipome = P-s [-F] adverb = 'earlier', 'previously' cipomo = P-s [-F] adjective = 'earlier', 'previous', 'preceding', 'prior' [Note that we cannot use "cipo" in place of "cipomo", since it would imply that the referent time is not known from context. Because of this, I doubt that "cipo" will be useful.]
We'll see many more derivations in the chapter on Tense and Aspect.
The above derivations are just examples using a single temporal state concept. A language will need several other temporal case tags. These tags will describe all of the possible states and relationships that are dealt with by English prepositions and adverbs, and will represent such concepts as 'time after', 'time at', 'duration', 'repetition', etc. However, since time is one-dimensional, we won't need nearly as many temporal case tags as locative ones. Later, when we discuss Tense and Aspect, we will see how the roots for all temporal case tags can be effectively "derived".
Many things happen as a result of earlier events, conditions, or situations. In English, these events are normally introduced by expressions such as "because", "because of", "in that", "as a consequence of", "(out) of", "from", "for", etc. Here are some examples:
John left early BECAUSE he had a headache. They guarded it carefully BECAUSE OF its great value. The book provides a useful resource, IN THAT it lists every restaurant a tourist should avoid. He was not allowed to participate AS A CONSEQUENCE OF his past behavior. She died OF/FROM a broken heart. He got a stomach ache FROM eating too much. They agreed to the terms OUT OF fear of retaliation. I was angry at Bill FOR what he said to Mary.
Note that some English forms are used exclusively with embedded clauses (i.e. "because" and "in that"), while the others require noun phrase arguments (i.e. "because of", "as a consequence of", "(out) of", and "from").
Since this case role represents the most basic form of secondary causation, we can paraphrase it as follows:
John left early because of a headache. = In the event in which John left early, the secondary cause was a headache.
Since this is the most generic kind of secondary causation, the obvious solution is to use the true generic root "tom" plus the 0/A suffix "-am". Thus, the reason case tag is simply "tomame". In other words, the argument of "tomame" is the secondary agent of the event.
The manner case tag describes how something happens. It can be paraphrased as "in the manner of" or "in an X manner", and answers the question "how did such-and-such occur". English implements the manner case using prepositions and adverbs. Here are some examples:
He drove the car LIKE a madman. (preposition) He QUIETLY closed the door. (adverb)
We've already seen how to convert verbs to adverbs, some of which function like English manner adverbs. There are times, however, when manner cannot be indicated with a simple adverb, as in the following examples (manner case tags are capitalized):
The army raced through the town LIKE a destructive tidal wave. Their singing sounds LIKE wailing banshees. The preacher berated the congregation AS IF they were naughty children.
Most manner case roles are indicated in English with the preposition "like". However, even here, we have two distinct senses. Consider the following:
He drove the truck LIKE a tank. He drove the truck LIKE a madman.
In the first sentence, the word "like" describes the behavior of the patient. In the second sentence, it describes the behavior of the agent. Also, the first sentence itself has two distinct interpretations:
He drove the truck causing it to be like a tank. He drove the truck as if it were a tank.
The best way to capture these distinctions as accurately as possible is to create case tags from a root morpheme meaning 'like' or 'similar'. In the interlingua, we will assign the scalar relational root "zunxum" (default = P/F-s). The three verb derivations that are most useful here are as follows:
P/F-s: zunxuma 'to be similar to', 'to resemble', 'to be like', 'to have the appearance of being' AP/F-s: zunxuminza 'to act similar to', 'to imitate' A/F-s [-P]: zunxumanzoma 'to cause something known from context to be similar to', 'to approximate'
When converted to case tags, these three verbs will provide the needed semantics for the above manner expressions: "like a tank", "like a madman", "like a destructive tidal wave", "like wailing banshees" and "as if they were naughty children". In the last item, "as if they were naughty children" would be expressed as "like naughty children" where "like" would be implemented using the P/F-s form "zunxume".
Now, while the above three derivations are semantically precise, some people may object to having to learn three case tags instead of just one. In a situation like this, it may be advisable to use the non-specific 0/F suffix "-eg", discussed earlier, plus the root meaning 'similar'. Using this approach, the single, all-purpose, manner case tag will be "zunxumege".
In the chapter on verb semantics, we discussed the need for the additional case roles of secondary agent-patient and secondary focus in sentences such as:
John sold the book to Bill for five dollars.
Here, "John" is the primary agent-patient, "the book" is the primary focus, "Bill" is the secondary agent-patient, and "five dollars" is the secondary focus.
These are secondary roles because they do not play the same roles as the corresponding roles for the main verb, but they DO take part in the change of possession.
Thus, both secondary roles can be derived in exactly the same way as the secondary patient (i.e. beneficiary) that we discussed earlier, by applying the appropriate non-linking suffix directly to the true generic root "tom". In other words, the secondary agent-patient (0/AP) meaning 'to/from' is "tomime", and the secondary focus (0/F) meaning 'for' is "tomege".
English uses two separate case tags to indicate the secondary agent-patient, depending on the direction of transfer. These are "to" and "from", and are often referred to as recipient and donor, respectively. However, there is really no need to implement two case tags, since the verb always indicates the direction of transfer. For example, in the following sentences, it's obvious who is the donor and who is the recipient:
John sold the book "to or from" Bill. Bill bought the book "to or from" John.
And for verbs like "swap", we use the more neutral preposition "with" for the secondary agent-patient:
John swapped his book for a magazine WITH Bill.
Thus, there is no need to implement two case tags for the "to/from" roles. Whenever there is a change of possession, the role indicated by the English prepositions "to", "from", and "with" is always specified by the verb, and using the case tag to indicate the direction of transfer is simply redundant.
The secondary focus uses the same English preposition "for", regardless of the direction of transfer, as in:
He sold the bike FOR $50. He bought the bike FOR $50. He swapped the book FOR a magazine.
It can also be used when a patient exchanges locations:
We departed Boston FOR New York.Note that we cannot use "zogumbe" for "FOR" in the above example because "zogumbe" clearly indicates that the travelers actually arrived in New York.
However, we must not use "tomege" for English "from" in the following example:
We arrived in New York FROM Boston.
"Tomege" would be too vague for the above example because it does not clearly indicate that the travelers actually started out in Boston. Instead, "fagumbe", which we introduced earlier, should be used.
A secondary agent-patient can also be used when a non-physical transfer occurs. Here are some examples:
I found out about the party FROM Bill. He deduced the location FROM the clues that you provided.
While the second example is somewhat metaphorical, it's still a legitimate use of the secondary agent-patient case tag.
Many verbs allow an expression that provides more information about the final state of the patient. Here are some examples:
He drilled the board FULL OF HOLES. He sliced the meat INTO SMALL PIECES. The coach turned back INTO A PUMPKIN. The crowd shouted itself HOARSE The crowd shouted itself INTO A FRENZY.
Linguists call these constructions resultatives. It's also possible, though, to specify initial states:
He changed FROM A SOFT-SPOKEN LIBERAL to a religious fanatic. He built the doghouse OUT OF SCRAP LUMBER. He worked the gold FROM AN INGOT into a flat sheet.
It's also possible (although rare) to specify a steady-state. Compare the following two sentences:
He kicked the door OPEN. (change-of-state) He held the door OPEN. (steady-state)
In English, most steady-states are handled with adverbs, as in the following examples:
They GLADLY tagged along. He QUIETLY ignored his brother. She imitated her boss CONVINCINGLY. The lights are flashing RAPIDLY.
Colors, though, are typically used in their adjective forms for both steady-states and changes-of-state:
The lights glowed red and blue. (steady-state) He painted the door green. (change-of-state)
Thus, in English, initial states are introduced by the prepositions "from" or "out of". Final states which are represented by noun phrases use the prepositions "in" or "into". States which are represented by adjectives do not use any case role marker. Steady-states use either adjectives (rarely) or adverbs (frequently).
All of these situations can be dealt with quite easily in the current framework. For the manner case role, we introduced the verb "to be similar to" and its derivatives. For the state case role, we will need a verb meaning 'to be equal to', or 'to be the same as', or simply 'to be'. In the interlingua, we will use the root "dap" to represent this concept (default = P/F-s verb). We will also use its antonymic counterpart "fes" with the meaning 'differ from/be unequal to'. Thus, the case tags are:
P/F-d: dapumbe = English "to/into" (literally "becoming the same as") fesumbe = English "from/out of" (literally "becoming not the same as")
Also, note that the P/F-s verb "dapa" is equivalent to the English copula "to be". However, since the concept of 'being' is an inherent feature of all of our P-s verbs, adjectives, and basic nouns, this verb will not be needed in the interlingua as often as it is in English. Here are some examples:
The food bokema = The food is wet. where "bokema" is the P-s verb meaning 'to be wet' That bug bokaga = That bug is a mosquito. where "bokaga" is the P-s verb meaning 'to be a mosquito'
However, the verb "dapa" can still be useful when emphasis is needed or when both nouns need modification or are definite.
[Incidentally, the sentence "The food dapa bokemi" literally means 'The food is the wet one'. In the interlingua, an adjective can never be an argument of a verb. Thus, to get the meaning 'The food is wet', we must use "The food bokema".]
The derivation "dape" is also useful, literally meaning 'being', and is equivalent to the English prepositions "as" or "for" in the following:
These warriors use broadswords AS/FOR their weapon of choice. I prefer Jim AS/FOR the new school principal. The voters elected Bill Johnson AS the new mayor. AS an astronomer, he knows a lot about stars. Your claims do not conform to the laws of physics AS we know them.
Where English uses adjectives or adverbs, we will use the adverb form of the appropriate P-s or P-d verb. For example, in a sentence such as "He painted the door green", we would use the P-d adverb form of the root meaning 'green'. Literally, this would mean something like 'He painted the door, it becoming green'.
Finally, do not confuse state case roles with the focus case role. Consider the following:
Louise ran the marathon. Louise sang an aria.
In both examples, the object is a focus. If it were a state, it would describe the state of Louise. In other words, it would indicate that Louise was a marathon or an aria. However, neither "marathon" nor "aria" describe the state of a patient - instead, they elaborate the events.
In the preceding sections, we derived several case tags. Here is a list that allows us to compare their various forms:
Primary case roles:
Passive -> tomese passive + true generic Anti-passive -> tomose anti-passive + true generic Comitative ('with') -> tomeve co-subject + true generic Non-subject ('without') -> tomove non-subject + true generic
Secondary generic case roles:
Secondary Agent -> tomame true generic + 0/A = Reason 'because (of)' Secondary Agent-patient -> tomime true generic + 0/AP = Exchange 'to/from/with' Secondary Patient -> tomume true generic + 0/P = Beneficiary 'for' Secondary Focus -> tomege true generic + 0/F = Exchange 'for'
Secondary non-generic case roles:
Instrument/Means/Method 'by/with/via/etc' -> busege action root + 0/F Locative 'at/in' -> zoge locative root (P/F-s) Locative 'to' -> zogumbe locative root + P/F-d Locative 'from' -> fagumbe locative root + P/F-d Temporal 'before' -> cipe tense root (P/F-s) Manner 'like' -> zunxumege state root + 0/F State 'into' -> dapumbe state root + P/F-d State 'from' -> fesumbe state root + P/F-d
For verbs describing relational states, the focus case role indicates the referent of the verb. This referent is always obvious and needs no further explanation. For verbs describing physical states, however, the focus is not always obvious. In fact, many physical verbs do not appear to have a focus at all. As we will see, though, all verbs can have a focus. For many, though, the focus is so strongly implied by the meaning of the verb that expressing it obliquely or as a direct object would be redundant.
Before we try to deal with verbs that seem to be inherently unfocused, let's first re-examine the semantics of focus in more obvious situations. Remember, for a focused state verb, the patient experiences a steady-state or undergoes a change of state in its relationship with the focus. For example:
1. John needs money. 2. John owns the house. 3. John bought the house.
In (1), we are describing a relationship between "John" and "money". The relationship is defined by the state concept "need". In (2) and (3), we are describing a relationship between "John" and "the house". The relationship is defined by the state concept "ownership", where (2) describes a steady-state and (3) describes a change of state (number 3 also implies the use of money as a secondary focus). Thus, there is a relationship between the patient of the verb and the focus. Let's extend this idea to some simple static verbs:
I'm angry "focus" Louise. = I'm angry at Louise. The house is free "focus" termites. = The house is free of/from termites. The little girl is afraid "focus" thunder. = The little girl is afraid of thunder. John is proud "focus" his father. = John is proud of his father. John is happy "focus" Louise. = John is happy for/about Louise.
Note that the above examples can be expressed either as P/F-s verbs where the focus is the direct object, or as P-s [+F] verbs with an oblique focus. Thus, all of the English examples above are inherently anti-passive.
Do not make the mistake of analyzing the above foci as reasons or indirect causes. For example, the sentence "The girl is afraid OF thunder" does not mean the same as "The girl is afraid BECAUSE OF thunder".
The above examples use verbs that are inherently relational. For verbs that represent non-relational scalar states, the focus will represent the actual position on a scale of possibilities:
John is wealthy "focus" $1,000,000. = John is wealthy to the tune of $1,000,000. John is tall "focus" 6 feet. = John is 6 feet tall. The box is heavy "focus" 10 kilograms. = The box weighs 10 kilograms. The new student is intelligent "focus" 160. = The new student has an intelligence (IQ) of 160. The painting is expensive "focus" 100 dollars. = The painting costs 100 dollars.
In other words, the focus of scalar states indicates the precise nature of the state.
We can also create examples where the focus is abstract:
He formatted the document "focus" company standards. = He formatted the document according to company standards.
In other words, the document is in a relationship with a company standard, and the nature of the relationship is indicated by the verb "format". For some verbs, though, the focus is so strongly implied that expressing it separately seems redundant:
The recession impoverished his family (?of money). The cat killed the mouse (?of life). The boys cleaned the tables (?of dirt). The boys broke the window (?of its structure).
We should be able to apply the same logic to specify a focus for verbs that, on first examination, appear to be inherently unfocusable, even if the result is redundant. For example, what could the focus be in the following sentence:
John managed the company. (A/P-s)
When something is managed, it has operations or other components that can be controlled:
John managed the company (?in its operations).
However, if the focus adds detail that is not implied by the verb, then a specific focus is not only acceptable but very useful:
John managed the company in its overseas operations.
Now, consider the following:
The warrior struck the peasant. (A/P-d)
Again, we can focus the action only if it provides more detail, as in:
The warrior struck the peasant a mighty blow to the head. (A/P/F-d)
In other words, the focus of an action is a more detailed description of the action itself. Note that this is exactly what happens with speech acts, where the focus describes the actual message being conveyed (e.g. "John told the kids A STORY").
Thus, for some verbs, the focus is an inherent part of the meaning of the verb; i.e., it is lexicalized. A specific focus only makes sense if it provides more detailed information.
Finally, there are indeed concepts that are inherently unfocusable. However, these are not true state or action concepts, and will not be derived as basic verbs. We'll have more to say about them later, when we discuss deixis.
Since verbs can be converted to oblique case tags and adverbs, why not apply the same logic to create the equivalent of English non-modifying prepositional phrases (e.g. "the man WITH THE RED HAT") or adjective phrases (e.g. "countries RICH IN OIL")?
By its very nature, a verb has arguments. When other parts-of-speech are derived from verb forms, the results can also have arguments. For example, an adverb is an oblique argument of a verb but takes no additional arguments of its own. A case tag, however, operates in the same way while adding one or two new arguments to the verb. In effect, a case tag is an open verb argument, since its non-subject arguments are available for use. An adverb, however, is a closed verb argument, since it cannot take any more arguments of its own.
The same distinction can be made with other parts-of-speech that are derived from verbs. For example, the nouns and adjectives that we've seen so far are all closed, since they take no arguments of their own. In this section, we will discuss what happens when we 'open them up'.
In order to do this, though, we first need to summarize what we've done so far, and introduce a few new concepts:
1. The part-of-speech of a word in the interlingua is indicated by the final part-of-speech marker. The following morphemes have been introduced so far: -a = verb -e = case tag or adverb -i = noun -o = adjective -ay = previous-word modifier (see below; e.g. adverbs that modify adjectives) 2. By definition, verbs and case tags are inherently open. Nouns, adjectives, and adverbs are inherently closed. 3. Three new part-of-speech markers will be assigned that will open the argument structure of a normally closed word: -yu = open adjective -aw = open noun -wa = open previous-word modifier 4. An appropriate grammatical voice operation can be performed to close the argument structure of words that are inherently open.
Note that item (1) introduces a new part-of-speech marker, "-ay", which creates previous-word modifiers. These words will always modify the immediately preceding word, regardless of its part-of-speech. Thus, they can be used to implement English adverbs that modify adjectives (e.g. "RECENTLY married couple", "RAPIDLY flowing stream", etc.).
Previous-word modifiers can also be used to modify adverbs, case tags, and other previous-word modifiers.
Item (2) states nothing new and simply re-iterates what we've been doing all along.
Item (3) can be used to create nouns, adjectives, and previous-word modifiers that take arguments. I will illustrate how to do this below.
Item (4) simply re-iterates something we already know. That is, we can apply grammatical voice operations to remove one or more arguments from a verb, effectively closing it. This will allow us to create adverbs that do not take arguments of their own from verbs that normally take objects. For example, middle forms can be used to create adverbs such as "unexpectedly", "repeatedly", "amusedly", etc. Anti-middle forms can be used to create adverbs with meanings such as "destructively", "lovingly", "oppressively", "knowingly", and so on.
By opening up the argument structure of adjectives, we can create words that represent the functions of many English prepositions. Consider the following examples:
the cup ON the table ("on" = 'being located on') the circus AT the fairgrounds ("at" = 'being located at') the can OF beans ("of" = 'containing') the magazine UNDER the box ("under" = 'being under') the pile OF junk ("of" = 'consisting of') the pound OF beef ("of" = 'consisting of' the building ACROSS the street ("across" = 'being across') the paper BY Smith ("by" = 'having Smith as agent')
Note that all of the above (except the agentive "by") must use the P/F-s forms of the corresponding verb.
Each open adjective will link the noun it modifies with the argument of the open adjective. Here are a few derivations using morphemes we've already defined:
agent -> tomamyu e.g. the book by Mark Twain reason -> tomamyu e.g. the delay tomamyu Joe = the delay caused by Joe with -> tomevyu e.g. the boy with those two women for -> tomumyu e.g. the party for Jill at/in -> zogyu e.g. the man at the corner before -> cipyu e.g. the day before the party method -> busegyu e.g. death by strangling state -> fesumbyu e.g. the hut fesumbyu straw = the hut made (out) of straw
And so on. Note that verbs that have had their argument structure changed with a voice-changing morpheme can also be converted to open adjectives. This would allow you to handle distinctions such as (active) "the man owning the house" vs. (inverse) "the house belonging to the man" or (passive) "the house owned by the man".
The general scalar relationship root "xum" can be used with the very general sense 'having an unspecified relationship with' or 'having something to do with'. Keep in mind that interpretations of "xum" can be different depending on context, since the generic root morpheme does not indicate a specific relationship. Out of context, an accurate paraphrase of "the article xumyu the election" would be "the article having something to do with the election". Thus, a likely translation would be "the article ABOUT the election". Other examples are "the recipe FOR cake" and "a big book OF jokes". Note that English also allows these to be expressed as vague noun-noun compounds: "election article", "cake recipe", and "joke book".
In fact, the open adjective "xumyu" can often be used in place of other, more specific, open adjectives, although the result will be vaguer. For example, "the letter xumyu Louise" means literally 'the letter associated with Louise'. Thus, a closer English gloss would be simply "the Louise letter", since it can mean 'the letter to Louise', 'the letter from Louise', 'the letter with Louise', 'the letter about Louise', and so on. Because of this vagueness, it will be useful in machine translation when the software cannot tell the difference between the very general 'associative' sense and a more precise genitive sense (which we will discuss in the next section).
Since the derivation of open adjectives is essentially the same as the derivation of case tags, I won't spend much more time on it here. In general, most case tags will have adjective counterparts, especially the locative ones. Also, keep in mind that different languages implement these functions in different ways. For example, in many languages, they are neither adpositions nor inflections, but are implemented as relative clauses (e.g. "the boy in the kitchen" = "the boy who is in the kitchen").
Also, a few languages, such as English, allow some case tags to be used, unmodified, as open adjectives. However, this is not allowed in the interlingua because case tags and open adjectives are both syntactically and semantically distinct, and because conflating them would often result in attachment ambiguities. For example, in the sentence "I spoke with the lady in the storeroom", is "in the storeroom" an oblique argument of "spoke" or a modifier of "the lady"? Besides, natural languages that use the same word for both roles, including English, often do so idiosyncratically. Consider the following:
I put the box UNDER the bed. The box UNDER the bed is empty. The man walked INTO the room. *The man INTO the room is my brother. The man WHO WALKED/WENT INTO the room is my brother. He built the doghouse OUT OF plywood. *The doghouse OUT OF plywood is as good as the plastic one. The doghouse MADE OF plywood is as good as the plastic one. They delayed the operation BECAUSE OF his death. ?The delay BECAUSE OF his death was unavoidable. The delay CAUSED BY/DUE TO his death was unavoidable.
In other words, sometimes the case tag is the same as the open adjective, while other times it is not. When it is not the same, it is either periphrastic or idiosyncratic. The system used here is totally regular and unambiguous.
English has two ways to implement the genitive (also called the _possessive_): use of apostrophe-s or use of the preposition "of". Here are some examples:
Definite argument: the boat of the student = the student's boat the boat of the students = the students' boat Indefinite argument: the boat of a student OR a boat of a student = a student's boat the boat of some students OR a boat of some students = some students' boat
Note that the English apostrophe-s form is sometimes ambiguous, as in the last two examples above. However, there are times when we cannot use an apostrophe-s form at all, as in "a boat of the student". In other words, the apostrophe-s form can be used only if the headword is definite or if the definiteness of the headword is the same as the definiteness of the noun following "of". If this is not true, then the "of" form must be used in English.
Now, consider the following:
I put the new computer in the room of computers = I put the new computer in the computer room.
In the above example, "computers" is a generic noun and the "of" form is too unnatural for fluent use. Even the apostrophe-s form would sound unnatural in most situations. Instead, to correctly capture the semantics and to sound more natural, we must use a noun-noun compound. [We'll see later how to implement generic nouns and compounds.]
In the interlingua, the genitive concept is represented by the root "xim". By default, "xim" is F/P-s (explained later). Here are some examples:
the tail ximyu bird = the bird's tail the book ximyu John = John's book the answers ximyu the students = the students' answers
Note that the default argument structure of "xim" is actually the inverse of an active P/F-s structure, because a genitive sense inverts the normal P/F-s possessive sense. For example, the noun phrase "John's book" implies that John has the book, and the concept 'have' is inherently P/F-s. [We have chosen this somewhat odd default for its usefulness. And, as we will see later, it will have a very important additional use.]
As it turns out, the genitive concept almost completely overlaps the many senses of the English verb "to have". To illustrate this, consider the following examples:
The project has a new manager => the project's new manager The house has a red roof => the house's new roof He has a good reputation => his good reputation We had problems with the new equipment => our problems John has an answer to your question => John's answer I had supper at 6 o'clock => my supper
In other words, the semantics of the verb "to have" encompasses much more semantic space than the prototypical sense of 'possession', 'ownership', or 'control'. In fact, it can even imply the exact opposite, as in:
The slaves have a new owner.
Thus, the P/F-s verb "ximunza" is in almost all respects the equivalent of the English verb "to have". [The English verb is different when it is used as an auxiliary as in "He HAS arrived", and when it is used with a causative sense as in "I HAD Joe sweep the garage".] Also, the sense of the word (and its derivatives) usually defaults to 'possession' or 'control' whenever the actual relationship is not clear from context, and this default appears to be universal among natural languages. However, as we saw in the last example above, the exact opposite can also be true.
This automatically leads to the creation of several additional words:
A/P/F-d: ximamba - 'to give', 'to cause P to come to have F' AP/F-s: ximinza - 'to keep', 'to retain' AP/F-d: ximimba - 'to obtain/get', 'to accept', 'to take' P/F-s: ximunza - 'to have', 'to possess' P/F-d: ximumba - 'to get/receive', 'to come into/by'
Note that we used the P/F-s suffix "-unz" to convert "xim" from F/P-s to P/F-s. We will not allow the use of the inverse suffix "-ang", even though it is technically correct.
By opening up the argument structure of nouns, we can create more complex noun phrases without having to resort to the use of prepositions, relative clauses or other subordinate constructions. Here are a few English examples:
They hired a TERMITE EXTERMINATOR. BASEBALL PLAYERS get paid too much. I am no longer a COFFEE DRINKER.
In all of the above highlighted phrases, the second word is an open noun version of a verb and the first word is its object argument. Since we are using the noun version of a verb, and since such use represents a generic subject, the subject position is automatically filled. Thus, we can say "coffee drinker" where "coffee" is a noun (literally 'drinker of coffee'), but we cannot say "woman drinker" (literally, 'drinker of woman'), where "woman" is also a noun (although we can get the sense 'woman who is a drinker' by using the adjective version of "woman").
If we first invert a verb and then use its open noun form, the original subject position becomes available while the original object position becomes automatically filled. For example, the inverse-noun form of "to study" would correspond to the English words "subject" or "topic". If we then open it up, we can create an expression like "subject John", which would be equivalent to the English expressions "John's subject of study" or "the subject that John is studying".
Later, when we discuss prefixes in more detail, we will be able to derive process nouns from verbs, such as "destruction" from "to destroy". This will allow us to emulate English expressions such as "the destruction of the city by the enemy" without the need for prepositions.
Closed previous-word modifiers can be used to implement English adverbs that modify adjectives. Here are some English examples:
The POORLY built homes collapsed in the earthquake. He emptied the PARTIALLY filled can. The EXTENSIVELY mined pit was an eyesore. QUICK-frozen vegetables taste better than canned vegetables. I really enjoy PROPERLY prepared seafood.
Note that the above adjective-modifying adverbs are the same as verb-modifying adverbs except that the part-of-speech marker must be "-ay" rather than "-e".
The system presented here also allows us to create open previous-word modifiers; i.e. words which modify adjectives, adverbs, etc. and which take an argument and link it to the preceding word. We'll see how this can be useful later.
Some readers may object to the creation of open nouns and open adjective modifiers, claiming that they are simply short cuts for subordinate clauses. This is not really true because they cannot be modified for tense, aspect, or modality. For example, consider the following:
My beer-drinking buddies think ...
versus
My buddies (who are) drinking beer in the corner over there think ... My buddies who shouldn't drink beer so much think ... My buddies who drank the beer that had gone bad think ... My buddies who may be drinking beer tomorrow night think ...
and so on.
In effect, the phrase "beer-drinking" says nothing about when the event occurred, nor does it provide additional details about where the event occurred, how it occurred, etc. In other words, an open modifier like this is more general, because it does not describe a particular event. [Linguists refer to phrases such as these as non-finite. Phrases which are modified for tense, aspect, and modality are called finite.]
It is important to keep in mind that the intent of open modifiers is to allow the creation of non-finite forms because they exist in many natural languages. These constructions are not intended to be used as shortcuts for subordinate clauses, and, if used in such a way, translations are likely to be inaccurate.
Finally, although the argument structure of the original verb is available for use in open nouns and modifiers, keep in mind that it is no longer a verb. Thus, it cannot be further modified by adverbs or case tags.
We've already discussed morphemes that change the argument structure of a word. In this chapter, I would like to discuss some of the other morphemes that will be needed.
As we stated earlier, we will always implement a morpheme as a prefix if it modifies the meaning of the word in a semantically precise way without changing its syntax. If a morpheme modifies the syntax of a word, it will be implemented as a suffix.
Here are some new prefixes and suffixes:
lo- Negator prefix There are times when we will need to negate the meaning of a word and indicate that the referent is actually 'not' or 'other than' something. We will use the prefix "lo-" for this purpose. Here are some examples: kopeso = known (adjective) lokopeso = unknown bodami = duck (noun) lobodami = non-duck bocali = swimmer lobocali = non-swimmer The prefix "lo-" should not be used to negate verbs. Instead, we will use another word for this purpose which we will discuss in the chapter on Modality. For example, "lokopa", meaning 'to not know' is illegal. However, the noun derivation "lokopi", meaning 'non-knower' and the passive adjective derivation "lokopeso", meaning 'unknown', are acceptable. lay- 'again/repeat' This prefix represents the concept of the English adverbs 'again' or 'over' or the English prefix "re-", as in "He opened the door again" or "He reopened the door". It indicates that the event is a repetition of a previous event. Here are some English examples: to rejoin, to reclassify, to reorganize, to replay Note that all of the above could just as easily have used the adverb "again", as in "to play again". When modifying noun concepts, it can be rendered in English as "once-again" or "de novo", as in "the once-again teacher". lyu- 'back/in return' This prefix will represent the concept behind the English expressions "back" or "in return", as in "He shouted back at me" or "I gave her a kiss in return". It indicates that a previous situation existed in which the current agent and a different agent had reversed roles. These derivations are probably not going to be useful with non-agentive verbs. Here are some English examples: to shout back at to do (someone) a favor in return to hit (someone) back etc. We can also use "lyu-" with the action root "bus". Thus, the AP-s verb "lyubusisa" means 'to reciprocate', the focused version, "lyubusinza" means 'to reciprocate by', and the AP-s adverb "lyubusise" means 'back' or 'in return'. However, the adverb is never really necessary since "lyu-" can be attached directly to the verb being modified. Do not confuse "lyu-" with the English words "back/again" when they mean 'one more time'. lwa- 'back/to a former state' This prefix represents the concept behind the English word 'back' in expressions such as "I put the book back on the table" or "The coach changed back into a pumpkin". It is used to indicate that the patient is returning to a previous state. Here are some examples: A/P-d: "zogapa" = 'to move/place/position' "lwazogapa" = 'to put/move back' A/P/F-d: "ximamba" = 'to give' "lwaximamba" = 'to give back', 'to return to' English examples: to go back to, to return to to throw (something) back to to turn back into, etc. We can also use "lwa-" with the root "dap" meaning 'to be'. For example, A/P/F-d "lwadapamba" means 'to restore, reestablish, or return something' to its previous state, P/F-d "lwadapumba" means 'to revert/regress/retrogress/reverse to', or 'to change back to', and so on. Finally, do not confuse "lyu-" = 'back/in return' with "lwa-". The prefix "lyu-" indicates that the AGENT is repeating something done earlier by a different agent. The prefix "lwa-" indicates that the PATIENT is reverting to a former state. For example, in a sentence such as "John threw the ball back to Bill", we can use either "lyu-" or "lwa-". However, use of "lyu-" emphasizes that John was simply repeating Bill's action, while "lwa-" emphasizes that the ball is being returned to its previous state. In other words, "lyu-" is inherently an action since it emphasizes what the agent does, while "lwa-" is inherently stative since it emphasizes what happens to the patient. -on Essential quality and ability suffix Use the suffix "-on" to represent the essential and distinctive quality of a prototypical, generic subject. For derivations from agentive verbs, the meaning will indicate some kind of capability or skill. For other verbs, it will indicate the abstract quality associated with the subject. In other words, for verbal concepts, it will indicate the attribute that a generic subject "has". (Note that we are using the loose sense of 'have' in the interlingua word "ximunza", not the much stronger sense of the English words "possess" or "own".) For derivations from nouns and adjectives, most English noun equivalents will end in "-ness" or "-ity", while adjective equivalents will be the corresponding English attributive word (e.g. "navy" -> "naval", "marriage" -> "marital", "reptile" -> "reptilian", "circle" -> "circular", and so on). Examples: "kopa" = 'to know' "koponi" = 'knowledge' (literally: what the subject "has") "kopono" = 'knowledgeable' "kopemoni" = 'knowability' (literally: what the object "has") "kopemono" = 'knowable' "kobaycala" = 'to teach' (intransitive) "kobaycaloni" = 'teaching ability' "kobaybegi" = 'teacher' "kobaybegoni" = "teacherness" "zunxume" = 'like/similar to' "zunxumoni" = 'similarity' "bokemo" = 'wet/damp' "bokemoni" = 'wetness', 'dampness' "kavo" = 'real/existent' "kavoni" = 'reality/existence/realness' "byedami" = 'bird' "byedamoni" = 'birdness' "byedamono" = 'avian' "byefemi" = 'time (period)' "byefemono" = 'temporal' Both nouns and adjectives can be formed from the active and middle voice derivations. For example, if the P/F-s verb meaning 'to like/enjoy' is "zoykopa", then we can derive the following: "zoykoponi" = 'enjoyment', 'pleasure', 'delight' "zoykopono" = 'pleased', 'delighted', 'thrilled', 'gratified', 'appreciative' [Note that the simple adjective "zoykopo" has essentially the same meaning.] "zoykopemoni " = 'likableness', 'likability', 'enjoyability' "zoykopemono " = 'likable', 'enjoyable', 'pleasing', 'delightful' [Note that the open adjective form "zoykopemonyu" is also useful and means 'pleasing to'.] Applying the same logic, verb forms will mean 'having the quality or ability' and will maintain their original argument structure. Thus, from "kobaycala" meaning 'to teach', we can derive "kobaycalona" meaning 'to have the ability to teach' or 'to know how to teach'. In effect, we are divorcing the act from the ability. Thus, the verb "kobaycalona" means that someone has the ability to teach (but not necessarily that this ability is actually put to use). It is essential to note that this convention is contrary to our usual approach, which would force us to interpret "kobaycalona" as 'to BE the ability to teach'. However, this is not a useful interpretation and can be achieved, if necessary, using the copula verb "dapa". The interpretation 'to HAVE the ability to teach' is much more useful. In effect, a derivation using "-on" will be a noun with the meaning 'ability to X'. All non-noun derivations will have the sense 'having the ability to X'. In other words, the noun will represent the ability itself, while all other parts-of-speech will represent 'having' the ability. If we open up the argument structure of a "-on" derivation, we will be able to fill all argument positions, including the subject. In other words, "-on" does not change the argument structure of a verb. For example, using a right-branching, VSO word order, "kobaycalanzona John students French" means 'John has the ability or knows how to teach students French' (where "-anz" = A/P/F-s); "people kobaycalanzonyu students French" ("-yu" = open adjective) means 'people with the ability to teach students French"; and "kobaycalanzonaw John students French" ("-aw" = open noun) means 'John's ability to teach students French'. The generic AP/F-s action verb "businzona" (from "businza", meaning 'to do') indicates that the subject has the ability to do or perform the deed elaborated by the focus. Thus, "businzona" is equivalent to the English verb "can" or "to be able", as in "John can swim" or "She knew how to make a lot of money" (but not "She was able to make a lot of money" which actually means that she succeeded in making a lot of money). Note though, that this verb is not likely to be used very much in the interlingua, since it is much more efficient to add "-on" directly to a verb. For example, it is more efficient to say "John kobaycalona" rather than "John businzona to teach", even though both mean 'John knows how to teach'. The adjective form "businzono" means 'able/capable/competent', and the noun form "businzoni" means 'ability/capability/competence (to do something)'. The middle adjective "businzemono" means 'doable', or 'capable of being done', while the noun form "businzemoni" means 'do-ability'. When used with P/F words, "-on" plus the open noun part-of- speech "-aw" will express the relationship between two entities (i.e., what they "have" in their relationship with each other). Here are some examples using a pure right-branching syntax (i.e. VSO): "zunxumonaw John Bill" = 'the similarity between John and Bill' "zoykoponaw John swimming" = 'the enjoyment that John has for swimming', 'John's enjoyment of swimming' "koponaw John mathematics" = 'John's knowledge of mathematics' "xumonaw lightning thunder" = 'the relationship between lightning and thunder' As seen in the last example, when combined with the generic state root "xum", "-on" will indicate a generic relationship. Thus, the simple noun "xumoni" will mean 'relationship', 'involvement', or 'association'. The unfocused P-s [-F] "xumomoni" will mean 'state' or 'condition'. -ink & -env Process and event noun suffixes A language needs to be able to talk about events as if they were objects; i.e., as if they were nouns. By doing so, the result will have the syntax of a noun, including its ability to be modified by adjectives. Here are some English examples: a. Rapid measuring will produce bad results. b. We have to make two accurate measurements. a. She hates bathing in that dirty lake. b. She had a long bath before bed. a. Forgetting your anniversary can be dangerous. b. I had a terrible lapse of memory. Note that all of the (a) examples ("measuring", "bathing", and "forgetting") refer to the actual process that takes place between the endpoints. In other words, they look at the event "from the inside". All of the (b) examples, however, refer to a discrete event. In effect, we are looking at the events "from the outside". Another way of looking at the distinction between processes and events is that an event is essentially an entity or an object, while a process is essentially a state. In other words, an event IS a thing while a process DESCRIBES a thing. Note also that the process vs. event distinction for verbs is very similar to the mass vs. count distinction for nouns. In the interlingua, we will allocate the suffix "-ink" to create process nouns and the suffix "-env" to create event nouns. English process derivations typically end in "-ing". Event derivations end in "-ing", "-ion", or "-ment", use the verb unchanged (eg. "bath"), or have completely idiosyncratic forms (eg. "lapse of memory"). Here are some more examples (results are nouns by default): "zunxuminza" = 'to imitate' "zunxuminzinki" = 'imitating' "zunxuminzenvi" = 'imitation' "busa" = 'to do something to' "businki" = 'affecting', 'doing (something to someone/something)' "busenvi" = 'action', 'deed', 'act' "busasa" = 'to control/manage/run' "busasinki" OR "busasenvi" = 'control(ling)', 'managing/management' [Note that English does not seem to make a distinction here between the process and event senses.] "businza" = 'to do/perform' "businzinki" = 'doing', 'performing' "businzenvi" = 'activity', 'goings-on', 'doings' "busisa" = 'to act/take action' "busisinki" = 'acting' "busisenvi" = 'action (taken)', 'steps (taken)' Here are several English process/event pairs: rehearsing/ rehearsal, speaking/speech, flying/flight, crossing/crossing, auditioning/audition, driving/drive, fighting/fight, receiving/receipt, announcing/announcement, etc. Processes and events maintain their argument structure. Thus, the open noun form (assuming a right-branching word order) "zunxuminzenvaw John teacher" means 'John's imitation of the teacher', while the middle version "zunxuminzemenvaw teacher" means 'imitation of the teacher'. Finally, the process and event noun suffixes are used to convert a clause to a noun. In other words, the suffix "noun-ifies" a clause, and thus allows it to be treated syntactically like a noun. For example, it can be modified by an adjective. However, these suffixes are (probably) not really needed in the interlingua since any argument of an open word can be either a noun or a clause. They were included in the language because they have counterparts in many other languages, and thus will make translation easier. -iv Infinitive/Participle suffix The infinitive/participle suffix is used when the verb is part of an embedded sentence and when its subject is the same as one of the preceding arguments of the outer verb. The English equivalent is the particle "to", as in "John wants to go now" or "He tried to open the door" or "I told the children to sit down". It can also sometimes be represented by English gerunds ending in "-ing", as in "I know eating fried foods is bad for my health" or "He hates getting up early". Be careful not to confuse the infinitive/participle with the purpose case role, as in "Bill opened the window (in order) to cool off the room". [We'll discuss the derivation of the purpose case tag later.] When used as indicated above, infinitives are inherently non- finite and should NEVER be marked for tense and aspect. [We'll have more to say about tense and aspect later.] [Incidentally, because an infinitive is non-finite, its actual tense and aspect will either be clear from context or will be ambiguous. Because of the potential for ambiguity, I was reluctant to include an infinitive form in the interlingua, and decided to do so only because many languages have an equivalent form.] Suffix "-iv" should also be used to mark a verb if it has the same subject as a preceding verb that it is linked to by means of a conjunction or case tag. Here are two examples in English: I broke the window by kicking it. Joe opened the window and left the room. In the first example, the word "kick" should have the suffix "-iv" to indicate that the subject of "kick" is the same as the subject of "broke". In the second example, the word for "left" should use "-iv" to indicate that its subject is the same as the subject of "opened". When used in this way, a word marked with "-iv" may also be marked for tense and aspect, as in: Joe opened the window and will leave the room later. However, unless overridden, the tense and aspect that applies to the first verb will also apply to the infinitive. -oys Same arguments as first conjunct suffix There are times when we want to have more than one verb use the same argument list. Here are some English examples: John opened and closed the window. The room was cold and wet. In the second example, the words "cold" and "wet" are actually P-s verbs in the interlingua. For example, the adjective "bokemo" means 'wet' and the P-s verb "bokema" means 'to be wet'. In the interlingua, syntax follows very strict rules and we can't just link verbs together without modifying them in some way to ensure that the parser will parse them correctly. We could insist on verbosity, as in "John opened the window and he closed it". However, this will put a significant additional burden on the source translator. Instead, in the interlingua, we will use the suffix "-oys" to mark the second and subsequent verbs in a chain that have the same argument list as the first verb. Verbs marked with "-oys" cannot take additional core arguments, although they may take additional oblique arguments. If so, the additional oblique arguments will apply only to the verb they immediately follow. For example (using VSO word order): Doykavapa John kifigi tesye doyjuvapoysa ticuloge. opened John window and closed(oys) quickly 'John opened and quickly closed the window. In the above example, "doyjuvapa" inherits the arguments "John" and "kifigi" from "doykavapa". However, "ticuloge" applies only to "doyjuvapa". Note that it is not possible to apply an argument to the first conjunct and not apply it to the second and subsequent conjuncts. For example, it is not possible to say "John quickly opened and closed the door", where "quickly" modifies only "opened". Here's the example: Doykavapa John kifigi ticuloge tesye doyjuvapoysa. opened John window quickly and closed(oys) 'John opened and closed the window quickly. In other words, since "quickly" modifies "opened", it also automatically modifies "closed". If we wish to modify only "opened", then we must use periphrasis in both English and the interlingua, as in "John quickly opened the window and closed it". -av Reflexive suffix In a reflexive construction, an argument is marked as being identical to the subject of the verb. Most reflexive constructions in English use the morpheme "self" to mark this function. In the lexical semantic system we are discussing here, this function is often performed by deriving a verb whose subject is AP. For example, the verb "to kill" is an A/P-d verb, while the AP-d version means 'to kill oneself or commit suicide'. There are situations, however, when we must reflexivize a focus, creating subjects that are either PF, APF, or AF. There will also be cases where we want to reflexivize an action, as in "He kicked himself". The reflexive suffix "-av" will allow us to do this. Here are some examples: A/P/F-s "faganza" = 'to keep (something) away from (somewhere) A=F/P-s "faganzava" = 'to keep (something) away from oneself', 'to keep away' A/P/F-d "fagamba" = 'to move (something) away from (somewhere)' A=F/P-d "fagambava" = 'to send/move away', 'to dismiss/dispatch/ expel', 'to cause P to become away from oneself' Note the use of "=" in the above notation. While it could be omitted in the above examples without confusion, it can not be omitted for action verb derivations such as "to kick oneself", because, for actions, AP has a different meaning than A=P. Here are a few English examples: A=F-d [+P] 'self-explanatory' P/F-s 'to be with' AP/F-s 'to accompany' A=F/P-s 'to bring/take along' AP=F-s 'self-admirer' and 'self-admiration' 'self-contempt' Note that, in all cases, X/Y becomes X=Y. For state derivations, X/Y/Z becomes X=Z/Y (NOT X=Y/Z!). There is never a need to go from X/Y/Z to X=Y/Z since this capability is already available as an AP/F verb derivation. For action derivations, X/Y/Z does become X=Y/Z. There is never a need to go from X/Y/Z to X=Z/Y, since it would never make sense for the focus of an action to be identical to the agent of the action. We can use the generic noun "tomavi" to represent English words such as "myself", "themselves", etc when we wish to create a stand-alone reflexive. (This is actually closer to the Japanese "jibun", since it does not indicate person or number.) Here are a few examples: He killed tomavi = He killed himself. I saw tomavi in the mirror = I saw myself in the mirror. The adjective form "tomavo" can represent the English word "own", as in the following: He killed tomavo mother = He killed his own mother. They brought tomavo chairs = They brought their own chairs. I wanted tomavo business = I wanted my own business. OR = I wanted a business of my own. Finally, English often uses "self" in ways that are not truly reflexive. For example, words like "self-discovery" and "self-satisfaction" are essentially idiomatic, and "-av" does not capture these meanings. Others, such as "self-ignition", imply that something happens automatically, with no apparent agent. These can be implemented using the basic P-d version of the verb. Also, expressions such as "he himself" are emphatics - and not true reflexives. [We'll discuss how to derive emphatics later.] -awn and -ind Reciprocal suffixes In a reciprocal construction, the subject performs the roles of both subject and object. Most reciprocal constructions in English use a plural or compound subject and the phrase "each other" as the object, as in "They punched each other". Some verbs, however, are inherently reciprocal, and we will use the reciprocal suffix to create them. Thus, this suffix will change the argument structure of a verb from X/Y-x to X+Y-x or from X/Y/Z-x to X+Y/Z-x. (Note the use of "+" in the notation "X+Y-x". This is necessary since the semantics of reciprocal XY is different from the semantics of normal XY.) Here are some examples: P/F-s: "xumyu" = 'about', 'for', 'associated with', 'in a relationship with', 'involved with', 'having something to do with' P+F-s "xumawno" = 'mutual', 'reciprocal', 'having an unspecified association or relationship with each other', "xumawni" = 'correlative' (i.e., things which have an unspecified relationship or association with each other) P/F-s "zunxumyu" = 'similar to', 'like' P+F-s "zunxumawno" = 'alike/similar', 'like/similar to each other' Here are a few English examples: A/P-d 'to argue/quarrel with' A+P-d 'to argue/quarrel' A/P/F-d 'to speak to... about...', 'to tell' A+P/F-d 'to converse/talk about', 'to discuss', 'to have a conversation about' Note that the suffix "-awn" is similar to the back/in return prefix "lyu-", which we discussed earlier, but is different in two important ways: first, the prefix "lyu-" does not combine the subject and the object into a single argument, and thus has no effect on the argument structure; and, second, "lyu-" implies a sequential event, whereas the suffix "-awn" implies simultaneity. We can use the generic "tomawni" to represent the English phrase "each other" or "one another" when we need to apply the concept in a non-verbal form. Finally, it will also be useful to have a reciprocal suffix that equates the patient and the focus of an A/P/F verb; i.e. A/P/F -> A/P+F. For this purpose, we will use the suffix "-ind". For example, the A/P/F-d locative verb "zogamba", meaning 'to move P to F', becomes A/P+F-d "zogambinda", meaning 'to put P and F together', 'to bring together', 'to gather', 'to round up', 'to muster', 'to assemble', etc (literally: 'to cause object components to become "at" each other). -ig Apply/Use suffix Many languages have ways to derive verbs from nouns with the meaning 'to apply noun to patient' or 'to use noun for/on patient'. In the interlingua, the suffix "-ig" is used for this purpose, and creates an A/P-d action verb from the noun root. Here are some English examples: "to brush" from the noun meaning 'brush' e.g. "Louise brushed her hair." "to hammer" from the noun meaning 'hammer' e.g. "I hammered the spike into the crossbeam." "to truck" from the noun meaning 'truck' e.g. "We trucked the goods into town." "to paint" from the noun meaning 'paint' e.g. "I need to paint the shed". "to radio" from the noun meaning 'radio' e.g. "They radioed the soldiers at the river." Note that, in each case, we can paraphrase the result as "the agent uses the noun for or on the patient, or applies the noun to the patient in a way that is inherent to the noun's nature". The suffix "-ig" should only be used with noun roots, with one exception: when suffixed to the generic action root "bus", the result "busiga" will be equivalent to English 'to use on/for' or 'to apply to'. However, unlike all other "-ig" derivations, "busiga" will be A/P/F-d. This is necessary because, when "-ig" is added to a noun root, the root itself is the inherent focus. However, when used with "bus", the item or substance being used must be explicit, allowing it to be expanded via modification. For example, we can use "busiga" to say something like "I used two bars of soap on the dogs", where "two bars of soap" is the explicit focus. [IMPORTANT! Note that the explicit focus usually precedes the patient in English. In the interlingua, the patient must precede the focus.] Do not confuse "busiga" with "busasa". "Busasa" indicates that the agent is simply in control of the patient, while "busiga" indicates that the focus is being used according to its nature to cause the patient to undergo a change-of-state. If in doubt, use "busasa", since it is more general. This is especially true if there is no clear implication that someone or something is undergoing a change-of-state as a result of the usage of the item. It's important to emphasize that when "-ig" is added to a root, the result is an action verb, not a state verb, and the result emphasizes what the agent is doing rather than what the patient is experiencing. Thus, the final state of the patient may not be obvious. If this very important difference between state and action concepts is not clear to you by now, you may want to refresh your memory by re-reading the sections on state verbs and action verbs. Finally, "-ig" should not be used to indicate that the noun is added to the patient, as in the English verbs "to salt" or "to water". We'll see how to handle this sense in the next section. -ent and -unk Add and Remove suffixes Many languages have ways to derive verbs from nouns with the meaning 'to add noun to patient' or 'to remove noun from patient'. In the interlingua, the suffix "-ent" is used for the 'add' sense and "-unk" is used for the 'remove' sense. In either case, the result is an A/P-d state verb. Here are some English examples: "to water/hydrate" and "to dry/dehydrate" from the noun meaning 'water' "to plant" and "to remove plants from" from the noun meaning 'plant' "to enlarge/expand/make bigger" and to "shrink/compress/make smaller" from the adjective meaning 'big' [For adjectives that have antonymic or opposite counterparts, "-ent" and "-unk" will be applied only to the positive forms. For example, we will not apply these suffixes to the root meaning 'small'.] "to pressurize" and "to depressurize" from the noun meaning 'pressure' "to salt" and "to de-salt/desalinate" from the noun meaning 'salt' These suffixes should only be used with noun and adjective roots, with one exception: they may be suffixed to the true generic root "tom". "Tomenta" will be equivalent to English 'add to', while "tomunka" will be equivalent to English 'extract or remove from'. However, unlike all other "-ent" and "-unk" derivations, these two words will be A/P/F-d. This is necessary because, when the suffix is added to a noun root, the root itself is the inherent focus. When used with "tom", however, the item or substance being added or removed must be explicit, allowing it to be expanded via modification. In other words, "tom" is simply a placeholder for any noun concept - the actual concept will be the focus. For example, we can use "tomenta" to say something like "I added two teaspoons of salt to the soup", where "two teaspoons of salt" is the explicit focus. [IMPORTANT! Note that the explicit focus usually precedes the patient in English. In the interlingua, the patient must precede the focus.] Do not confuse "-ent" with "-ig". For example, if the interlingua word for 'refrigerator' is "caybisi", then "caybisiga" means 'to refrigerate' (literally, 'to use refrigerators on the patient in the way they are normally used'), while "caybisenta" means 'to add refrigerators to', as in "We need to add refrigerators to the rooms". -aym Associated position noun suffix Use this suffix to derive the location or section associated with a state or action. The resulting structure will be P/F-s. This suffix can be used with any word, but is especially useful with the relational locative classifiers "-zog" and "-fag" to form the names of the basic directions and relative positions. Here are some examples: zizoge = in, inside of, within zizogaymi = the inside, the interior zizogaymo = inner, internal, interior dezoge = on (the surface of), upon dezogaymi = the surface dezogaymo = surface (as in "the surface texture or color") This suffix can also be used with other words to create new words with meanings such as 'the red place' or 'the poetry place' or 'the volleyball place'. It's important to emphasize that this suffix creates a P/F-s relationship in which the focus is the larger location that the patient is a part of. Here are some examples: zizogaymaw box = the inside of the box dezogaymaw ocean = the surface of the ocean Similarly, we can also create expressions with meanings such as 'the red part of the apple' or 'the poetry section of the library' or 'the volleyball section of the field'. Since this suffix represents the location associated with a person, thing, or event, it may not also be used with a voice suffix. In other words, it applies to the entire concept represented by the word - not just to a single argument of the word.
Several root classifiers have antonymic forms that can be used to create true opposites. We saw an example of this earlier (i.e., "zog" vs. "fag"). We'll see a few more later. Only classes that have many true, semantic opposites will have contrasting pairs of classifiers.
However, it will also be useful to be able to create approximate opposites; i.e., words which are not true antonyms but are highly contrastive. In the interlingua, we will use the modifying root morpheme "ju" for this purpose. This modifier can also be used to create true opposites for roots whose classes do not have contrasting classifiers.
Keep in mind that, since "ju-" is just a modifier, it is not being used with semantic precision. Instead, as with all modifying root morphemes, it is being used only for its mnemonic value. Here are a few examples:
bofemi = monsoon, rainy season jubofemi = dry season botisi = oasis jubotisi = desert joybusa = to treat, to minister, to give care to jujoybusa = to wreck, to ruin twacesi = chest [part of the body] jutwacesi = back
Note that "ju-" changes the meaning of the root, but does not change the class or argument structure. For example, we cannot create "jubocivi" with the meaning 'soil/earth/dirt' from "bocivi" (meaning 'water'), because the word meaning 'soil' requires a different classifier.
Finally, do not confuse the prefix "lo-" with the root modifier "ju". The prefix modifies the entire word while the root morpheme creates an antonym of the root. For example, "botisi" means 'oasis', "lobotisi" means 'non-oasis', "jubotisi" means 'desert', and "lojubotisi" means 'non-desert'.
The simplest possible generic derivation would consist of just the true generic root "tom" and an appropriate part-of-speech marker. Also, to represent true genericity, the default class of "tom" must be "0" (as also represented by the "0" non-linking suffix "-og", which we discussed earlier).
By its very nature, a pure generic like this can encompass any or all possible referents. In other words, they perform the same functions as the impersonal constructions of English and other natural languages.
Here are the derivations:
Generic adjective "tomo" - 'a/an/some (singular)', some (plural) e.g. We need AN empty box. SOME jerk just blocked my car. SOME people are at the door. Generic noun "tomi" - 'something/anything' e.g. SOMETHING broke the window. Did you see SOMETHING in the lake? Billy didn't break ANYTHING. [Note that "tomi" cannot be translated as 'somebody/ anybody', because these words can only be applied to people and are therefore too specific. To get the sense of 'somebody/anybody', we can use the words meaning 'a person' or 'some people'.] Generic verb "toma" - 'something's going on/happening' e.g. SOMETHING'S GOING ON here. If he persists, SOMETHING's bound TO HAPPEN. [Note that since "toma" does not specify an argument structure, it cannot have ANY core arguments, and may stand alone as a complete sentence. All arguments, if any, must be oblique.] Generic adverb "tome" - 'you know', '... or something', 'somewhere', 'somehow', 'for some reason or other', 'among other things', etc. ["Tome" simply indicates that the verb can take more arguments; i.e., that there's more that can be said, but that the speaker either can't or won't specify.] Generic previous-word modifier "tomay" - 'somehow' ["Tomay" can be used to modify adjectives or adverbs.]
If we apply the antonymic morpheme "ju-", the results are also very useful.
"jutomo" - 'no', 'not' e.g. NO man left these footprints. I saw Bill but NOT John. "jutomi" - 'nothing', 'nil', 'naught' "jutoma" - 'nothing's going on', 'nothing happened' "jutome" - 'that's all', 'that's it', 'no more', etc. [Like "tome", there is no close English equivalent to "jutome", which indicates that the verb can take no more arguments.]
Note that it would be semantically meaningless to have additional oblique arguments following "tome" or "jutome", even though the syntax allows it.
In the interlingua (as well as in many natural languages), verbs are marked to explicitly show their argument structure. Thus, for instance, a speaker can not use a verb that takes a focus unless he plans to provide a focus. If he wishes to omit an argument, he can use an appropriate voice-changing operation. In English, however, many of these voice-changing operations are not available and objects are often omitted, as in the following:
John is eating vs. John is eating a sandwich. Bill told a joke vs. Bill told the kids a joke.
There will be times, though, when a speaker wishes to emphasize that an argument is being intentionally omitted.
In the interlingua, the basic generics "tome", "jutome", "tomi", and "jutomi" allow us to do this. The generic "tomi/jutomi" fills a single empty slot in the argument structure of the verb, while generic "tome/jutome" does the same for an oblique argument. In effect, they are equivalent to the use of anti-passives without an oblique argument.
However, these words should not be used when the actual argument or modifier is known from context or is irrelevant. If it is known from context or is irrelevant, then an appropriate argument structure suffix or voice operation should be used. For example, in "John is eating", if what John is eating is irrelevant (i.e., the speaker is only interested in saying what John is doing), then the AP-s form of the verb should be used. [Note that this is equivalent to the anti-middle of the AP/F-s word.] In "Bill told a joke", if the audience is assumed from context, then the anti-middle voice should be used. Otherwise, the anti-passive should be used (which gives the speaker the option of specifying the audience obliquely; e.g. "Bill told a joke TO THE CHILDREN").
Many concepts are inherently scalar in nature; i.e., they can be easily modified to indicate their degree, as in "extremely cold", "very cold", moderately cold", "not so cold", and so on. The degree that applies to a concept is what we will call the polarity of the concept.
We've already had some exposure to the concept of polarity when we used an antonymic classifier or the modifier "ju-". As we've seen, these morphemes can also be used to create true opposites for state concepts that can have only an "either/or" interpretation. I will refer to these as binary states. Here are some English examples:
open -> close = become 'not open' attach -> detach = become 'not attached' recall -> forget = become 'not in-memory' enter -> exit = become 'not inside' zip -> unzip = become 'not zipped' same -> different = be 'not the same'
In other words, for binary states, anything that is 'not X' is by definition 'the opposite of X'.
When used with concepts that do not have true opposites, "ju-" can be used only for its mnemonic value. For example, as we saw earlier, "botisi" = 'oasis' vs. "jubotisi" = 'desert'.
Another type of true opposite is called an antonym. Antonyms are fully scalar concepts that are in opposition to each other. Examples of these are 'hot/cold', 'heavy/light', 'happy/sad', 'love/hate', and so on. Note that these are not binary opposites! For example, something that is not hot is not necessarily cold - it can also be neither hot nor cold. In the interlingua, an antonymic classifier exists for each of these classes.
A good test is to ask yourself if "slightly X" means the same as "mostly Y", where X and Y are the opposites. If they mean the same, then they are binary opposites. Otherwise, they are scalar opposites. For example, "slightly drunk" has essentially the same meaning as "mostly sober", and vice versa. Thus, "drunk/sober" are binary opposites. However, "slightly hot" does not mean "mostly cold", since it's possible to be neither hot nor cold. Thus, "hot/cold" are scalar opposites.
In addition to the above oppositions, scalar concepts can often be further broken down into narrower concepts that represent specific positions on the same scale, such as 'torrid/hot/warm/lukewarm/cool/chilly/cold/frigid'. Thus, these concepts are modified scalar concepts. However, natural languages almost never make minor distinctions such as between "cold" and "cool" or between "warm" and "hot" with completely different words. Instead, modifiers are normally used, as in "heavy" vs. "very heavy" vs. "not too heavy", etc. Also, when a language does make such a distinction using unique words, it is rare to find other languages that make the same distinction. For example, the Arabic word "baarid" can mean either 'cool', 'chilly', or 'cold'. Expressions meaning 'very', 'not too' and so on are used to provide greater detail when needed.
So, summarizing the above, there are basically four types of opposites:
1. Binary opposites: real vs. imaginary open vs. closed 2. Mnemonic opposites: oasis vs. desert monsoon (rainy season) vs. dry season tupelo/sourgum (swamp trees) vs. cactus 3. Antonyms (i.e., scalar opposites): fast vs. slow big vs. small high vs. low hot vs. cold 4. More specific scalar concepts: hot vs. warm vs. cool vs. cold gigantic vs. large vs. small vs. tiny bright vs. light vs. dim vs. dark vs. pitch black
There will be some cases in which the distinction between binary and scalar opposition is not clear. This generally occurs when a concept can be binary in one context and scalar in another. For example, the concepts 'wet' and 'dry' are in binary opposition in an example such as "The table is wet/dry", since, if the table is not wet, then it must be dry - there is no middle ground. However, in a case like "The climate here is wet/dry", the opposition is scalar, since 'not wet' does not necessarily imply 'dry', and vice versa. It's also possible for a climate to be 'average' or 'normal'; i.e., neither wet nor dry.
In situations like this, we will always implement the words using a scalar classifier, since it is less limiting. For example, the word "bokemo" will mean 'wet' and "bofomo" will mean 'dry', where "kem" is the classifier for other scalar non-relational states and "fom" is its antonym.
As we've already seen, binary and mnemonic opposites can be created using "ju-", but how do we deal with scalar opposites? In other words, how do we indicate a more specific degree of a scalar concept?
In the interlingua, we will start by creating five new modifying root morphemes that can provide the necessary additional detail. These will hardly ever be needed with scalar concepts to represent unique words from natural languages, since most people will prefer to use external modifiers such as "very", "not too", "hardly", etc. But there will be times when these concepts will be needed in word derivations.
Here are the morphemes that we will use in the interlingua:
bi- 'maximally', 'extremely', 'utmost' ke- 'very', 'highly' can- 'average', 'typical', 'midway' fo- 'not too', 'not very' zu- 'minimally', 'barely', 'hardly'
We will refer to all of the above morphemes as polarity modifiers.
It's important to emphasize that these five morphemes can not be used for the creation of one-time, one-shot, or off-the-cuff words (i.e. what linguists refer to as "nonce" words), unless the root being modified is a scalar non-relational state (or one of the few other inherently scalar roots that we will discuss later). For all other roots, these morphemes are non-productive and thus these roots must have valid dictionary entries in all target languages. For scalar non-relational states, the dictionary will contain a polarity derivation only if the target language has a unique word for it. For example, the word meaning "cool" will have an entry in the English dictionary but not in the Arabic dictionary.
When a polarity modifier is not used, the default interpretation for scalar non-relational states will be a normal distribution whose center is "can-". This interpretation appears to be universal among natural languages.
In addition, we will assume that the semantic space of "bi-" is a subset of the semantic space of "ke-", and that the semantic space of "zu-" is a subset of the semantic space of "fo-". We'll see examples of this below.
Using the above, we can start with the word "feculo" meaning 'hot' and create words such as "bifeculo" = 'torrid/scorching', "fofeculo" = 'warm', "zufeculo" = 'lukewarm/tepid', and so on.
However, as I stated above, natural languages hardly ever create distinct words to represent such concepts, depending instead on external modification. To make matters worse, the derivations may only be approximate. For example, we could also gloss "bifeculo" as either 'blistering' or 'scalding', but these all have implications beyond basic 'hotness', since they imply manner as well as degree of heat. Actually, the gloss 'torrid' is also somewhat inappropriate, since it has connotations of both 'dryness' and 'climate'.
Keep in mind, though, that this lack of precise English counterparts is not a problem at all. As long as the semantics of the derivations are precise, there will never be any doubt about their meaning, even though a particular derivation may not have an exact counterpart in a particular natural language. As I mentioned earlier, it is almost always impossible to find exact matches for a word in different languages. Also, the above derivations are actually more useful than the English counterparts, since they are slightly more general and can be used in more contexts. Specific implications such as 'climate' or 'dryness' are either obvious from context, or can be made obvious, if necessary, by further elaboration.
Here are some examples using relational verbs:
P/F-s "zunxumyu" = 'like', 'similar to', 'analogous to' "bizunxumo" = 'identical' "zunbimyu" = 'dissimilar to/different from' (classifier "bim" is the antonym of "xum")
[The above English words are sometimes used with a reciprocal sense. For example, in "This is a similar problem", we would use the word "zunxumo" for the adjective "similar". However, for "We had two similar problems", we really should use "zunxumawno" (where "-awn" is the reciprocal suffix) if the two problems were similar to each other. However, "zunxumo" is more general since it is less specific and, thus, includes both senses.]
Compare the above derivations with derivations using "dap" and "fes":
P/F-s: "dapa" = copula 'to be' "dapo" = 'same', 'equal' "feso" = 'not the same', 'unequal', 'different', 'other' ("fes" is the binary opposite of "dap")
In other words, two things can be either 'equal' or 'unequal', but not 'more equal' or 'less equal'.
For a binary state, one of the poles is often scalable, even though the opposite pole is not. For example, there are several degrees of 'pregnancy', 'openness', 'fullness', and 'inebriation', but the same does not apply to their opposites 'non-pregnant', 'closed', 'empty', and 'sober'.
Here are some useful examples derived from the binary P-s adjective root "bekepo", meaning 'intelligent, and its opposite "bekolo", meaning 'non-intelligent/mindless':
bekepo = intelligent, having intelligence bibekepo = brilliant, genius, exceptionally intelligent kebekepo = smart, bright, sharp, very intelligent canbekepo = of average/medium intelligence fobekepo = stupid, dumb, obtuse, doltish fobekepi = ignoramus, dolt, dunce, dope, dumbbell zubekepo = moronic, retarded, idiotic, dim-witted, feebleminded, simpleminded zubekepi = idiot, simpleton, dullard, dimwit, nitwit, moron, imbecile, half-wit bekolo = non-intelligent, lacking intelligence, unintelligent, mindless
Note that someone who is 'genius' is also 'bright', but someone who is 'bright' is not necessarily 'genius'. Thus, "bi-" derivations are a subset of "ke-" derivations. For the same reasons, "zu-" derivations are a subset of "fo-" derivations, and "can-" derivations are a subset of the unmarked case. In fact, all derivations except those using "ju-" or an antonymic classifier are subsets of the unmarked case, because "very intelligent" people, "barely intelligent" people, and so on are still "intelligent".
Finally, a completely different kind of opposite can be derived by means of the inverse grammatical voice change (suffix "-ang"). These words will all be derived from P/F-s state verbs, since they indicate a relationship between two entities. Here are some examples:
Active Inverse ------------ -------------- to own to belong to to be inside of to enclose to be a part of to consist of to produce/lead to to be the result/outcome of
Opposites of this type are normally referred to as converses.
Counts (also called quantifiers) and measures are inherently stative because they provide more information about the state of an entity. Consider the following:
He saw students. He saw tall students. He saw three tall students. He saw three 6-foot tall students.
Each use of a count or measure reduces the number of possible referents, just as if they were adjectives. Thus, counts and measures are inherently stative - they just happen to be quantitative rather than qualitative.
In the interlingua, a numeric quantity, regardless of magnitude or complexity, will be implemented as a single root whose syllables are compositional, and whose classifier is "kum". Each stand-alone numeric word will have the following format:
faw- minus sign (default = positive) -- cardinal (This is the default.) -bye ordinal -ci previous, minus one-th ordinal -je next, plus one-th ordinal -da N-ary, Nth in importance, rank, or value -zen N-tuple, N-fold, N of a kind, N in one -ku N at a time, N per group, in groups of N Numeric components: ju- zero ba- one xe- two di- three co- four tu- five za- six tay- seven fi- eight ko- nine Numeric linkers: -boy- decimal point -xo- positive exponent -twa- negative exponent -fu- real/imaginary separator -tin- fraction, X/Y
By default, all numeric words formed with classifier "kum" will be P-s.
Here are some examples:
jukumo = 0, no bakumo = 1 babyekumo = 1st badakumo = primary, main, chief bazenkumo = single dikumo = 3 dibyekumo = 3rd didakumo = tertiary dizenkumo = triple, threefold xekokumo = 29 xekobyekumo = 29th xejujukumo = 200 xejuxekumo = 202 xeboytukumo = 2.5 fawxeboytukumo = -2.5 xetindikumo = two-thirds tutinbaxekumo = five-twelfths xeboytujukoxokokumo = 2.509 x 10**9 difuxekumo = 3 + i2 xefufawcokumo = 2 - i4 xeboyxefudiboycokumo = 2.2 + i3.4
If a linker does not have a number to its left, then the default is assumed to be "ju" = 'zero' for decimal point and real/imaginary separator, and "ba" = 'one' for all the other linkers. For example:
boytutaykumo = 0.57 tindikumo = one-third xofikumo = 1 x 10**8 twafikumo = 1 x 10**-8 fufawxeboycokumo = 0 - i2.4
Now, we also need a way to represent non-specific numeric quantities, such as 'many', 'few', 'all', and so on. Since these are inherently scalar, the ideal approach is to use the scalar polarity modifiers. Here are the results:
bikumo all, every, the whole amount of, the maximum amount possible of kekumo many, much, lots of, a lot of, a large amount of, numerous, plenty of cankumo several, some, a number of, a moderate/average/typical amount of fokumo a few, a little, a small amount of, not too many, not too much zukumo very few, very little, a tiny/minimal amount of, hardly any, almost no kumo any, some, an unspecified number/quantity/amount of
[Technically, "kumo" should really be "byekumo". However, if we used "byekumo", then "kumo" would be useless because numeric derivations follow a paradigm. Thus, we will use "kumo" instead of "byekumo".
Here are some examples:
I saw kekumo bodami = I saw many ducks. I'd like fokumo bocivi, please. = I'd like a little water, please. There's zukumo soup in the pot. = There's almost no soup in the pot.
Note how, in the last two examples, less specific numerics can also be used to modify mass nouns. In fact, we will adopt the convention that the less specific numerics will have a mass interpretation when modifying mass nouns and a count interpretation when modifying count nouns. For example, "fokumo" will mean 'a few' when applied to count nouns and 'a little' when applied to mass nouns. Thus, "fokumo bodami" means 'a few ducks' while "fokumo bocivi" means 'a little water'.
Specific numerics, however, must always have a count interpretation, since a mass interpretation would not make sense. For example, "dikumo soup" means 'three units/portions/servings of soup' or simply 'three soups', where the unit/portion/serving size is known from context. In effect, the specific numeric forces a count interpretation.
If the fraction linker "-tin-" does not have a string to its right, it will be assumed to be 'all'. When this occurs, only a polarity modifier may precede "-tin-". Here are some examples:
ketinkumo = most, a large fraction of, a majority of cantinkumo = about half fotinkumo = a small fraction of, a minority of zutinkumo = almost none of, a tiny fraction of
There will also be times when we will need to treat a count noun as if it were a mass noun and vice versa. Here are some examples:
Count-to-mass: He ate a lot of duck (i.e., a large quantity of duck meat). Mass-to-count: He owns a lot of rubies. ('Ruby' is a natural substance and is a mass noun by default.)
In the interlingua, we will allocate two suffixes to change the default count/mass interpretation of a noun root:
-ep change to mass noun -op change to count noun
We will also need a non-specific numeric to indicate plurality:
li- plural, more than one
In the interlingua, count and group nouns will always be assumed to be singular unless "li-" is used:
bodami = the duck libodami = the ducks
Note that "li-" can also be prefixed directly to the numeric classifier "-kum" to create the separate word "likumo" meaning 'multiple'. Note also that "li-" is specific and will automatically convert a mass noun to a count noun.
Now, there will be times when the number of a noun phrase is not known. This will not happen when a person is speaking the language, but may happen when a computer is translating from a natural language to the interlingua. (Please keep in mind that this monograph is actually the reference document for an interlingua intended primarily for use in machine translation.) When a machine translation program cannot determine the number of a noun phrase in the source language, it should modify the noun with the special particle "kujopo". This word will behave syntactically as an adjective.
[Note that "-jop" is a the classifier reserved for words called "particles". Particles have special syntactic and/or semantic properties that place them outside the general classificational system. Because of this, they rarely undergo further derivation. We'll see more examples of this classifier later.]
The noun forms of ordinals can be used to represent the specific members of a sequence, as in:
I need number seven and number thirteen.
where "number seven" is "taybyekumi" and "number thirteen" is "badibyekumi". Note that the above can also be paraphrased as "I need the seventh one and the thirteenth one".
The scalar polarity modifiers can also be used with the ordinal marker "bye" to express a non-specific range within the range of possibilities, as follows:
bibyekumo 'last', 'final', 'highest in sequence', 'at the very end of a sequence' kebyekumo 'later', 'high in a sequence', 'near the end' canbyekumo 'intermediate', 'middle', 'midway', 'midmost', 'halfway', 'midway in a sequence' fobyekumo 'early', 'early in a sequence', 'near the beginning' zubyekumo 'first', 'initial', 'lowest in sequence', 'at the very beginning of a sequence' byekumo 'sequential', 'ranked', 'graded'
When dealing with sequences, it's often very useful to be able to specify next or previous element in the sequence. We will accomplish this by allocating two new modifiers:
ci- previous, minus one-th ordinal je- next, plus one-th ordinal
Note that the ordinal marker "bye" is not needed with these, since they are inherently ordinal. Here are some examples:
cikumo 'previous/last' jekumo 'next' xecikumo 'previous two', 'last two' badijekumo 'next thirteen'
If an ordinal is focused, the focus will indicate the sequence of which P is a part, as in the following example:
P-s dibyekumo third P-s dibyekuma to be third P/F-s dibyekumunza to be third in F, to be in the third position of sequence F
The last example could be used in a sentence such as "John dibyekumunza the line", meaning 'John is third in the line.'
We can do similar things with cardinal numbers:
P-s: xekumo 'two' xekuma 'to be two in number' dikuma 'to be three in number' xeboydikuma 'to be 2.3 in quantity' P-d: xekumupa 'to become two in number' A/P-d: xekumapa 'to split/break up/divide into two' kumapa 'to divide/partition/separate'
As nouns, cardinal numbers represent the concept "N entities" or "an N-some":
I have kokumo copies left = I have nine copies left. Please give me zakumi = Please give me six. I met dikumi yesterday = I met the threesome/trio yesterday.
Earlier, we discussed how the focus of basic scalar state verbs could elaborate the state, as in the following examples:
Saudi Arabia is rich vs. Saudi Arabia is rich in oil.
It's also possible to be even more precise, as in:
John is rich vs. John is rich to the tune of 3 million dollars.
Here, the argument "3 million dollars" is simply the focus of the P/F-s verb meaning 'to be rich'.
In other words, any scalar state that can have different degrees of measurable intensity can be the root of a P/F-s verb that indicates the degree of the state. Here are some more English examples:
P-s: John is tall. P/F-s: John is tall 6 feet = John is 6 feet tall. P-s: The book is heavy. P/F-s: The book is heavy 4 kilograms = The book weighs 4 kilograms. P-s: The opera is long (temporal). P/F-s: The opera is long 3 hours = The opera lasts 3 hours. P-s: The town is far. P/F-s: The town is far 20 miles = The town is 20 miles away.
Thus, there is no need to create special roots meaning 'to last', 'to weigh', 'to have a volume of' and so on. We simply need to focus the appropriate P-s state verbs and provide a specific measurement as the focus argument.
Note that English has only a few verbs such as "to weigh" or "to last". It does not have similar equivalents for most of its measure words. For example, we say "He is very tall" - not "*He heights very much", or "The rope is too long" - not "*The rope lengths too much". The system presented here allows you to derive verbs for any kind of measurement.
So, let's define a few roots and derive the corresponding measure verbs:
toyculo -> P-s adjective, scalar state 'long (temporal)' toyculunza -> P/F-s verb 'to last F' culo -> P-s adjective, scalar state 'heavy' culunza -> P/F-s verb 'to weigh F' zeculo -> P-s adjective, scalar state 'long (spatial)' zeculunza -> P/F-s verb 'to be F in length'
Measurement nouns such as "weight", "age", "length", and so on can be obtained via middle voice derivations of the corresponding verbs. For example, the English noun "weight/heft" is the inverse F/P-s noun derivation of the verb "to weigh"; i.e. "culunzangi". We can also use the quality suffix "-on" for the more general sense of "having weight"; i.e. "culunzono". Note that the inverse derivation indicates an actual value (e.g. "At that weight, he can expect to have serious health problems"), while the quality derivation indicates the quality possessed by the patient (i.e. 'weighableness'), and is probably not that useful. The unfocused "culoni", meaning 'heaviness', is probably more useful.
The case tag "toyculunze" is also useful. It means 'lasting' or 'for' as in "John was sick FOR three days".
Units of measure use the classifier "-tov". Here are a few examples:
tovi -> 'day' zetovi -> 'meter' bawntovi -> 'pound' (English weight measure)
[Note that, since these are basic nouns, the modifier "bawn" is used for its mnemonic value, which means it can also be used for its sound value.]
We will use numeric morphemes to represent Latin and Greek prefixes for multiples of ten:
-xo- positive exponent -twa- negative exponent xe- two di- three za- six xodi- kilo- twaxe- centi- twadi- milli- twaza- micro- xodizetovi kilometer twaxezetovi centimeter twadizetovi millimeter twazazetovi micrometer
When a verb is modified by a number, it has the meaning "N-times" rather than "N-entities". Here is an example:
doykavapa = 'to open' John doykavapa the door dikumoge = John opened the door three times.
It is important to note that we can not use the P-s adverb form - we must use the "0" form (suffix "-og"). The reason is that the P-s form will imply a link to an argument of the verb, thus indicating the quantity of "Johns", which is meaningless, or the quantity of "doors", which can more easily and unambiguously be implemented using the adjective "dikumo". The "0" adverb form, however, always modifies the verb. Thus, we are, in effect, indicating the 'quantity' of the verb; i.e. the frequency of the event. [This is an important distinction that will come in handy again later, when we discuss comparatives.]
Adverbial "0" forms of the non-specific numerics are also very useful. Here are some examples:
bikumoge = 'always', 'all the time', 'at every opportunity' kekumoge = 'often', 'frequently', 'a lot', 'many times' cankumoge = 'sometimes', 'at times', 'a number of times', 'now and then' fokumoge = 'occasionally', 'not too often', 'on occasion', 'a few times', 'from time to time', 'once in a while' zukumoge = 'rarely', 'seldom', 'hardly ever', 'almost never', 'infrequently'
Also, from the specific numerics, we get:
kumoge = 'sometimes', 'at times', 'ever (in questions)' jukumoge = 'never', 'zero times', 'not ever', 'not ... at all', 'on no occasion' bakumoge = 'once', 'one time', 'on one occasion' xekumoge = 'twice', 'two times' dikumoge = 'thrice', 'three times' xebakumoge = 'twenty-one times'
The ordinal derivations are also useful:
babyekumoge = '(for) the first time' xebyekumoge = '(for) the second time' xebabyekumoge = '(for) the twenty-first time' e.g. "Yesterday, he went to Boston dibyekumoge." = 'Yesterday, he went to Boston for the third time.' cikumoge = the last/previous time jekumoge = the next time dicikumoge = the last three times xejekumoge = the next five times e.g. "I'll see you when I'm in Boston jekumoge." = 'I'll see you the next time I'm in Boston.'
We can also handle noun phrases that contain both counts and measures:
Using the open adjective "ximunzyu", meaning 'having/with/of': I bought dikumo bawntovi ximunzyu rice. or I bought rice ximunzyu dikumo bawntovi. = I bought three pounds of rice. A relative clause: I bought rice that culunza dikumo bawntovi. = I bought rice that weighs three pounds. An open adjective version of the P/F-s measure verb: I bought rice culunzyu dikumo bawntovi. = I bought rice weighing three pounds.
Other derivations of the measure verbs are also useful. Here are some examples:
P/F-s: The pig culunza 25 pounds. = The pig weighs 25 pounds. P/F-d: The pig culumba 25 pounds. = The pig came to weigh 25 pounds. A/P/F-s: He culanza the pig 25 pounds. = He maintains the pig at a weight of 25 pounds. A/P/F-d: He culamba the pig 25 pounds. = He changed the pig's weight to 25 pounds. [Literally: He caused the pig's weight to become 25 pounds.] AP/F-s: The pig culinza 25 pounds. = The pig keeps itself at 25 pounds weight. AP/F-d: The pig culimba 25 pounds. = The pig changed its weight to 25 pounds.
If we want to create versions of the English verbs that actually measure a state, such as "to time" as in "He timed the performance" or "to weigh" as in "He weighed the rice", then we need to augment the basic state. In the interlingua, we will accomplish this with the special suffix "-ayg". When added to a root, it will convert the word to one with the meaning 'to determine or measure the root state'. The result will be an AP/F-d verb by default. Here are some examples:
toyculayga = 'to time', 'to measure/determine the duration of' culayga = 'to weigh', 'to measure/determine the weight of' zeculayga = 'to measure/determine the (spatial) length of' zogayga = 'to locate', 'to find', 'to determine the location of' dapayga = 'to compare', 'to determine the sameness/equality'
Note that "dapayga" is equivalent to "fesayga" (meaning 'to determine the differentness'. In both cases, we must use the secondary focus case tag "tomege" for the second item that is being compared.
We can also create the general derivation "tomayga", meaning 'to measure'. Thus, for example, the word "toyculayga" is equivalent to "tomayga toyculunzangi", meaning 'to measure the duration'.
Distance and time measures often specify a relative position or direction, as in the sentence "The chair is two meters from the window". To handle this in the interlingua, we simply use previous-word modifier forms (part-of-speech marker "-ay") of the distance or time measure to modify the main locative or temporal relation. Here are some examples (where "zetovi" means 'meter', and "xekumo" means 'two'):
John was sitting fage the door. = John was sitting away from the door. John was sitting fage-zetovay-xekumay the door. = John was sitting two meters from the door. The chair faga the window. = The chair is away from the window. The chair faga-zetovay-xekumay the window. = The chair is two meters from the window.
We'll use the same technique later with temporal deictics.
Other words with meanings such as "to be to the left of", "to be beneath", "to be after", and so on can also be used in place of "fage" or "faga", as in "I left two hours before the meeting ended".
Later, we'll discuss more complex forms of modification, such as in "He arrived two hours AND twenty minutes after me".
Finally, do not confuse measure words with specific entities that have precise measures, such as the named time periods "September", "Tuesday", and "1994". These are proper nouns and we'll discuss how to deal with them later.
It would also be useful to have a separate numeric morpheme to indicate the concept 'N at a time' or 'N per group':
-ku N at a time, N per group, in groups of N bakukumo = each, every, a group of one xekukumo = both, a group of two ... batukukumo = a group of fifteen ... bakukumoge = one at a time, one by one xekukumoge = two at a time, two by two, in groups of two bikukumoge = all at once, all at the same time, all together
When we modify the verb with an adverb formed with the "0" suffix "-og", we are indicating that the event itself is being performed N times simultaneously; i.e., that the event itself is a group of sub-events. Thus, we get the sense of 'N at a time'. In other words, we are linking to the verb itself - not to an argument of the verb.
Note that we also used "ku" earlier when we discussed group nouns. We are simply extending its meaning for use with numbers. Note also that "ku" is reminiscent of the numeric classifier "kum".
Now, in the interlingua, we will use the classifier "-bel" to designate arithmetic functions. These words will be P/F-s by default, where P will represent the result of the function and F will represent the argument. If an argument has more than one component, then they will be linked by the conjunction meaning 'and'. Here are some examples:
danbel = addition kozaynbel = cosine Tukumi danbela dikumi and xekumi = The sum of three and two is five. (Literally, "five is the sum of three and two".) Fawbakumi kozaynbela "pi" = The cosine of pi is minus one. (Literally, "minus one is the cosine of pi".)
Other parts of speech can be derived in the usual way. For example, the open noun phrase "kozaynbelaw doytukumi" means 'the cosine of zero point five'.
The agentive form is also useful, but the word order is different than English. For example:
I danbelamba dikumi bakumi and xekumi = I added two to one to get three. (Literally, "I caused the sum three to be from one and two".)
For scalar states, the focus positions the state within a range of possibilities. For example:
The man is old vs. The man is old (focus =) 90 years. The man is tall vs. The man is tall (focus =) 2 meters. The man is rich vs. The man is rich (focus =) $10,000,000.
For numeric states, the number itself is the position within a range of possibilities. In other words, the state is simply 'having quantity' while the numeric value indicates the precise quantity or "position" within the numeric state. In effect, the specific numeric value is the focus of a generic numeric state which has been incorporated into the root to make it more precise. Thus, it makes no sense to use polarity prefixes with specific numbers because numbers are polarity markers themselves, only more precise. However, it does make sense to use the polarity prefixes with the bare numeric root "kum", as we did earlier (e.g. "bikumo" = 'all/every', "kekumo" = 'many/much', etc).
Also, since numbers are, in effect, the focus of a generic numeric state, an explicit focus would make sense only if it provided more detail about the number, which is either semantically impossible or completely useless, depending on how you look at it. In other words, once a specific value is used with "kum", the result is inherently unfocusable because the numeric value itself is the effective focus.
However, numbers are often used to indicate a quantity that is part of a larger group, as in "I need two of those oranges". In a situation like this, we cannot use "ximyu" because it implies a genitive relationship; i.e. "two ximyu the oranges" would mean something like "the oranges' two", which is not the meaning we are interested in here.
Because of this, we will allow numbers to be focused to indicate a partitive relationship. Here are some examples:
P/F-s: Here are xekumunzaw the oranges = Here are two of the oranges. A/P/F-d: I bakumamba the wood those crates = I made the wood into one of those crates. (Literally: "I caused the wood to become one of those crates".)
While the above may not be semantically correct (and I'm not sure myself), we have implemented a very useful form that would otherwise go unused. [Later, we'll discuss the semantics of a more generally applicable partitive relationship.]
A deictic word is one whose referent is determined by the speech context. For example, in the sentence "I ate here yesterday", there are three deictic words:
1. "I" - The actual referent depends on WHO uttered the sentence. 2. "here" - The actual location depends on WHERE the sentence was uttered. 3. "yesterday" - The actual time depends on WHEN the sentence was uttered.
Deictics are inherently unfocusable - not because there is no referent - but because the referent can never be stated explicitly. It is always determined by the speech environment.
What's especially fascinating about deictics is the strong relationship between their forms and their meanings in many natural languages, as well as the strong relationship between the meanings of deictics that, on the surface, appear to be completely unrelated. For example, most natural languages have a three-way distinction between personal pronouns, deictic locatives, and demonstratives:
1st person: I/we here this/these 2nd person: you there that/those 3rd person: he/she/it/they yonder yon
Standard English rarely uses "yon" and "yonder" anymore, but it used to be used quite often. Also, languages that make the three-way distinctions for locatives and demonstratives generally do it in the following way:
this or here -> at or near the speaker that or there -> at or near the addressee yon or yonder -> away from both speaker and addressee
Note that 1st person is the speaker, 2nd person is the addressee, and 3rd person is neither the speaker nor the addressee. For example, Japanese is fairly typical of how many languages use the same forms for both demonstratives and locatives:
near speaker near addressee far from both ------------ -------------- ------------- adjective this - "kono" that - "sono" yon - "ano" pronoun this - "kore" that - "sore" yon thing - "are" locative here - "koko" there - "soko" yonder - "asoko"
While not perfectly regular in the modern language, they all evolved from the same roots. English also has a historical link between "this/here", "that/there", and "yon/yonder", although it is less regular. An even better example, though, is Cambodian where the word "nih" means either 'this' or 'here', and the word "nuh" means either 'that' or 'there'. And in Turkish, the same root is used to derive the third person pronouns meaning 'he/she/it/they', the demonstrative meaning 'that', and the locative meaning 'there'.
As it turns out, this correlation between form and meaning, and the obvious link to 1st, 2nd, and 3rd person referents is quite common among the world's languages.
Another major difference between deictics and other words is that deictics do not indicate, in any way at all, the nature of their referents. For example, on hearing the noun "duck", we immediately know a lot about the referent. However, the pronouns "you" or "that" or the adjectives "my" or "this" or the locatives "here" or "yonder" tell us nothing about their referents. Instead, they simply 'point to' the actual referent.
Deictics are also different from open class words such as nouns and verbs because there are very few of them, and because new ones rarely enter a language. For example, new nouns are adopted by a language quite often, while deictics are the result of slow and gradual language evolution that can take centuries.
Incidentally, since the referents of deictic expressions are effectively 'indexed' by the location of the speaker and the addressee, deictics are also sometimes called indexicals, and deixis (i.e. the phenomenon itself) is sometimes referred to as indexicality. Also, words that are members of small, closed groups, such as pronouns, demonstratives, tense-aspect words, and articles are called closed class words, while words that are members of large, open groups, such as nouns and verbs, are called open class words. It's rare when a closed class word enters or leaves a language, whereas open class words change frequently.
In the next few sections, I will describe a highly regular system that can be used to implement personal pronouns, possessive adjectives, possessive pronouns, demonstratives, and deictic locative and temporal words.
In the interlingua, I will implement deictics by allocating a set of root morphemes that are mnemonically compositional. In other words, deictics will be formed from true, unique root morphemes, but we will design them in a way that will display their inherent compositionality.
For personal pronouns and possessives, the basic components will be as follows:
1: ba- 2: xe- 3: di- pronoun: -v 1+2: co- plus 1+3: tu- genitive: -m 2+3: za- 1+2+3: tay-
The first three are inherently singular, and the remaining four are inherently plural.
All deictics are P-s by default.
Here are the derivations that correspond to the English personal pronouns and possessive adjectives:
1: I = bavi, my = bamo, mine = bami 2: you = xevi, your = xemo, yours = xemi 3: he, she, it = divi his, her, its = dimo his, hers, its = dimi 1+2+3: we = tayvi, our = taymo, ours = taymi
The 3rd person forms will not be used very much, since it will almost always be more appropriate to use anaphora. I'll have more to say about this later in the chapter on Anaphora.
Note that the 1+2+3 form "tayvi" is being used for English "we". This is because English "we" includes the speaker and any others whether they are present or not.
The second and third person forms can be made plural by using the plural prefix "li-", which we introduced earlier:
lixevi = you (plural), you all lidivi = they/them
[Some languages use the 2nd person plural form with singular referents to indicate politeness. We'll discuss how to do this later.]
If gender must be specified, we will use the following prefixes:
male: loy- female: law-
We have also given a special interpretation to the 1+2+3 form. It will be interpreted as either 1+2, 1+3, or 1+2+3. This will make it conform to natural language universals, since a true 1+2+3 form does not seem to exist in any natural language, whereas forms do exist in many languages for the interpretation we are using here (such as English "we").
Quite a large number of languages have two 1st person plural pronouns. For example, in Indonesian, "kita" has the same coverage as English "we". The second pronoun, "kami", however, explicitly excludes the addressee(s):
kami = tupa 1+3, speaker plus one or more others who are not present, but not the addressee(s)
Pronouns which include the addressee(s) are called inclusive, while pronouns which exclude the addressee(s) are called exclusive.
Some languages (e.g. Cambodian and several languages of New Guinea) even have versions of 3rd person pronouns that are unspecified for number, as well as 2+3 forms. The system presented here allows us to create any of these pronouns with total regularity and with whatever degree of precision (or lack of precision) that we need.
Some languages have dual (= exactly 2), and a few languages even have trial (= exactly 3), and paucal (= a few) forms of their personal pronouns. We will not create special words for these in the interlingua because they are very rare and because they are almost always used as anaphora, which we will handle differently. If necessary, however, we can modify the pronoun with a numeric word:
Dual: xevi xekumo = 'the two of you', 'you two' Trial: xevi dikumo = 'the three of you', 'you three' Paucal: xevi fokumo = 'the few of you'
The P-s verb form will having the meaning 'to be X'. Thus, the verb "bava" means 'to be me' (e.g. "Bava the culprit" = 'The culprit is me') and "bama" means 'to be mine' (e.g. "The pencil bama" = 'The pencil is mine').
Adjectival forms can be used to handle expressions such as "You boys" in "You boys better behave yourselves", where 'You' would be "xevo" and would modify the noun meaning 'boys'.
Adverbial forms will have English translations that use "being", as in "Being yours, the car is probably a piece of junk", where "being yours" would be represented by "xeme".
Open nouns, open adjectives, and open previous-word modifiers are meaningless because deictics are inherently unfocused.
Other verb forms can be used to represent such concepts as P-d "bavupa" = 'to become mine', A/P-d "xemapa" = 'to make P yours', etc.
For demonstratives, we will use the same initial modifying root morphemes as for personal pronouns, plus the letter 's':
1: ba- 2: xe- 3: di- 1+2: co- plus -s 1+3: tu- 2+3: za- 1+2+3: tay-
Here are the English equivalents:
this = baso that = xeso yon = diso
The plural prefix "li-" should be applied to the head noun, rather than to a demonstrative adjective. For example, "libodami baso" means 'these ducks'. However, "li-" can be applied to a demonstrative noun, as in "I like libasi" = 'I like these'.
Since demonstratives often have strong locative implications, it will not be very useful to interpret the compound forms, such as 2+3, as 'that' and 'yon'. Instead, we will interpret it as 'that' or 'yon'. Thus, if we do not want to make the 'that/yon' distinction, we can use the 2+3 forms:
that = zaso
Some languages have other versions. For example, 1+2 demonstratives are found in Sre (Vietnam) and Chibemba (Africa). I do not know of any language that has a 1+3 demonstrative.
The basic verb forms can represent P-s concepts such as 'is this (one)' and 'are those'. For example, "Your boat basa" would mean 'Your boat is this one'.
Other verb forms can also be useful ('to become this entity', 'to make something into that entity', etc.). For example, the A/P-d version of the 3rd person demonstrative, "disapa", would be used to represent "to turn ... into that" in a sentence such as "I TURNED the scrap lumber INTO THAT".
For locatives, we will use the same modifying root morphemes as for personal pronouns, plus the letter 'l':
1: ba- 2: xe- 3: di- 1+2: co- plus -l 1+3: tu- 2+3: za- 1+2+3: tay-
Here are the English equivalents:
bale = here xele = there (near listener) dile = over there, yonder
If you do not want to make the 'there/yonder' distinction, use the 2+3 forms, as we did for demonstratives:
there = zale
The 1+2+3 form implies 'here or there or yonder', or simply 'somewhere'. The 1+3 form means 'here or over there'.
The basic verb forms can represent such concepts as 'here is', 'there are', etc. For example, the P-s verb "bala" would mean 'here is' or 'to be here' in a sentence such as "Here's Bill" or "The books you want are here". However, English speakers should be careful not to confuse the 2nd + 3rd or 3rd person deictic constructions with the P-s verb "kava", discussed earlier, which does not refer to a particular location. Consider the following:
dila: There are the books you wanted. kava: There are people who actually like you.
Adjective forms are also useful, as the following examples illustrate:
dile: I saw Sally over there (= I was over there when I saw her). dilo: Do you see Sally over there (= the Sally standing over there right now)? dilo: The man over there married my sister.
In the last two examples, the adjective "dilo" modifies the nouns "Sally" and "man".
Other verb forms can also be useful ('to get here', 'to keep there', 'to put over there' etc.). For example, the A/P-s verb "dilasa" would be used to mean 'to keep over there' in a sentence such as "We keep the plants over there during the winter". Also, the AP-d verb "balipa" means 'to come' (literally: 'to cause oneself to become here'), and the AP-s verb "balisa" means 'to stay here' or 'to tarry'.
Finally, do not confuse deictic locatives with state adverbs such as "near/nearby", "far/far away/far off", etc. The adverb forms often appear to be used deictically, but this is simply because the contextual referent is sometimes the location of the speaker. There are other times, however, when the referent is not the speaker:
Referent is the speech location: John lives nearby. (= near here) Referent is not the speech location: When I rented that cheap apartment in Boston, John lived nearby. (= near the apartment) Compare the above with "John lives here" vs. "When I rented that cheap apartment in Boston, John lived here".
In other words, when using an unfocused version of an inherently focused concept, we must supply a default based on context, and sometimes the default referent will be the speaker's location, but not always. It is important to keep in mind that true deictics are inherently unfocusable because the referent is always determined by the speech act.
For temporal deictics, we will use the same modifying root morphemes as for personal pronouns, plus the letter 'p':
1: ba- 2: xe- 3: di- 1+2: co- plus -p 1+3: tu- 2+3: za- 1+2+3: tay-
I will also adopt the following person/time mappings:
1: present 2: past 3: future 1+2: past, same time unit 1+3: future, same time unit 2+3: (unassigned) 1+2+3: (unassigned)
Here are some English equivalents:
now = bape earlier, already = xepe later = dipe currently, nowadays = libape
Note that the above derivations are true deictics. Thus, they cannot be used in a sentence such as "John arrived at 3, but Bill arrived much earlier". Since "earlier" in the example is not relative to the moment of speech, it is not a true deictic. It is simply a temporal state relationship whose referent must be determined from context. (In fact, we derived this word when we discussed temporal case tags. The word is "cipome", meaning 'earlier' or 'previously'.)
We can also use polarity modifiers to refine the meanings:
bibape = right now, immediately, at this very moment canbape = about now foxepe = a little while ago kedipe = a long time from now
Languages also have deictics that refer to specific time periods, such as 'today', 'tomorrow', and 'yesterday'. For these, we can modify the deictic adverb by a previous-word modifier version of the measure word. Here are some examples:
tovi - 'day' today = bape-tovay present + 'day' yesterday = xepe-tovay past + 'day' tomorrow = dipe-tovay future + 'day' earlier today = cope-tovay past, same time unit + 'day' later today = tupe-tovay future, same time unit + 'day'
If we use numeric multipliers, we can indicate precise temporal distances from the present time. Here are some examples:
day before yesterday = xepe-tovay-xekumay = 'earlier' + '2 days' day after tomorrow = dipe-tovay-xekumay = 'later' + '2 days' three days ago = xepe-tovay-dikumay = 'earlier' + '3 days' twenty-three days from now = dipe-tovay-xedikumay = 'later' + '23 days' many days ago = xepe-tovay-kekumay = 'earlier' + 'many days' in a few days = dipe-tovay-fokumay = 'later' + 'few days'
And so on.
We can extend this approach easily to handle expressions such as "tonight" = present + 'day' + 'night', "tomorrow night" = future + 'day' + 'night'. [Keep in mind that "tovi" = 'day' refers to a 24 hour period. It does not refer to the concept of 'daytime'.]
Prefixes (plural, etc) and polarity modifiers may not be used when the deictic is modified by a measure word.
Articles in English are used to indicate whether a noun phrase refers to an entity which is being newly entered into the discourse or which has already been mentioned or is known from context. A definite article (e.g. English "the") is typically used when the noun phrase refers to a specific entity that is already known to both the speaker and listener. An indefinite article (e.g. English "a/an" or "any/some") is generally used when the entity is being introduced for the first time. Compare "An old man entered the room" with "The old man entered the room".
Articles are not always used in the same way from one language to another. For example, there are many cases where an English definite article will be used where French will use an indefinite article or no article at all. In fact, I doubt very much if there are two languages that always use articles in exactly the same way. The rules involving their use are always language-dependent. Because of this, it is important that we define exactly what we mean by the term "article". For the interlingua, here is the definition that we will use:
A definite article indicates that the corresponding noun phrase refers to an entity that is already known to both speaker and listener; i.e., it is either known from context or has been previously mentioned. An indefinite article indicates that the corresponding noun phrase refers to an entity that is being newly added to the existing context.
Many languages do not have unique words or morphemes to represent articles (e.g. Chinese, Swahili, Turkish, Hindi, Japanese, and many others). Some have only one or the other, but not both (e.g. Arabic has only a definite article, while Persian has only an indefinite article). However, when articles are not available in a language, word order (e.g. Russian) or verb-marking (e.g. Swahili) can sometimes distinguish between definiteness and indefiniteness.
Fortunately, we will not need to allocate new words or affixes for articles, because we already have ideal solutions for both.
The indefinite article is simply "tomo", the adjective form of the true generic root, which we discussed earlier in the chapter on Simple Generics. Here are the relevant examples again:
Generic adjective "tomo" - 'a/an/some (singular)', some (plural) e.g. We need AN empty box. SOME jerk just blocked my car. SOME people are at the door.
The semantics of "tomo" perfectly overlap our definition of the indefinite article.
The negative indefinite article "jutomo" will also be useful:
"jutomo" - 'no', 'not' e.g. NO man left these footprints. I saw Bill but NOT John.
A perfect choice for the definite article is the 1+2+3 demonstrative, "tayso". (Note that demonstratives are inherently definite.) Cambodian does something very similar to this. It has a word that can mean any of 'this', 'these', 'that', or 'those', and corresponds exactly to the word "tayso" in the interlingua. It is normally translated into English as 'the'.
There will be times when it will be necessary to use "tayso" plus the indefinite article "tomo" or the negative indefinite article "jutomo". Here's an example:
I saw carpenter but jutomo tayso plumber = I saw the carpenter but not the plumber.
If we did not use "tayso" with "plumber", then we would have had:
I saw carpenter but jutomo plumber = I saw the carpenter but no plumber.
Note that "jutomo" and "tomo" should never be used together.
Now, there is also a third category of definiteness: generic. Here are some examples in English:
Tigers live in India. I don't like ice cream.
In the above examples, "tigers" and "ice cream" are generic, because they do not refer to specific entities, whereas definite and indefinite nouns always refer to specific entities.
In the interlingua, we will mark genericity by using the generic noun prefix "lu-". Thus, we can implement the examples above by prefixing "lu-" to the words meaning 'tiger' and 'ice cream'. English usually achieves the same effect by omitting an article and making the headword plural. Also, in noun-noun compounds, the modifying noun is always assumed to be generic; e.g. "meat eater", "gold mine", "door knob", etc. In the interlingua, these modifying nouns will be prefixed by "lu-". [We'll have more to say about compounding later.]
As we discussed earlier, the word "tomi" means 'something' or 'anything'. Since it always introduces a new referent into the conversation, it is inherently indefinite. If we make it generic, we get the general sense of the English word "things". Here are some examples:
lutomi = 'things' e.g. "Why do you have to make things so complicated?" bikumo lutomi = 'all things', 'everything' e.g. "Everything must come to an end." kekumo lutomi = 'many things', 'a lot of things' e.g. "I know many things that you don't know."
In the interlingua, all noun phrases will be definite by default. Use "tomo" or "jutomo" to make a phrase indefinite, and the prefix "lu-" to make a phrase generic. Here are some examples:
Lubucalinki is fun = Working is fun Bucalinki was fun = The work was fun. I saw tomo bucalinki = I saw some work. [Note that "bucala" is the verb meaning 'to work/labor' and "-ink" is the process suffix that we discussed earlier.] I read book = I read the book. (definite by default) I read to book = I read a book. (indefinite) I read lu+book = I read books. (generic) I read diku book = I read the three books. (definite by default) I read tomo diku books = I read three books. (indefinite)
In summary, there is no need to create special words or affixes to represent articles in the interlingua. Instead, we will use the 1+2+3 demonstrative "tayso" to explicitly mark definiteness (on the rare occasion when it is needed), we will use the true generic word "tomo" to explicitly indicate indefiniteness, and we will use the prefix "lu-" to indicate genericity.
Finally, there will be times when the definiteness of a noun phrase is not known. This will never happen when two people are speaking the language, but may happen when a computer is translating from a natural language to the interlingua. When a machine translation program cannot determine the definiteness of a noun phrase in the source language, it should modify the noun with the particle "tojopo".
Unlike basic verbs, comparatives do not represent true states or actions. Instead, they indicate the relative magnitudes of two or more states or the relative quantities of two or more entities. In a sense, they are somewhat like deictics, since they do not represent exact states or entities. Unlike deictics, though, they do not index or point to exact states or entities. Instead, they simply position one referent with respect to another along a one-dimensional scale:
John John John John John is is is is is least less happy more most happy happy happy happy | | | | | V V V V V o-----------o-----------o-----------o-----------o Absolute Absolute Minimum maximum
Now, the interpretation of comparatives will depend on the nature of what is being compared. Earlier, when we discussed Counts and Measures, we made an important distinction between counts which were explicitly linked to an argument of the verb, and those which were not linked, but which modified the verb directly. Thus, a numeric P-s adverb had the meaning 'being N in quantity' when it linked to a noun, while the verb-modifying "0" form had the meaning 'being N in frequency'. Comparatives behave in the same way.
However, count words have specific numeric values, whereas comparatives have the very vague meaning of 'relative magnitude'. Thus, when a count modifies a verb, it can only indicate a frequency; i.e. a number or a count of discrete events. A comparative, however, is more general and can be interpreted as either degree, duration, or frequency. Consider the following examples:
degree -> Fish stinks more than beef. duration -> John studied more than Bill. frequency -> He complained more than I did.
Note, though, that these are the most likely interpretations in English, and can change depending on context. Also, when necessary, it is possible to explicitly indicate the desired interpretation:
degree -> Fish stinks stronger than beef. duration -> Fish stinks longer than beef. frequency -> Fish stinks more often than beef. degree -> John studied harder than Bill. duration -> John studied longer than Bill. frequency -> John studied more often than Bill. degree -> He complained more vehemently than I did. duration -> He complained longer than I did. frequency -> He complained more often than I did.
And yet, when you look more closely, the most likely "more than" default interpretation actually includes all three non-default interpretations. For example, the sentence "John studied more than Bill" could be interpreted as "John studied harder, longer, and/or more frequently than Bill". In other words, when a "more than" comparative is used with verbs, it can indicate any or all of the three concepts of 'degree', 'duration', or 'frequency'. However, the nature of the verb and the context in which it is used may favor one interpretation more than another.
Natural languages implement comparative constructions in several different ways. Here are some examples of the major types:
1. The 'from' comparative (e.g. Classical Arabic, Hindi, Japanese, Eskimo, Quechua, Turkish, Burmese): A horse is big FROM a mouse. = A horse is bigger than a mouse. (In these constructions, "from" is the same word or affix used in a sentence such as "He drove FROM Boston to New York".) 2. The 'to' comparative (e.g. Breton, Maasai, not very common): A horse is big TO a mouse. = A horse is bigger than a mouse. (In these constructions, "to" is the same word or affix used in a sentence such as "He drove from Boston TO New York".) 3. The 'more' plus 'on' comparative (e.g. Navaho, Tamil, not very common): A horse is MORE big ON a mouse. = A horse is bigger than a mouse. (In these constructions, "on" is the same word or affix used in a sentence such as "He put the book ON the table".) 4. Comparatives that use opposites or negatives (e.g. Motu, Dakota, Samoan, Nahuatl. This method is very common, but is limited to relatively obscure languages.): A horse is big, a mouse is not big. OR A horse is big, a mouse is small. = A horse is bigger than a mouse. 5. Comparatives formed regularly (typically from verbs) meaning 'to be more in degree', 'to be equal in degree', and 'to be less in degree' (e.g. Chinese, Hausa, Swahili, Vietnamese, Yoruba, Cambodian): A horse is big SURPASSING a mouse. = A horse is bigger than a mouse. 6. Comparatives that use special particles (e.g. Hungarian, Russian, Malagasy, English, Basque, Javanese. A large majority of the languages in this group are European.) English: A horse is bigger THAN a mouse. Javanese: A horse is big MORE-THAN a mouse.
The first four methods are essentially metaphoric or idiosyncratic, and I will say no more about them. The fifth method can be very complex because different forms are needed depending on the syntax of the construction.
The sixth method, however, is the simplest, and is the method that we will use in the interlingua. However, the sixth method can at times be ambiguous, which we cannot tolerate in an interlingua intended for use in machine translation. Thus, we must design the system such that there can be no ambiguities.
Before proceeding, however, and in order to get an idea of how to most effectively and unambiguously implement these comparatives, let's look at a few examples that vary only slightly, and see if we can make some generalizations about them (I will use parentheses plus the English particle "more" to show which item is greater in quantity or degree):
John reads novels more than Bill. John (more reads) novels vs. Bill reads novels i.e., different verbs, different subjects, same objects John reads novels more than short stories. John (more reads) novels vs. John reads short stories i.e., different verbs, same subjects, different objects John reads more novels than Bill. John reads (more novels) vs. Bill reads novels i.e., same verbs, different subjects, different objects John reads more novels than short stories. John reads (more novels) vs. John reads short stories i.e., same verbs, same subjects, different objects
In other words, there are three constituents (verb, subject, and object) which can have either of two values (same or different). This suggests that there could be up to eight possible combinations. Here is a list of all of the possibilities:
1. same verb, same subject, same object This is not a comparison since nothing is different. 2. same verb, same subject, different object John reads more novels than short stories. John reads (more novels) vs. John reads short stories 3. same verb, different subject, same object More women read novels than men. (more women) read novels vs. men read novels 4. same verb, different subject, different object John reads more novels than Bill. John reads (more novels) vs. Bill reads novels 5. different verb, same subject, same object John writes novels more than he reads them. John (more writes) novels vs. John reads novels 6. different verb, same subject, different object John reads novels more than short stories. John (more reads) novels vs. John reads short stories 7. different verb, different subject, same object John reads novels more than Bill. John (more reads) novels vs. Bill reads novels 8. different verb, different subject, different object John reads novels more than Bill writes short stories. John (more reads) novels vs. Bill writes short stories Note that when everything is different, it is really a simple comparison between two clauses.
In the interlingua, we will implement these ideas by creating two types of word: several comparative modifiers derived using the polarity morphemes discussed earlier, and a single comparative conjunction. By default, the modifiers will be P-s. Here are the previous-word modifier forms:
bijopay = 'most' kejopay = 'more' canjopay = 'as much/many' fojopay = 'less/fewer' zujopay = 'least' jopenay = 'how (much)?', 'to what degree?'
We are introducing the 'interrogative' suffix "-en" for the first time. It will convert a word to an interrogative. We'll have more to say about the semantics of "-en" later.
Forms using the negating prefix "lo-" are also useful:
lokejopay = 'not more than', 'at most' lofojopay = 'not less than', 'at least'
And here is the comparative conjunction:
zuntesye = 'than', 'as', 'compared with/to'
The conjunction will be a "true conjunction", meaning that it must link two constituents that have the same part-of-speech and which are inherently comparable. For example, we can compare "apples" with "oranges", but we cannot compare "apples" with "John reads books". [We'll have more to say about true conjunctions later.]
The interrogative modifier "jopenay" can be used in expressions such as "How heavy is the box?" or "How generous is Bill compared to John?".
The superlative modifiers "bijopay" and "zujopay" cannot be used with "zuntesye" since they do not really compare two different constituents, and the use of a conjunction would be incorrect. For example, in the sentence:
John was the tallest student at the party.
we are not really comparing "John" with "the party", since they are inherently incomparable. In order to represent this meaning, we will do as is done is many natural languages, including English:
John was tall bijopay student zoge party.
where "zoge" is the locative case tag we discussed earlier.
[Keep in mind that "bijopay" is a previous-word modifier and must follow the word it modifies, even though we are using English word order for the rest of the example.]
When a comparative modifier modifies a countable entity, it will always have a quantitative interpretation. In all other cases, it will have the vaguer degree/duration/frequency interpretation.
To obtain a more precise interpretation, we can modify an attribute of a non-countable entity:
This room zuntesye that room fecula kejopay = This room is hotter than that room. [Where "fecula" is the word meaning 'to be hot'.] This room zuntesye that room fecula often kejopay. = This room is hot more often than that room.
It's important to emphasize that, when modifying countable entities, a comparative modifier will have a quantitative interpretation. For example, "culi kejopay" means 'more heavy ones', and not 'the heavier one'.
Now, let's look at some more examples:
1. John is taller than Bill. = John zuntesye Bill is_tall kejopay. 2. John is as tall as Bill. = John zuntesye Bill is_tall canjopay. 3. John is less tall than Bill. = John zuntesye Bill is_tall fojopay. 4. John is not as tall as Bill = John is less tall than Bill. = John zuntesye Bill is_tall fojopay. 5. John is the tallest. = John is_tall bijopay. 6. John is more quiet than shy. = John is_quiet kejopay zuntesye he is_shy. 7. John helps Bill more than he helps Mike. = John helps kejopay Bill zuntesye Mike. 8. John helps Bill more than Mike does. = John zuntesye Mike helps kejopay Bill. 9. More kids join gangs in Boston than Cowtown. = Kids kejopay join gangs in Boston zuntesye Cowtown. 10. Kids join gangs in Boston more than Cowtown. = Kids join kejopay gangs in Boston zuntesye Cowtown. or = Kids join kejopay gangs in Boston zuntesye in Cowtown. 11. John reads novels more than Bill. = John zuntesye Bill reads kejopay novels. 12. John reads more novels than Bill. = John zuntesye Bill reads novels kejopay. 13. John reads novels more than short stories. = John reads kejopay novels zuntesye short stories. 14. John reads more novels than short stories. = John reads novels kejopay zuntesye short stories. 15. John most reads novels or John reads novels the most. = John reads bijopay novels. 16. John reads the most novels. = John reads novels bijopay. 17. John is more of a fighter than Bill. = John zuntesye Bill is kejopay a fighter. [Note that we must modify "is", not "Bill".] 18. John is more of a whiner than a fighter. = John is kejopay a whiner zuntesye a fighter. 19. John likes taller girls than Louise. = John likes tall kejopay girls zuntesye Louise. 20. John had more money than Bill thought (he had). = John had money kejopay zuntesye what Bill thought (he had). [Note that we must use the headless relative "what" because the conjunction "zuntesye" can only link constituents with the same part-of-speech. We'll have more to say about headless relatives later.] 21. John baked more pies than Bill told him to (bake). = John baked pies kejopay zuntesye what Bill told him to (bake). 22. More people stayed late than left early. = People kejopay stayed late zuntesye left early. [Note that "left" here must use the infinitive suffix "-iv" because its subject is the same as "stayed".] 23. John can run faster than Bill. = John zuntesye Bill can run fast kejopay. 24. You can buy a less expensive car here than at other places. = You can buy an expensive kejopay car here zuntesye at other dealers. [Note that "here" is the adverb "bale" and "at" is the case tag "zoge".] 25. John can kick a football farther than Bill. = John zuntesye Bill can kick a football far kejopay.
When the conjunction "zuntesye" is not used, the item it links to will be known from context. For example, "I need kejopay water" = 'I need more water (than I currently have)' or "I need kejopay water this time" = 'I need more water this time (than last time)'.
Noun versions can also be used. For example, "I need kejopi" means 'I need more'.
When "kejopay" or "fojopay" modifies a specific numeric, the result will have the meaning 'more than N' or 'less than N':
I have dikumo kejopay books = I have more than three books. [Again, keep in mind that "kejopay" is a previous-word modifier, and it modifies the adjective "dikumo" - not "books"!]
When a numeric modifies "kejopay" or "fojopay", the result will have the meaning 'N more' or 'N less':
I (zuntesye you) have book kejopo dikumay = I have three more books (than you).
Note above that the adjective "kejopo" modifies "book" and the previous-word modifier "dikumay" modifies "kejopo".
Earlier, we discussed what happens when we focus scalar states that are inherently unfocused. Here is an example:
The rope is 6 meters long. = The rope is_long 6 meters.
where "is_long" is the P/F-s verb "zeculunza" .
In other words, the focus indicates the degree of the state.
Now, here is an example of a comparative "-d" derivation:
They lengthened the rope by 6 meters. = They made_longer the rope 6 meters.
Now, to create the comparative sense 'longer' from the absolute sense 'long', all we need to do is the following:
zeculo = P-s adjective 'long [spatial]' zeculamba = A/P/F-d verb 'to cause P to become length F' zeculamba kejopay = A/P/F-d verb 'to cause P to become longer by F'
Note that we cannot use a polarity modifier on the verb to achieve the same thing, because that has a different meaning:
kezeculamba = A/P/F-d verb 'to cause P to become a very long F' e.g. He made the rope a very long 6 meters.
Note that the above does not imply at all that the rope was shorter before he set its length to a "very long" 6 meters.
Natural languages have several words which indicate degree or quantity relative to an implied referent. In other words, these words have an unspecified focus. Here are some English examples:
Excessive degree: He is TOO happy now. maximum degree: He is SO/TOTALLY/MOST happy now. High degree: He is QUITE/VERY/EXTREMELY happy now. Low degree: He is NOT TOO/SLIGHTLY/SOMEWHAT happy now. Minimum degree: He is HARDLY/BARELY happy now. Zero degree: He is NOT AT ALL happy now. Slightly less than unmarked degree: He is ALMOST/NOT QUITE happy now. Exclusive degree: He is JUST/ONLY going to the library. Exact degree: He bought EXACTLY/PRECISELY seventy-two pencils. Approximate degree: He bought APPROXIMATELY/ABOUT seventy pencils.
Note that the maximum, high, low, minimum, and zero degrees are already represented by the scalar polarity morphemes. Additional modifiers can be created for the other degrees. Here is the complete list that we will use in the interlingua:
bi- maximal polarity ke- high polarity can- average polarity fo- low polarity zu- minimal polarity ju- 0% polarity kin- too, excessively, over- bon- insufficient, too little, inadequate, not enough xi- enough, adequately, sufficiently dan- extra, spare, surplus, over and above, above and beyond ta- almost, not quite, nearly, all but, well-nigh coy- just, only, exclusively, simply fen- about, approximately, circa, more or less dun- exactly, precisely, no more and no less baw- especially, particularly, in particular
Now, we can form complete words with these modifying morphemes by using them with the true generic root "tom". The results will be P-s by default. Here are a few examples:
previous-word modifiers: He likes bitomay Louise. = He likes Louise SO MUCH. He's a happy ketomay person. = He's a VERY happy person. This is a heavy cantomay box. = This is a PRETTY/MODERATELY heavy box. He studies when coytomay she's here. = He studies ONLY when she's here. When she's here, he studies coytomay. = When she's here, he JUST studies. He is a poet coytomay (and nothing else). = He's JUST a poet (and nothing else). He studies ketomay. = He studies VERY MUCH/A LOT. He studies kintomay. = He studies TOO MUCH. He likes bawtomay pizza. = He ESPECIALLY likes pizza. verbs: His understanding zutoma. = His understanding IS MINIMAL. The music cantoma. = The music IS SO-SO. The volume kintoma. = The volume IS EXCESSIVE. The measurements fentoma. = The measurements ARE APPROXIMATE. The problem ketoma. = The problem IS LARGE/GREAT. That bicycle bawtoma. = That bicycle is SPECIAL.
It's important to emphasize that the polarity morphemes indicate degree or general magnitude. They do not indicate quality or quantity. Thus, they are most useful when modifying states or actions. When modifying physical entities, they are likely to be ambiguous.
When "bitomay" modifies an adverb, an adjective, or an adjectival verb (i.e., a verb derived from an English adjective, such as "fecula" = 'to be hot'), it is equivalent to the English expression "as X as possible". For example, if "ticuloge" means 'quickly', then "ticuloge bitomay" means 'maximally quickly' = 'as quickly as possible'.
Here are some adjective derivations:
bitomo = maximum, maximal, utmost, greatest, highest, uppermost ketomo = great (eg. intellect), intense (eg. color), high (eg. temperature), strong (eg. smell), keen (eg. eyesight), acute (eg. hearing), superior, considerable, substantial, etc. cantomo = so-so, average, typical, common, usual, ordinary fotomo = low, weak, mere, meager, inferior, slight zutomo = minimum, minimal, lowest, weakest, negligible, inconsiderable, inconsequential, trifling kintomo = excessive, overrated, overblown, etc. bawtomo = special, notable, noteworthy
Some adjective forms are what linguists refer to as predeterminers since they modify an entire noun phrase, including the determiner such as an article or demonstrative. "Coytomo", meaning 'just/only', is an example:
That's only a duck. I need only those books.
The word "predeterminer" is based on English word order. In a right-branching language, such as the interlingua, the proper term is postdeterminer, since it will follow the determiner. For example, in the interlingua, the phrase meaning 'only a duck' is "bodami tomo coytomo".
Finally, keep in mind that scalar non-relational states can use the five basic polarity morphemes productively ("bi-", "ke-", "can-", "fo-", and "zu-"). The remaining polarity morphemes can NOT be used in this way. In other words, the five basic polarity morphemes can be added directly to the root rather than require a separate modifying word. For example, "kefeculo" = "feculo ketomay" = 'very hot'.
We often make comparisons in which we specify the magnitude of the difference between the entities being compared. Consider the following:
1. The rope is longer than the stick. 2. The rope is half as long as the stick. 3. The rope is less than half as long as the stick. 4. The rope is three meters longer than the stick.
We've already seen how to deal with (1). The others can be most easily implemented by simply focusing the verb:
2. The rope zeculunza tinxekumo stick. = The rope is as long as half the stick. = The rope is half as long as the stick. 3. The rope zeculunza fojopay tinxekumo stick. = The rope is less long than half the stick. = The rope is less than half as long as the stick. 4. The rope zeculunza the stick and three meters. = The rope is as long as the stick and three meters. = The rope is three meters longer than the stick.
where "zeculunza" is the P/F-s verb meaning 'to be as long as', and "tinxekumo" is the numeric adjective meaning 'one-half'.
We can also easily add a comparative to the verb, creating a double comparative:
The rope zeculunzu kejopay the stick and three meters. = The rope is longer than the stick and three meters. = The rope is more than three meters longer than the stick.
Note that none of the above solutions require the comparative conjunction "zuntesye". However, solutions with "zuntesye" are also possible. Here's the solution for (2):
The rope zuntesye tinxekumo stick zecula canjopay. = The rope compared with half the stick is as long. = The rope is as long as half the stick. = The rope is half as long as the stick.where "zecula" is the P-s verb meaning 'to be long'.
Here is the solution to (3):
The rope zuntesye tinxekumo stick zecula fojopay. = The rope compared with half the stick is less long. = The rope is less long than half the stick. = The rope is less than half as long as the stick.
But how do we handle (4), where the difference is not only additive, but also contains the unit of measure "meters"? Here's the answer:
The rope zuntesye the stick and three meters zecula canjopay. = The rope compared with the stick and three meters is as long. = The rope is as long as the stick and three meters. = The rope is three meters longer than the stick.
We can also easily add a comparative to the verb, creating a double comparative:
The rope zuntesye the stick and three meters zecula kejopay. = The rope compared with the stick and three meters is longer. = The rope is longer than the stick and three meters. = The rope is more than three meters longer than the stick.
Thus, in effect, "zuntesye" acts like an oblique focus marker.
We've seen how useful the polarity morphemes can be in deriving many new words. So far, though, we've only applied them to stative (i.e. adjectival) concepts. Fortunately, they can be just as useful and productive when applied to nouns. In doing so, we will be creating words that are commonly known as diminutives and augmentatives. (Diminutives are also sometimes referred to as attenuatives).
In the interlingua, the semantics of diminutives and augmentatives is defined as follows:
When a polarity morpheme is used with a basic noun, it magnifies or reduces the SIZE, INTENSITY, and/or QUALITY of the entity, in proportions that are most natural or typical for the entity. While the nature of the result may be quite different from the root concept, the class will remain the same.
As you can see, even the definition is not semantically precise. Thus, as always, we will use the morphemes for their mnemonic value.
For easy reference, here are the basic polarity morphemes again:
bi- maximally, extremely, utmost ke- very, highly, so, so much, such can- midpoint, average, so-so fo- not too, not very zu- minimally, barely, hardly
Now, here are a few examples:
cinfepi = 'snowfall' bicinfepi = 'blizzard', 'whiteout' kecinfepi = 'snowstorm' zucinfepi = 'snow flurries' bivi = 'lake' bibivi = 'ocean' kebivi = 'sea' canbivi = 'pond' fobivi = 'pool (natural)', 'water hole' zubivi = 'puddle' kokigi = 'school' bikokigi = 'university' kekokigi = 'college' cankokigi = 'high/secondary/middle school' fokokigi = 'elementary/primary/grade school' zukokigi = 'kindergarten', 'preschool', 'nursery school'
We can also use the polarity morphemes to indicate quality. For example, we can use them to make distinctions such as between "palace", "mansion", "house", and "hovel".
In summary, we will be able to use the complete set of scalar polarity morphemes to create augmentatives and diminutives based on degree or quality. It's important to keep in mind, though, that these derivations are not productive, and cannot be used to create makeshift or ad hoc words. Any such words must have unique dictionary entries.
Many languages have words or morphemes that indicate the social status of the speaker relative to the listener or to a third party. The most common way of marking these differences is by means of special pronouns. For example, a more polite 2nd person pronoun can be used when speaking to a superior or elder.
However, these distinctions are not only made with pronouns. There are also many words, other than pronouns, that are only used in certain social contexts. For example, most English speakers will use the words "shit", "crap", "feces", "do-do", and "number 2" in entirely different settings, depending on who they are speaking with. In fact, some speakers will completely avoid using certain words, either because they are too formal or too rude. For example, many speakers will not use the 'dirty' word "shit" at all, while others may not use 'big' words like "explicate" or "obfuscation", or 'pretty' words like "lovely" or "marvelous".
Some languages also have words that differ in register that are effectively required in certain contexts. Cambodian is a language that is especially rich in this respect. For example, there are three completely different words that mean 'to sleep'. The first is used when the sleeper is a superior or someone especially deserving of respect; the second is used when the sleeper is the speaker or a person of equal status; and the third is used when the sleeper is of lower status.
Words or morphemes that indicate respect are normally called honorifics, while those which indicate disrespect are called pejoratives.
In the interlingua, we will create special prefixes for honorifics, pejoratives, and other register variations. This approach is similar to the honorific affixes of Korean and Japanese, but is more comprehensive.
To illustrate this in the interlingua, we will create the following prefixes:
laye- humble, subservient, inferior, fawning, groveling lea- praising, complimentary, flattering lin- polite, respectful ????- formal, correct -- neutral (default) lewi- slang, informal lun- cold, unfriendly, unsociable loa- contemptuous, rude, insulting layo- vulgar, filthy, tasteless
The register prefixes can be applied to any word to indicate its social context. An unmarked expression would be interpreted as 'neutral', and would be used in the vast majority of cases.
Pronouns of varying degrees of politeness can be easily formed. For example, the 2nd person pronoun "xevi", meaning 'you', would become:
linxevi = old English "ye", French "vous", German "Sie", etc. lewixevi = old English "thou", French "tu", German "du", etc.
Nouns, verbs, adjectives, etc. can also have their register changed. For example, if the word for 'urine' is "fezopi", then the word for 'piss' will be "loafezopi".
As with most prefixes, a register prefix modifies the entire word that follows. For example, the word "loafezopi" means 'piss'. However, the P-d derivation "loafezopumpa" does not mean 'to become/turn into piss'. Instead, it simply means 'to become/turn into urine' spoken contemptuously. If we need to express the meaning 'to become/turn into piss', we must use the verb meaning 'to become' = "dapumba".
It is important to emphasize that the register prefixes always reflect the attitude of the speaker toward the entity that is being modified by the prefix. For example, if the 'contemptuous' prefix is used with the pronoun meaning 'you', it shows that the speaker feels contempt for the listener. If it is used with the pronoun 'I', it indicates that the speaker feels contempt for himself. If the 'humble' prefix is used with the pronoun 'you', it shows that the speaker feels humble in the presence of the listener. If it is used with the pronoun 'I', it indicates that the speaker feels humble in his own presence, as if he were in awe of himself or something he just did.
When used with a verb, the attitude is towards the patient.
Finally, the use of pejoratives is preferable to using metaphor (e.g. "dog" or "pig") since pejoratives are culturally neutral and will always be understood. [I'll have more to say about the dangers of metaphor later.]
It will also be useful to apply register to complete utterances; i.e., by having a register word modify a complete sentence. In the interlingua, we will accomplish this by simply prefixing the first word of the sentence with an appropriate register prefix. Here are a few examples:
1. Laye + May I leave now? <- humble 2. Lin + Can I watch TV? <- polite 3. Lewi + I'm leaving now. <- slang 4. Loa + Why did you do it? <- insulting
example (1) would have the sense of the sentence "I humbly request permission to leave now", (2) is the same as "Can I please watch TV?", (3) would be equivalent to "Hey, I'm splittin' now", and (4) would have the flavor of the English sentence "You louse! Why did you do it?".
She told me that Bill had broken the window.
Here, the time of the main clause is relative to the moment of speech, while the time of the subordinate clause is relative to the time of the main clause.
Tense has three basic values: past, present, and future. However, natural languages often have additional tenses that are variations of the three basic tenses, such as 'immediate past', 'remote future', as well as different forms for relative tenses. Aspect marks the temporal 'shape' of the event, and whether the event is being viewed from the 'inside' or from the 'outside'. There are two general aspects that apply to all events, and several more specific ones. Here are the two general aspects:
Perfect or Perfective: The event is considered to be a single, bounded unit, viewed from the outside; i.e., the event is completed. e.g. Past: In this report, we showed that... John sang the song. Present: In this report, we show that... He catches the ball, swings around, and throws it... Future: In this report, we will show that... John will sing the song. Imperfect or Imperfective: The event is considered to be a range of points in time, viewed from somewhere within the range; i.e., the event is in progress. e.g. Past: John was singing the song. Present: John is singing the song. Future: John will be singing the song.[The aspectual labels that I am using here are very common in the linguistic literature, but actual labels and their definitions vary somewhat from linguist to linguist. Also, since the words "perfect" and "imperfect" have common, unrelated, non-aspectual meanings in English, I will often use their respective synonyms "perfective" and "imperfective" instead to prevent misunderstandings.]
In English, the combination of present and perfect is almost never used except in formal reports and, occasionally, in colloquial narration. Technically, the combination is not usually meaningful. If it were, it would imply that an event can be viewed as both complete and ongoing at the same time, which is almost always self-contradictory. Still, it does occasionally have its uses since it allows the speaker to treat an ongoing event as if it were complete.
Because of this, natural languages will often use the present perfect form for something else. English, for example, almost always uses the present perfect form to represent a present tense generic or habitual meaning (discussed below). For example, the use of "sings" in "He sings very well" means that he habitually sings very well. It does not mean that he is actually singing at the present moment.
A few languages, including English, also take advantage of certain, very common verb-tense-aspect combinations to achieve greater efficiency. For example, some verbs are almost always used with a perfective meaning in the past tense and an imperfective meaning in the present tense. Some languages will take advantage of this by using the perfect form all of the time if the perfect form is less marked (i.e. 'shorter') than the imperfect form. Here are some examples:
Imperfective meaning, perfective form: John knows the answer. *John is knowing the answer. The book weighs 4 pounds. *The book is weighing 4 pounds. Perfective or imperfective meaning (depending on context), perfective form: John knew the answer. *John was knowing the answer. Imperfective meaning, perfective form: The fish stinks more than I can tolerate. *The fish is stinking more than I can tolerate. Perfective or imperfective meaning (depending on context), perfective form: The hat was too big. *The hat was being too big.
[Note that all the verbs in this group are non-agentive "-s" verbs derived from state roots.]
Because the English imperfective form using an auxiliary plus "-ing" is longer than the perfective form, and since only one meaning is likely, the more efficient perfective form is used instead without confusion. I suspect that this kind of crossover is only likely to occur in languages whose perfective forms are more efficient than their imperfective forms. However, it is not universal. In Turkish, for example, the less efficient but semantically correct imperfective form is used for verbs such as 'know'.
Keep in mind that use of the imperfective indicates that we are looking at a point in time within a range of points; in other words, we are viewing the event from the inside. Use of the perfective implies that we are looking at the event as if it were bounded; in other words, we are viewing it from the outside, as if it were a single point in time (although it could be a very "large" point).
Now, consider the following:
Imperfective: John was eating when Bill left. Perfective: John ate when Bill left.
The first example is not bound and can potentially extend both before and after the tense time. It's even possible that John is still eating when the sentence is uttered. The second example is bounded. John was definitely not eating before Bill left, and was definitely not eating when the sentence was uttered. In other words, a perfective event can not extend outside of the boundaries imposed by the tense time. An imperfective event can extend beyond those boundaries.
There are several aspects that are more specific than the perfect or imperfect aspects. Here is a list of the most important ones:
Iterative: The event is repeated more than once on a SINGLE occasion. Past: John kept singing the song. Present: John keeps singing the song. Future: John will keep singing the song. Habitual: The event is repeated more than once on DIFFERENT occasions. Past: John used to sing the song. Present: John sings the song (e.g. often). Future: John will sing the song (e.g. from now on). Note that "sing" in the above examples is perfective by default. We can also make it imperfective, as in "John used to be singing the song when ... etc". Inceptive: Only the start point of the event is under consideration. Past: John started singing the song. Present: John starts singing the song. Future: John will start singing the song. Terminative: Only a stopping point of the event is under consideration. Past: John stopped singing the song. Present: John stops singing the song. Future: John will stop singing the song. Resumptive: Only a resumption point of the event is under consideration. Past: John resumed singing the song. Present: John resumes singing the song. Future: John will resume singing the song. Do not confuse resumptive with 'continuing'. In English, "resume" is never ambiguous, but "continue" is sometimes used as a synonym for "resume". Consider: "John resumed singing" vs. "John continues to sing even though I told him to stop". We'll see how to deal with this sense of "continue" later. Completive: The event is done to completion, reaching a natural or obvious endpoint. English generally uses the verb "to finish" or an expression such as "really" or "to completion" to indicate this aspect. Past: John finished washing the dishes. Present: John finishes washing the dishes. Future: John will finish washing the dishes. Do not confuse "completeness" with "thoroughness". Something may be finished without having been done thoroughly.
The above definitions are, I believe, the best ones possible for an AL designer because they cover those categories of aspect that appear in most natural languages. Categories that appear in very few languages, such as 'excessive duration', 'limited duration', 'frequentative', 'partial completion', etc. are not true aspects, but are actually modifications of existing aspects, and can be handled by using adverbs.
The inceptive, terminative, resumptive, and completive are all perfect by default. If we force them to be to be imperfect, we can obtain senses such as "John was starting to sing the song when ...", etc. We'll see how to do this later on in this chapter.
In summary, tense describes the external temporal state of an event, while aspect describes the internal temporal state of an event.
Now, in the above list, I intentionally omitted the aspect usually referred to as generic. Here are some examples:
Squirrels live in trees. Americans produce too much garbage. Sapphires cost more than diamonds. Dogs bark when the moon is full.
Many (and perhaps most) languages use the same form for both habitual and generic aspects. This is possible because the subject of a habitual is always definite:
Generic: Dogs bark when the moon is full. Habitual: His dogs bark when the moon is full.
Keep in mind that "genericness" is really a property of a noun - not of a verb.
[Incidentally, English also allows a definite article to appear with an indefinite noun, which can be confusing, as in "The elephant lives in Africa". Here, the context must make it clear whether the speaker means a particular elephant or elephants in general. The interlingua does not allow this ambiguity.]
In English, the habitual is also often rendered with the words "always" or "all the time", as in "John always eats alone". In general, the habitual aspect should be used when referring to a series of events whose actual number is not relevant. The numeric derivation "bikumoge", meaning 'always', (which we derived earlier in the chapter on Counts and Measures) should be used when the actual number of events is relevant. A good test for this is to paraphrase the sentence using the word "habitually". If the result is acceptable, then the habitual aspect should be used. Otherwise, the numeric derivation should be used. The numeric derivation should also be used whenever the speaker wants to emphasize that the event or situation occurred at every possible opportunity.
Tense seems to be morphologically or lexically linked to aspect in most natural languages. In the interlingua, we will accomplish this by allocating roots that are mnemonically compositional, just as we did for deictics. In other words, tense and aspect words will be formed from true, unique root morphemes, but we will design them in a way that will display their inherent compositionality. Here are the details:
Aspect Tense -------------------- --------------------- Perfective: fin- Past: -cip Imperfective: doy- Present: -das Iterative: kaw- Future: -jev Habitual: xa- Unspecified: -bul Inceptive: ci- Terminative: ju- Resumptive: da- Completive: je- Unspecified: to-
By default, when not preceded by an aspect marker, "cip" will be past-perfect, "das" will be present-imperfect, and "jev" will be future-perfect. "Bul" is discussed below.
Also, tense-aspect words take an entire clause as an argument, rather than just modifying the verb. In the interlingua, we will use the part-of-speech marker "-u" to indicate this.
Here are a few examples:
John looked at the house. = past perfect = John cipu look at the house. John will be reading a book when I arrive. = future imperfect = John doyjevu read a book when I arrive. John is replacing the front tire. = present imperfect = John dasu replace the front tire.
It is also possible to apply more than one of the more specific aspects at the same time, but the way this is implemented varies considerably from language to language. Some languages have simple and regular rules for doing so, while others must depend on context or periphrasis. English is an example of the latter. Consider the following three English sentences:
1. Louise started singing the song 5 minutes ago. 2. Louise started singing the song 5 years ago. 3. Louise started singing the song 5 years ago and has never stopped, even to eat or sleep.
In most circumstances, (1) would be interpreted as perfective-inceptive. example (2), though, would normally be interpreted as a combination of perfective-inceptive and perfective-habitual (assuming Louise is a human with normal human limitations). However, context makes it clear that (3) can only be interpreted as perfective-inceptive, while also implying that Louise is some kind of supernormal creature.
Now, let's look at some examples:
He keeps sneezing. = present iterative = He kawdasu sneeze. John started singing the song. = past inceptive = John cicipu sing the song. John used to sing the song. = past habitual = John xacipu sing the song.
Now, compare the last one above with:
John used to be singing when I visited. = past habitual + imperfect verb = John xacipu doybulu sing when I visited.
Note that when more than one aspect is applied to a verb using more than one word, tense must be applied to the outermost aspect, and the inner aspect(s) must be tenseless. A clause may have more than one aspect, but may not have more than one tense. Note that this is consistent with the use of infinitives, participles, and other equivalent non-finite forms in natural languages.
Here are some more examples that require more than one aspect:
John was starting to sing the song when... = past imperfect + inceptive + verb = John doycipu cibulu sing the song when... John used to stop smoking as soon as I arrived. = past habitual + terminative + verb = John xacipu jubulu smoke as soon as I arrived. John started to (habitually) smoke when he was 15 years old. = past inceptive + habitual + verb = John cicipu xabulu smoke when he was 15 years old.
And so on.
English past perfect and future perfect tenses are used when the tense of the main clause is relative to the tense of an embedded clause. However, there is no need to implement special "perfect" forms in the interlingua, because they can be dealt with using simpler constructions. Here are some examples:
John will have left when I arrive. = John will leave before I arrive. John had been sick when Bill arrived. = John was sick before Bill arrived. John had gone to a great deal of trouble to convince her. = John went to a great deal of trouble to convince her.
The English past perfect and future perfect tenses are almost never used in ordinary speech. Instead, simpler constructions like the ones above are used.
However, there is a way to achieve the same effect, as we will see later.
The English present perfect form is used when a situation occurred in the past, is still occurring in the present, and will presumably continue into the future. In effect, all three tenses apply. For example, the English sentence "John has been angry for a long time" can be expressed as "John was angry for a long time and continues to be so". Now, we could create a special tense-aspect marker for this, but it's really not necessary. We can simply use the tenseless word "bulu". Thus, we have "John bulu be-angry for a long time", where "be-angry" is a P-s verb. Note that "bulu" is inherently imperfective. [Note that we can also use the perfective "finbulu" instead of the imperfective "bulu" = "doybulu", depending on how the event is perceived. However, English does not appear to be able to make this distinction and I'm not sure the distinction is meaningful. (Unless, that is, the actual imperfective translation is "John has been being angry" or "John has been being fixing the car". However, these are not grammatically acceptable).]
For verbs that use the auxiliary "do" for interrogative present imperfect (eg. "DO you know/have/see/want/etc") rather than "be" (eg. "ARE you eating/staying/going/etc"), the English translation will sound more natural if the adverb "already" is used with the simple present tense:
John wants a bicycle, but he bulu have one = *John wants a bicycle, but he has been having one. = John wants a bicycle, but he ALREADY HAS one. Bill can't hide because I bulu see him = *Bill can't hide because I have been seeing him. = Bill can't hide because I ALREADY SEE him.
But if the verb has a durational case tag, we do not want to use the simple present:
John wants a bicycle, but he bulu have one for three months = John wants a bicycle, but he has ALREADY had one for three months.
There are also other complex English tense-aspect forms that tend to be used in formal writing but which all have simpler counterparts. For example, "You must have answered all of the questions before you can leave" can be stated more simply as "You must answer all of the questions before you can leave".
In the interlingua, we will create a few simple rules that will make the intended tense and aspect obvious - even to a computer - when an explicit tense-aspect marking is missing. Here are the rules that I feel are both natural and efficient:
a. If a tense-aspect disjunct immediately precedes the verb, then it will set the tense and aspect accordingly. b. Otherwise, if the verb is derived from a temporal deictic root, then the tense will be the deictic tense, and the aspect will be perfective for past or future or imperfective for present. c. Otherwise, if a verb is modified by a tense-aspect adverb, then the adverb will provide the tense and/or aspect. If the adverb indicates aspect but not tense, then the tense will be past. If the adverb indicates tense but not aspect, then the aspect will be perfective. d. Otherwise, if the verb follows a conjunction then it will have the same tense and aspect as the preceding verb that it is linked to. e. Otherwise, if an embedded clause is introduced by a case tag derived from a root that specifies tense (i.e., 'before', 'after', 'when', 'until', and 'since'), then the default tense of the embedded clause will be the same as the tense of the main clause, and the default aspect will be perfective. f. Otherwise, the tense and aspect will be past-perfective.
The above tests must be carried out in the order shown.
Here are some examples:
a. John jevu study. = John will study. [Here, "jevu" is the future-perfective disjunct.] a. John doycipu study. = John was studying. [Here, "doycipu" is the past-imperfective disjunct.] b. The party dipa. = The party will take place later. [Here, "dipa" is the simple temporal deictic verb meaning 'to take place later'.] c. He speak to his sister dipe-tovay. = He will speak to his sister tomorrow. [The adverb "dipe-tovay" = 'tomorrow' forces the verb to be future perfective.] c. He speak to his sister dikumoge. = He spoke to his sister three times. [Since "dikumoge" = 'three times' does not indicate tense (just iterative aspect), the default past tense applies.] c. He speak to his sister bape. = He is speaking to his sister now. [The adverb "bape" = 'now' forces the verb to be present imperfective.] d. He go dipe and John go too. = He will go later and John will go too. [The adverb "dipe" = 'later' forces the first "go" to be future tense, and the second verb "go" is also future because of the conjunction "and".] e. John jevu use the computer jeve you fix it. = John will use the computer after you fix it. [Note that "fix" is future tense even though "jevu" = 'will' does not precede it, because "jeve" = 'after' is derived from the future tense root "jev" (discussed below in the next section).] f. He break the window. = He broke the window. f. He ask me a question. = He asked me a question. f. He know geometry. = He knew geometry. f. John walk to school. = John walked to school. f & e. He arrive after you go. = He arrived after you left. ["Arrive" is past tense because of rule f. "Go" is the same tense as "arrive" because of rule e.]
The tense-aspect roots represent many useful concepts, and can undergo further derivation to produce many useful words.
Before we can proceed, though, we need to define the semantics of the conversion process. In other words, since we will be using a tense-aspect root as a state root, we need to define the meaning of the resulting state. Here are the meanings that we will use in the interlingua:
1. If a root contains tense information, the corresponding state will represent a point or range of points on the time line. For example, a past tense derivation indicates that the patient occurs before the focus.
2. If a root contains aspectual information, the corresponding state will represent a point in time (perfective) or a point within a range of points (all imperfectives). Furthermore, imperfectives may not be focused, because the referent is implied by the aspect itself. For example, inceptive indicates the start of the patient event, the terminative indicates a stopping point in the patient event, etc.
[Technically, this is not correct. However, the results would represent complex states that, to my knowledge, have no single-word counterparts among natural languages. For example, a past-terminative derivation would represent the concept 'P stopping before F', which can be just as easily rendered with two words. Thus, implementing these concepts would require complex programming with little or no known benefit. Because of this, we will not allow imperfective derivations to be focused in the interlingua.]
3. The patient of a derivation is the entity or event that experiences the temporal/aspectual state, while the focus (if any) is the referent. Tenseless derivations cannot have a focus because they do not have a referent. And since imperfectives cannot be focused, they must all be tenseless.
4. All tense (but not aspect!) derivations must be static ("-s"). Dynamic ("-d") tense derivations do not make sense because they would imply movement on the timeline in directions other than what occurs naturally.
Tense-aspect roots will all be P/F-s by default.
With the above in mind, we can now create the following useful words:
jev (future-perfect): jeva - P/F-s verb = 'to be in the future relative to the focus' 'to occur/happen after', 'to postdate' e.g. The accident jeva the party. = The accident occurred after the party. jeve - P/F-s case tag = 'after', 'since', 'subsequent to', 'once' e.g. He left jeve I did. = He left after I did. [The open adjective form "jevyu" has the meaning 'subsequent to', 'after (a/the)', or 'since (a/the)'.] jevomo - P-s [-F] adjective = 'subsequent', 'following', 'next', 'succeeding' jevome - P-s [-F] adverb = 'afterwards', 'after that', 'then', 'later', 'next' e.g. We saw him three times since then. [Note that both "jevuse" and "jevome" are synonymous, since P-s is semantically equivalent to P-s [-F]. As we mentioned earlier, this only applies when the root is focused by default (all tense-aspect roots are inherently focused).] jevanza - A/P/F-s verb = 'to cause event P to occur after event F' = 'to have/hold P after F' e.g. I jevanza the meeting Bill arrive = I had the meeting after Bill arrived. jevasa - A/P-s verb = 'to cause event P to occur later' = 'to have/hold P later' e.g. I jevasa the meeting = I'm having the meeting later.
These should look familiar. In the section on temporal case tags, we used the root "cip" to represent the temporal relationship meaning 'before'. This root, as we can now see, is simply the tense-aspect root representing past tense with perfect aspect.
It's important to emphasize that the agent actually causes the event to occur. It's not just being scheduled. For example, we cannot translate "jevasa" as 'to schedule P for later'. If we did, it would mean that the following two sentences are synonymous:
We had the meeting yesterday. We scheduled the meeting for yesterday.
Obviously, the two sentences are not synonymous.
Now, let's create some more useful words:
cip (past, perfect aspect): cipaw - P/F-s open noun = 'predecessor of', 'precursor of', 'forerunner to', 'prelude to' e.g. The telegraph was the cipaw all modern communications. = The telegraph was the precursor of all modern communications. cipomi - P-s [-F] noun = 'predecessor', 'precursor', 'forerunner', 'vanguard', 'prelude' [Note that we cannot use "cipi" because all of the English glosses imply that the focus is known from context. We must use the anti-middle "cipomi".] cipanza - A/P/F-s verb = 'to cause event P to occur before event F' = 'to have/hold P before F' e.g. We cipanza the meeting Bill arrives = We're having the meeting before Bill arrives. cipasa - A/P-s verb = 'to cause event P to occur earlier' = 'to have/hold P earlier' e.g. We cipasa the meeting because of the weather= We're having the meeting earlier because of the weather. doycip (past, imperfect), findas (present, perfect), and doyjev (future, imperfect): doycipa - P/F-s = 'to be a range of points that occurs before F' findasa - P/F-s = 'to be a point in time that occurs at F' doyjeva - P/F-s = 'to be a range of points that occurs after F' We will not use these derivations because the fact that P is a single point or a range of points is indicated by P itself and does not need to be repeated in the verb. Thus, for the sake of brevity, we will only use the default forms for "cip", "das", and "jev". [TBD - How do we distinguish between imperfect "meanwhile/in the meantime" and perfect "at the time"???] das (present tense, imperfect aspect): dasa - P/F-s = 'to be at the time F', 'to occur when/at/while/during the time of/at the same time as' e.g. The accident dasa he fell asleep at the wheel. = The accident occurred when he fell asleep at the wheel. dase - P/F-s case tag = 'when/at/while/during/for', 'at the time of', 'at the same time as' e.g. Louise laughed dase Bill arrived = Louise laughed when Bill arrived. e.g. I arrived dase Bill was eating. = I arrived when/while Bill was eating. e.g. I left dase the parade = I left during the parade. e.g. I worked there dase three years. = I worked there for three years. dasuso - P-s adjective = 'ongoing', 'in progress', 'happening', 'underway', 'current' dasanza - A/P/F-s verb = 'to cause event P to occur when/at/while/during F' e.g. I dasanza the meeting the conference = I'm having the meeting during the conference. dasasa - A/P-s verb = 'to cause event P to occur at the same time' e.g. I dasasa the meeting = I'm having the meeting at the same time. dasinza - AP/F-s verb = 'to spend/pass', 'to keep oneself at/during time period F' e.g. I dasinza three days here studying French = I'm spending three days here studying French. dasisa - AP-s verb = 'to spend/pass the time' e.g. I dasisa studying French = I'm spending the time studying French.
In a similar vein, we can use the tenseless root "bul" to create the "0" adverb "buloge" with the meaning 'still', since "bul" implies past, present, and future. Here are some other useful derivations:
bul (tenseless = past + present + future): bulasa - A/P-s verb = 'to prolong', 'to protract', 'to continue', 'to keep an event ongoing' e.g. The children want to bulasa their vacation = The children want to prolong their vacation. bulusa - P-s verb = 'to continue (on)' e.g. The rain bulusa until very late. = The rain continued until very late. buloge - "0" adverb = 'still', 'continue to' e.g. John still wants to go to the beach. Note, though, that the adverb form is not very useful since we can apply the tense marker "bul" directly to the verb.
It will also be useful to have a past plus present root. For this purpose, we will allocate the classifier "kom". For example, in its underived form, "kom" represents both past and present (but excludes the future). Thus, the sentence "John komu wash dishes" means 'John washed dishes until just now' or 'John has (just) washed the dishes', and the sentence "It komu snow" means 'It was snowing until just now'. Here are some useful derivations:
kom (past+present tense, perfect aspect): koma - P/F-s verb = 'to occur both before and at the focal event', 'to last until', 'to take place until', 'to go on until' e.g. The party koma midnight. = The party goes on until midnight. kome - P/F-s case tag = 'until', 'up to the time of', 'by (the time)', 'not later than' e.g. We stay here kome it starts raining. = We're staying here until it starts raining. They should arrive kome noon. = They should arrive by noon.
Note that 'past plus present' ("kom") is not the same as negating the future. When we negate the future, we are simply saying that the event did not occur in the future. This means that it could have occurred in the past or the present or neither or both; i.e., there are four possible interpretations. When we combine past and present, we are saying that the event occurred both before and at the referent time; i.e., there is only one possible interpretation.
In the same way, we will allocate the root "xus" to represent both present and future, but exclude the past. For example, "He xusu work there" means 'He will work there from now on'. Here are some useful derivations:
xus (present+future tense, perfect aspect): xusa - P/F-s verb = 'to occur both at and after the focal event', 'to have lasted since', 'to have taken place since', 'to have gone on since' e.g. The party xusa midnight. = The party has gone on since midnight. xuse - P/F-s case tag = 'since', 'effective (as of)', 'starting (when)' e.g. He's been sick xuse October. = He's been sick since October. xusome - P-s [-F] adverb = 'ever since', 'since then', 'from then on', 'from that moment on' e.g. John use that office xusome. = John used that office from that moment on.
Now, let's do some imperfective derivations. Keep in mind that in all imperfective derivations, the patient represents a range of points and the aspect itself indicates the point within the range. Because of this, all imperfective derivations must be unfocused (unless the focus elaborates the aspect??? Is this possible???):
cibul (tenseless inceptive): cibulapa - A/P-d verb = 'to start P', 'to begin P', 'to initiate' e.g. We cibulapa the new policy in December. = We started the new policy in December. cibulupa - P-d verb = 'to start/begin/start up' e.g. The rain cibulupa now. = The rain is starting now. jubul (tenseless terminative): jubulapa - A/P-d verb = 'to stop/halt', 'to bring to a stop/halt' e.g. The police jubulapa the illegal gambling at the tavern. = The police halted the illegal gambling at the tavern. jubulasa - A/P-s verb = 'to suppress/repress/curb/check/arrest/restrain', 'to keep in a halted state' jubulupa - P-d verb = 'to stop/halt', 'to come to a stop/halt' e.g. The rain jubulupa now. = The rain is stopping now. dabul (tenseless resumptive): dabulapa - A/P-d verb = 'to resume', 'to continue on with' e.g. We dabulapa the trip when John was feeling better. = We resumed the trip when John was feeling better. dabulupa - P-d verb = 'to resume' e.g. The rain dabulupa. = The rain resumed. jebul (tenseless completive): jebulapa - A/P-d verb = 'to finish', 'to end', 'to bring to an end' e.g. We jebulapa the trip when John was feeling better. = We finished the trip when John was feeling better. jebulupa - P-d verb = 'to end', 'to finish', 'to come to an end' e.g. The rain jebulupa at 3 o'clock. = The rain ended at 3 o'clock.
Note that AP forms are also useful and imply that the agent affected himself. For example, we must use AP-d "jubulipa" in a sentence such as "We'll stop when we reach the next milestone". In other words, "jubulipa" literally means 'to stop oneself'.
xabul (tenseless, habitual): xabule - P/F-s case tag = "every (time)" or "each (time)" in habitual time elaborations e.g. I went to Boston xabule three weeks. = I went to Boston every three weeks. He spends too much money xabule he goes to the market. = He spends too much money each time he goes to the market. xabuli - P/F-s noun = 'habit', 'wont', 'custom', 'routine', 'convention' xabulo - P/F-s adjective = 'usual', 'habitual', 'regular', 'customary', 'typical', 'routine', 'conventional' xabulasa - A/P-s verb = 'to do/perform event P habitually', 'to be in the habit of' xabuloge - "0" adverb = 'usually', 'habitually', 'regularly', 'customarily', 'typically', 'routinely' kawbul (tenseless, iterative): kawbulasa - A/P-s verb = 'to do/perform event P iteratively', 'to repeat/iterate', 'to do over and over' kawbuloge - "0" adverb = 'repeatedly', 'iteratively', 'over and over', 'time after time'
As we saw earlier, in the chapter on Counts and Measures, we can also achieve an iterative meaning by using both specific and non-specific numeric values with the "0" suffix "-og". For example, "xekumoge" means 'twice', "fokumoge" means 'occasionally/a few times', "jukumoge" means 'never', and so on.
In sum, there is no need to randomly allocate state roots to perform temporal or aspectual functions. In fact, if we should ever discover that we are unable to "derive" an essential temporal or aspectual word using the above approach, then it will imply that our tense-aspect system is incomplete. If this should occur, then an appropriate new entry should be added to the tense-aspect table.
It's also possible to apply the polarity modifiers to tense-aspect roots and their derivatives. Here is an example with the future-perfect root "jev":
bijevu eventually, in a very long time, in the remote future kejevu eventually, a long time from now, far in the future canjevu in a while, in a moderate time from now fojevu soon, in a short time from now, in the near future zujevu very soon, in the very near future
Note that "zujevu" can be used to capture the meaning of the English expression "to be on the verge of". For example, "Zujevu John get a promotion" can be translated 'John is on the verge of getting a promotion'.
A tense-aspect word ending in "-u" should not be confused with the P-s [-F] derivation, which takes an embedded sentence as its only argument. For example, "John jevu leave" meaning 'John will leave' is not the same as "Jevoma John leave" meaning 'It was in the future relative to an unspecified focus that John left' or simply 'John left afterwards', where the unspecified focus is either obvious or provided by the context. In the "-u" form, where tense is deictic, the unspecified focus is always the time of the utterance. When using an unfocused, verb derivation, the unspecified focus does not necessarily have to be the time of the utterance.
It's also important to note that the verb form will also have a tense of its own. This allows us to create equivalents to English's past and future perfect tenses. For example, "Jevu jevoma John leave" means literally 'It will be in the future that John left', or simply 'John will have left'.
As we discussed earlier, a verb which takes an embedded sentence as its only argument is called a disjunct. A disjunct is deictic when the unspecified arguments are determined by the speech environment (who is speaking, who is listening, where the speech is taking place, when the speech is taking place, etc). When a disjunct is derived by using an unfocused verb or a middle derivation of a focused verb, it is not deictic because the unspecified arguments are obtained from context (i.e., obvious from what has already been said) or from general knowledge.
Whenever we speak, we always provide some indication of our commitment or attitude towards what we are saying. In effect, we make an impersonal judgment of the truth or consequences of the event we are speaking about. This impersonal judgment of a speaker towards what he is saying is called the modality of an utterance, and can vary in kind as well as in degree. Here are some English examples:
You must go now. -> 100% obligation You should go now. -> high obligation You need to go now. -> high necessity He left. -> 100% probability He may have left. -> undefined probability He might have left. -> low probability He did not leave. -> zero probability He should be there. -> high probability He's bound to be there. -> high inevitability He'd better be there. -> high consequentiality It seems the storm is over. -> high evidentiality [Evidentiality indicates the speaker's judgment about how reliable the information is.]
As you can see from the above examples, there is very little regularity in the English modal system, and this is typical of perhaps all natural languages. Modal systems evolve slowly over time and can be quite idiosyncratic. In a single language, some modals may take the form of inflections, some may use auxiliaries, while some may use verbs, adverbs, or other open class words. In this respect English is typical.
Unfortunately, different languages implement modal concepts in different ways, and a particular modal may be used for more than one type of modality or may cover different degrees. For example, the English modal "should" can express either probability or obligation.
There may also be different ways of expressing the same type and degree of modality. For example, the English expressions "should" and "ought to" are essentially synonymous, as are "must/have to", "does it matter/is it important", and so on.
Finally, modalities often overlap in meaning. For example, both "must" and "have to" can imply either obligation, probability, or inevitability.
In fact, the modal systems of natural languages vary so much and are so idiosyncratic, that a truly neutral and regular system is unlikely to resemble the system of any natural language. Fortunately, the semantics of modality is highly regular, and can be categorized.
The most basic modal concept is 'probability'. It is the most basic because it provides us with the most common sentential types: positive statements, negative statements, and statements of the likelihood of events. Here is a breakdown of the probability modality:
probability: He left yesterday. 100% probable He must have left yesterday. high He probably left. average He might/could have left yesterday. low He just might have left yesterday. very low He did not leave yesterday. 0% He may have left yesterday. undefined Did he leave yesterday? interrogative
The 100% probability modality is normally referred to as the indicative, the 0% probability modality is referred to as the negative. Also, the 100% probability modality is normally unmarked. When it is explicitly marked, it is called the emphatic. (Cf. "He left yesterday" vs. "He did leave yesterday" or "He definitely left yesterday".)
There are also several other modalities. However, in most natural languages, these modalities generally only have unique modal forms for the 100% or high degree, if at all. Other degrees of modality are generally obtained by use of adjectives, adverbs, normal verbs, disjuncts, and other kinds of periphrasis. Here is a listing of a few of the other modalities, illustrating the 100% and the high degrees of each one in English:
obligation: He must (= has to) go now. 100% He should (= ought to) go now. high necessity: It is essential that he go now. 100% He needs to go now. high evidentiality: It's obvious that he left. 100% He seems to have left. high or It looks like he left. inevitability: He must be there by now. 100% He's bound to be there by now. high
The other degrees of modality (and occasionally the 100% and high degrees, as well) are often quite idiosyncratic, and may require adjectives, adverbs, normal verbs, and unusual language-specific forms of prosody and/or periphrasis.
Some linguists consider certain feelings about an event to be modal in nature. Here are some examples:
fear: I fear that he left. sorrow: It's sad that he flunked the course. curiosity: It's curious that he left so early. revulsion: It's disgusting that he'd be so crude.
However, these are not true modals because the embedded event causes the state of the speaker. For a true modal, the speaker is judging a situation and must be the source of the judgment. Besides, these feelings are inherently mental and personal, and represent the state of the speaker himself. They do not represent the speaker's impersonal or unbiased judgment of an event. Thus, they should be derived from basic state verbs.
Some people may also be tempted to include other attitudes, such as anger, fondness, hatred, suspicion, desire, optimism, etc. among the modals. However, these again do not indicate the speaker's judgment about what he is saying. In fact, they are true mental states that represent the speaker's feelings towards the event, the listener, or a third party.
In summary, a true modal must be judgmental, but it must also be impersonal, which means that it must not represent the mental or emotional state of the speaker or others he may be speaking for.
Since all modalities express the speaker's judgment towards what he is saying, they are, in effect, a kind of speech act. In fact, all true modals can be paraphrased as something like "I say that there is X degree of modality Y that Z". For example, the sentence "You need to find a job" can be paraphrased as "I say that there is a high degree of necessity that you find a job". And like all speech acts, the 'agent' (i.e. the speaker) attempts to cause a change of state in the 'patient' (i.e. the listener), either by affecting the behavior of the patient or by imparting information to the patient. In other words, the speech act either tries to convince the listener to do or to not do something, or it tries to get the listener to accept, question, reject, or supply information. It's important to keep this in mind if you should ever feel that other concepts may be inherently modal in nature. [Later, we'll discuss a rigorous and comprehensive test for modal concepts.]
All modalities belong to one of two categories:
1. Epistemic: an impersonal judgment of a REAL situation (e.g. "John may have gone away.") 2. Deontic: an impersonal judgment of a HYPOTHETICAL situation (e.g. "John should go away.")
As we saw above with epistemic probability, each modal concept can take on a range of values. Here are complete examples for epistemic probability and deontic obligation:
Epistemic probability: 100%: John left. high: John must have left. average: John probably left. low: John might/could have left. very low: John just might have left. 0%: John did not leave. undefined: John may have left. interrogative: Did John leave? Deontic obligation: 100%: John must/has to leave. high: John should leave. average: John probably should leave. low: John may/might have to leave. very low: John just may/might have to leave. 0%: John doesn't have to leave. undefined: John can/may leave. interrogative; Should John leave?
Thus, 100% and high versions of deontic modalities imply that the hypothetical event can, should, or will occur. The low and 0% versions simply indicate lower degrees of commitment on the part of the speaker. The undefined deontic is used to indicate that change is optional.
Obviously, there is a great deal of similarity between modals and scalar states. However, there is also an important difference. For scalar states, the unmarked condition is a normal distribution centered about the 50% point, and never includes the 0% point. For example, the word "feculo", meaning 'hot', has the 50% point (polarity morpheme "can-") at its most prototypical interpretation, but it can also include the high and low degrees, and even the maximum and minimum degrees. However, it can not include the 0% degree (polarity morpheme "ju-").
For modals, the unmarked condition (i.e., the "undefined" degree) is a straight line with zero slope which can include the 0% degree. For example, if I say "John may be in Boston", it's possible that he is in Boston. But it's just as possible that he is not in Boston. Thus, the undefined degree is compatible with the 0% degree. For deontic modalities, we are dealing with hypothetical situations. Thus, instead of a range of "probability", the undefined modal has a range of "obligation", including 0%. And this, of course, simply indicates that the target of the modality has an option.
Also, do not confuse the 0% and undefined deontic modalities. For example, 0% obligation indicates that there is no obligation, which may or may not imply an option. The undefined modal clearly indicates an option.
There are several other modalities. Here's another epistemic one:
Epistemic evidentiality: 100%: It's obvious/clear/evident that John left. high: John seems to have left. OR It looks like John left. average: There's reason to believe that John left. low: There's little reason to believe that John left. very low: There's almost no/hardly any reason to believe that John left. 0%: There's no way that John left. John couldn't possibly have left. undefined: There may or may not be reason to believe that John left. John could have left (but I'm not sure). It's unclear/uncertain that John left. interrogative: What reason is there to believe that John left?
Thus, evidentiality indicates what appears to be true - not what actually is true. In effect, it simply comments on how reliable the speaker feels the information is. A good English paraphrase for evidentiality is "As for event X, the evidence is 100%/high/low/etc". Thus, evidentiality does not state that something actually happened or did not happen. It simply states how 'evident' the event is.
Some languages provide even greater detail, such as whether the speaker saw the event with his own eyes or heard it with his own ears. However, these more specific modalities are relatively rare. Also, the means by which information is obtained is basically periphrastic - it is not inherently modal and, technically, should not be part of a system of modality.
Here's another example of an epistemic modality:
Epistemic inevitability: 100%: He can't help being there by now. or He must be there by now. (The implication is that the event is totally predictable.) high: He's bound to be there by now. (The implication is that the event is expected; i.e., very predictable.) low: It wouldn't surprise me if he's there by now (but I don't really expect it). 0%: I don't EXPECT him to be there by now (but he may be for all I know). undefined: He could be there by now, but who knows? (The speaker is not sure how predictable the event is.)
Here are brief examples of a few other modalities:
Epistemic acceptability: What he's doing is acceptable/okay. (average) At least he remembered to bring the hot dogs. (minimal) Deontic necessity: He needs to take care of them. (high) Epistemic significance: It's very significant that he left early. (high) Does it matter that John won? Yes, it matters. (interrogative) Deontic consequentiality: It's critical that he keep his commitment (100%) He'd better keep his commitment. (high) It's important for him to keep his commitment. (average) [This modality implies that a situation will have negative consequences if the hypothetical event does not take place.]
Later, we'll discuss other potential modalities. We'll also discuss how to test new concepts to determine if they are inherently modal in nature.
So, how should we implement modality in a way that captures its inherent regularity, while avoiding the ubiquitous variability and idiosyncracy of natural languages?
There are three characteristics of modality that we need to represent:
1. The modal concept (e.g. probability, evidentiality, etc.) 2. The degree of modality (e.g. 100%, high, negative, etc.) 3. The type of modality (i.e. epistemic or deontic)
However, there is no need to explicitly mark whether the type of a modal is epistemic or deontic, because the type is an inherent part of the modal concept.
[Incidentally, if there were some way to derive one type from the other, then we would want to explicitly mark the type. For example, what is the deontic counterpart of epistemic acceptability? What is the epistemic counterpart of deontic obligation? Although I did once think that there was a correlation, I was never able to state the correlation with semantic precision, and so I abandoned the idea.]
So, in order to implement modality in the interlingua, we will allocate a set of root morphemes to represent the modal concepts. The polarity modifiers can then be used to indicate the degrees of modality. If a modal root is not modified by a polarity morpheme, it will represent the undefined degree.
Here are the details:
Modality Morpheme -------------------------- ---------- Probability (epistemic) tam Evidentiality (epistemic) jeg Inevitability (epistemic) cav Acceptability (epistemic) zim Significance (epistemic) dup Obligation (deontic) dov Necessity (deontic) kes Consequentiality (deontic) zul
And here again are the degree morphemes that we will use:
ju- 0% bi- 100% ke- High can- Average fo- Low zu- Very low -en Interrogative (suffix)
If a degree morpheme is not used, the degree of the modality will be "undefined".
Technically, use of "-en" with a modal does not indicate the speaker's attitude about the event. Instead, the speaker is asking the listener what the listener's attitude is, or what the speaker's attitude should be. In other words, the speaker is asking the listener to provide the correct degree of modality.
Finally, since a modal is inherently deictic and takes an entire clause as an argument, they will be deictic disjuncts when the part-of-speech marker "-u" is used, just as we did for tense-aspect words.
Now, in all natural languages that I am familiar with, the indicative is the default and is unmarked. Thus, it might seem that the 100% epistemic probability word "bitamu" is not really needed. However, a language must have a way of emphasizing the truth of a statement, and "bitamu" is the obvious and natural choice for this function. (Cf. "He went to the house" vs. "He DID go to the house", or "He definitely went to the house", or "He went to the house for sure".)
Here are some examples using English word order:
Louise bitamu buy it = Louise DID buy it. Louise jutamu buy it = Louise didn't buy it. Louise ketamu buy it = Louise must have bought it OR Louise almost certainly bought it. Louise cantamu buy it = Louise probably bought it. Louise tamu buy it = Louise may have bought it. Louise zutamu buy it = Louise just might have bought it. She bidovu leave now = She has to leave now. (Literally, 'She is maximally obligated to leave now'.) She dovu leave now = She may/can leave now. (Literally, 'She has the option of leaving now'.) She judovu leave now = She does not have to leave now. (Literally, 'There is no obligation for her to leave now'. She jutamu dovu leave now = She may not/can not leave now. (Literally, 'She does not have the option of leaving now'. Note the distinction between this and "judovu".) She kedovu leave now = She should leave now. He kekesu study harder = He needs to study harder. He kecavu cause trouble = He's bound to cause trouble. He kejegu leave = It looks like he left. OR He seems to have left. He kezulu leave now = He'd better leave now.
In the chapter on Tense and Aspect, we defined a default tense that would apply to verbs when it was not specified. It is also possible for a modal to have tense. It is even possible for the modal and the verb to have different tenses. Here are some examples:
John seems to have been angry. (modal = present, verb = past) It will seem that John was angry. (modal = future, verb = past) It seemed that John will be angry (but it doesn't seem that way any more). (modal = past, verb = future)
Some English modals cannot carry tense at all, and if tense is needed, then it must be done using a paraphrase of the modality. Here are some examples:
(1) John may be in Boston. (2) It is possible that John is in Boston. (3) It will be possible that John will be in Boston. (4) It was possible that John will be in Boston (but it's no longer possible).
Note that (1) can be interpreted as either present or future. To make it explicit, paraphrases (2) or (3) can be used. Example (4) shows that the modality can have past tense while the verb has future tense.
So, since it's possible for the modality to have tense that is different than the verb, the modality must also have a default tense and aspect. In the interlingua, all epistemic modal deictic disjuncts will be present imperfect by default. In other words, the possibility exists, by default, at the time of speech.
For example, the default tense for the verb "to leave" is past. Thus, "John leave" is 'John left'. However, the sentence "John tamu leave" means 'John may have left' (literally: 'it is possible that John left'). In other words, the default tense-aspect of the modal is not inherited by the verb - the tense-aspect of the modal and the verb are determined independently.
All deontic modal disjuncts will have the same tense as the verb that follows. In other words, the obligation exists at the same time as the event.
For example, "John bidovu leave" means 'John had to leave', since "leave" is past-perfect by default. Similarly "John dovenu leave" will mean "Should John leave?". To indicate that the obligation existed in the present or future, an explicit tense marker can be used. Thus, "John dasu bidovu leave" means 'John has to have left'.
Now, consider the following:
Should John leave now? Does John have to leave now?
We can use "dovenu" for the first example, because it is effectively asking 'how obligatory is it for John to leave?', which is a good paraphrase as long as "should" is not stressed. However, in the second example, or if "should" is stressed in the first example, we are really asking if it is true that a particular degree of obligation applies. A possible English scenario could be this:
A: Does John HAVE to leave? B: No. He doesn't HAVE to leave but he SHOULD leave.
In other words, what we need is a general purpose interrogative marker. This, of course, is simply "tamenu", since we are asking for the degree of probability that the statement "John has to leave" is true. In effect, we are asking "What is the probability that John has to leave?", and we are expecting an answer such as "yes", "no", "maybe", or whatever degree of probability is appropriate.
With this word, we can now deal with the above examples:
Should John leave now? = Dovenu John leave now? SHOULD John leave now? = Tamenu kedovu John leave now? Does John have to leave now? = Tamenu bidovu John leave now?
Thus, in effect, "tamenu" can also be paraphrased as "Is it true that ... ?".
When "tamenu" introduces an embedded clause, it will indicate that the speaker wants to know if the clause is true, and will be equivalent to English non-conditional "if/whether", as in the following examples:
I asked Bill tamenu you have enough money. = I asked Bill if you have enough money. I want to know tamenu Bill needs my help. = I want to know if Bill needs my help.
Note that this is not the same as quoting the embedded clause. For example, there is a difference between "I asked Bill if you have enough money" and "I asked Bill 'Do you have enough money?'". We'll discuss quoting later.
Also, do not confuse this usage of "if" with the conditional conjunction "if" (which we will discuss later). Even though English uses the same word for both purposes, their semantics are quite different.
A clause can contain no more than one deontic modal disjunct, and it must follow any epistemic modals. For example:
Tamenu jutamu bidovu John leave = Didn't John have to leave?
An epistemic modal may never follow a deontic modal because, if it were done, the epistemic modal would not actually be a true modality. Consider this:
It has to be possible to open the window.
In the above sentence, the verbal phrase "be possible" is not a modal disjunct - it's actually a state verb. However, as we'll see below, this state verb will be derived from the epistemic modal root "tam".
As was the case with tense-aspect roots, modal roots can be used to derive many additional words. Before we can start, though, we need to define the equivalent 'state' of a modal. In other words, what is the basic or raw state that is associated with a modal?
As I mentioned above, all of the modal derivations are similar to speech acts, since the speaker tries to induce a change of state in the listener using speech. However, unlike a true speech act, a modal always describes the speaker's impersonal judgment of a situation that may be completely unrelated to the speech act itself. In other words, a modal is a combination of a speech act, an additional situation, plus the speaker's judgment of the additional situation. Thus, a modal concept is much more complex than most basic states, and this overly complex concept will not be useful if it undergoes further derivation.
Fortunately, the most useful component of a modality is how the speaker judges the situation. If we can isolate this attitude, it will provide us with a simpler concept that we can then use very productively in further derivations.
Thus, we need a strategy that will eliminate the speaker's contribution such that only the basic modality remains. To that end, we will paraphrase the modal in such a way that it eliminates the 'speech act' component and isolates the modal concept. We will do this by using each modal in a test sentence and paraphrasing it in the form "it is X that Y", where X is the modal concept, and Y is the subject matter. Here are several examples:
Epistemic probability: 100%: He took care of the children. It is true/definite that he took care of them. high: He must have taken care of them. It is highly probable/likely that he took care of them. average: He probably took care of them. It is probable/likely that he took care of them. low: He just may have taken care of them. It is unlikely/improbable that he took care of them. very low: He almost certainly did not take care of them. It's implausible/hardly possible that he took care of them. 0% He did not take care of them. It is false/impossible that he took care of them. undefined: He may have taken care of them. It is possible that he took care of them. Deontic obligation: 100%: He must take care of them. It is mandatory/obligatory that he take care of them. high: He should take care of them. It is advisable that he take care of them. undefined: He can/may take care of them. It is optional that he take care of them. Epistemic evidentiality: 100%: It's obvious/evident that he took care of them. high: It seems/appears that he took care of them. It is apparent/almost obvious that he took care of them. Epistemic inevitability: 100%: He can't help taking care of them. It is inevitable/fated/preordained that he took care of them. High: He's bound to take care of them. It is almost inevitable that he took care of them. Epistemic acceptability: 100%: It's definitely acceptable/okay that he took care of them. high: It's very acceptable/welcome/gratifying that he took care of them. average: It's acceptable/okay that he took care of them. very low: It's tolerable/barely acceptable that he took care of them. At least he took care of them. undefined: It's possibly acceptable/okay that he took care of them. 0%: It's unacceptable/not okay that he took care of them. Deontic necessity: 100%: It's essential that he take care of them. high: He needs to take care of them. It is necessary that he take care of them. 0%: It's unnecessary that he take care of them. Epistemic significance: 100%: It is crucial/extremely significant that he left early. high: It matters that he left early. It is significant that he left early. Deontic consequentiality: 100%: It is crucial/pivotal/critical/extremely important that he leave early. high: He'd better leave early. It is very important that he leave early. average: It is important that he leave early.
Note that none of the above are true states. If they were, they would describe the states of entities. Instead, they describe the impersonal judgment of the speaker about a situation. Thus, the situation is actually the focus of the speaker's mental state. For example, if a situation is 'obvious', then the speaker feels that it is obvious. If a situation is 'acceptable' then the speaker feels that it is acceptable. And so on. In other words, the true states can be best captured in the form of P/F-s verbs, since they indicate a relationship between a patient/entity and a focus/situation. The raw concepts themselves (i.e. 'true', 'obvious', 'acceptable', etc.) can be represented by the F-s [-P] middle voice forms.
Note also that all of the epistemic paraphrases use the past tense, while all of the deontic paraphrases use an implicit future tense. I discovered that using this convention is less likely to result in confusion (at least when the paraphrases are in English). It is also consistent with the inherent natures of epistemic and deontic concepts; i.e., epistemic judgments are concerned with actual events, while deontic judgments are concerned with hypothetical (i.e., future) events.
Now, in order to convert the modal concept to a state, we must add a patient. The best way to do this is to paraphrase the P/F-s state verb using an expression such as "Patient feels/thinks that focus". Here are some epistemic examples:
probability: I feel that situation F is true = I believe F I feel that situation F is unlikely/improbable = I doubt F evidentiality: I feel that situation F is obvious = I am confident/sure/certain that F [The above implies that the speaker has good reason to believe F; i.e., that the certainty is based on evidence or reason.] inevitability: I feel that situation F is inevitable = I take F for granted I feel that situation F is almost inevitable = I expect F acceptability: I feel that situation F is acceptable = I am at ease/comfortable/content with/that F significance: I feel that situation F is significant/matters = I am convinced of the significance of/that F
Note that each modal concept has now become an actual mental state of a patient.
Now, let's see if we can do the same thing with a few deontic modalities:
obligation: I feel that event F is mandatory = I ??? necessity: I feel that event F is necessary = I ??? consequentiality: I feel that event F is important = I ???
What's wrong? It seems that deontic modalities do not really describe the state of the speaker. Instead, the speaker is actually describing the state of someone or something else. Thus, it's necessary to paraphrase deontic derivations in terms of the other entity, as follows:
obligation: Something is mandatory for the patient = The patient is obligated to... necessity: Something is necessary for the patient = The patient needs/has a need to/for... consequentiality: Something is important for the patient = The patient is liable/answerable/ accountable/responsible for... [I am using the word "important" here only in the sense that there may be negative consequences for the patient if the indicated event does not take place.]
Thus, for epistemic modalities, we must paraphrase the state as the equivalent state of the speaker. For deontic modalities, we must paraphrase the state as the equivalent state of the entity that the speaker is talking about.
Also, the epistemic states are true mental states of the patient, while the deontic states are still abstract. This should not be surprising since epistemic modalities apply to real situations, while deontic modalities apply to hypothetical ones.
Finally, since the epistemic mental states are inherently under the control of the speaker ('believe', 'doubt', 'be confident', 'expect', etc), they will be AP/F-s by default when their part-of-speech is changed from disjunct to something else. However, since deontic states reflect an outsider's view of the modal state of a patient ('need', 'be obligated to', 'be liable for', etc), they will be P/F-s by default.
Now, with all of the above in mind, let's create several useful words from modal roots. Here are some of the many possible derivations from the epistemic probability modality:
100% epistemic probability, "bitamu": AP/F-s bitama = to believe, to take as true that, to be convinced that bitamemi = accepted fact, the truth, that which is believed true ("-em" = middle suffix) bitamemo = true/veritable bitamoge = really, truly, definitely, absolutely, positively, indeed ("-og" = "0" adverb) ["Bitamoge" is more general than "bitamu", since it can imply that others in addition to the speaker are certain of the truth of the statement. Also note that, since it is an adverb, its syntax is different as well.] bitamesi = belief, tenet, article of faith, something taken to be true ("-es" = passive suffix) bitamoni = faith, conviction ("-on" = quality suffix) bitamemoni = truth, veracity, truthfulness Bitamemu = Yes, That's true/correct/right, etc. [Literally, 'the speaker agrees that something just said is true'. Note that "Bitamemu" is still a deictic disjunct, and that the middle voice change indicates that its single sentential argument is assumed to be the declarative version of what was just said.] English speakers should be careful not to extend the meaning of these derivations to people. For example, in the sentence "You are correct", the speaker really means 'What you are saying is correct'. Thus, it's acceptable to say "THE ANSWER is correct", but not "YOU are correct". In other words, 'truth' or 'correctness' applies to a situation - not to a person. AP/F-d bitamimba = to decide, to make up one's mind, to convince oneself, resolve, conclude A/P/F-d bitamamba = to persuade, to convince, to win over P/F-s bitamunza = to realize, to be aware that, to understand, it is P's understanding that F (e.g. "It's my understanding that Joe won't be here tomorrow".) P/F-s bitamunzo or bitamunzyu = credulous (about) High epistemic probability, "ketamu": AP/F-s ketama = to think/feel/reckon, to take as most likely/probable, to be of the opinion ketamemo = likely, almost certain, highly probable Ketamemu = Almost certainly, In all likelihood, That's almost certainly right, etc. (in answer to a question) P/F-s ketamunza = to presume, to accept as most likely or almost certain Average epistemic probability, "cantamu": AP/F-s cantama = to feel that something is probable, to feel that something has a moderate chance of being true. cantamemo = probable, moderately likely Cantamemu = Probably, That's moderately probable, etc. (in answer to a question) AP/F-d cantamimba = to surmise/conjecture/speculate, to decide that F is probable [English "surmise", "conjecture", and "speculate" sometimes imply that a conclusion is reached with little evidence. "Cantamimba" does not have this implication.] P/F-s cantamunza = to suppose/gather/daresay, to accept as probable Low epistemic probability, "fotamu": AP/F-s fotama = to doubt, to be doubtful about, to have doubts about, to question, to be skeptical about, to consider unlikely/improbable, to take as unlikely/improbable fotamo = skeptical/doubting/doubtful (i.e. a person) fotamemo = dubious, unlikely, improbable, hard to believe (i.e. an event or situation) fotamemi = a dubious/unlikely/improbable event or situation fotamoni = doubt, skepticism Fotamemu = Probably not, Not likely (e.g. in answer to a question) 0% epistemic probability, "jutamu": AP/F-s jutama = to disbelieve, to take as false jutamo = unbelieving, disbelieving, incredulous jutami = unbeliever, non-believer jutamemo = false/untrue/incorrect/wrong/impossible Jutamemu = No, That's wrong/incorrect, It's not true (e.g. in answer to a question) Undefined epistemic probability, unmodified "tamu": AP/F-s tama = to accept as possible, to admit the possibility of tamemo = possible tamoge = possibly (adverb) tamemoni = possibility, likelihood, probability, potential Tamemu = Maybe/Perhaps (e.g. in answer to a question)
Someone once said that all truth is relative. The above derivations certainly seem to reflect this attitude, since they imply that the truth of a situation is more perceived than real; i.e. it is true only if it is true to a patient. However, keep in mind that when the patient is demoted by means of a middle voice operation, the result implies nothing about the nature of the unmentionable perceiver. It could just as well be the universe, your cat, or a supreme being. In spite of this, it is important to remember that 'truth' as derived above does not mean 'absolute truth' or 'reality'. Thus, we cannot use the modal to derive concepts such as 'to exist = to be real' or 'to create = to make real'. The modals do not imply reality - only the perception of reality. Another difference is that the 'truth' described here is inherently scalar, while the concept we derived earlier meaning 'real/actual' (state root = "kav") is inherently binary.
Although I listed a large number of useful derivations for the epistemic probability modality, there are many more. The modal concepts are so basic, that it shouldn't be surprising that they can be the source of so many useful words. However, for the sake of brevity, I will only list a few derivations for the remaining modalities:
100% deontic obligation (= obligation), "bidovu": P/F-s bidova = it is mandatory/compulsory/obligatory for P to F, to be obligated to bidovanga = to be mandatory/compulsory/obligatory for ("-ang" = inverse voice suffix) bidovemo = mandatory, compulsory, obligatory bidovemi = duty, obligation AP/F-s bidovinza = to feel obligated to... AP/F-d bidovimba = to take on the obligation to... A/P/F-s bidovanza = to require, to oblige High deontic obligation, "kedovu": P/F-s kedova = it is advisable/desirable for P to F, it behooves P to F, it's a good idea for P to F, P is highly obligated to F kedovemo = advisable, desirable, called for Average deontic obligation, "candovu": P/F-s candova = to be supposed to (e.g. John is supposed to bring the coffee.), P is moderately obligated to F Undefined deontic obligation, unmodified "dovu": P/F-s dova = it is optional for P to F, P has the freedom/ choice/option to dovo = having a choice, free to choose dovanga = F is optional for P dovemo = optional dovemi = option [The inverse form "dovanga" would be used for English sentences such as "Picking up the litter is optional for the guests".] A/P/F-s dovanza = to let, to allow, to permit (literally: to cause someone to have the option to...) 100% epistemic acceptability, "bizimu": AP/F-s bizima = to feel that F is maximally acceptable, to feel that F is perfect or ideal, to totally approve of Average epistemic acceptability, "canzimu": AP/F-s canzima = to feel that F is acceptable, to accept, to countenance, to sanction canzimo = comfortable with, satisfied, accepting canzimemo = acceptable, okay, admissible low epistemic acceptability, "fozimu": AP/F-s fozima = to feel that F is not very acceptable, to disfavor, to frown upon, to deprecate 0% epistemic acceptability, "juzimu": AP/F-s juzima = to feel that F is unacceptable/inadmissible/not okay, to disapprove of, to oppose juzimo = disapproving, opposed, hostile juzimemo = unacceptable, inadmissible, intolerable juzimenvi = disapproval, opposition ("-env" = event noun suffix) 100% deontic necessity, "bikesu": P/F-s bikeso = destitute, in extreme need bikesemo = essential, indispensable, vital, exigent High deontic necessity, "kekesu": P/F-s kekesa = to need, to require, to have a need for (note that this is a VERB, not a disjunct!) kekeso = needy, in need kekesemo = necessary, needed, requisite, required kekesemi = need, requirement (i.e. what is needed) kekesoni = necessity, need, requirement (the need itself) [Keep in mind that some of the English "-on" equivalents are ambiguous. For example, "kekesoni" literally means 'high degree of necessity' while "kesoni" means 'degree of necessity' with no indication of the actual degree (it could even be zero). Thus, it doesn't make sense to ask "What is the kekesoni?", but it does make sense to ask "What is the kesoni?". On the other hand, it doesn't make sense to say "The kesoni for more money is obvious". Instead, "kekesoni" must be used.] 0% deontic necessity, "jukesu": F-s [-P] jukesemo = unnecessary, unessential, inessential A/F-d [-P] jukesamboma = to make unnecessary, to obviate
The following derivations of epistemic evidentiality are very similar to the derivations for epistemic probability, and the English glosses are often the same. However, it's important to keep in mind that evidentiality implies the presence or absence of evidence, whereas probability has no such implication. Thus, the probability derivations are more general than the evidentiality derivations. For example, both the derivations "bitamamba" and "bijegamba" can be glossed as "to convince", but "bijegamba" implies that evidence (reason, logic, data, etc) was used to convince the patient, while "bitamamba" says nothing at all about how the patient became convinced. Thus, the probability derivations are more general and can be used in all situations.
100% epistemic evidentiality, "bijegu": AP/F-s bijega = to be confident/certain/sure (that), to be convinced that, to take for granted that, to know for a fact (that), to know as a matter of fact that bijego = certain, sure, confident bijegema = to be evident, obvious, etc that ... bijegemo = evident, obvious, manifest, patent, clear, overt, certain, sure bijegoni = certainty, confidence, certitude P/F-s bijegunza = it is evident/obvious to P that F, F/P-s bijegunzanga = to be obvious/clear to (E.g. "His anger was obvious to everyone".) AP/F-d bijegimba = to conclude (that), to come to the conclusion (that), to come to be certain/confident/sure (that) bijegimbemi = conclusion bijegimbemo = conclusive A/P/F-d bijegamba = to convince, to show, to persuade High epistemic evidentiality, "kejegu": AP/F-s kejega = to suppose, to gather, to surmise, to guess, to figure, to reckon, it is apparent/ almost obvious to AP that F kejegemo = seeming, apparent, ostensible kejegema = "It is apparent that..." kejegeme = supposedly, apparently Average epistemic evidentiality, "canjegu" AP/F-s canjega = to suspect, to feel that there is reasonable evidence for, imagine Low epistemic evidentiality, "fojegu" AP/F-s fojega = to doubt, to question, to be skeptical about, to consider unlikely or improbable (due to the paucity of evidence or reason) fojego = skeptical fojegemo = dubious, problematic Undefined epistemic evidentiality, unmodified "jegu" AP/F-s jega = to accept as possible because there may be evidence in support jegemo = plausible, conceivable, reasonable, justifiable 100% epistemic inevitability, "bicavu": AP/F-s bicava = to take for granted, to assume/presume, to feel that F is inevitable bicavema = to be inevitable bicavemo = inevitable, unavoidable, pre-ordained, ineluctable bicavemi = destiny, fate, fortune, lot bicavemoni = inevitability High epistemic inevitability, "kecavu": AP/F-s kecava = to expect, to anticipate, to feel that F is almost inevitable 100% deontic consequentiality, "bizulu": P/F-s bizula = it is imperative for P to F bizulemo = imperative, critical, crucial, vital High deontic consequentiality, "kezulu": P/F-s kezula = it is urgent for P to F, P is responsible/ accountable/liable/answerable for F kezulo = responsible, accountable, liable, answerable kezulemo = urgent, exigent, compelling, pressing, important, highly consequential
Finally, when an unmodified modal is used (i.e., a deictic disjunct), it is very similar to the non-deictic F-s [-AP] middle derivation. For example, "John bitamu leave" meaning 'John DID leave' is similar to "Bitamema John leave" meaning 'It is accepted as true by some unspecified agent-patient that John left'. However, there is an important difference between the two. When the deictic disjunct is used, the agent-patient must be the speaker. When the middle verb form is used, the patient is unspecified and does not necessarily have to be the speaker. In other words, the argument structure of the deictic modality is actually F-s, since the agent-patient is deictic.
Since modalities represent a speaker's judgment about a situation, and since it's possible for people to pass judgment in many different ways, the obvious question is whether we can implement other concepts as modalities, rather than as basic states.
I am convinced that the answer is a resounding "YES", even though I've only discussed those modalities that I've read about or which seemed obvious to me. There is no doubt in my mind that other modalities exist, and that these modalities are likely to have formal representation (as inflections, auxiliaries, particles, etc.) within some natural languages.
When trying to decide whether a concept is inherently modal in nature, we must keep in mind that a modality represents the speaker's impersonal judgment of a situation. The concept must not represent the speaker's feelings towards the listener or a third party, nor can it represent an attitude that is caused by a situation, the listener, or a third party; i.e. the speaker must be the source of the judgment. Also, the concept must not represent the state or behavior of an actual entity or process - it must represent a judgment of a situation. Upon further derivation, of course, the concept may represent the mental state of an actual entity (e.g. "to believe" from the modal concept 'true').
Modal concepts are inherently abstract. Normal states are not. Consider the following:
yellow vs. optional open vs. true heavy vs. necessary good vs. important
In effect, a modality is not an inherent quality of a situation. Instead, it is externally imposed.
Fortunately, it is definitely possible to test a concept to determine if it is modal in nature. In English, we can test a concept M for modality by using one of the following two sentences:
(1) It's E that he left early. (2) It's D that he leave early.
If (1) makes sense and is grammatically correct, then E may be an epistemic modality. If only (2) makes sense and is grammatically correct, then D may be a deontic modality.
Do not use descriptions of mental states or measurable, objective attributes that describe the actual nature of events. For example "sad", "shocking", "odd", "legal", and "ironic" all pass the above test even though "sad" and "shocking" describe mental states (classifier = "-kop" or "-dum"), while "ironic", "legal", and "odd" describe objectively determinable attributes of events (classifier = "-xas" or "-bes").
Also, do not use passive forms of verbs for any of the above tests. For example, "known" and "said" pass the above test even though they are obviously not modalities.
Finally, we must also make sure that the concepts are potentially impersonal. They must not inherently represent the bias of the speaker! To test this, we can perform the following test:
(3) I approve/disapprove of what occurred. <- epistemic I hope that the event will/will not occur. <- deontic
If either of the above statements is automatically implied when the concept is used, then it is not a modal concept. In other words, a true modal statement must not indicate how the speaker feels about an event.
If this is not clear, consider the following two sentences:
(A) It's good that Mike won the game. (B) It's a good thing that Mike won the game.
In (A), the speaker is clearly glad about what happened. In (B), however, the approval, if any, is clearly secondary to the implied importance of the event. In fact, in (B), it's quite possible that the speaker does not approve of the event at all, but is merely commenting on its importance. Thus, (A) represents a normal state while (B) represents a modal one.
There may be cases when the dividing line between personal and impersonal is not clear. When this occurs, we will treat the concept as an attribute (classifier = "-xas" or "-bes") rather than as a modal. An example of this is the concept 'fair/just' or 'opportune'.
Now, here's a list of other modal concepts that I believe are inherently modal in nature. Each of them passes the above tests:
I say/believe that something is: reasonable/sensible others???
In any case, it seems to me that the above concepts (and certainly others I've missed) are indeed modal in nature and should be treated as such.
There are times when we need to modify the modality of an utterance, implying that the situation is true in spite of reasons to believe otherwise. Linguists refer to this process as "hedging". Here are some English examples:
a. STRICTLY SPEAKING, his answer was correct. b. LOOSELY SPEAKING, a dolphin is a fish. c. TECHNICALLY, a penguin is a bird. d. Bill joined the SO-CALLED Society for Universal Tolerance.
In each case, the capitalized expression either affirms or denies the truth of a sentence or the accuracy of a label while implying that there is reason to think otherwise. Thus, example (a) can be paraphrased as "His answer was correct even though there are reasons to feel that it was really incorrect". Example (b) indicates that a dolphin is not a fish, even though there are reasons to think otherwise. Example (c) is similar to (a) but implies that there is actual data or proof to support the claim. And example (d) implies that there may be reasons to believe that the name of the society is invalid.
In the interlingua, we will implement hedging with the epistemic modal root "fug". Here are English paraphrases of the available forms:
fugu It is true that ... even though there MAY be reason to think otherwise. bifugu It is true that ... even though there is every reason to think otherwise. kefugu It is true that ... even though there is good reason to think otherwise. canfugu It is true that ... even though there is reason to think otherwise. fofugu It is true that ... even though there is a small amount of evidence to think otherwise. zufugu It is true that ... even though there is a tiny amount of evidence to think otherwise. jufugu It is NOT true that ... even though there may be reason to think otherwise. fugenu It is true that ..., but what reason is there to think so?
Here are some examples:
Kefugu a dolphin is a mammal. = Even though there is much reason to think otherwise, a dolphin is a mammal. = Strictly speaking, a dolphin is a mammal. Jufugu a dolphin is a fish. = A dolphin is not a fish, even though there may be reason to think otherwise. = Loosely speaking, a dolphin is a fish. OR In a sense, a dolphin is a fish. OR In a manner of speaking, a dolphin is a fish. OR In some/certain respects, a dolphin is a fish. OR Actually, a dolphin is NOT a fish. Bill joined the jufugemo "Society for Universal Tolerance". = Bill joined the so-called/self-styled/soi-disant "Society for Universal Tolerance". [Here we are saying that the society has nothing to do do with universal tolerance, even though the name may imply it.]
At first glance, hedging appears to be the direct opposite of evidentiality, since hedging is used to indicate that there is evidence against something. However, there is an important difference: evidentiality does not imply the truth of a situation - it simply indicates the degree of evidence in its favor. Hedging, however, does imply the truth of a situation, while at the same time indicating the degree of evidence against it. If we need to indicate the true opposite of evidentiality, we simply negate the embedded clause. Consider the following:
High evidentiality: It seems that he left early. High hedging: He did leave early even though it seems otherwise. Opposite of high evidentiality: It seems that he did not leave early.
Thus, hedging and evidentiality are distinct modalities.
When more than one disjunct is used in a clause, their order will depend on their scope; i.e., the amount of information that is completely encompassed by the meaning of the disjunct. This order is quite strict and cannot be violated without generating gibberish.
The syntax of the interlingua is purely right-branching and the examples below will reflect this. (For a left-branching language, the order would have to be reversed.)
Modal disjuncts indicate the speaker's attitude about the entire embedded event. However, the argument of the modality can occur at different times, and the modality itself can also occur at different times. This means that modal and tense-aspect disjuncts can appear in any order, as long as they make sense. Here are some examples:
Bidovu cicipu you study lesson. ["Bidovu" = maximum obligation, "cicipu" = past-inceptive.] You had to start studying the lesson. Cicipu bidovu you study lesson. You started to have to study the lesson. Bidovu doycipu you study lesson. ["Doycipu" = past-imperfective.] You have to have been studying the lesson.
Note that the last example is deontic, not epistemic (both interpretations are possible in English). In other words, it can be paraphrased "It was obligatory for you to have been studying the lesson".
There is special type of deontic modality that represents an imperative from the speaker that a hypothetical situation should or should not be brought about. The best example is a simple command, such as "Go away!".
An imperative is like deontic obligation in that it indicates that something should be done. However, it goes beyond deontic obligation by actually commanding the listener to do something.
In the interlingua, we will allocate the special part-of-speech marker "-oy" to convert a verb to an imperative. Specifically, "-oy" will be a short form for the normal verb suffix "-a" plus the 2nd person deictic "xevi" or "lixevi" meaning 'you' as the implied subject. Here is an example:
Jutamu doykavapoy kifigi not open the window Don't open the window!
Note that an imperative is always directed at the listener, even if the speaker is demanding action by a third party, as in the following example:
Kavapoy fagipa likobaybegi bape cause leave teachers now Have the teachers leave now!
In other words, it is always the listener that is being given responsibility for the action.
Also, an imperative, by its very nature, implies future tense. Thus, explicit marking of the tense of the embedded clause is not needed unless the aspect is not perfective, as in "Start singing!" (future-inceptive), or the event is a continuation of an event that started in the past, as in "Keep dancing!" (past+present+future-imperfective).
Now, some languages that have a distinct morphology for imperatives can also apply them directly to first and third persons. The ones that I am familiar with generally have the meaning 'let ...' as in "Let them leave if they really want to". However, these are not true imperatives. They are either permissives (i.e. the undefined deontic obligation derivation "dovanza"), or non-deictic disjuncts derived from appropriate mental state verbs expressing a sense of frustration or resignation. For imperatives that include the speaker (e.g. "Let's all leave now!"), we can use "kavapoy" and an embedded sentence whose subject is "covi", meaning inclusive (1+2) 'we/us'.
[This chapter is a compilation of a few articles I posted to the 'conlang' email discussion list in October 1993. Rather than spend a lot of time re-writing it to make it conform to the general style of this monograph, I decided to be lazy, and am inserting the original material with only some minor editing.]
An anaphor is a word that refers back to a another word, phrase, or clause that preceded it. For example, in "John doesn't like apples. He prefers pears.", "he" is an anaphor for "John".
One of the problems that many people have is that they tend to think of anaphora as belonging to a special, closed class of words. In English, we think of third person pronouns ("he", "she", "it", etc.), auxiliaries ("be", "have", and "do") and a handful of oddballs ("herself", "each other", "so", "such", etc.) as most of the available anaphora. Here are some examples:
I love anchovy ice cream. Do you? (Anaphor: "do") William Shakespeare lived in a small town with his pet rock and his wife Fifi Yokohama. He would not eat veggies, she would not eat vegemite, and IT didn't eat at all. (Anaphora: "his", "he", "she", and "IT") John said he'll definitely attend the class on Creative Suffering. Louise will too. (Anaphor: "will")
However, these 'closed class anaphora' are not the only ones. Consider the following:
1. Ten theoretical physicists and eight sanitary engineers attended the seminar. They were constantly heckling them.
Obviously, we can't use the anaphora "they" and "them" in the second sentence of (1). Instead, we need something like:
2. The engineers were constantly heckling the physicists.
The point, though, is that the expressions "the engineers" and "the physicists" in (2) are anaphora, and they can continue to be used as such throughout the remainder of the dialog. Thus, the headword of a phrase is used as a referent for the entire phrase. Since these anaphora are actually nouns, which are open class words, I'll call them open class anaphora.
Sometimes, especially when writing, we define new open class anaphora explicitly, as in:
3. This contract is between Steven Speedemon (henceforth the first party) and Wendall Whiplash (henceforth the second party)...
In (3) the anaphora are explicitly defined as "the first party" and "the second party". But we can also do it in informal writing and speech:
4. Ten computational linguists and ten theoretical linguists attended the seminar. The comps were constantly heckling the theos. Finally, the theos got so angry that they mooned the comps and left.
Another common (and much more formal) way to create open class anaphora is to use single letters or abbreviations:
5. In discussing the "Best Artificial Language Linguists Ever Designed" (BALLED), the designers forgot that there were many other lingwackos out there, who were out to get BALLED and who would ridicule it at every opportunity.
Of course, once an abbreviation becomes recognizable without introduction, it will no longer be an anaphor - it will be a proper noun (like USA, IBM, etc).
The major difference between the open (O) and closed (C) classes of anaphora is that the Os tend to keep their referents throughout the discourse, while the referents of the Cs are constantly changing. Thus, the anaphor "BALLED" in (5) will refer to the same thing throughout the dialog, while anaphora such as "he", "do", or "each other" will continually take on new meanings.
One other thing should be mentioned. Most anaphora are "backward-referring"; that is, the anaphor refers to something that was mentioned earlier. In English, it is also possible to have "forward-referring" anaphora, as in:
6. After ordering a pint of his favorite ale, Robert was perplexed when the barmaid replied that the fishmonger was next door. The Great English Vowel Shift had begun. [Thanks to Jim McCawley for this one!]
In (6) "his" precedes its referent "Robert". Forward-referring anaphora are sometimes called cataphora.
So, how do you handle anaphora in an interlingua intended for machine translation? In my opinion, the simplest, most natural, and most flexible solution is to use a form of contraction. The result would always be immediately recognizable as an anaphor by its form. The contraction could then be used as an anaphor for the entire phrase (or other constituent) from that point on. We will modify this rule to allow the contraction to take on a new meaning if its pattern matches a new and different phrase. Here's how something like this might sound in English:
The Sheboygan Bandits and the Milwaukee Dragoons faced off at Lovemud Stadium on Sunday. The Mil-goons beat the She-its out of their expected title.
In the interlingua, an anaphor will have three parts: the first CN(n) of the root of the headword of the expression it refers to plus "h" plus the part-of-speech vowel. If the headword of a clause is not a verb (i.e., a tense/aspect word, modal word, or any other deictic disjunct), then the anaphor will be formed from the clause's verb, even though, technically, the disjunct is the actual headword.
For example, consider the following examples (assume a right-branching word order):
libodepi feculo bathrooms hot "the hot bathrooms" anaphor = "bohi" timi ticulo car fast "the fast car" anaphor = "tihi" Doycipu bocala Lajonbegi zogumbe botimi past-imperfect swim John to boat "John was swimming to the boat." anaphor of verb "bocala" = "boha" anaphor of noun "Lajonbegi" = "jonhi"
In the first example, note that we used "bo" for the anaphor, because the anaphor must be formed from the root, not the prefix. In the second example, we used "ti", not "tim", because "ti" is the first CN of the root; i.e., if an anaphor is formed from a classifier, the final consonant of the classifier must be dropped. In the third example, we formed the anaphor from the verb "bocala" rather than from the disjunct "doycipu". Note also, that it is not possible for a verb anaphor to refer to a noun. Thus, there can be no confusion because "boha" cannot refer to the noun "botimi". Finally, note that the anaphor of "Lajonbegi" is "jonhi"; i.e., it includes the letter 'n'.
Anaphora may be formed from verbs (ending = "-ha"), nouns (ending = "-hi"), and adverbs/case tags (ending = "-he"). It is not legal to form anaphora of words that appear with other parts-of-speech, such as adjectives, deictic disjuncts, previous-word-modifiers, and so on. [However, it is possible to form an anaphor from a conjunction, as we will see below.]
An anaphor of a noun may change its part-of-speech to adjective to obtain a genitive meaning. For example, if the anaphor "bohi" refers to 'hot bathrooms', we could say something like "But windows boho were broken", meaning 'But their windows were broken'.
An anaphor can not undergo further derivation by adding prefixes or suffixes. Thus, if we need to create a genitive anaphoric noun, we must use "ximyu", as in the following example:
A: Kobaybegi (= the teacher) sent me to get lidazipi (= the buckets). B: Dahi ximyu Bill are here and dahi koho are over there. = Bill's are here and the teacher's are over there.
In the above, "dahi" is the anaphor for "the buckets". Thus, "dahi ximyu Bill" is a genitive anaphoric noun phrase equivalent to "those of Bill" or simply "Bill's". Note that "dahi koho" can also be rendered "dahi ximyu kohi".
Anaphora of verbs and adverbs/case tags may not change their part-of-speech.
An anaphor of an adverb will most often correspond to the English expressions "thusly" or "in that way/manner". [However, the translator will generate "then" for temporal referents, "there" for locative referents, and so on.] If an anaphor of a case tag is used, its meaning will include the case tag plus its argument(s).
Similarly, an anaphor of a verb will refer to the verb and all of its arguments. But the anaphor itself can never be a stand-alone sentence or take any arguments of its own (either core or oblique). The same applies to anaphora of case tags. For example, the sentence:
John kobaycalamba his son to swim = John taught his son to swim.
could be immediately followed by:
I know koha because his wife told me.
where "koha" is the anaphor for the complete first sentence. Thus, "koha" would be translated as 'it', 'this', 'that', or even 'it happened'.
The anaphor of an open noun must end in "-hi", not "-haw", since the anaphor itself is not open and cannot take any arguments.
Since abbreviations (i.e., "open" anaphora) are essentially proper nouns, we'll deal with them later in the chapter on Proper Names, Borrowed Words, Abbreviations, and Vocatives.
There will be times when an anaphor of a coordinated structure will be needed. Here are two examples:
a. The engineer and his assistant just left. THEY had to go to work. b. The windows broke and a wall fell in. IT was a terrible experience.
An anaphor of a coordinated structure will be formed from the first root morpheme of the first conjunction plus 'h' plus the appropriate part-of-speech. For example, if the word meaning 'and' is "tesye", then the anaphor meaning 'THEY' in (a) will be "tehi", and the anaphor for 'IT' in (b) will be "teha". [We'll have more to say about conjunctions later.]
It is illegal to form an anaphor of a 1st or 2nd deictic pronoun or of a coordinated structure that contains one. Here is an example:
The engineer and I worked in the computer room. WE finished the job in less than two hours.
Here, the correct anaphor for "we" is the 1+3 deictic pronoun "tuvi". Use of the anaphor "tehi" would either be illegal or would refer to someone else. In other words, anaphora can only refer to purely 3rd person entities (including events and oblique arguments).
Forward-referring anaphora (i.e. cataphora) are not really necessary in a language and are illegal in the interlingua.
It is also illegal to use an anaphor instead of a reflexive construction. Here is an example:
Samantha looked at herself in the mirror. *Samantha looked at sahi in the mirror.
where "sahi" is an appropriate anaphor of "Samantha". [We'll discuss how to create proper nouns later.] Reflexive constructions must use the suffix "-av", an appropriate reflexive word (such as "tomavi"), or an AP verbal derivation.
It's important not to confuse anaphora with deictics. Deictics, as we discussed earlier, are pointers to entities external to the discourse (e.g. this book, there, yesterday, you, then, etc.). Anaphora, however, are pointers to entities internal to the discourse (e.g. I saw Louise before SHE left, THAT is why she was so upset, IT caused all kinds of problems, etc.). Natural languages often use third person deictics for both functions (e.g. deictic: "Please hand me THAT book" vs. anaphoric: "I knew THAT").
In the system presented here, deictics and anaphora are completely different, intentionally, because their semantics are different. This implies that the speaker should be careful to use deictics only where appropriate. Deictics are essentially pointers. For example, the 3rd person plural personal pronoun "lidivi" literally means 'those entities over there'. Thus, the word "they" is a deictic in "(Speaker points to some people nearby) who are THEY?", while "they" is an anaphor in "I saw Bill and Mary yesterday. THEY just bought a new house." With the system presented here, third person personal pronouns will hardly ever be necessary. Instead, anaphora will almost always be used in their place. Some people may find this distinction a difficult one to master, especially if their native language allows third person deictics to be used as anaphora.
However, the problem is not quite as severe as it may seem. Keep in mind that third person deictics refer to entities other than the speaker or listener. Thus, their meaning automatically includes any anaphoric referent. It is for this reason that many natural languages use third person deictics as anaphora. In other words, third person referents are usually both internal and external to the discourse. Thus, either an anaphor or a deictic can be used. However, in the system presented here, an anaphor is never ambiguous, whereas a third person deictic can definitely be ambiguous. Consider the following:
Bill visited John yesterday. He was totally drunk.
If you use an anaphor, "he" will have only one possible referent. If you use a deictic, "he" can refer to either "Bill" or "John". It could even refer to someone other than Bill or John.
Thus, use of deictics in place of anaphora for third person referents is semantically correct, but may be ambiguous. However, even in cases where ambiguity is unlikely, I feel that use of deictics in place of anaphora should be discouraged.
Note that the above comments apply only to deictic pronouns and their genitive forms. Locative and temporal deictics are never ambiguous, and demonstratives simply imply an association, even though the association often ends up being locative. Consider the following:
Bill: I used the red car. Mary: I don't like that car because it's a gas guzzler.
In the above, the word "that" should be demonstrative "diso", even though it is not locative. In other words, a first person demonstrative implies an unspecified association (not necessarily a location!) with the speaker, second person implies an association with the listener, and third person implies an association with someone or something else.
In my earlier essay on syntax, I discussed two kinds of relative clause that are most common among natural languages. The first kind, which is found in a large minority of natural languages (including English), uses a single relative conjunction (i.e. "that", "who", or "which") plus a gap, as in the following example:
John saw the book THAT Bill bought (gap).
Note that "the book" is the object of the verb "saw", as well as the implied object of the verb "bought". The gap is required by English syntax.
The second kind of relative clause, which is found in a slight majority of natural languages, uses a relative conjunction plus a resumptive pronoun, as in the following example:
John saw the book THAT Bill bought IT.
Here, the gap is filled by the resumptive pronoun "IT" that refers back to "the book".
The use of resumptive pronouns has one disadvantage compared to the use of gaps, but has four advantages. The single disadvantage is that an extra word is needed; i.e. the resumptive pronoun (RP) itself. The advantages are as follows:
1. ANY noun can be relativized, regardless of the function it performs in the embedded sentence, or of the number of functions it performs: Gap: *I saw the car WHOSE driver got thrown from. RP: I saw the car WHICH ITS driver got thrown from IT. Here, "IT" is the resumptive pronoun and has the morphological form of a noun. "ITS" is the possessive form of the resumptive pronoun (using "ximyu"). 2. ANY noun can be relativized, regardless of how deeply the gap or resumptive pronoun is embedded: Gap: *This is the man WHO Louise bought a car from the same dealer that sold a Cadillac to. RP: This is the man WHO Louise bought a car from the same dealer that sold a Cadillac to HIM. Here, "HIM" is the resumptive pronoun and unambiguously links to "the man". 3. Use of a resumptive pronoun allows it to be combined with other nouns in coordinated structures: RP: I just met this real tall guy WHO my sister dated both HIM and HIS real short brother. 4. Computer parsing of relative clauses using resumptive pronouns is much easier. Parsing gaps can be extremely complicated, and can often fail completely without even more complicated semantic/ contextual processing.
In order to deal with the above examples, languages like English must split them up into two or more sentences. For example, the third example would have to be something like this:
My sister dated this real tall guy and his real short brother. I just met the tall one.
Since the advantages of resumptive pronouns significantly outweigh the single disadvantage, the interlingua will implement relative clauses with resumptive pronouns.
We need to create only three basic words to completely implement relative clauses that use resumptive pronouns: a relative conjunction, a resumptive pronoun, and a genitive/adjective form of the resumptive pronoun.
A relative conjunction simply provides a genitive link between a noun and the relative clause that modifies it. Thus, it performs exactly the same function as the genitive linker "ximyu" which we've used extensively so far. The difference, though, is that we've always used it to link two noun phrases. However, there is no reason why the argument of "ximyu" cannot be an embedded clause. In other words, the genitive linker "ximyu" performs the function of the English genitive preposition "of" when followed by a noun phrase, and performs the function of the English relative conjunctions "that/who/which" when followed by an embedded clause. This is more easily understood if we paraphrase these functions as in the following examples:
Billy's toys = the toys 'of-the-entity' Billy the boy "who" broke the window. = the boy 'of-the-event' he broke the window
Note that, while this approach may seem odd to speakers of English, it is semantically correct. In fact, many natural languages (most notably Mandarin Chinese) use exactly the same approach.
Thus, in the interlingua, we will use the following:
Relative conjunction: ximyu
For the resumptive pronoun and its genitive form, the obvious choice is to use an anaphor. (And since I will be using English for my examples, I will simply use the corresponding English anaphor, capitalized; i.e., "HE", "HER", "THEIR", etc).
Here are a few examples:
The shirt ximyu you want IT is on the bed. = The shirt that you want is on the bed. The police caught the man ximyu HE robbed the bank. = The police caught the man who robbed the bank. Here's the hammer ximyu he broke the window with IT. = Here's the hammer that he broke the window with. They examined the room ximyu the fire started in IT. = They examined the room that the fire started in.
Note that "ximyu" can be glossed in English as either "who", "which", or "that", depending on its referent.
Here are some examples using genitive forms:
That's the man ximyu the police just arrested HIS wife. = That's the man whose wife the police just arrested. That's the man ximyu HIS wife was just arrested by the police. = That's the man whose wife was just arrested by the police.
In summary, there is no need to create special words, morphemes, or syntax to deal with relative clauses. Features already available in the language are more than capable of handling the task.
Earlier, we discussed how to modify a locative relationship with a relative distance or time. To refresh your memory, here are the relevant examples:
John was sitting fage the door. = John was sitting away from the door. John was sitting fage-zetovay-xekumay the door. = John was sitting two meters from the door. The chair faga the window. = The chair is away from the window. The chair faga-zetovay-xekumay the window. = The chair is two meters from the window.
where "fage" means 'away from', "zetovi" means 'meter', and "xekumo" means 'two'. In other words, the use of previous-word modifiers allows us to effectively modify a case tag or verb. [Keep in mind that case tags and adverbs are not modifiers of verbs - they are arguments of verbs.]
However, there is no way using the above technique to deal with more complex expressions, such as a modifier that is a compound expression. For example, how does one handle "I arrived two hours and ten minutes before you."?
Since a relative clause allows a complex expression (i.e., a clause) to modify a noun, we can extend the concept to allow complex modification of case tags and adverbs. However, we cannot use the open adjective form "ximyu", since an open adjective cannot modify a case tag or an adverb. Instead, we will have to use the open previous-word modifier form "ximwa" ["-wa" is the part-of-speech marker for open previous-word modifiers]. Here is an example:
I arrived cipe ximwa cantovi xekumo tomo tesye fotovi bajukumo tomo you. = I arrived two hours and ten minutes before you.
Where "cipe" is the case tag meaning 'before', "cantovi" means 'hour', and "fotovi" means 'minute'. Note that we had to use "tomo" to make the phrases indefinite. Otherwise the translation would have been "I arrived THE two hours and THE ten minutes before you". Note also that since "ximwa" is a previous-word modifier, it must immediately follow "cipe". In other words, it must appear between "cipe" and its argument "you".
Relative clauses can either modify nouns or noun phrases, or act as noun phrases. Those that act as noun phrases are usually called nominal or headless relative clauses. In the interlingua, these can be easily implemented by using the open noun form of the relative conjunction rather than the open adjective form. In other words, we can use "ximaw" instead of "ximyu". In addition, since "ximaw" contains both the relationship and the referent, the anaphor will always be "xihi" or "xiho" (genitive). Here are a few examples:
I know WHAT broke the window. = I know XIMAW XIHI broke the window. They saw WHAT John brought. = They saw XIMAW John brought XIHI. She showed me WHERE the boys went. = She showed me XIMAW the boys went zogumbe XIHI. [Here, "zogumbe" is the 'destination' case tag that we derived earlier. Literally, the sentence can be glossed as 'She showed me what the boys went to it'.] He told me WHO he bought the book for. = He told me XIMAW he bought the book tomume XIHI. [Here, "tomume" is the 'beneficiary' case tag.] You told me WHY you sold it. = You told me XIMAW you sold it tomame XIHI. [Here, "tomame" is the 'reason' case tag.] Bill told me HOW he did it. = Bill told me XIMAW he did it busege XIHI. [Here, "busege" is the 'instrument/means/method' case tag.] I don't like THE WAY you behaved yesterday. = I don't like XIMAW you behaved zunxumege XIHI yesterday. [Here, "zunxumege" is the 'manner' case tag.]
Note that "ximaw" can be paraphrased as "the person/place/time/thing which" or simply "that which". Thus, for nominal relative clauses, the open noun form of the relative conjunction acts as both the relative conjunction and the argument of the preceding verb.
It's also possible to use derivations of the case tags directly, without a relative conjunction. In order to do this, however, we must invert the case tag, convert it to a noun, and then open up its argument structure. For example, the P/F-s locative case tag "zog" can be paraphrased as 'being at/in'. Thus, the open F/P-s inverse noun form "zogangaw" means simply 'the location where'. In other words, the argument of the open noun (i.e. the embedded sentence or the patient of the embedded sentence) will be the patient of the inverted locative:
She showed me WHERE the boys bought the magazine. = She showed me ZOGANGAW the boys bought the magazine. [Literally, this can be glossed as 'She showed me the location where the boys bought the magazine'.]
Let's do the same for the other examples that used case tags:
He told me WHO/WHAT he bought it FOR. = He told me TOMUMANGAW he bought it. [In English, this can be closely rendered as 'He told me the beneficiary for which he bought it'.] You told me WHY you sold it. = You told me TOMAMANGAW you sold it. [This sentence can be glossed as 'You told me the reason for your selling it'.] Bill told me HOW he did it. = Bill told me BUSEGANGAW he did it. [This sentence can be glossed as 'Bill told me the method of his doing it'.] I don't like THE WAY you behaved yesterday. = I don't like ZUNXUMEGANGAW you behaved yesterday. [This sentence can be glossed as 'I don't like the manner in which you behaved yesterday'.]
The astute reader may now be wondering why there is any need at all for a relative conjunction, since we can use an appropriate open adjective in its place. Here is an example:
I saw the building that he was walking to. = I saw the building ximyu he was walking zogumbe IT. OR = I saw the building ZOGUMBANGYU he was walking. [Literally, 'I saw the building to which he was walking.]
In other words, we can take advantage of the perfect symmetry inherent in the way we are designing case tags. If a case tag can link an argument of a main verb or the entire clause to its own argument, the inverse form can perform the exact reverse operation. This is exactly what we did in the last example. Thus, the inverse open adjective form can be paraphrased as 'X-which', where "X" is a case tag. Here's another example:
There's the girl that he bought the flowers for. = There's the girl ximyu he bought the flowers tomume HER. OR = There's the girl TOMUMANGYU he bought the flowers.
Here, "tomumangyu" is exactly equivalent to English "for whom".
However, the above approach cannot be used with the special case tags "tomese" (passive), "tomose" (anti-passive), "tomeve" (co-subject), and "tomove" (non-subject), because they simply provide an oblique version of a primary argument of their head and do not have real argument structures.
All of the relative clauses we've discussed so far are typically referred to as restrictive relative clauses, since they 'restrict' or 'reduce' the number of possible referents of the head noun. Some languages, such as English, allow the same form to be used with a non-restrictive sense (but with a noticeable difference in timing and intonation). These clauses simply provide additional information about the head noun. Here are a few examples:
Restrictive: The man who robbed the bank... Non-restrictive: The elephant, which is a large animal, ... Restrictive: The mower that is in the garage is broken. Non-restrictive: The mower, which is in the garage, is broken.
Since a non-restrictive relative clause is the same as any other kind of parenthetical structure, it should be treated as such. It should not be treated in the same way as a restrictive relative clause, for the simple reason that the two are semantically quite different. [I will discuss how to deal with parenthetical structures later.]
In the chapters on comparatives and modality, we used the interrogative suffix "-en". We now need to address how to implement other interrogatives, in which the listener is being asked, in effect, to "fill in a blank". Here are some English examples:
WHO closed the window? WHY did he close the window? HOW did he close the window? WHERE did he close the window?
We also need interrogative modifiers, as in the following:
WHICH boy closed the window? WHAT kind of people live here? HOW many people live here? HOW heavy was the box?
In order to create interrogative sentences, we used the very general interrogative deictic disjunct "tamenu". We can naturally extend the use of "-en" by suffixing it to the generic root "tom". "Tomeno" will be a "0" structure adjective by default, and is equivalent to the interrogative English adjective "which". The noun form, "tomeni", means 'what'. Note that this is exactly what we did when we derived the impersonal forms "tomo", "tomi", "tome", "jutomo", "jutomi", etc.
When referring to people, we can suffix "-en" to the root meaning 'person' = "beg". Thus, "begeni" corresponds to the interrogative English pronoun "who". Other parts-of-speech will also be useful, as we'll see below.
We will also apply the following rule:
When "-en" is suffixed to a stem whose argument structure is open (i.e. verbs, case tags, open nouns, etc), then it will be equivalent to "tomeni" appearing in the rightmost unfilled argument slot.
Here are some examples:
Who opened the window? = Begeni doykavapa the window? [Here, "doykavapa" is the A/P-d verb meaning 'to open'.] What did Billy open? = Billy doykavapena? = Billy doykavapa tomeni? Who opened what? = Begeni doykavapena? = Begeni doykavapa tomeni? Why did he open the window? = He doykavapa the window tomamene? = He doykavapa the window tomame tomeni? [Here, "tomame" is the reason case tag.] How did he open the window? OR With what did he open the window? = He doykavapa the window busegene? = He doykavapa the window busege tomeni? [Here, "busege" is the instrument/method case tag.] Where did he open the window? = He doykavapa the window zogene? = He doykavapa the window zoge tomeni? [Here, "zoge" is the locative 'at/in' case tag.] How heavy is the box? or What does the box weigh? = The box culunzene? = The box culunza tomeni? [Here, we are using the P/F-s verb "culunza", meaning 'to weigh'. As we discussed earlier in the chapter on Counts and Measures, it is the P/F-s verb form of the P-s adjective "culo" meaning 'heavy'.]
For the English expression 'how many' or 'how much', we need to use "-en" with the numeric root "kum" in exactly the same way we derived the non-specific numeric words. Here are a few interrogative examples:
How many boxes are there (= the boxes number how many)? = There are kumeno boxes? How many people live here? = Kumeno people live here?
In other words, when "-en" is used as a modifying concept, we are asking the listener to indicate the actual "position" among the various possibilities. Thus, for example, the simple adjective "tomeno" corresponds exactly to the English word "which/what", and "bodameni" corresponds to 'which duck(s)'. Here are some examples:
Which duck opened the window? = Tomeno bodami doykavapa the window? OR = Bodameni doykavapa the window? Who was a duck? = Begeni dapa bodami? ["dapa" = verb 'to be'.] = Begeni bodama? [Literally: "Which person was a duck", where "beg" is the root meaning 'person'.] Whose duck is that? = Bodami ximyu tomeni is that? OR = Bodami ximyu begeni is that? How is it that the duck lives here? = Bodami live here tomene? [In essence, "tomene" asks "what other oblique arguments can be added to this verb?". Do not confuse this with the more explicit instrument/means/method construction "busegene" = 'with what', 'how', or 'by what means/method'.]
In summary, "tomeni" (or a derivative) occupies the position of a missing word or expression that would have provided more detailed information, while indicating that it should be replaced by something more specific.
English speakers should be careful not to confuse the use of "-en" derivations with English equivalents that are not truly interrogative. Consider the following from the previous chapter:
Bill told me HOW he did it. = Bill told me BUSEGANGAW he did it. [This sentence can be glossed as 'Bill told me the method of his doing it'.]
Note that "how" is not a true interrogative.
Now consider the following:
How much money do you have? I know how much money you have.
The first sentence uses the true interrogative "kumeno". In the second, however, "how much" literally means "the amount/quantity of", which is not an interrogative even though it may appear to be in English. The equivalent of "the amount/quantity of" in the interlingua is the P/F-s open noun derivation "kumunzonaw", where "-on" is the quality/ability suffix.
Now, here's another one:
How much do you like the teacher? I know how much you like the teacher.
The first example uses the particle "jopenay" to modify the verb "like" because it is asking for the degree of "liking". The second example, however, cannot use "-en" because it's not a true interrogative. Instead, we will use an appropriate derivative of non-interrogative "jop":
jopunze = 'to the degree/extent of/that' (P/F-s case tag) jopunzaw = 'degree or extent of/that' (open noun) jopunzi = 'degree or extent' (noun)
Thus, the second example in the interlingua would look like this:
I know jopunzaw you like the teacher.
Finally, let's do one more:
Whose book are you reading? I know whose book you are reading.
Here, we have to restate the sentence as "I know the person who you are reading his book", where "his" is an anaphor for "person". Note also that, in this example, the syntactic object of "know" is "book", but the semantic object is not "book" - it's actually the person associated with the book.
Similar kinds of periphrasis will be needed for other non-interrogatives that use interrogative words in English.
There are several abstract relationships that are often discussed in the technical literature on semantics. Their simplest and most basic forms are all P/F-s verbs. I will simply list them and provide examples of their use. By now, potential derivations using these words should be obvious.
Here is a partial list:
Association: P/F-s -> 'to have an unspecified relationship with', 'to be involved with', 'to have something to do with' [This, of course, is the verb "xuma".] Equality: P/F-s -> 'to be', 'to be equal to', 'to be the same as' E.g. John is the new president of the company. [This is the verb "dapa", which we derived earlier when we discussed the 'state' case role.] Similarity: P/F-s -> 'to be like', 'to be similar to', 'to resemble', 'to share/have something in common with' E.g. John is like his father. [This is the verb "zunxuma", which we derived earlier when we discussed the Manner case role. We also derived some other useful words from the same root in the section on polarity.] Equivalence: P/F-s -> 'to be equivalent to', 'to amount to', 'to be comparable to' E.g. The cross-border raid was equivalent to an act of war. Analogy: P/F-s -> 'to be analogous to', 'to be equivalent to' E.g. A dog's relationship to a puppy is analogous to a cat's relationship to a kitten. (i.e. A dog is to a puppy as a cat is to a kitten.) Execution for murder is analogous to fines for petty theft. (i.e. Execution is to murder as a fine is to petty theft.) Proportionality: P/F-s -> 'to be proportional to' E.g. Volume is proportional to the radius cubed. Paronymy: P/F-s -> 'to be the source of', 'to provide/supply' E.g. -> This mine provides gold and platinum. Inverse F/P-s -> 'to derive/come from', 'to be a derivative of' E.g. Kerosene is a derivative of crude oil. [Incidentally, P is referred to as the base, while F is referred to as the paronym.] Hyponymy: P/F-s -> 'to be a kind/type/variety/subtype/example of' E.g. A horse is a kind of mammal. A dialect is a variety of a language. Inverse F/P-s -> 'to subsume', 'to include' E.g. Mammals include horses, dogs, and cats. [Incidentally, P is referred to as a hyponym of F, and F is referred to as a superordinate of P. Thus, 'horse' is a hyponym of 'mammal', and 'mammal' is a superordinate of 'horse'.] Relatedness: P/F-s -> 'to be related to', 'to be in the same class as' E.g. Cats are related to dogs, both being mammals. Magpies are related to crows. Compatibility: P/F-s -> 'to be compatible/consistent/go together with' E.g. My views are compatible with yours. His approach is consistent with his earlier work. Constituency or Partitive relationship: P/F-s -> 'to be part/element/component/member/constituent of' E.g. A finger is part of the hand. Inverse F/P-s -> 'to include', 'to have (as a component or part)', 'to contain', 'to comprise' E.g. The trip will include a stop in Rome. [Incidentally, P is referred to as the meronym of F, while F is referred to as the holonym of P. Thus, 'finger' is a meronym of 'hand', and 'hand' is a holonym of 'finger'.] Purpose: P/F-s -> 'to be the purpose/objective/goal/aim/point of', 'to be intended to/for' E.g. The purpose of the catalyst is to increase the reaction rate. The root for this concept is the P/F-s "cadap". From it, we can also derive the very useful purpose case tag "cadape" with the meaning of English "(in order) to", "so that", "in order that", "for (the sake/purpose of)", and so on. Readiness: P/F-s -> 'to be ready/fit/prepared/adapted to/for' E.g. The new classrooms are ready for the students. The children are ready to leave now. The root for this concept is the P/F-s "joyxum". From it, we can also derive the very useful case tag "joyxume", meaning 'in case (of)', as in the following sentences: I brought a book IN CASE the flight is delayed. We should buckle our seatbelts IN CASE OF accident. A flashlight was on the table IN CASE OF a power outage. We can also derive words such as A/P/F-d "joyxumamba", meaning 'to prepare/adapt to/for', as in "I prepared the children for school", and AP/F-d "joyxumimba", meaning 'to get ready or prepare/ready oneself to/for'. Supplementation: P/F-s -> 'to be in addition to', 'to be an adjunct or supplement to', 'to be an augmentation of' E.g. The money is a supplement to the normal wage. Alternativity: P/F-s -> 'to be an alternative to/for' E.g. Compromise is the only alternative to war. [Alternativity implies that there is a choice among options.] Alternation: P/F-s -> 'to alternate with', 'to take turns with' E.g. The girls take turns with the boys at the swimming pool. Red flags alternate with blue flags in the row of flagpoles. [Do not confuse 'alternativity' with 'alternation'. An alternative is an option while an alternate precedes or follows in temporal or locative sequence.] Substitutivity: P/F-s -> 'to be a substitute or replacement for' E.g. John is a replacement for the former teacher. [Note that this relationship can be used to derive the case tag meaning 'instead of', 'rather than', or 'in place of'.] Enablement: P/F-s -> 'to enable or make possible', 'to be a prerequisite for' E.g. The new policy will enable us to hire better engineers. Result: P/F-s -> 'to result in, produce, lead to, yield, bring forth, have as a result/outcome/product' E.g. Your stupidity resulted in lower profits. Inverse F/P-s -> 'to be the result/outcome/product of' E.g. The high dropout rate is the result of overcrowded classes. This is the root "jedap". The inverse case tag form "jedapange" has the meaning of the word 'that' in a sentence such as "He's so rich THAT he can afford a yacht". Literally, it means 'He is so rich, the result being that he can afford a yacht'. Contingency: P/F-s -> 'to be contingent/conditional on', 'to hinge on', 'to depend on' E.g. The success of the project depends on complete cooperation. Inverse F/P-s -> 'to entail/imply' E.g. 'He shouted again' entails 'He shouted earlier'. Lightning implies thunder. [Important: do not confuse 'contingency/implication' with 'causation'. See the section on Conditional Clauses for important applications of this concept.] Inherentness: P/F-s -> 'P has the qualities/nature/characteristics of F', 'P is inherently F', 'P is F by nature' E.g. A cat is a meat-eater by nature. He has the qualities of a good teacher. Inverse F/P-s -> 'to be inherent to', 'to be an inherent quality of' [This relationship is represented by the root "kaxum". We can also derive the interrogative adjective "kaxumeno" with the meaning 'what kind of' (literally 'being by nature what?'). The interrogative verb "kaxumena" means 'What is a', as in "What is a duck?" (literally, 'A duck is by nature what?' or 'What is the nature of a duck?').] Meaning: P/F-s -> 'to mean', 'to signify', 'to stand for', 'to denote', 'to indicate', 'to represent' E.g. The French word "maison" means 'house'. His behavior signifies that he is very angry.
And there are many others.
Note that all of the technical labels that we introduced above, such as "paronym", "meronym", and "superordinate" can be easily derived from the active and inverse forms of the corresponding verbs.
A conjunction links two constituents, and always provides additional information about the relationship between the items being linked. Also, some conjunctions can be concatenated to link more than two items. Here are a few examples:
Louise AND Bill just left. Louise OR Bill OR Mike will give the talk. Bill will go shopping IF Louise wants him to. John just went shopping, BUT he forgot to buy coffee. He bought the book EVEN THOUGH it was very expensive. He was the only one who was sober, SO he had to drive. He finished his homework at 7 PM, AND THEN he went outside to play. Bill missed the target; IN OTHER WORDS, he lost the match.
Conjunctions always link two expressions of the same syntactic type. For example, if a noun phrase immediately follows a conjunction, the conjunction links it to one or more preceding noun phrases. If a complete clause immediately follows a conjunction, the conjunction links it to one or more preceding clauses. And so on.
Conjunctions can be grouped into the following general categories:
Additive: and, also, in addition, besides, furthermore, moreover, similarly, likewise, in the same way, in other words, in conclusion, in summary, etc. Causal: if, then, unless, even if, so, consequently, thus, it follows, because, under the circumstances, for this reason, therefore, etc. Concessive/Adversative: but, and even, in spite of, however, although, albeit, notwithstanding, anyway, nevertheless, even though, regardless, even so, despite, just the same, even now, for all that, still, all the same, yet, whether or not, whatever, no matter what, in fact, as a matter of fact, despite that, on the other hand, etc. Substitutive: or, instead of, rather than, in place of, etc. Temporal: then, next, after that, finally, afterwards, before that, at last, at the same time, subsequently, etc. Continuatives/Cohesives: uh, now, well, anyway, okay, at any rate, in any case, etc.
[Incidentally, the above categories reflect linguistic/discourse distinctions based on actual usage in natural language, as opposed to logical distinctions. Logicians categorize conjunctions quite differently, and, in the process, end up excluding words and expressions that are truly conjunctive in nature, or end up restricting their meanings more than natural languages do. For example, most logicians and formal semanticians would not consider expressions such as "in other words", "afterwards", "on the other hand", and "anyway" as actual conjunctions, because they do not perform basic logical operations on truth conditions. In natural language, however, these are conjunctions and they perform important conjunctive discourse functions.]
Conjunctions are interesting because of their large numbers and because of the great variety of relationships that they represent. Also, the vast majority of them are derived from basic, open class words. Thus, while conjunctions do perform a function that is quite different from verbs, nouns, adjectives, etc., their meanings include the concepts of many of these words.
Conjunctions fall into three general categories depending on how they are used:
To illustrate the difference between true conjunctions and normal disjuncts, consider the following:1. True conjunctions. These always link a constituent which follows the conjunction with the closest preceding constituent of the same type (i.e., clause with clause, noun phrase with noun phrase, etc). The linkage is thus syntactically precise. Examples: and, or, but, unless, if.
2. Normal Disjuncts. These only loosely link a sentence which follows the conjunction with one or more of the preceding sentences. The syntactic linkage is often vague. Examples: however, on the other hand, also, in other words, despite that, etc. [Incidentally, these are normal disjuncts derived using middle voice operations. They can never be deictic.]
3. Case tags. These always link their arguments with one or more arguments in the main clause, or with the entire event represented by the main clause. Any transitive verb can be converted to a case tag.
The project was over-budget and under-staffed. The project manager was a political hack and his choice for a tech lead was a bureaucrat who could barely spell his name. Three of the engineers and four of the secretaries were sick most of the time. To make matters worse, the technicians had to spend most of their time on another project that had higher priority and more adequate funding. But the project was a great success.
Notice how "and" precisely links its arguments, creating new constituents of the same syntactic type. The syntax of the linkage is not in doubt.
But there is doubt about the linkage of the word "but" as it is used above. Does it link to the immediately preceding sentence, to the preceding two sentences, or to the entire preceding paragraph? If the above "but" were a true conjunction, there would be no doubt about which items were being linked. In effect, the semantics of "but" in the above example is not compatible with the syntax of a true conjunction since the linkage is not clear. The actual linkage can only be determined through context. [Compare it with "John stayed but Jill left", in which "but" is a true conjunction.]
Now, since true conjunctions and disjuncts are syntactically distinct, we must treat them as distinct syntactic entities; i.e. true conjunctions must define a unique part-of-speech. (Normal disjuncts, of course, are verbs.)
However, before discussing true conjunctions, let's first look at how to derive conjunctions as normal disjuncts.
Earlier, we looked at normal disjuncts with which a speaker could express feelings or attitudes about an event by using a verb (i.e., the disjunct) that takes an entire sentence as its single argument. For these cases, the unspoken arguments are demoted via a grammatical voice change and can not be precisely determined from the speech environment as would be the case with a deictic disjunct. Instead, they can only be guessed at based on the context, if at all. Here's an example:
P/F-s They hope that he wins. F-s [-P] Hopefully he'll win.
where the normal disjunct "hopefully" is actually a verb that takes a complete embedded sentence as an argument - it is not an adverb as in English.
As stated earlier, many other disjuncts of this type can be derived in the same way: "to presume" -> "presumably", "to be interesting" -> "interestingly", "to be possible" -> "possibly", "to be incidental" -> "incidentally, by the way", "to be necessary" -> "necessarily", "to be fortunate" -> "fortunately", and so on.
In these constructions, the attitude being expressed is typically (but not always!) the attitude of the speaker. Also, these constructions almost always imply that the attitude is shared by other, unmentioned people. Thus, this type of disjunct is not truly deictic, but is vaguer and more general.
Thus, normal disjuncts can be used as conjunctions whose scope is not precise. For these, however, we must demote the second argument of the verb rather than the first argument using an anti-middle construction (suffix = "-om"). Here is an example in the interlingua:
P/F-s: The new project is similar to the previous one. where "zunxuma" = 'to be similar to' P-s [-F]: "Zunxumoma" = 'Similarly, ...', 'Likewise...', 'In like manner', etc.
Note that "Zunxumoma" is a verb that takes a single core argument, even though the English translation requires a fronted adverb.
Note also that we must use an anti-middle, rather than an anti-passive, since the unmentioned argument is determinable from the context that preceded the disjunct. If we used an anti-passive instead, we would be able to specify the argument obliquely, which does not have the required semantics. In other words, when we say "Similarly, ...", we know that what follows is similar to what has already been said - not to something else that is optionally expressable.
Here are some more English examples:
P/F-s: The bazaar was in addition to the car wash. P-s [-F]: Additionally, ... P/F-s: The land swap was an alternative to continued violence. P-s [-F]: Alternatively, ... P/F-s: The accident occurred after the party. P-s [-F]: Afterwards, ... P/F-s: His odd behavior meant that he was angry. P-s [-F]: In other words, ... P/F-s: Red flags alternated with white ones. P-s [-F]: On the other hand, ...
Note the important differences between a verbal disjunct and a case tag. A disjunct must undergo a grammatical voice change to demote an argument, while the case tag keeps both arguments. Also, the demoted argument of a verbal disjunct is not as precisely known as the first argument of a case tag. For a case tag, we know that the first argument is either the entire main clause that precedes it or one of the primary arguments of the main clause.
In sum, when a deictic disjunct is used, the unmentioned argument(s) are determinable from the speech environment. When a verbal (i.e., middle or anti-middle) disjunct is used, they are determinable from the speech context; i.e., what has already been spoken. And for a case tag, the first argument is either the entire main clause that precedes it or one of the primary arguments of the main clause.
A true conjunction should be used only when its linkage is determinable using only the rules of syntax. This will only occur when the items being linked are part of the same sentence and have the same part-of-speech. A disjunct should be used to introduce a sentence that is only loosely linked to the preceding one(s). A case tag should be used when its argument links to something in the same sentence which cannot be determined using only the rules of syntax.
For true conjunctions, we will use the classifier "tes" and the part-of-speech marker "-ye". Here are a few true conjunctions:
tesye - 'and', 'plus' zuntesye - 'than', 'as' daytesye - 'or', 'or else', 'either ... or' kuntesye - 'but', 'but also'
Note that zuntesye was discussed earlier in the chapter on comparatives.
The conjunction "but" has semantics very similar to "and". However, unlike "and", "but" has the further implication that the items it links are somehow in contrast to each other without itself providing any indication of the nature of the contrast (such as opposition, oddness, incompatibility, disagreement, distinction, counterbalance, surprise, differentiation, and so on).
Note that there is no special construction in the interlingua for the English expression "neither ... nor", since this is just an alternative for "and" with a negated verb. For example, "Neither John nor Michelle left early" is the same as "John and Michelle didn't leave early". [TBD: Why not use "jutesye"???]
We will see additional conjunctions later.
There are many different disjuncts that have essentially the same meanings, but which are used in different settings. Natural languages differ widely in the number and nature of these expressions.
Fortunately, we can capture these distinctions without having to arbitrarily create words that will have few close counterparts in other languages. We can do this by simply changing the speech register of the more basic disjuncts by using the register prefixes we discussed earlier. Here are a few examples:
from 'also/too' informal -> 'besides' formal -> 'in addition', 'additionally', 'furthermore', 'moreover' from 'still/yet' informal -> 'whatever', 'even so', 'for all that' formal -> 'though', 'although', 'however', 'nevertheless', 'regardless', 'notwithstanding' from 'even though' formal -> 'despite that', 'in spite of the fact that' from 'well/so' informal -> 'okay', 'so anyway', 'so anyhow', 'anyway', 'anyhow', 'okay then' formal -> 'now', 'in any case', 'at any rate', 'in any event' from 'then (= thus)' informal -> 'because of this', 'for this reason' formal -> 'thus', 'therefore', 'it follows therefore that', 'consequently', 'hence'
And so on. The actual distinctions between informal, formal, etc. will vary somewhat from one person to another, and the above examples reflect my own (subjective) conclusions. (Actually, I doubt if it's possible to precisely define the semantics of these register differences.)
Conjunctions can be used to solve problems that sometimes show up if the syntax of an interlingua is strict and unambiguous. For example, if the syntax requires a relative clause to always attach to the closest preceding noun, you would not be able to render the following as a single sentence:
I told him about the chicken that we had for supper that was killed by a coyote.
If the syntax is strict (as it is in the interlingua), then the relative clause "that was killed by a coyote" would modify the noun "supper", which is nonsense. With a conjunction, however, the problem disappears:
I told him about the chicken that we had for supper AND that was killed by a coyote.
Here, "AND" links the two "that" clauses so that both modify "chicken".
If a relative clause modifies a noun phrase that is part of a coordinated pair, the linkage may be ambiguous. Consider the following:
1. The boy and (the girl who ran away)... 2. (The boy and the girl) who ran away...
In the interlingua, relative clauses modify only the single, closest, preceding noun by default, and conjunctions link the following item with the closest preceding item of the same type. Thus, without further information, (1) is the only possible interpretation.
If we want the relative clause to apply to the compound phrase, we could modify the relative conjunction with a modifier meaning 'both' or 'all', or something similar. However, this is not a very good solution, since parsing success would now depend on the meaning of the words in addition to the syntactic relationships between the various parts-of-speech. If a language is to be computer-tractable, parsing must depend only on morphosyntax.
Sometimes, periphrasis or parenthetical expressions can be used to eliminate the ambiguity. Here's an example:
The boy and the girl, both of whom ran away, ... Jim, Bob, and Joe, all three of whom were in the accident, ...
In effect, the expressions "both of" and "all three of" terminate the coordinated structure and allow further modification.
However, this option is rarely used even when it is available.
Now, consider the following two sentences, and note how the parentheses indicate how the constituents are grouped based on their most likely interpretations:
(The boy with the red hat) and (the girl with the puppy)... The boy with ((the lunchbox) and (the book with the missing cover))...
The two examples seem to be syntactically identical, but a human listener would group the constituents differently. In the interlingua, the adjectival phrase "with a missing cover" modifies the noun "book", and the conjunction "and" links the noun phrases "the lunchbox" and "the book with a missing cover". Thus, the grouping shown in the second example is correct, while the grouping shown in the first example is wrong.
The reason why the first example is not ambiguous in English is because it's the only grouping that makes sense. However, it is possible for the same structure to be ambiguous, as in the following example:
I just looked at the room with the new computer and the modem with the bad ICs.
Is the modem in the same room as the computer? In the interlingua, the answer is "yes", but in English the sentence is ambiguous. Does the computer also have bad ICs? In the interlingua, only the modem has bad ICs, but in English it is not clear.
In English, the sentence is doubly ambiguous, not only because attachment on the right is ambiguous, but also because we're not sure where the coordinated structure begins. Does it begin with "the room" or does it begin with "the new computer"?
Now, the interlingua is not ambiguous - only the modem has bad ICs. Also, in the interlingua, there is no doubt that both the computer and modem are in the same room. How, though, can we indicate that both the computer and the modem have bad ICs or that they are not in the same room? Again, periphrasis can sometimes work:
I just looked at both the room with the new computer and the modem with the bad ICs.
However, this option is not always available, and if it is, it's not often used, since either context will resolve the ambiguity or the speaker simply won't realize that there is an ambiguity.
In the interlingua, we will also have the option of using periphrasis. In fact, this may be the only option when translating from a natural language statement that uses periphrasis, because the translation software may not be sophisticated enough to realize what is actually happening.
However, we will also implement a solution that is purely syntactic and which can be used when practical. We will do this by allocating two new particles. The particle "cijop" will be the equivalent of an opening parenthesis and the particle "jejopi" will be the equivalent of a closing parenthesis. The part-of-speech of "cijop" must be an open version of the item being parenthesized, and the part-of-speech of "jejopi" will always be noun. Thus, in addition to the default linkages, we can also do the following:
Cijopaw the boy and the girl jejopi who ran away... -> both the boy and the girl ran away. Cijopaw the boy with the red hat jejopi and the girl with the puppy... -> the boy with the hat is separate from the girl with the puppy. I just looked at cijopaw the room with the new computer jejopi and the modem with the bad ICs. -> the modem is not with the computer and is probably not even in the same room. I just looked at the room with cijopaw the new computer and the modem jejopi with the bad ICs. -> the computer and the modem are in the same room and both have bad ICs.
It's important to note that the particles "cijop" and "jejopi" may only be used when the default interpretation is not the desired one. And since most coordinated structures are relatively simple, these particles will probably not be needed very often.
[Incidentally, some natural languages achieve a bracketing effect similar to that of "cijop" and "jejopi" by using explicit open/close morphemes that are very reminiscent of parentheses. Here's an example from Malagasy:
ity trano fotsy ity this house white this 'this white house'
There are also many languages, such as Persian (Iran), Yoruba (West Africa), and Hewa (Papua New Guinea), that bracket their relative clauses with explicit start and end morphemes. Although this may seem unnecessary or even redundant, it can be useful at times to prevent ambiguity.]
Parenthetical expressions which elaborate or exemplify a concept sometimes use conjunctions, but not always. Here are some examples in English:
Some people, SUCH AS JOHN, BOB, AND MIKE, had to leave early. When John travels in the winter, SUCH AS TO BOSTON OR TO NEW YORK, he always forgets to bring his gloves. Many birds do not fly south for the winter (E.G. SPARROWS AND PIGEONS). The man who managed the finance department, BILL JOHNSON, also managed the marketing department. The single disadvantage (I.E. THE HIGHER COST) will probably kill the project. John Smith, WHO JUST FILED FOR BANKRUPTCY, recently moved to Texas.
In the interlingua, we will use the following particles to bracket a parenthetical expression:
bajop -> start particle for a parenthetical expression jujopi -> end particle for an incomplete parenthetical expression (equivalent to "such as" ... "etc") xijopi -> end particle for a complete parenthetical expression (equivalent to "i.e.")
The start particle will introduce a list of one or more items and an end particle must terminate the list. If a list has more than one item, then they must be separated by the special conjunction "byetesye". These words correspond to pauses used in speech, or parentheses and commas used in writing.
The part-of-speech of the start particle "bajop" should be both open and compatible with the item it is modifying. Thus, in the first example above, the open-adjective form "bajopyu" should follow the interlingua words for "some people", since it modifies that phrase. In the second example, the case tag form "bajope" must be used since it essentially adds two new oblique arguments to the verb "travel". And so on for the remaining examples.
To handle quotes, we will use the particle "tejop" to start the quote, and will terminate the quote with "xijopi". The part-of-speech of "tejop" must be appropriate for the way the quote is used. See below for examples.
These particles can often be used in the same way that English uses quotes in writing or the words "quote" and "unquote" in speech. Here are some examples:
I asked Bill tejopaw Do you have enough money? xijopi He shouted the words tejopyu Go away! xijopi at the teacher.
Note that the open adjective part-of-speech marker "-yu" must be used if the quoted material modifies another noun, as in the second example.
When one event is conditional upon another, English normally links the events with an "if...then" construction, as in the following example:
If the law is passed, (then) tax forms will be simpler.
In the interlingua, we will allocate the true conjunction "citesye" to represent the contingency relationship:
True Conjunction "citesye" = 'if', 'as long as': (Tax forms will be simpler) citesye (the new law passes) = Tax forms will be simpler if the new law passes. True Conjunction "citesangye" = 'then' ("-ang" = inverse suffix): (The new law passes) citesangye (tax forms will be simpler) = If the new law passes, then tax forms will be simpler. [Note that the word "citesye" meaning 'if' cannot be used here. This is similar to Hindi, where the word meaning 'then' is always required, while the word meaning 'if' is optional. In English, of course, the exact opposite is true.] The modifier root morpheme "ci" can also be used to create verbal derivations with similar meanings to the conjunctions: Verb P/F-s "cidapa": (Tax forms are simpler) cidapa (the new law passes) = Having simpler tax forms is contingent upon/depends on passage of the new law. Case tag "cidape" = 'provided/providing (that)', 'on condition (of)', 'as long as', 'if', 'depending on', etc. We'll go to the lake cidape the weather. = We'll go to the lake depending on the weather.
Another kind of conditional expression is called the counterfactual expression. Here's an English example:
If Joe had opened the window, Louise would have screamed.
The implication here is that Joe did not open the window and Louise did not scream; i.e., that the event is purely hypothetical.
[Incidentally, do not confuse counterfactual "would" with habitual "would"; e.g., "If I arrived early, he would offer me some coffee". A good test for this is to replace "if" with "when". If the replacement is grammatical and the meaning is essentially the same, then it is not a true counterfactual. For example, "When I arrived early, he would offer me some coffee" is habitual, not counterfactual. In a true counterfactual, "if" must always be used or implied in English.]
Counterfactuals are only useful in the past and perhaps the present tense:
If Sam had arrived earlier, Joe would have been angry. If Sam were to arrive now, Joe would be angry. *If Sam arrives later, Joe would be angry.
Even if the third example is grammatically acceptable (I find it very awkward), it's still not a true counterfactual because "would" is simply a synonym for "will".
It's also possible to modify the probability of the hypothetical implication by using polarity prefixes, as in the following:
maximum: If Joe had opened the window, Louise would have screamed. High: If Joe had opened the window, Louise probably would have screamed. Minimal: If Joe had opened the window, Louise just possibly would have screamed. OR If Joe had opened the window, Louise just might have screamed. [Note that "would" is not used here. The use of "had", however, forces a counterfactual interpretation.] Unspecified: If Joe had opened the window, Louise would possibly have screamed.
In effect, 'counterfactuality' is a combination of 'hypotheticality' and 'probability', and the result is also an epistemic modality. The degree of the modality will determine the degree of probability of the hypothetical implication.
In the interlingua, we will allocate the epistemic modal root "xiv" for counterfactuality.
Two obvious derivations of this modality are "bixivemo" meaning 'counterfactual' (i.e., 'both hypothetical and untrue') and "xivemo" meaning 'hypothetical' or 'speculative' (i.e., 'hypothetical and possibly true').
Compounds are single words or simple expressions that represent unique concepts, but which are formed by combining two or more root morphemes. There are three kinds of compounds:
1. Compounds which represent the sum of their components (i.e., both components are present): to test-fly = to test AND to fly also drop-kick, stir-fry, go swimming/shopping/etc 2. Compounds in which one root is the argument (core or oblique) of the other root: watchmaker = X makes watch (argument = object) also mousetrap, fly swatter, housecleaning, blood test(er) Compounds of this type can also be created using verbs that are derived from basic nouns: baby oil (= X 'oils' baby), dish towel (= X 'towels' dish), doghouse (= X 'houses' dog), towel rack (= X 'racks' towel), dancehall (= X 'halls' dance), water skis (= X 'skis' water), snowshoes (= X 'shoes' snow), etc. rescue team = team rescues Y (argument = subject) also team rescue, student association, fan club, manmade [Note that the grammatical voice of the verb meaning 'rescue' determines whether the interpretation is 'rescue team' or 'team rescue'.] college education = X educates Y in/at college (argument = oblique locative) also beach party, mountain warfare, barn dance, city life spring showers = it rains DURING spring battle fatigue, evening prayers, marital sex, night flight to towel dry = X dries Y using towel (argument = oblique instrument) also steam iron, to water cool, handwriting, windmill to backpedal = X pedals backwards (argument = oblique method/manner) also to sidestep, freestanding, to dog-paddle, to bunny-hop And so on. Many more oblique relationships are possible. 3. Compounds in which BOTH roots are core arguments of an IMPLIED verb: bedsore = bed CAUSES sore also disease germ, storm damage, tear gas, birth pain [Note that the INVERSE sense of the verb "cause" is used for "disease germ" and "tear gas".] tax laws = laws BEING FOCUSED ON taxes also murder investigation, UFO sighting, food requirements houseboat = boat BEING-THE-SAME-AS house also dungheap, girl friend, infantry battalion, snowball [Note that this group could also be considered as the noun equivalent to verb compounds like "stir-fry" mentioned above, since both components are present.] olive oil = oil BEING A DERIVATIVE OF olives also solar energy, buffalo hide, wood pulp, cane sugar also inverses meat calf, milk cow, pulp wood toolbox = box CONTAINING tools also apple pie, pea pod, salt marsh also inverses lemon peel, door knob, windowpane And so on. There may be others that fall into this category. However, if there are, I doubt there are very many of them.
Note that many compounds can appear in more than one category. For example, "tree nursery" can be derived from "X GROWS trees AT nursery" or the inverse of "trees BEING LOCATED IN nursery". The compound "towel rack" can be derived from "X places towel ON rack" or the inverse of "towel BEING LOCATED ON rack". It is important to keep this in mind, since it's possible that one version may be implemented more efficiently than another, even though they have essentially the same meanings. Also, some are more specific, and thus less useful, than others.
[Incidentally, Mandarin Chinese has many compounds in which each component means essentially the same thing. However, since most Chinese morphemes have several meanings, using just one would be ambiguous. By using two with the same or close meanings, the result is a word whose meaning is the meaning that the two components have in common. In a properly designed language, this type of compound is totally unnecessary.]
Some languages implement compounds by simply juxtaposing complete words (e.g. English, Chinese, Indonesian, and Quechua). Unfortunately, this approach is useless if you want the resulting compounds to be semantically precise. (By "precise" I mean 'as precise as the inherent precision of the basic components will allow'.) For example, what is the relationship between "house" and "boat" in the word "houseboat"? What is the relationship between "house" and "maid" in the word "housemaid"? Obviously, the relationships are different.
Another way to implement compounds is to use a combination of a headword and a morphologically correct modifier (e.g. English adjective-noun compounds "solar panel", "marital sex", "marine life", "academic transfer", etc.). English uses this approach occasionally, French uses it more often, while Russian and Arabic use it quite often. In general, a language is more likely to use this approach if it has a regular and productive way to convert words from one part-of-speech to another. However, while the semantics of this kind of construction is more precise than simple juxtaposition, it can still be ambiguous.
In many languages, ambiguity is somewhat reduced by using linking morphemes such as English prepositions. Swahili uses this approach for almost all of its compounds, and French uses it for most (French examples: "salle à manger", "eau de toilette", "film en couleurs", etc.). English uses it occasionally, as in "son-in-law", "hand-to-hand", and "bed of nails". Note, though, that these linking words can be very vague and their use is often idiosyncratic. If we want the semantics of our compounds to be precise, then the semantics of the linkers must also be precise.
With the above comments in mind, let's look again at each type of compound and ask ourselves the following questions:
a. Do we already have a way to implement this type of compound? b. If not, what new technique should we create to do it?
As I will show below, the answer to question "a" is always "yes", making question "b" unnecessary. Here goes...
1. Verb-Verb Compounds
Compounds similar to English "stir-fry" seem to be quite rare among natural languages. The only languages I know of that use them frequently are Chinese and a few others that make extensive use of serial verb constructions.
In the interlingua, we can implement these compounds easily by creating case tags and adverbs that perform the same semantic function as serial verbs.
2. Open-Word Compounds
We can often accomplish this in the interlingua by 'opening up' the argument structure of nouns and adjectives derived from verbs. Here are three examples using words we've already created:
duck teacher = "kobaycalinzaw lubodami" = 'teacher about ducks', where "kobaycala" = AP-s verb 'to teach', verb "kobaycalinza" = AP/F-s 'to teach', "bodami" = 'duck', and "lu-" = generic prefix. [Remember, we open up the argument structure of a normally 'closed' noun by using the part-of-speech suffix "-aw".] duck teacher = "kobaycalapaw lubodami" = 'teacher of ducks' or 'one who teaches ducks', where "kobaycalapa" = A/P-d verb 'to teach (someone)'.
However, do not confuse these with:
duck teacher = "bodamo kobaybegi" = 'a teacher who is a duck' [Here, there is no need to 'open up' the noun "kobaybegi" = 'teacher' to make the subject position available for use. Instead, we simply use the adjective version of the noun meaning 'duck'.]
Finally, many compounds are really not necessary. For example, the English word "backpedal" can be just as easily implemented as "to pedal backwards", where "backwards" is a basic adverb.
3. More Complex Compounds
Some compounds will require that two concepts be the arguments of another verbal concept. Here is another one where the implied verb is "dapa" = 'to be':
snow duck = "cinjavo bodami" = 'duck which is snow' (cf. "snowman", "snowball", etc.) where "cinjavi" = 'snow'
Note that the above is just an adjective-noun compound, where the basic relationship is not stated separately, but is the result of normal derivational rules. The interlingua can create many compounds this way, as is commonly done in languages such as French, Russian, and Arabic, but with true semantic precision.
Now, let's create some compounds in which the relationship must be indicated by a separate word. Here are two examples:
silver mine = mine XXX silver = 'mine being-the-source-of silver' where "XXX" is the open adjective inverse of the paronymy relationship that we discussed earlier. hydrology textbook = textbook XXX hydrology = 'textbook that contains hydrology' where "XXX" is the open adjective inverse of the constituency relationship that we discussed earlier.
And so on. These compounds are similar to Swahili compounds and most French compounds, but are semantically precise. English often creates similar constructions, such as "blood-sucking mosquitos", "swamp-dwelling amphibians", "man-eating tigers", "house-cleaning lady", etc. In these, however, only the hyphenated part of the construction is usually classified as a compound.
Thus, since any relationship can be expressed by a transitive verb, and since any transitive verb can be converted to an open adjective, there is no limit on the number of compounds that can be created with semantic precision.
We can create vaguer noun-noun compounds by using "xumyu", which we discussed earlier. For example, "hydrology textbook" could be implemented as simply "textbook xumyu hydrology". In fact, this approach is just as semantically vague as compounding in most natural languages, and can be used for any noun-noun compound.
Finally, the approach we are using here allows us to create many useful compounds that, in a language like English, would be either ambiguous or even impossible to create. For example, the English compound "woman teacher" could mean 'woman who teaches', 'teacher of women', 'teacher who focuses on women', 'one who teaches like a woman', etc. With the system presented here, we can create more compounds, and their meanings are always obvious. This ability is especially important because an MT interlingua is likely to be used by people who have different native languages. For example, if we were to create compounds as in English (by the simple juxtaposition of two root morphemes) the results will often be gibberish for some or will be interpreted differently by people of different linguistic backgrounds.
Unfortunately, in all natural languages, most compounds are created as needed and do not appear in dictionaries. Machine translation software that attempts to translate these compounds from a natural language to the interlingua will generally not be able to provide a precise translation, but will instead be forced to provide a vaguer substitute. However, human translators should always provide precise compounds to ensure that subsequent translations into other natural languages are as accurate as possible.
Some compounds are not semantically precise, but actually refer to a subset of entities within a class. In other words, a literal interpretation of the compound actually describes more entities than it is intended to represent. For example, we might be tempted to create the adjective+noun compound with the literal meaning 'black bear' to represent the species 'Black Bear'. However, this would be incorrect, since 'black bear' can apply to any bears that are black in color, even those that are not members of the species 'Black Bear'. Because of this, a normal compound cannot be used.
What we need is a way to make a distinction between normal, semantically precise phrases and mnemonic compounds.
In the interlingua, we will accomplish this by using the prefix "le-" for derivations that refer to distinct concepts that are over-described by normal derivation. This prefix will be used on the modifier or argument of a headword of a normally formed compound. For example, if the word for 'bear' is "bunzovi", and the word for 'black' is "kunzigo", then the expression "kunzigo bunzovi" can be applied to any bear that is black in color, while the mnemonic compound "Lekunzigo Bunzovi" will refer only to members of the species 'Black Bear'. Note that the prefix is applied to the modifier, since it is being used for its mnemonic value.
With this approach, we are providing ourselves with the ability to use normal compounding techniques where we feel that a simple basic noun is inappropriate.
[Later, I will discuss a consistent and objective approach for naming species.]
We've already discussed some of the ways in which an argument of a verb can be 'topicalized' or made more salient than other arguments. In this section, I will discuss and summarize all of the various degrees of topicalization that an MT interlingua will need.
Topical constructions add emphasis and sometimes contrast over and above the normal topicalization indicated by argument structure. In natural language, there are basically four degrees of topicalization:
1. Normal topicalization. Topicalization is indicated by the basic argument structure of the verb; i.e. a subject is more topical than an object or an oblique argument. In some languages, especially those with an anti-passive construction, objects may be more topical than oblique arguments. (English does not seem to make a distinction in topicality between objects and obliques. This view is supported by the fact that so many English verbs are inherently anti-passive but do not have active counterparts with clear differences in topicality; e.g. "to listen to", "to talk to", "to look at/for/up", "to wink/shout/laugh at", "to complain to", etc.) 2. Contrasting topicalization. Topicalization provides both emphasis and contrast. Here are some English examples: It's John who killed the chicken OR JOHN killed the chicken. It's a chicken that John killed OR John killed a CHICKEN OR A chicken is what John killed OR What John killed is a chicken. 3. Heavy topicalization. An argument of the verb is made more topical than the subject. Here are some English examples: Bill, I saw him yesterday. The new amusement park, it opens for business today. On Sunday, I plan to relax all day. With his new suit, he can attend the conference without embarrassment. 4. Reference-switching. A new entity is introduced into the conversation and singled out for special attention. Here are some examples: As for the chair, John broke it. As regards John, he left in disgust. As far as the meeting is concerned, I decided not to attend. The thing about John is that he's never on time. With regard to the delays, I assure you they won't happen again.
Normal topicalization is an inherent part of the verbal derivational system that we are discussing in this monograph. This system is not only perfectly regular, but it allows us to create four sub-degrees of topicality (subject vs. object vs. expressable oblique vs. inexpressible oblique). And, if the syntax is designed properly, then even normally oblique case roles can be promoted relative to the core roles. In contrast, most languages provide only two or three sub-degrees, while typically displaying a considerable amount of idiosyncracy.
The second kind of topicalization, contrasting topicalization, is used to add both emphasis and contrast to an argument of a verb. English is somewhat unusual among the world's languages in implementing this function using cleft sentences. Most languages achieve this function by somehow marking the item with an inflection or particle and leaving the item in its normal position in the sentence. However, I believe that most (if not all) languages can achieve the same effect by simply giving the word additional stress.
In the interlingua, we will achieve this effect with the special particle root "kunjop", and its part-of-speech should be compatible with the constituent that it modifies. Here are some examples:
John killed a chicken kunjopo. = A chicken is what John killed. OR = What John killed is a chicken. OR = John killed a CHICKEN. Billy hit Jimmy kunjopo? = Was it Jimmy that Billy hit? OR = Billy hit JIMMY? Louise may kunjopay have bought a lamp. (where "may" is the modal disjunct "tamu", and "-ay" is the part-of-speech marker for previous-word modifiers) = It MAY be a lamp that Louise bought. OR = Louise MAY have bought a lamp.
And so on.
The third type of topicalization, heavy topicalization, focuses the listener's attention on a particular argument of the verb. In effect, it makes the argument even more topical than a normal subject. Most natural languages, including English, accomplish heavy topicalization by a process called left dislocation; i.e. by moving the emphasized argument out of the sentence and placing it before the sentence. In addition, an anaphor of the moved item normally appears in the original position in the sentence if the moved item is a core argument of the verb. Thus, in English:
The Smiths, THEY left early.
Here, "the Smiths" is left-dislocated and the anaphor "they" takes its place in the sentence. In addition to the dislocation, languages mark the emphasized item either by an explicit marker, such as a particle, by a change in stress and timing, or both.
Left-dislocation seems to be the way that most natural languages implement heavy topicalization. Also, in most (if not all) languages, an anaphor of the dislocated item occupies the original position in the sentence if the dislocated item is a core argument. We will use the same approach in the interlingua.
In the interlingua, we will reserve the particle "xojopa" for this purpose. Here are some examples:
Xojopa bodami, the sailors ate bohi. = The duck, the sailors ate it. Xojopa on Sunday, I plan to relax all day. = On Sunday, I plan to relax all day.
However, "xojopa" is really not necessary if the topicalized argument is oblique. In this case, we can simply place the oblique argument ahead of the subject, as we discussed earlier. For example, assuming a right-branching syntax, the sentence "Went on Monday Joan to the movies" would be translated "On Monday, Joan went to the movies".
The fourth kind of topicalization, reference-switching, introduces or re-introduces an entity into the conversation, and singles it out for special attention. This is also normally implemented as a type of left-dislocation, since the argument is moved to the left of the sentence and the gap in the main sentence is almost always filled with an anaphor of the moved argument. In English, this is usually accomplished with phrases such as "As for X, ...", "With regards to X, ...", "As far as X is concerned, ...", "The thing about X is that ...", etc. In the interlingua, we will use the particle "zunjopa" for this purpose:
Zunjopa John, I think the boss is going to fire him. = As far as John is concerned, I think the boss is going to fire him. Zunjopa the new employee, I think he'll do very well. = As for the new employee, I think he'll do very well.
Note that both "xojopa" and "zunjopa" are verbs and require that a complete sentence immediately follow their argument.
Proper names are the names of individual people, places, and things. However, what is considered "proper" can differ from language to language. Here is the precise definition that we will use for the interlingua:
A proper name is a word that names or labels a specific, unique representative of a category designated by a basic noun. The proper noun word itself cannot have sub-categories.
Thus, using the above definition, words such as "Atlantic", "Johnson", "IBM", "Christianity", "New York", "Caucasian", "1996", and "USA" are all proper nouns. They are intended to name unique instances of, respectively, the following common nouns: "ocean", "person", "corporation", "religion", "city", "race", "year", and "nation".
With the above definition, it would appear that a word such as "Christian" cannot be a proper name because it has sub-categories such as "Catholic" and "Methodist". However, "Catholic" and "Methodist" are not true sub-categories of the word "Christian", they are sub-categories of the word "sect". or, more precisely, they are the names of specific sects.
Names of people, such as "Mike Johnson", often refer to more than one person. However, they are still proper names because they are intended to isolate a particular individual.
Common nouns such as "tiger", "catfish", and "professor" cannot be proper names because they are generic terms that are not intended to isolate specific entities. If we wish to isolate a specific entity, then we must qualify the common noun, as in "this catfish" or "the new professor".
Names of specific activities, such as "soccer" (a specific sport), "opera" (a specific musical form), and "geology" (a specific field of study) are not proper names because they must either be qualified to isolate a specific instance (e.g. "yesterday's soccer match") or must be titled (e.g. the opera "Carmen").
A proper name such as "1996" represents a specific period of time, and in the semantics of the interlingua, time is considered an entity (it is classified along with all of the other physical nouns). Also, words such as "March" and "Tuesday" are also proper names, even though they can refer to more than one period of time, because they are intended to isolate a particular time period.
In the interlingua, the prefix "la-" will be used to create proper names, and the normal rules of self-segregation must be applied. However, the root classifier must be appropriate for the proper noun. Other modifying morphemes that follow "la-" will not have semantic significance, but they may be used for their mnemonic value.
Here are some examples:
Africa - Lafibitisi ("-tis" = appropriate class for 'continent') France - Larandugi ("-dug" = appropriate class for 'nation') [Some consonants, such as 'r', are not used in normal word design, but may be used in proper names.] The Nile (River) - Lafizebivi ("-biv" = appropriate class for 'river') [It's impossible to come even close to the pronunciation of "Nile" without violating the morphological rules of the interlingua. Thus, I chose to use "fi" for its mnemonic association with "Lafibitisi" (meaning 'Africa') plus the word "zebivi", meaning 'river'. Note also that "The" is required in the English translation, even though the definite article is not normally used with proper names.] John - Lajonbegi ("-beg" = appropriate class for 'person')
The 'person' classifier "-beg" should be used for all sentient beings as well as for animals that are normally given individual names, such as pets.
For the attributive adjective associated with a proper name, we must use the quality suffix "-on". For example, "Larandugono history" means 'French history' and "Larandugono food" means "French food". However, to conform with the use of "-beg" discussed in the preceding paragraph, we will use "Laranbegi" to mean 'French person/Frenchman'.
The normal rules of derivation that apply to basic nouns will also apply to proper nouns. For example, the P-s verb "Laranduga" means 'to be France'.
Conventions can also be adopted that apply to proper names that come in groups. For example, days of the week can all have the form "LaXXtovi", where the sub-string XX is a numeric CV and "tovi" is the word meaning 'day':
Labatovi - Sunday ("ba-" = numeric 'one') Laxetovi - Monday ("xe-" = numeric 'two') Laditovi - Tuesday ("di-" = numeric 'three')
And so on. A similar approach can be used for months of the year, the years themselves (e.g. "1996"), letters of the alphabet, stellar constellations, etc.
A proper noun can be modified by adjectives to indicate titles. Here's an example:
kekobaybegi - 'professor' Lajonzonbegi - 'Johnson' kekobaybego Lajonzonbegi - 'Professor Johnson'
Note that the above literally means 'Johnson who is a professor' or simply 'Johnson the professor'.
If an entire translatable expression is a proper name and it does not have a dictionary entry, such as a book or report title, or the complex names of not-very-well-known places or events, then it should be bracketed by the proper name particle "lajopaw" and the parenthetical stop particle "xijopi". We can use these to create proper names such as "The White House", "The Sea of Japan", "The American Revolutionary War", and so on. Expressions bracketed by "lajopaw ... xijopi" are always nouns but cannot undergo further derivation. [TBD: Should "The White House" be a mnemonic noun?]
It will not be practical to have equivalents in the interlingua for all proper names of all natural languages. In general, we will only provide equivalents for names that are truly international, such as country names and the names of well-known cities, monuments, etc. Because of this, all translation software must be able to accept natural language names.
We will accomplish this by enclosing the name in curly brackets, preceding it with any necessary prefixes (especially a gender prefix, if applicable), and appending an appropriate classifier and part-of-speech marker. Here are some examples:
Loyla{Mahatma Gandhi}begi = "Mahatma Gandhi" ("loy-" = male prefix, and "-beg" = classifier for a person) La{IBM}jagi = "IBM" ("-jag" = classifier for a corporation) La{Flash Gordon}zugi = "Flash Gordon" ("-zug" = classifier for a performance) La{Flash Gordon}begi = "Flash Gordon"
Note that all of the above use the prefix "La-" since they are all proper names.
We will also extend this to apply to non-proper words that are not likely to have equivalents in other languages. For example, we can have legitimate words such as {pepperoni}fupi = "pepperoni (sausage meat)" and {pici}tavi = Indonesian man's hat.
Note that an appropriate classifier is mandatory, even for proper names, because it provides valuable information that can be used by the translation software to improve the result.
This technique can also be used to quote parts of words, complete words, abbreviations, words in other languages, or even longer strings that cannot be translated. For example, we could discuss the P-s suffix "-su" by referring to it as {P-s "-su"}kusi, where "-kus" is the classifier for components of a performance.
A problem arises when we need to create an anaphor of a borrowed word, because an anaphor is formed from the first syllable of the root, and the root of a borrowed word may not conform with the rules of the interlingua. We will deal with this problem by adopting the following rule:
The anaphor of a borrowed word will always use the mandatory classifier.
A vocative is a word or phrase intended to directly address or get a person's attention. In the interlingua, vocatives are implemented with the special part-of-speech suffix "-we" which must be applied to the head noun of the vocative noun phrase.
The true generic "Tomwe" can be used as a general way to get someone's attention, and is equivalent to English "Say there!", "Hey!", or even "Ahoy!". Thus, it is, in effect, a stand-alone sentence. We can also consider it to be a disjunct for whatever sentence that follows it.
Syntactically, all other vocatives are nouns because they can take modifiers and arguments. For example, they can be modified by adjectives, and, if they are open, they can have arguments. However, from the clause's point-of-view, they are oblique arguments, and can appear in any part of the clause suitable for an oblique argument.
Here are some examples (assume VSO word order):
Tomwe! Tamenu ximunza xevi botimi? interrog have you boat = Say there! Did you have the boat? Ximunza Kekobaybegwe bavi botimi. Have Professor(voc) I boat = Professor, I have the boat. Fagipoy bape Lajonbegwe! Leave(imp) now John(voc) = Leave now, John!
Note that, if a vocative is the first argument of the verb (as in the second example), it has even more salience than the subject. Thus, it must be fronted in the English translation.
In this chapter, I would like to discuss some of the strategies used to design the vocabulary of the interlingua.
Very early in this monograph, we decomposed the verb meaning 'to know' into a root concept and an argument structure. We then applied all other possible argument structures to the same root. This process resulted in many unexpected and extremely useful derivations. The number of useful derivations increased even more as we applied prefixes and other suffixes.
With the above in mind, we can state several general guidelines for word design:
1. Start with simple, common verbs and adjectives. Isolate their root concepts and apply it to every classifier. Appropriate suffixes should be used when related verbs have different argument structures (e.g. "to say" vs. "to tell"). In the process, a very large number of less common concepts will be automatically derived. This principle also applies to numeric, deictic, tense-aspect, and modal concepts. 2. Keep in mind the inherent difference between basic state concepts and modal concepts. When in doubt, always test new concepts to determine if they are modal. 3. If there's difficulty defining a basic state or modality, or if it has limited usefulness when combined with most classifiers, it is very likely that the state is not very basic. When this occurs, postpone derivation until later. You may be able to "accidentally" derive it from a different root. 4. Always be suspicious of roots that represent energetic states. Many of these concepts can actually be derived from non-energetic states that end up being much more productive.
The fourth principle is the most difficult to apply, since the nature of the more basic state may not be obvious. In a situation like this, postpone derivation of the particular verb. There's a good chance that the desired word will be derivable from a different root concept that you haven't yet defined.
Another tactic is to examine words that have similar meanings (a thesaurus can be very useful for this), or to create a few paraphrases of a sentence that uses the word. For example, how do we deal with the verb "to establish"?
He established his innocence. He proved his innocence. He convinced others of his innocence.
where
"He" = agent "others" = patient "his innocence" = focus
Thus, "to establish" is simply the A/F-d [-P] (i.e. anti-middle) derivation of the A/P/F-d verb meaning 'to convince (of/that)'. And, as we saw earlier, the verb meaning 'to convince' is "bijegamba" and is derived from the evidentiality modality. Thus, this sense of the English word meaning 'to establish' is simply "bijegamboma". Similarly, the anti-passive derivation "bijegambosa" is also useful, and is equivalent to the English verb "to prove", since it can take an oblique patient.
Incidentally, by now it shouldn't be too surprising that obscure grammatical voice operations such as middle, anti-passive, and inverse can produce so many useful words. Languages that do not have these voice operations must instead use unique root morphemes, periphrasis, metaphors, or even idioms. Because of this, it is important to constantly keep these 'obscure' derivations in mind, especially when you run into difficulties. There are many hidden and pleasant surprises in such a powerful derivational system as the one presented here.
When designing our vocabulary, we will often have to ask ourselves whether a concept should be implemented as a single word or as a compound. Natural languages differ considerably in this respect. For example, English has unique unrelated words meaning 'mouse' and 'rat', while Japanese does not. On the other hand, Swahili has unique, unrelated words for 'soldier ant', 'white ant', and 'brown ant', whereas English forms compounds.
Obviously, a word designer will be heavily influenced by his native language, and may unintentionally copy it. In order to avoid this inherent kind of bias, we need to employ a consistent approach.
In the interlingua, we will adopt the following approach for the design of words to represent living entities:
For the living noun classes, a single word should be created for each biological category (phylum, order, class, family, or genus) that is linguistically useful; i.e., which is likely to have a single-word representation in a natural language. A single word may also be used to represent a super-category consisting of more than one category, if the categories are similar enough, and if a natural language is unlikely to differentiate between them. For sub- categories (such as individual species) within a category or super- category, a descriptive mnemonic compound should be created. For extremely common sub-categories, a unique common noun can be created as well.
To illustrate this approach, consider the following chart:
Common name Family Genus & species ---------------------------------------------------------- Arctic fox Canidae Alorex lagopus Bat-eared fox Canidae Otocyon megalotis Bushdog Canidae Speothos venaticus Cape hunting dog Canidae Lycaon pictus Coyote Canidae Canis latrans Crab-eating fox Canidae Cerdocyon thous Dingo Canidae Canis familiaris dingo Dog Canidae Canis familiaris Grey or Timber wolf Canidae Canis lupis Raccoon dog Canidae Nyctereutes procyonoides Red fox Canidae Vulpes vulpes
As you can see, there is very little consistency in the English names.
Using the above guidelines, we will allocate a single word for all members of family Canidae. In the interlingua, we will allocate the root "kanzovi", where "-zov" is the classifier for all carnivores. Thus, the simple noun "kanzovi" will refer to any canine, such as 'dog', 'fox', or 'wolf', and the adjective "kanzovo" will be equivalent to the English adjective meaning 'canine'. Now, if the proper noun for 'Arctic' is "Larikitisi", we can create the mnemonic compound "Lelarikitiso Kanzovi" for 'Arctic fox', where "le-" is the mnemonic name prefix. If the color word meaning 'gray' is 'kuncinzigo', then the compound "Lekuncinzigo Kanzovi" will mean 'Gray Wolf'. (Note that this is the same approach we used earlier to derive the mnemonic compound meaning 'Black Bear'.)
For Canis familiaris, we need to allocate a unique common noun. In the interlingua, we will allocate the root "zov". In other words, the stand-alone word "zovi" will mean 'dog'. The scientific name can be derived as a mnemonic compound using the word that means 'familiar' or 'common'.
For the non-living noun classes, we will use the following approach:
1. If a combination of a modifying morpheme plus a noun classifier is highly suggestive or mnemonic, then use it. 2. Otherwise, if a concept can be implemented, without ambiguity, by exactly two simpler words, then use the two-word compound, even if the result is slightly too general. 3. However, if a concept requires more than one word to prevent ambiguity, then a single word should be created to represent the concept.
By allowing compounds that are slightly more general in meaning than their English counterparts, the results are more likely to encompass the meanings of equivalent words in other natural languages.
Very early in this monograph, we discussed the semantics of exchange verbs, such as buy/sell, swap or exchange, and borrow/lend. For these verbs, we will allocate the binary relational 'exchange' classifiers "-dog" (when the subject gains possession of the focus) and "-cem" (when the subject loses possession of the focus). These roots will be AP/F-d by default. Here are some examples ("ja" is a modifying morpheme with the senses 'commerce', 'money', 'finance', etc.):
jadoga = AP/F-d verb 'to buy' jacema = AP/F-d verb 'to sell'
In both cases, we will use the 0/AP case tag "tomime" for the secondary agent-patient. When used with "jadoga" = 'buy', it will be equivalent to English "from". When used with "jacema" = 'sell', it will be equivalent to English "to". We will also use the 0/F case tag "tomege" for the amount paid; i.e., the secondary focus, and in both cases, it will be equivalent to English "for", as in "I bought/sold the boat for 200 dollars". Finally, since the patient of "jadoga" is the person who achieves custody of the focus, the A/P/F-d form "jadogamba" means 'to buy F for P' or simply 'to buy P F', as in "I bought the children a puppy".
Many concepts come in groups of closely related members, and we can considerably ease the learning burden by providing regular paradigms to "derive" the corresponding roots or words. We've already seen examples of this when we "derived" deictic roots, tense-aspect roots, numbers, and words for days of the week. Regular paradigms also tend to be inherently neutral. Without them, a language designer is likely to duplicate words and senses directly from his native language, not realizing that other natural languages divide up the same semantic space differently.
Paradigms can be developed for any concept groups that are sequential (e.g. names of months of the year) or componential (e.g. tense-aspect roots). As another example, here is the componential paradigm we will use in the interlingua to represent color concepts:
Color Components -------------------- black kun- purple jan- blue da- green boy- yellow fe- orange tu- red zo- white cin- Shades: normal (default) deep/dark ke- light/pale fo-
The classifier for colors will be "-zig", and by default will be P-s. Simple colors are formed from a single color component plus "-zig". For example, "kunzigo" = 'black', "boyzigo" = 'green', "ketuzigo" = 'deep orange', "fofezigo" = 'pale yellow', and so on. If two primary colors are combined, the rightmost component will indicate the major color. For example, "kuncinzigo" = 'black white' = 'gray'. Here are some more examples:
white cinzigo orange tuzigo yellow fezigo purple janzigo brown tukunzigo = orange black pink zocinzigo = red white magenta janzozigo = purple red turquoise foboydazigo = light greenish-blue
We can also use "zigo" as a stand-alone root with the meaning 'colored' or 'having color'. With this root, we can create words such as the A/P-d verb "zigapa" = 'to color' and the quality noun "zigoni" = 'color' or 'hue'.
[Technically, "zigo" should really be "byezigo". However, if we used "byezigo", then "zigo" would be useless because color derivations follow a paradigm. Thus, we will use "zigo" instead of "byezigo".]
Further derivation of specific colors is also possible. For example, P-d "fezigupa" would be the verb 'to yellow' in a sentence like "The wallpaper yellowed over time".
By nature, kinship terms are binary relationships. In the interlingua, we will allocate morphemes that can be combined to create whatever degree of consanguinity is needed. The classifier "tas" will be used to mark kinship relationships, and it will be P/F-s by default.
Here are the kinship morphemes for the ancestor generation:
da - parent (either sex) di - female parent du - male parent
Here are some examples:
datasi - English "parent" ditasi - English "mother" dutasi - English "father" daditasi - English "grandmother" (a parent's female parent) dadatasi - English "grandparent" (a parent's parent) dadadutasi - English "great grandfather" (a grandparent's father) dudutasi - Seri "hipaz" (parallel grandfather = father's father) duditasi - Seri "himaz" (cross grandmother = father's mother) [Seri is a Hokan language spoken in Sonora state, Mexico.]
Note that, like prefixes, a kinship morpheme modifies everything to its right, and the rightmost modifier is the head morph. Thus, "daditasi" = "parent's mother" = 'grandmother'. The word "didatasi", however, is "mother's parent" = 'maternal grandparent'.
Here are the morphemes for siblings:
za - sibling (either sex) zi - female sibling zu - male sibling zen - sibling of the same sex zon - sibling of the opposite sex
Here are some examples:
zatasi - English "sibling" zitasi - English "sister" dazutasi - English "uncle" (a parent's male sibling) [We'll see examples using "zen" and "zon" below.]
Here are the kinship morphemes for the descendent generation:
ba - child (either sex) bi - female child bu - male child
Here are some examples:
batasi - English "child" butasi - English "son" babitasi - English "granddaughter" (a child's female child) zabutasi - English "nephew" (a sibling's male child) dazabatasi - English "cousin" (child of the sibling of a parent)
To handle ancestors and descendents, we will use the following:
ci - ancestor of (the parent and everyone above on the tree) je - descendent of (the child and everyone below on the tree)
Here are some examples:
citasi = ancestor = parent or grandparent or great grandparent etc. dicitasi = mother's ancestor = her parent or her grandparent or her great grandparent etc. jetasi = descendent = child or grandchild or great grandchild etc. bujetasi = son's descendent = his child or his grandchild or his great grandchild etc.
We will also need morphemes to indicate marriage relationships:
ka - spouse (either sex) ki - female spouse ku - male spouse
Here are some examples:
katasi - English "spouse/mate" katasa - English verb "be married to" kataso - English adjective "married" katasumba - English verb "marry (transitive)", "get married to" jukataso - English adjective "single/unmarried" kutasi - English "husband" kadatasi - English "in-law" (spouse's parent) kaditasi - English "mother-in-law" (spouse's female parent) zakutasi - English "brother-in-law" (sibling's male spouse) dikutasi - English "stepfather" (mother's male spouse) kabutasi - English "stepson" (spouse's son) dakabitasi - English "stepsister" (parent's spouse's daughter) dabitasi - English "half sister" (parent's daughter)
Note in the last four examples that we are using spouse and parent prefixes to indicate step and half relations. We are, in effect, adopting the convention that the simpler terms (e.g. "dutasi" = 'father', "zitasi" = 'sister') will indicate natural relationships, while the more complex terms (e.g. "dikutasi" = 'stepfather', "dabitasi" = 'half sister') will indicate "step" or "half" relationships. With this approach, there is no need to allocate separate morphemes for "step" and "half" relationships.
Note also that we cannot use "lokataso" for the meaning 'single/unmarried'. "Lokataso" literally means 'not married to an unspecified person'. Thus, the separate root "jukatas" is needed to handle the meaning 'single/unmarried', even though its focused forms are probably not useful.
Two of the above examples do not overlap their English equivalents precisely. "Zakutasi" meaning 'brother-in-law' applies only to the husband of a sibling. If a brother-in-law is the brother of a spouse, we must instead use "kazutasi". Similarly, we cannot use "dazutasi" meaning 'uncle' to refer to the husband of a parent's sibling. Instead, we must use "dazakutasi". [We could, of course, implement a new triplet of morphemes to represent 'a sibling OR a spouse of a sibling', but, to my knowledge, only a few European languages have this bizarre conflation in meaning, and even they only apply it in a very limited fashion.]
Here are some useful modifiers:
ke - modifier meaning 'older' fo - modifier meaning 'younger' xa - referent/focus is male do - referent/focus is female tay - modifier meaning 'adopted'
And here are some examples:
taybutasi - adopted son taytasi - adopted person taytasapa - adopt kezentasi - Hawaiian "kaikua'ana" (older sibling of the same sex) fozatasi - Hawaiian "pooki'i" (younger sibling) dokezutasi - Korean "oppa" (a female's older brother) xakaditasi - Korean "caangmo" (a male's mother-in-law) xabutasi - Seri "hisáac" (a male's son) xabubatasi - Seri "hiquípaz" (a male's parallel grandchild = son's child) xafozitasi - Seri "hicóome" (a male's younger sister) dokezibatasi - Seri "hipxaz" (a female's elder sister's child) dokaditasi - Seri "hiquémez" (a female's mother-in-law) dozukitasi - Seri "hicóaac" (a female's brother's wife)
Note that, when translating from the interlingua into English, "ke-" and "fo-" can be ignored because English does not make these distinctions. They are needed only for translating to languages that have equivalent words.
The most basic P/F-s verb form, "tasa", will have the general meaning 'have a kinship relationship with', 'be a kin of', 'be related to', etc. The simple noun "tasi" means 'relative'.
To indicate the time-of-day in the interlingua, simply use one or more numeric morphemes with the point-time classifier "-dus". Here are some examples:
baduse = 1:00 or 1 AM tuduse = 5:00 or 5 o'clock bacoduse = 14:00 or 2 PM za-ditayduse = 6:37 = 37 minutes past 6 o'clock za-jutay-xejuduse = 6:07:20 = 7 minutes and 20 seconds past 6 o'clock za-ditay-xeju-boy-kofibaduse = 6:37:20.981 = 37 minutes and 20.981 seconds past 6 o'clock
The adverbial forms (part-of-speech marker "-e") are probably the most useful. For example, "tuduse" literally means 'the time being 5 o'clock', or simply 'at 5 o'clock'.
Note that this approach implies that any "-dus" derivation that starts with a numeric morpheme must be a time-of-day root. Thus, non-time-of-day roots that use the classifier "-dus" must not start with a numeric morpheme, but may contain one or more of them after the first one.
[TBD: The following scheme violates self-segregation rules. (See the example below which allows "van mi vin ne" - "vanmi" is probably ok, but "vinne" is not.) I'm open to suggestions.]
There will often be times when a word or phrase must be spelled out. In the interlingua, all spellings start with the particle "fijopaw" and will be terminated by the particle "xijopi". Letters, digits, punctuation marks, and other symbols will be represented by a series of special morphemes and will appear between the particles.
Consonants are CV, where V is 'i' for voiced consonants and 'u' for unvoiced consonants (exception - 'n' is "naw"):
b = bi n = naw c = cu p = pu d = di q = qi f = fu r = ri g = gi s = su h = hu t = tu j = ji v = vi k = ku x = xu l = li z = zi m = mi
Vowels and semi-vowels are as follows:
a = van e = gen i = vin o = gon u = vun w = swe y = syo
An upper case letter is preceded by "ke":
B = kebi H = kehu I = kevin T = ketu W = keswe etc.
Number symbols will consist of "nu" followed by the numeric morpheme for the digit:
0 = nuju 1 = nuba 2 = nuxe 9 = nuko . = nuboy (= decimal point OR period) - = nufaw (= minus sign OR hyphen) etc.
Punctuation and special symbols will also have dedicated C(S)Vs:
question mark = ne space/blank = to single quote = te double quote = kete etc (TBD).
Accented letters use "day" plus a special C(S)V after the letter. The special C(S)V will, if possible, be the same as the punctuation symbol that most closely resembles the accent mark:
acute accent = dayte umlaut = dayxe circumflex = dayne tilde accent = dayci macron = daysay etc (TBD). á = vandayte ñ = nidayci ü = vundayxe Ü = kevundayxe etc.
For stand-alone accents, use the space/blank character "to" as a placeholder. For example, acute accent (stand-alone) = "todayte".
Here's an example of a complete, spelled-out word:
"Latejami?" = fijopaw keli van tu gen ji van mi vin ne xijopi
Non-Roman alphabets and syllabaries can use the closest equivalent of the above with an appropriate modifier. For example, Greek "kappa" will be "fijopaw ku xijopi xxx", where "xxx" is the interlingua adjective meaning 'Greek'. Syllabic scripts, such as Japanese Hiragana and Katakana, can use two or more morphemes. For example, Hiragana "ba" will be "fijopaw bi van xijopi xxx", where "xxx" is the interlingua adjective meaning 'Hiragana'. (Obviously, the modifying adjective "xxx" will not be needed if the context makes it unnecessary.)
Chinese characters will be based on their Unicode descriptions, as follows (TBD).
Just for fun, here's a fairly large (but incomplete) set of derivations using the simplest speech act root "teg", meaning 'say/tell/speak' (default = A/P/F-d). Here goes...
tega to tell (e.g. "I told John a joke") tegenvi statement, remark, utterance, comment ("-env" = event suffix) tegoma to say, utter ("-om" = anti-middle; e.g. "I said that I was sorry") teginza to speak ("-inz" = AP/F-s activity; e.g. "I spoke about dolphins" or "I spoke a few words" or "I can speak French". Use A/P/F-d "tegamba" or A/P-d "tegapa" if there is a separate patient, as in "I spoke (three words) TO THE STUDENTS".) tegawna to discuss, to talk/confer on/about ("-awn" = reciprocal) tegawnenvi conversation, discussion, dialogue tedami parrot ("dam" = bird) tecimi laryngitis ("cim" = bodily state classifier) tezegi forum, meeting place for discussions ("zeg" = other artificial location) tefusi megaphone ("fus" = non-powered tool/instrument) tebisi telephone ("pya" = other powered device) fotebisi intercom ("fo" = low polarity) tekusi word ("kus" = performance component) kutekusi vocabulary ("ku" = group modifier) zutekusi morpheme ("zu" = minimal polarity) fotekusi phrase ("fo" = low polarity) cantekusi clause ("can" = average polarity) ketekusi sentence ("ke" = high polarity) bitekusi paragraph ("bi" = maximum polarity) tecala give/deliver a speech/talk/address ("cal" = activity) tecalenvi speech, talk, oration, address ("-env" = event suffix) tejami language ("jam" = protocol/standard/system) Latejami the name of the interlingua ("la-" = proper name prefix)
Keep in mind that a modifying morpheme does not have to be semantically precise - it can be used for its approximate, metaphoric, or sound value.
Throughout this monograph, we've seen many examples of derivations whose English counterparts were periphrastic, polysemic, metaphoric, or even idiomatic. In fact, when speakers of natural languages use non-literal language it is almost always because they are forced to do so. They cannot avoid it either because their vocabulary does not have an appropriate literal construction available, or because it is something that the speaker is not comfortable using.
This is unfortunate because the way that a non-literal construction will be interpreted will depend very much on the native language and culture of the listener. For example, metaphoric use of the word "pig" can have meanings such as "slob", "sex maniac", or "over-eater" in English, but will have different meanings to speakers of other languages. Also, as we've seen many times throughout this monograph, many metaphors, including the above examples, can be avoided by using appropriate derivations instead. For example, pejorative morphemes or more precisely derived compounds can be used to implement the above examples. In fact, I have become completely convinced that a properly derived word can replace any required or unavoidable metaphor, and it can never be misinterpreted by native speakers of other languages.
The goal of a designer of an MT interlingua should be to provide the means to say anything without the need for non-literal language. In other words, metaphor, polysemy, and idiom should be optional - they should never be obligatory. It is also my opinion that non-literal language should be generally avoided (except where its use is obvious to all listeners or readers), since the possibility for misunderstanding is so great.
[If you would like to read more about the dangers of metaphor, see my separate essay entitled "Metaphor".]
At the very beginning of this monograph, I stated that the focus case role is vague and even somewhat "out-of-focus". Furthermore, even our working definition of focus is vague: that the focus is the referent of an actual or potential relationship with the patient, or the elaboration of an action.
Actually, I don't really think that the above definition is needed, even though I do believe that it is accurate. In fact, we can come up with a different and perhaps better definition if we look at our primary case roles as sets of binary features, an approach which is often quite useful in linguistics. There are only two features, agent and patient, and there are only four possible combinations:
A case role -> +agent, -patient P case role -> -agent, +patient AP case role -> +agent, +patient F case role -> -agent, -patient
In other words, the focus case role is the primary case role that has neither agent nor patient attributes.
Thus, focus is indeed vague, but it is definitely not ambiguous.
I hope that by now I have convinced the reader of the value of a powerful derivational system. I cannot emphasize too much that a system like the one that I've described here will maximize the neutrality of the vocabulary of an artificial language, while completely eliminating the need for ad hoc and arbitrary word creation, thus making it ideal for an MT interlingua. It will also reduce to an absolute minimum the number of morphemes that a user of the language will have to memorize.
One of the greatest difficulties in learning a new language is mastering the idiosyncracies of the vocabulary. This is so because a word in one language rarely means exactly the same thing as its closest counterpart in a different language. In other words, the "semantic space" of each word in a natural language is arbitrary - the result of centuries of evolution and accident. In effect, each word of a natural language has built-in irregularities that the student must learn.
Unfortunately, most language designers unwittingly clone their native vocabulary, not realizing the difficulty that will be faced by potential users of the language. The net result is that the meaning of a word cannot be deduced from more basic and universal concepts that have the same meaning for everyone, but instead depends almost exclusively on its meaning in only one natural language - the native language of the designer.
In such a design, as in the natural language on which it is based, the semantic space of each word is arbitrary, and mastering the idiosyncracies of the entire vocabulary will take years of effort. Thus, different speakers will use the words differently, and misunderstandings will occur because there are no rules that can be followed to determine the precise semantic space of a word. Instead, each speaker will use the word in the same way he would use the closest equivalent in his native language.
In the system described here, the semantic space of each word is precisely defined in terms of the much more basic meanings of the components that make up each word. And while there may be some arbitrariness in the selection of the root concepts, the overall arbitrariness of the entire vocabulary is much, much less. Thus, even though we may never be able to achieve true neutrality, we can certainly come very close.
Definitions:
() indicates that the enclosed item is optional {} indicates that the enclosed item may appear zero or more times [] indicates that the enclosed item must appear one or more times | ::= logical or V ::= any vowel ::= a | e | i | o | u S ::= any semivowel ::= y | w C ::= any consonant ::= b | c | d | f | g | j | k | l | m | n | p | q | r | s | t | v | x | z C1 ::= modifier starter ::= b c d f j k q r t x z [q and r not used in native words] C2 ::= classifier terminator ::= g l m p s v C3 ::= suffix terminator ::= g m n p s v [Note that C3 is any classifier terminator except l, which is reserved for prefixes and classifier terminators. C3 also includes n, which can never start a modifier (but can terminate one).]
A vocalic nucleus N has the following form:
N ::= vocalic-nucleus ::= [V]
More precisely, a vocalic nucleus can consist of one or more vowels, and, if there is more than one vowel, then 'i' or 'u' is converted to the corresponding semi-vowel 'y' or 'w'. For example, "eua" becomes "ewa". I'll have more to say about this later.
A prefix has the form:
prefix ::= l N (n) examples: la, loy, lawe, len, loyn]
A suffix has the form:
suffix ::= N C3 | N m C | N n C examples: on, int, ayn, ev, umb, wav
A suffix changes the syntax and semantics of a word in a precise (i.e., totally predictable) way. For example, if we add the A/P-d suffix "-ap" and the final verb marker "-a" to the root "bodam" (meaning 'duck'), the result "bodamapa" means 'to turn P into a duck', which is a dynamic state verb. In other words, we have changed both the syntax and meaning from a 'duck' noun to a 'change-of-state' verb.
In summary, a prefix modifies the meaning of the entire word that follows it without changing its syntax. A suffix changes both meaning and syntax of the root plus any intervening suffixes. In other words, we start with the root, add the suffixes, and then add the prefixes to obtain the final meaning.
There are two kinds of root morphemes: modifiers and classifiers.
A classifier has the form:
classifier ::= C1 N C2 examples: cop, del, tus, bam, fig, zav [Only CVC will be used for everyday vocabulary. Unused CNC, especially CSVC and CVSC, can be used for scientific and technical classifiers.]
A modifier has the form:
modifier ::= C1 N (n) examples: bu, co, day, kwi, zen, tayn
Thus, a root morpheme and a root are defined as follows:
root-morpheme ::= modifier | classifier root ::= {modifier} classifier
Note that a classifier may be preceded by zero or more modifiers but may not be followed by one. Thus it automatically terminates a root.
Finally, a word has the following form:
POS ::= part-of-speech marker ::= a, e, aw, yu, etc word ::= {prefix} + root + {suffix} + POS anaphor ::= first-root-CN(n) + h + POS
As for pronunciation, vowels are cardinal, although laxer versions are acceptable (i.e., pronounce vowels as in Italian or Swahili). Pronounce /w/ as in "awake", /y/ as in "soybean", /c/ like "ch" in "chin", /j/ as in "judge", /x/ like "sh" in "ship", /q/ like "s" in "measure", and /r/ as any rhotic (flap, trill, retroflex, uvular, etc). The consonant /h/ may be pronounced like 'h' in "house", as a glottal stop (i.e., like "tt" in "button"), or as [x] (i.e., like "ch" in German "acht"). [More generally, /h/ may be pronounced as a glottal stop or as any unvoiced velar, uvular, pharyngeal, or glottal fricative.]
Geminates (i.e., two or more consecutive, identical vowels, semivowels, or consonants) are not allowed. For example, "kk", "bb", "uu", and "yy" are not allowed. The sequences /uw/, /wu/, /iy/, /yi/, /ou/, /ow/, /ei/, /ey/, /ao/, /ae/, /wy/, and /yw/ are also not allowed. However, it is always legal to pronounce /e/ as either [e] or [ey], and /o/ as either [o] or [ow]. For example, /ea/ may be pronounced [ea] or [eya], and /oa/ may be pronounced [oa] or [owa].
The vowels 'i' and 'u' may never appear adjacent to another vowel - use 'y' or 'w' instead. For example, the roots "foidam" and "kuentis" are illegal, but "foydam" and "kwentis" are legal. If 'i' and 'u' are adjacent, convert the first to a semi-vowel. Thus, "ui" becomes "wi" and "iu" becomes "yu".
Although stress is not necessary, we will adopt the following convention for the sake of consistency:
If a root contains at least one modifier, then the first vowel of the first modifier should be stressed. [Examples: BA-kav-o, li-JO-zip-i, TWA-cu-zum-i, li-la-KO-ke-tov-i] If a word contains at least one suffix, then the final vowel of the final suffix should be stressed. [Examples: fag-AP-a, bin-IMB-a, KE-dap-OG-e, BI-jeg-unz-ANG-yu] If a word contains at least one modifier and one suffix, the suffix should be given primary (i.e., heavier) stress, and the modifier should be given secondary (i.e., lighter) stress. If a word contains neither a modifier nor a suffix, then the final vowel of the classifier should be stressed. [Examples: CAL-a, FOM-o, li-KIG-i, li-law-BEG-i]
The above approach is completely self-segregating at both the morpheme and word level. In addition, the syntax of the interlingua will ensure self-segregation at the constituent and sentence levels.
This appendix contains a complete list of all prefixes, suffixes, and compositional root morphemes of the interlingua. Classifiers are listed in Appendix C. Classifiers with their stand-alone, classifier, and modifier (if any) meanings are listed in Appendix D.
Prefixes:
lu- generic noun li- plural, more than one loy- male law- female lo- negator: un-, non-, not, other than la- proper name le- mnemonic name lay- again/repeat lyu- back/in return lwa- back/to former state Register prefixes: laye- humble, subservient, inferior, fawning, groveling lea- praising, complimentary, flattering lin- polite, formal, respectful ????- formal, correct -- neutral (default) lewi- slang, informal lun- cold, unfriendly, unsociable loa- contemptuous, rude, insulting layo- vulgar, filthy, tasteless
Suffixes:
Basic argument structure suffixes: A/P/F-s: -anz A/P/F-d: -amb A/P-s: -as A/P-d: -ap AP/F-s: -inz AP/F-d: -imb AP-s: -is AP-d: -ip P/F-s: -unz P/F-d: -umb P-s: -us P-d: -up The above should only be used if the default argument structure is being changed. To change just the part-of-speech of a root, use an appropriate part-of-speech marker instead (see below). Non-linking suffixes: 0/A: -am 0/AP: -im 0/P: -um 0/F: -eg 0: -og Derivational suffixes: -ep mass noun -op count noun -TBD kind/type-of (TBD) -TBD result-of (TBD) (translate -> translation, mix -> mixture, wreck -> wreckage, build -> construction, collect -> collection, spill -> spillage, damage -> damage, copy/ duplicate -> copy/reproduction, handwrite -> handwriting) -ayg get/determine/measure state (to weigh, to time, to test (i.e. to determine if something is in working order), to compare, etc - result = AP/F-d verb) -ig apply/use noun to/on patient (to brush, to knife, to pencil, inject, etc - result = A/P-d verb) -ent add root to patient (to water, to stamp, to salt, to lengthen, to cover, to enlarge, etc - result = A/P-d verb) -unk remove root from patient (to erase, to de-salt, to shorten, undress, to strip paint from, to shrink, etc - result = A/P-d verb) -aym associated position noun (result = P/F-s noun) -on quality or ability (result = noun, structure unchanged) -ink process noun (result = noun, structure unchanged) -env event noun (result = noun, structure unchanged) Voice suffixes: [Note: an earlier version of this interlingua had voice morphemes for anti-anti-passive and anti-anti-middle, which I have not implemented in the current version. Instead, if it's necessary to demote the focus of a ditransitive verb, one of the unfocused A/P suffixes can be used to represent a combined anti-anti-passive and anti-anti-middle operation. Obviously, this implies that the focus of these verbs can never be expressed obliquely and that we can no longer make a semantic distinction between anti-anti-passive and anti-anti-middle. However, I do not consider this a disadvantage because I know of no natural language that can do these things.] -ang inverse (A/P/F-x -> P/A/F-x) -av reflexive (object is identical to subject, for states A/P/F becomes A=F/P, for actions A/P/F becomes A=P/F) -awn reciprocal (A/P/F -> A+P/F, P/F -> P+F) -ind P+F reciprocal (A/P/F -> A/P+F) -es passive -os anti-passive -em middle -om anti-middle -ev co-subject (comitative, demotes part of the subject and makes it obliquely expressable) -ov non-subject (anti-comitative, an entity is specifically excluded from being subject) Special voice suffixes: -iv infinitive - this must always be a verb - nothing can follow this suffix! -oys same arguments as the first conjunct - nothing can follow this suffix! Special suffix: -en interrogative Part-of-speech (POS) suffixes: Verb: -a Noun: -i open: -aw Adjective: -o open: -yu Adverb/Case tag: -e Previous-word modifier: -ay open: -wa Vocative: -we Imperative: -oy True conjunction: -ye Tense, modal, and other special words: -u
The order of suffixes is not semantically important. However, for the sake of consistency, the following order is recommended since it is the order in which they are processed:
closest to the root - basic argument structure, non-linking, and derivational suffixes (excluding quality suffix) - one or more voice suffixes (excluding infinitive suffix) - quality, process, or event suffix - infinitive, or same args suffix - interrogative suffix farthest from the root
If suffixes are not in the above order, the word will be processed as if they were in the above order.
Scope and order of prefixes and suffixes:
In general, all suffixes are applied to the root to create a basic stem and then prefixes are applied to modify the meaning of the result. However, the order of analysis will be the order that makes sense - regardless of the actual order in the word. For example, most derivational suffixes must apply before any voice suffixes so that the argument structure can be changed if needed. Also, a quality suffix must apply after any voice suffixes so that the correct argument slot can be selected before being converted to the quality associated with that argument.
Polarity root morphemes (all are modifiers):
A polarity root morpheme creates an effective new root whose meaning depends on the nature of the unmodified root. For scalar states, the degree of the state will be changed (e.g. warm vs. hot). For physical nouns, the magnitude will be changed (e.g. lake vs. pond). For sentient concepts, the degree of age, rank, or experience will be changed (e.g. pope vs. bishop, college vs. high school, etc.) If there are insufficient degrees of polarity, then more than one can be used (e.g. sergeant vs. master sergeant). bi- maximal polarity ke- high polarity can- average polarity fo- low polarity zu- minimal polarity ju- 0% polarity kin- too, excessively, over- bon- insufficient, too little, inadequate, not enough dan- extra, spare, surplus, over and above, above and beyond xi- enough, adequately, sufficiently ta- almost, not quite, nearly, all but, well-nigh coy- just, only, exclusively, simply fen- about, approximately, circa, more or less dun- exactly, precisely, no more and no less baw- especially, particularly, in particular
Numerics:
The numeric classifier is "-kum". Numeric modifiers: faw- minus sign (default = positive) -- cardinal (This is the default.) -bye ordinal -ci previous, minus one-th ordinal -je next, plus one-th ordinal -da N-ary, Nth in importance, rank, or value -zen N-tuple, N-fold, N of a kind, N in one -ku N at a time, N per group, in groups of N Numeric components: ju- zero ba- one xe- two di- three co- four tu- five za- six tay- seven fi- eight ko- nine Numeric linkers: -boy- decimal point -xo- positive exponent -twa- negative exponent -fu- real/imaginary separator -tin- fraction, X/Y Simple numbers are formed by appending the classifier "-kum" to the digit. For example, "fikumo" = 'eight', "bajujukumo" or "baxoxekumo" = 'one hundred', "taybyekumo" = 'seventh', and "didakumo" = 'tertiary'. Larger numbers are formed by linking individual digits and terminating them with "-kumo". For example, "tayjubakumo" = 701, "xejukumo" = 20, and so on. If a linker does not have a number to its left, then the default is assumed to be "ju" = 'zero' for decimal point and real/imaginary separator, and "ba" = 'one' for all the other linkers. For example, "boytutaykumo" = 0.57, and "tindikumo" = 'one-third'. If the fraction linker "-tin-" does not have a string to its right, it will be assumed to be 'all', and only a polarity modifier may precede "-tin-". For example, "ketinkumo" means 'a large fraction of' or 'most'. All numeric roots terminated by "-kum" are P-s by default. Non-specific numeric words are formed from the scalar polarity morphemes plus the numeric classifier "-kum", as follows: bikumo all, every, the whole amount of, the maximum amount possible of kekumo many, much, lots of, a lot of, a large amount of, numerous, plenty of cankumo several, some, a number of, a moderate/average/typical amount of fokumo a few, a little, a small amount of, not too many, not too much zukumo very few, very little, a tiny/minimal amount of, hardly any, almost no kumo any, some, an unspecified number/quantity/amount of kumeno how much?, how many?, what quantity of?
Deictic roots:
Person Deictic Type Default ------------ -------------- --------- 1: ba- Pers: -v P-s 2: xe- Gen: -m P-s 3: di- Dem: -s P-s 1+2: co- Loc: -l "0" 1+3: tu- Tem: -p "0" 2+3: za- 1+2+3: tay-
Tense-aspect roots:
By default, unmodified "cip" is past-perfect, "das" is present- imperfect, "jev" is future-perfect, "kom" and "xus" are both perfect, and "bul" is imperfect. All tense-aspect roots are deictic disjuncts when the part-of-speech is "-u", and P/F-s by default when the part-of-speech is not "-u". Aspect Tense -------------------- --------------------- Perfective: fin- Past: -cip Imperfective: doy- Present: -das Iterative: kaw- Future: -jev Habitual: xa- Past+Present: -kom Inceptive: ci- Present+Future: -xus Terminative: ju- Unspecified: -bul Resumptive: da- Completive: je- Unspecified: to-
Modality roots:
All modal roots are deictic disjuncts when the part-of-speech is "-u". When the part-of-speech is not "-u", epistemic roots are AP/F-s and deontic roots are P/F-s by default. Modality Morpheme -------------------------- ---------- Probability (epistemic) tam Evidentiality (epistemic) jeg Inevitability (epistemic) cav Acceptability (epistemic) zim Significance (epistemic) dup Hedge (epistemic) fug Counterfactuality (epistemic) xiv Reasonableness (epistemic) top Obligation (deontic) dov Necessity (deontic) kes Consequentiality (deontic) zul
Color modifiers (classifier = "-zig", P-s adjective by default):
Color Components -------------------- black kun- purple jan- blue da- green boy- yellow fe- orange tu- red zo- white cin- Shades: normal (default) deep/dark ke- light/pale fo- Simple colors are formed from a single color component plus "-zig". For example, "kunzigo" = 'black', "boyzigo" = 'green', "ketuzigo" = 'deep orange', "fofezigo" = 'pale yellow', and so on. If two primary colors are combined, the rightmost component will indicate the major color. For example, "kuncinzigo" = 'black white' = 'gray'. Here are some more examples: white cinzigo orange tuzigo yellow fezigo purple janzigo brown tukunzigo = orange black pink zocinzigo = red white magenta janzozigo = purple red turquoise foboydazigo = light greenish-blue
Kinship roots:
classifier = "-tas" (default = P/F-s noun) da - parent (either sex) di - female parent du - male parent za - sibling (either sex) zi - female sibling zu - male sibling zen - sibling of the same sex zon - sibling of the opposite sex ba - child (either sex) bi - female child bu - male child ci - ancestor of (the parent and everyone above on the tree) je - descendent of (the child and everyone below on the tree) ka - spouse (either sex) ki - female spouse ku - male spouse ke - modifier meaning 'older' fo - modifier meaning 'younger' xa - referent/focus is male do - referent/focus is female tay - modifier meaning 'adopted' A kinship morpheme modifies everything to its right, and the rightmost morpheme is the head morpheme. Thus, "daditasi" = "parent's mother" = 'grandmother'. The word "didatasi", however, means "mother's parent" = 'maternal grandparent'.
Comparative words:
bijopay = 'most' kejopay = 'more' canjopay = 'as much/many' fojopay = 'less/fewer' zujopay = 'least' jopenay = 'how (much)', 'to what degree' zuntesye = 'than', 'as', 'compared with'
Other particles and related items:
Unknown number marker: kujopo (For use in machine translation only.) Unknown definiteness marker: tojopo (For use in machine translation only.) Contrasting topicalization particle: kunjop Heavy topicalization particle: xojopa Reference-switching particle: zunjopa Opening parenthesis: cijop Closing parenthesis: jejopi Parenthetical start: bajopu List separator: byetesye Quote start: tejop Proper expression start: lajopaw Spelling start: fijopaw Parenthetical/quote/proper/ spelling stop: complete xijopi incomplete jujopi Valency terminator: jojope
Each word in the interlingua consists of zero or more optional prefixes, a single mandatory root, and zero or more optional suffixes. In this appendix, we are only concerned with the root and its components.
Each root consists of one or more root morphemes. The class of a root that has more than one root morpheme is determined by the rightmost root morpheme, and this root morpheme is referred to as the classifier.
A class can contain more than one classifier; i.e. it can contain sub-classes, and they, in turn, can also contain sub-classes. For example, the 'vertebrate' class has classifiers for 'mammals', 'birds', 'reptiles', and 'other vertebrates'.
A stand-alone classifier (i.e., one that is not modified by other root morphemes) will represent a specific member of the class, rather than the entire class. For example, the 'vertebrate' class has a 'bird' sub-class. When the 'bird' classifier "dam" is used alone, it will actually represent the particular category of birds called 'pigeon' rather than the more general meaning 'bird'. This classifier can then be modified by other root morphemes to represent other birds such as 'eagle', 'duck', 'robin', and so on. If we need to create a root representing the entire class, we will modify the classifier with the modifying root morpheme "bye", meaning 'generic member'. For example, the 'member' morpheme plus the 'bird' classifier "dam" means 'bird', and thus "byedami" can refer to any member of the class.
Classes that have sub-classes will have one final class called an 'other' class, and "bye" will be used with this classifier to represent any member of the larger class. For example, there are several 'vertebrate' sub-classes (such as the 'bird' class mentioned above), and one final class called the 'other vertebrates' class. This final class will be used for all vertebrate species for which there is not a more specific class, and the root meaning 'vertebrate' is simply "bye" plus the 'other vertebrates' classifier.
The member root morpheme "bye" will not be applied to a classifier unless the result is useful and has a counterpart in many natural languages. For example, there is a classifier for 'abstract attributes and qualities'. Since I doubt that any natural language has a single word to represent this concept, we will not create a generic word using "bye" plus this classifier.
Note that a specific member of a class does not have to represent a single species or a specific kind or type of entity. For example, there is a stand-alone root morpheme meaning 'pigeon' even though there are several species of pigeon.
The classifier root morpheme of a root is semantically and syntactically precise. However, the modifier root morphemes to the left of a classifier will provide no syntactic information at all and may not necessarily be semantically precise, but will provide a semantic clue that will help the student remember the meaning of the complete root. In other words, the root morphemes to the left of the classifier will be used for their mnemonic value to modify the classifier. The classifier, however, will always be semantically precise. For example, the root meaning "bicycle" consists of the numeric root morpheme meaning 'two' plus the 'vehicle' classifier.
Also, some modifier root morphemes can have completely different meanings in different contexts. For example, the modifier root morpheme with the meaning 'six' would be useless with most classifiers except the numeric classifier and certain shapes (such as the hexagon). In cases like this, the root morpheme can be given one or more completely different modifier meanings that will be more useful in other contexts. Even so, however, we will always try to assign multiple meanings that are at least somewhat reminiscent of or related to each other. For example, the modifier meaning 'two' will have the alternate meanings 'divided/opposition'.
A small number of classifiers have opposites. If so, the first one (i.e., the "positive" classifier) will represent the concept that is higher in magnitude or more humanly appealing. The second one (i.e., the "opposite" or "antonymic" classifier) will represent the concept that is lower in magnitude or that has less human appeal. If there is a conflict between magnitude and appeal, magnitude will have precedence. For example, 'drunk/inebriated' will use the positive classifier because it is higher in the magnitude of alcohol ingested, even though its opposite 'sober' is more positive in human appeal.
A similar approach will be used for classes that do not have opposites. In these cases, the basic, unmarked form will represent the concept that is higher in magnitude or more appealing, while the modifier "ju" will be used for the concept that is lower in magnitude or is less appealing.
Thus, the approach used here will allow an entire, easily learned vocabulary of roots to be flexibly designed using a relatively small number of root morphemes.
Now, here are the classifiers (for a complete, alphabetical list of all modifying root morphemes and their meanings, see Appendix D).
All matter, energy, and time classifiers are P-s by default:
Animals:
Vertebrates: Mammals: -tig large grazing mammals Artiodactyla camels, cattle, deer, giraffes, goats, hippos, llamas, pigs, sheep Perissodactyla horses, rhinoceroses, tapirs Proboscidea elephants, mammoths [w: goat, no corresponding modifier] -zov carnivores (badgers, bears, cats, dogs, weasels, otters) [w: dog, modifier "zoy" = enjoy/friend] -cam primates (humans, lemurs, marmosets, monkeys, tamarins, vervets) [w: monkey, no corresponding modifier] -des sea mammals: Cetacea dolphins, whales Sirenia dugongs, manatees Pinnipedia seals, walruses [w: dolphin, no corresponding modifier] -kup other mammals: Rodentia cavies, chinchillas, hamsters, mice, porcupines, rats Insectivora hedgehogs, moles, shrews Hyracoidea hyraxes Lagomorpha hares, rabbits, pikas Edentata anteaters, armadillos, sloths Pholidota pangolins Tubulidentata aardvarks Chiroptera bats, flying foxes Dermoptera flying lemurs Marsupialia kangaroos, koalas, wombats Monotremata (egg-laying mammals) echidna, platypus etc. [w: mouse, no corresponding modifier] -dam all birds (ostrich, emu, kiwi, chicken, pheasant, grouse, quail, duck, swan, sparrow, lark, thrush, cardinal, crow, jay, hawk, eagle, vulture, osprey, pelican, albatross, loon, gull, penguin, owl, woodpecker, kingfisher, hummingbird, parrot, cuckoo, pigeon, crane) [w: pigeon/dove, modifier "da" = bird/fly/high/lift/gas/sky/blue] -buv snakes (boa, cobra, viper, grass snake) [w: grass snake, modifier "bun" = dangerous/harm/fear] -feg other reptiles (lizard, turtle, crocodile, dinosaur) [w: lizard, no corresponding modifier] -kim amphibians (toad, newt, frog, salamander) [w: frog, no corresponding modifier] -bom fish [class Pisces] (perch, sardine, flounder, catfish, tuna, flying fish) [w: herring, modifier "bo" = fish/water/liquid/swim] -xip other vertebrates [classes Marsipobranchii, Selachii, Bradyodonti, etc] (shark, stingray, lamprey) [w: shark, no corresponding modifier] Arthropods: -kag insects (grasshopper, fly, mosquito, bee, butterfly) [w: bee, modifier "kay" = sweet/sugar] -cup other arthropods (lobster, krill, crab, shrimp, water flea, barnacle, spider, centipede) [w: crab, modifier "cun" = safe/protect/armor/skin] Other animals: -fas all other animals (clam, mussel, snail, slug, squid, octopus, worm, jellyfish) [w: worm, modifier "fay" = animal/meat]
Plants:
-jig trees and shrubs (pine, juniper, redwood, apple, cherry, walnut, maple, ash, elm, oak) [w: tree, modifier "ji" = tree/wood/paper/tall/vertical/leg] -dop farm, garden, and other cultivated plants (fruit, vegetable, wheat, rice, corn, pepper, parsley, carrot, beet, potato, onion, bean, squash, spinach, tomato, cabbage, eggplant, celery, asparagus, melon, strawberry) [Note that these words represent the entire plant, not just the edible portion.] [w: herb, modifier "don" = taste/tongue] -bos other plants (grass, fern, vine, weed, and the many specific varieties) [w: grass, modifier "boy" = plant/green/growth/leaf]
Living organs and components of plants and animals:
-ces animal organs and regions of the animal body (meat, liver, heart, pancreas, gland, muscle, tendon, bladder, brain, vocal cords, hand, head, chest, tail, eye, chin, ear, abdomen, body, skin, egg) [w: head, modifier "cen" = top/summit/cap] -bem other organs and body parts - mostly plant parts and regions (vegetable, organ, fruit, seed, leaf, branch, nut, berry, root, cell, rind/skin, mitochondrion, tuber). For the edible parts of specific plants (such as "apple"), the name of the whole plant (such as "apple tree") will be formed with the same modifier(s) plus the appropriate classifier. [w: organ/body-part, modifier "bye" = membership/constituency/ partitive/ordinal]
Illnesses & diseases:
-cim all illnesses & diseases & other bodily states (illness, disease, flu, malaria, diabetes, claustrophobia, upset stomach, cancer, hot or cold) [w: cold/rheum, no corresponding modifier]
Wounds & Growths:
-fav all wounds & growths (blister/boil, scab, ulcer, tumor, sore, rash) [w: sore/ulcer, modifier "faw" = negative/loss/cancel/minus]
Other living matter & energy:
-kuv all other living matter & energy (organism/lifeform/living thing, virus, plus all members of kingdoms Fungi, Monera, and Protista: bacteria, amoeba, microbe, germ, algae, spirochete, protozoans, slime mold, fungus, wort, mushroom) [w: unassigned, no corresponding modifier]
Non-living, natural matter & energy:
-fep all non-living phenomena (tornado, rainbow, flood, blizzard, climate, snow(fall), typhoon, (weather) front, wind, storm, cloud, hot spring, earthquake, volcanic eruption, mudslide, fire/blaze, spark, waterfall, sunspot, star/sun) [A word such as "hot spring" could also be implemented as a locative. However, since its most salient feature is its implied energy, it will instead be derived using this class.] [w: fire/blaze, modifier "fe" = hot/anger/fire/yellow/spicy]
Non-living, artificial matter & energy:
-tim vehicles (boat, canoe, catamaran, rowboat, ship, ocean liner, submarine, sailboat, raft, automobile, bulldozer, airplane, locomotive, rickshaw, bicycle, truck, train) [w: car/automobile, modifier "ti" = fast/agile/move/go] -fop weapons and explosive devices (rifle, shotgun, missile, cannon, bomb, bullet, bazooka, machine gun) [w: gun, modifier "foy" = loud/sound/hearing] -ceg electrical components which (typically) modify or transform energy (resistor, transistor, light bulb, transducer, speaker, capacitor, keyboard, anode, bullhorn, battery) [w: (electrical) switch, modifier "ce" = electric/shock/strike] -bis other powered and power-producing devices (windmill, jacuzzi, oscilloscope, generator, lamp, transmitter, refrigerator, telephone, washing machine, toilet, turbine, computer, clock/watch, walkie-talkie, television, music synthesizer, organ, camera, jackhammer, lathe, drill (press), circular saw, lawnmower) [w: toilet, modifier "bin" = mechanical/technical/complex]
Natural substance:
-civ elements and compounds (hydrogen, oxygen, sodium, chlorine, uranium, sodium chloride, potassium sulfate, biochemicals (including drugs), insulin, DNA, nucleotide, amino acid, methane, butanol, polybutadiene, benzoic acid, chlorobenzene, dimethylamine) [w: (chemical) element, modifier "cin" = chemical/clean/white] -zop plant/animal substances and mixtures (blubber, frankincense, beeswax, beef, honey, blood, wood, marrow, milk, feces, coral, tears, spit/spittle, urine, bark) [w: blood, modifier "zo" = red/blood/combat/violence] -jav other natural substances (air, coal, soil, clay, bauxite, dust, sand, ore, ruby, snow, gypsum, poison) [w: sand, modifier "jay" = sand/fine/filter/dig/deep]
Natural location:
-biv bodies of water (river, bay, swamp, puddle, ocean, lake) [w: lake, modifier "bi" = maximal polarity] -tis other natural locations (cave, cliff, island, mountain, desert, forest, beach, continent, peninsula, planet, pole (e.g. 'North Pole'), sky, outer space, galaxy) [w: land/ground, modifier "tin" = land/location/earth/natural/wild]
Natural other:
-tev organic things and things made by animals and plants (shell, claw, tooth, hair (strand), feather, bone, thorn, web, burrow, nest, beehive, anthill) Do not confuse with the living organ classifiers "-ces" and "-bem". [w: bone, modifier "twe" = arm/branch/extension/lever/crane] -xam other natural things, all inorganic (drop (eg. of water), boulder, stalagmite, gem, snowflake, atom, molecule, subatomic particles such as neutron and electron) [w: rock/stone, no corresponding modifier]
Artificial substance:
This section contains substances that are explicitly man-made and which cannot occur in nature without some processing. All other substances (including all drugs and other relatively pure chemical compounds) should be considered natural, even if they do not occur naturally on Earth.
-fup processed food substance (food, gravy, mustard, spices and herbs, cheese, beer, sugar, flour, vegetable oil, coffee, (table) salt, syrup, broth, medicine, alcohol/spirits, soda pop, milkshake) [w: food/victuals, modifier "fu" = food/eat/stomach] -cap other artificial substance (steel, alloy, paint, cloth, soap, glue, ink, gasoline, plywood, salve, glass, paper, cement, antiseptic, gunpowder) [w: steel, modifier "cay" = hard/tough/cold/solid]
Artificial location:
-dep sections of buildings and similar supporting or enclosing structures or artificial plots of land (garage, room, pantry, kitchen, bedroom, parlor, attic, hallway, (prison/jail) cell, apartment, closet, basement/cellar, porch, balcony, story/floor, hold, bleacher, platform/stage, gallows, wharf, staircase) [w: room/chamber, modifier "de" = flat/two-dimensional] -kig buildings and groups of buildings, including places of business (house, stadium, skyscraper, library, shed, farm, ranch, school, university, factory, refinery, prison, military base, bakery, restaurant, hotel, shopping mall, zoo, doctor's office, bazaar/marketplace, gas station, museum, hospital, tent) [w: house, modifier "kwi" = building/home/reside/build/construct] -dug large, typically political areas (nation, city, county, region/district, enclave, colony, village, suburb, state/province, empire) Use classifier "-jag" for specific government types such as "monarchy", "republic", etc. The word "empire" will appear in both classes, even though English does not differentiate between the government type and the political location. [w: town, modifier "du" = government/nation/politics] -zeg other artificial locations - typically infrastructure (park, canal, dike, road, trail, reservoir, plaza, interchange, parking lot, garden, patio, yard, border, bridge) [w: road, modifier "ze" = long/road/walk/line/string/hair]
Artificial other:
-jos food items (cocktail, pizza, lollipop, steak, TV dinner, egg roll, taco, sandwich, cake, bread loaf, pie, ham [the whole ham, not just the meat], sushi) [w: bread(loaf), modifier "joy" = healthy/functional/ready] -fus non-powered instruments and tools, including hand weapons (club, spear, bow, arrow, dagger, sword, pike, battle-ax, quarterstaff, tool, knife, hammer, key, glasses/spectacles, scissors, pencil, telescope, brush, broom, fork, strainer/ colander, whisk, ladle, thermometer, ruler, scale, compass, level, calipers, gauge, hourglass, abacus, sponge, flute, violin, horn, guitar, piano, bell) [w: knife, modifier "fyu" = cut/slice/sharp] -tav clothing and related items (shirt, hat, shoe, sleeping bag, coat, tie, belt, sleeve, collar, handkerchief, towel) [w: shirt/blouse, modifier "twa" = clothing/shirt/thick/wide/chest] -jis furniture (chair, table, bed, sofa, bookcase, desk, tripod, ladder, shelf) [w: chair, modifier "jin" = folded/sit/chair/angle] -fim other informative/artistic/entertaining/social items (map, book, CD-ROM, painting, statue, tombstone, flag, encyclopedia, toy, coin, award/prize, photograph, money, letter/missive, degree/diploma, work of art, flagpole) [w: book, modifier "fi" = eight/writing] -zip containers and conduits (box, tank, bottle, basket, bucket, sack, case, suitcase, aquarium, fireplace, trunk (of a car), sink, cup, bowl, pan, bathtub, drawer, cage, coffin, hose, pipe, tube, faucet, gutter, airduct, chimney, tailpipe) [w: container/vessel, modifier "zi" = contain/inside/meaning/hold/hand] -fig separators/barriers (mat, fence, carpet, curtain, door, window, floor/ceiling, pane (of glass/plastic/etc), roof) [w: wall, modifier "fin" = closed/obstructed/disallowed/stopped] -dag components - integral or essential components of larger items that do not fit into any of the above categories (nail, shoelace, button (on clothing), plug/stopper, cap, hinge, anchor, spring, zipper, shingle, knob, wheel, tile, rudder, trigger, handrail, (push)button, flywheel) [w: nail, modifier "caw" = straight/unbent/smooth/calm] -zum other artificial items (pin, hook, clip, (clothes) hanger, string, chain, ball, rope) [w: pin, modifier "zu" = minimal polarity]
Living energy (i.e., supernatural and primarily non-physical):
-dev all living energy (I was going to have two categories, religious and mythological, but I figured that it might cause nasty arguments.) [w: spirit/soul, modifier "den" = religious/spiritual/supernatural]
Non-living energy:
-cug count nouns (sunray, thunderclap, lightning bolt, flash (of light)) [w: ray/beam, no corresponding modifier] -xog other non-living energy - mass nouns (sunshine, thunder, lightning, electricity, hydropower, force, heat, sound) [w: power/energy rate, modifier "xo" = energy/power/strong/ leadership]
Time:
-dus point in time (midnight, sunrise, 6 o'clock) [w: (clock) time, modifier "dun" = bound/restrained/certain/exact polarity] -fem other times - periods of time (summer, morning, fall, equinox, childhood, season, monsoon, cycle, birthday, anniversary) [w: season/time of year, modifier "fen" = free/unrestrained/ uncertain/approximate polarity]
Abstract nouns (all are P-s by default unless stated otherwise):
Groups/organizations (including government types):
-jag all groups/organizations (parliament, republic, platoon, bureaucracy, army, kingdom, political party, senate, theocracy, government, jury, corporation, faculty, commune, partnership, sorority, business, cartel, construction company, union, law firm, restaurant chain, club, choir, circle, clan, community, congregation, sect, team/crew, department, organization, parish, gang, caste, brotherhood) [w: business/firm, modifier "ja" = business/money/finance]
Members of groups, including ranks and titles:
-beg all members of groups (president, congressman, soldier, officer, cadet, astronaut, politician, nobleman, ambassador, colonel, duchess, policeman, judge, dictator, carpenter, musicologist, physician, teacher, electrician, farmer, journeyman, mathematician, biologist, musician, salesman, actor, linguist, plumber, scholar, gymnast, gardener, chess player, hobbyist, person, worker/laborer, prisoner, professional, student, pilot, fireman, polyglot, adult) [w: person/individual, modifier "be" = person/thinking/ intelligent]
Professions, occupations, fields of study or endeavor, and activities, including ideologies and schools of thought:
-jep all professions, endeavors, and activities (politics, military science, engineering, acting/show business, farming, carpentry, linguistics, history, mathematics, plumbing, science, writing, gambling, debate, soccer, gymnastics, chess, gardening, hobby, divination, ministry/priesthood, profession/occupation, democracy, communism, religion, behaviorism). [w: profession/occupation, modifier "jen" = professional/expert/ skillful]
Performances, components, and attributes:
[Since all members of this group are the products of human activities, they can be focused, and the focus will elaborate the associated action or activity. Examples: "circle of fire", "law of the jungle", "gallon of ice cream", "game of chess", "legality of an act", "words of sorrow", "promotion to captain", etc.] -jum symbols and shapes (letter/alphabet, squiggle, note (musical), swastika, comma, minus-sign, number, parenthesis, caduceus, exclamation point, circle, line, triangle, ellipse, hexagon, sphere, dome, rectangle, point, disc, polyhedron, cylinder, sheet, digit) [Orientations, such as 'vertical' and 'flat' will be included in the binary non-relational class.] [w: symbol/emblem, modifier "jun" = shape/symbol/face] -jam protocols, paradigms, formalisms, programs, designs, and systems (language, creole, dialect, program, design, protocol, code, plan/scheme, recipe, score/music, rule, law, contract, instruction/direction, standard, treaty, curriculum, custom/more) [w: law/statute, modifier "jan" = law/justice/court/civilization] -tov measures - P/F-s by default (meter, acre, gallon, gram, ton, second, century, radian, dollar, joule, newton, hertz, watt, ampere, ohm) [w: day, modifier "toy" = duration/temporal/long-lasting] -zug complete performances (poem, song, opera, symphony, novel, war, job/task, lecture, tournament, game, speech, research, fieldwork, meal, battle, movie, accident, recession/depression) [w: show/performance, modifier "zun" = performance/actor/substitute/ similar/compare] -xas abstract attributes/qualities of performances and their components - P-s by default (legal/legality, rhyming/rhyme, musical/music, sovereign/sovereignty, having pitch/pitch, easy/ease, efficient/efficiency, perfect/perfection, complex/complexity, automatic/automation, secret/secrecy, accurate/accuracy, successful/success) WARNING!: Be careful not to confuse this class with modal concepts. [w: easy, modifier "xa" = easy/habitual/common/male] -bes opposites of abstract attributes/qualities of performances and their components - P-s by default (illegal/illegality, arrhythmic/arrhythm, discordant/discord, difficult/difficulty, inefficient/inefficiency, imperfect/ imperfection, simple/ simplicity, open/openness, inaccurate/inaccuracy, unsuccessful/failure) WARNING!: Be careful not to confuse this class with modal concepts. [w: difficult, modifier "ben" = arithmetic/abstract/difficult (see also "-bel")] -kus other abstract concepts - typically components or sections of a performance (problem, enigma, stanza, scene, lap, movement, chapter, climax, word, morpheme, subroutine, lesson, promotion/demotion, concept, equation) [w: problem/puzzle, modifier "kun" = dark/asleep/black/secret/ unexpected]
Actions:
-teg speech acts, default = A/P/F-d (tell, shout, ask, explain, flatter, lie, mock, offer, thank, curse, congratulate, recommend, decree) [w: tell, modifier "te" = language/mouth/message/word] -cal activities, default = AP-s (go, smoke, eat, ski, swim, walk, work, study, sing, bark) [w: try/attempt, modifier "ca" = good/want/purpose] -kas involuntary acts, default = P-d (sneeze, blink, laugh, trip/stumble, blush, drool, burp, hiccup, cry/sob, bleed, smile, burst) [w: shiver/shake, modifier "kaw" = jump/spring/back-and-forth/ iterate] -bus other acts, default = A/P-d (do something to, push, tickle, betray, spill, kick, throw, catch, drop, punish, drag, bring, grasp/grab, pick up, put down, build, manipulate, spit on, choose/select, cancel) [w: do something to/affect, modifier "bu" = work/deed/project]
Scalar relational states:
-kop mental states, default = P/F-s (fear, be angry, love, want, be happy, be eager, like/enjoy, be greedy, be emotional, have fun, know, remember, be conceited, understand, wonder, be intent/focused on, imagine) [w: know, modifier "ko" = knowledge/wisdom/correct] -dum opposite of mental states, default = P/F-s (hate, be sad, be reluctant, dislike, be generous) [w: be unaware of, no corresponding modifier] -kiv physical relations, default = P/F-s (hear, be allergic to, taste, feel pain in, detect (eg. an instrument)/sense, be hungry for) [w: see, modifier "ki" = see/light/evident/show] -xum other scalar relationships, default = P/F-s (about/involved with, similar to, taste like, compatible with, ready to/for, being by nature/inherently, resemble/look like) [w: of/about/have something to do with, modifier "xu" = connect/ intermediary/neck/between] -bim opposite of other scalar relationships, default = P/F-s (uninvolved with, dissimilar to, incompatible with, not ready to/for) [w: have nothing to do with/uninvolved with, no corresponding modifier]
Scalar non-relational states, default = P-s:
-cul measurable physical states with corresponding, commonly used, measure words, (hot, tall, heavy, thick, fast, old/aged, expensive, high, long, late, loud, deep) [w: heavy/weighty, modifier "cu" = low/under/on/support/foot/heavy] -del antonyms of measurable physical states (cold, short, light, thin, slow, young, cheap, low, short, early, quiet, soft, shallow) [w: light/low in weight, no corresponding modifier] -kem other scalar non-relational states - typically vague or subjective (big, wet, sharp, strong, light/lit, sunny, smooth, common, sweet, good, attractive, fragrant, normal, wise, friendly, talkative/garrulous, wealthy, tasty, convenient) [w: big/large, modifier "ke" = high polarity/big] -fom antonyms of other scalar non-relational states (small, dry, blunt, weak, dark/unlit, cloudy/overcast(?), rough, rare, sour, bad, ugly, smelly, abnormal, foolish, unfriendly, quiet/reticent, poor, bad-tasting, inconvenient) [w: small, modifier "fo" = small/low polarity]
Binary relational states:
-tas kinship relations, default = P/F-s (mother, cousin, grandchild, uncle, brother) [w: relative/kinsman, modifier "tan" = kinship/family/trust] -zev social/economic/political/etc relations, excluding kinship, default = P/F-s (friend, colleague, acquaintance, employee, enemy, member, guest, representative, assistant) [w: assistant/aide, modifier "zen" = involvement/together/help] -dog exchange and transfer verbs, default = AP/F-d (buy/sell, borrow/lend, swap/exchange, invest in, donate, export/import, confiscate/commandeer, steal) [w: exchange/swap, modifier "doy" = open/unobstructed/allowed/ exchange] -cem opposite of exchange and transfer verbs, default = AP/F-d (buy/sell, borrow/lend, swap/exchange, invest in, donate, export/import, confiscate/commandeer, steal) [Convention: "-dog" is used when the agent gains possession of the focus, and "-cem" is used when the agent loses possession of the focus.] [w: pass/hand (over), no corresponding modifier] -bel arithmetic functions, default = P/F-s (addition, square root, logarithm, cosine, reciprocal, integral) [w: multiplication, modifier "ben" = arithmetic/abstract/ difficult (see also "-bes")] -zog relational locatives, default = P/F-s (at/in/on, between, above, to the left of, to the north of, upstairs/downstairs to) [w: at/in/on, no corresponding modifier] -fag opposites of relational locatives, default = P/F-s (away from, under, to the right of, to the south of) [w: away from, modifier "fa" = away/distant/foreign] [Compound locatives can be formed by modifying the base relation by the previous-word modifier (PWM) form of the other orientation(s) or location(s). For example, to create a case tag with the meaning 'on the inner surface of', modify the case tag meaning 'on (the surface of)' by the PWM form of the word meaning 'in(side)'. To indicate the names of the locations or orientations, use the suffix "-aym".] -dap other binary relational states, default = P/F-s (be a part/constituent of, own, be equal to, be a substitute for, be full of, mean/signify, provide/be the source of, be the consequence of, connected/linked/joined to, owe/be in debt for, be a kind/type of, to be a member of species F) [w: be (equal to), modifier "day" = equal/same/fair] -fes opposite of other binary relational states, default = P/F-s (be different from/not equal to, be empty of) [w: differ/be different from, no corresponding modifier]
Binary non-relational states, default = P-s:
-zig colors (red, blue, green, turquoise, magenta, colored) [w: colored/having color, modifier "zin" = color] -kep attributes of living entities (alive, pregnant, gendered/sexual, sober, standing/upright, healthy, sitting, growing, intelligent, instinctive, with others/not alone) [w: alive, modifier "kwe" = life/alive] -kol opposites of attributes of living entities (dead, non-pregnant, non-gendered/asexual, inebriated, prone/lying down, sick/ill, kneeling, non-intelligent, (a)lone/by oneself) [w: dead, modifier "koy" = death/inactive/sorrow] -kav other binary non-relational states (open, real/existent, straight, exposed, vertical, solid, authentic, clean, functional/operational, natural, clear/transparent, colorful, whole/complete, rising, pure/unadulterated, free/unrestrained, safe) [w: real/actual, modifier "ka" = reality/truth/inherent] -juv opposites of other binary non-relational states (closed, imaginary, crooked, hidden, horizontal, hollow, fake, dirty, dysfunctional/broken, artificial, opaque, colorless, partial/incomplete, falling, impure/adulterated, bound/restrained, dangerous) [Concepts that do not have true opposites, such as 'written', 'material/matter', and 'translucent', can use either of the two opposite classifiers.] [w: imaginary, modifier "ju" = zero/not/opposite/imaginary/0% polarity]
Deictic classifiers:
The deictic classifiers are formed by combining the modifiers for numbers 1-7 plus 'v' for personal pronouns, 'm' for possessives, 's' for demonstratives, 'l' for locative deictics, and 'p' for temporal deictics. Ex: "bavi" = 'I/me', "bamo" = 'my', "baso" = 'this/these', "bale" = 'here', "bape" = 'now', "xepe" = 'earlier', etc. In effect, each deictic root has its own classifier. Deictic classifiers are all P-s by default.
Tense classifiers (all are P/F-s disjuncts):
-cip past tense [w: past perfect, modifier "ci" = past/before/pre/early/ start/birth/contingency] -das present tense [w: present imperfect, modifier "daw" = present/now/recent] -jev future tense [w: future perfect, modifier "je" = future/after/post/late/finish] -kom past+present tense [w: past+present perfect, no corresponding modifier] -xus present+future tense [w: present+future perfect, no corresponding modifier] -bul unspecified tense [w: unspecified imperfect, modifier "byu" = temporary]
Modal classifiers (all are disjuncts):
-tam epistemic probability (default AP/F-s) [w: may (possibility), modifier "ta" = possible/potential/ almost polarity] -jeg epistemic evidentiality -cav epistemic inevitability -zim epistemic acceptability -dup epistemic significance -fug epistemic hedge -xiv epistemic counterfactuality -top epistemic reasonableness -dov deontic obligation (default P/F-s) [w: can/may (option), modifier "do" = vote/opinion/choice/ female] -kes deontic necessity -zul deontic consequentiality
Other classifiers:
-xim genitive/relative clause linker (default = F/P-s) [w: of/that, modifier "xi" = own/possess/enough/sufficient polarity] -tes true conjunctions (and, or, but, if, default = P/F-s) [w: and, no corresponding modifier] -kum numerics, (one, 7.23E-5, seventh, three-fourths, three-at-a-time, default = P-s) [w: some/any (quantity), modifier "ku" = number/count/group] -tom true generic root (default = P-s) [w: a/an/some, modifier "to" = indefinite/nonspecific/general] -jop particles (modifying root morpheme indicates default) [w: unassigned, no corresponding modifier]
Here is a list of all the modifying root morphemes of the interlingua in alphabetical order. If the modifier is formed from a classifier, then the classifier, the meaning of the stand-alone word (i.e., without modifiers), and the description of the class are listed in parentheses. Note that many potential root morphemes are undefined and are reserved for future use.
ba one/unity/harmony/child/offspring baw especially/particularly polarity bay push/press/give/exit be person/thinking/intelligent (-beg = 'person/individual' = all members of groups) ben arithmetic/abstract/difficult (from both -bel = 'multiplication' = arithmetic functions and -bes = 'difficult' = opposites of abstract attributes/qualities of performances) bi maximal polarity (-biv = 'lake' = bodies of water) bin mechanical/technical/complex (-bis = 'toilet' = other powered and power-producing devices) bo fish/water/liquid/swim (-bom = 'herring' = fish) bon insufficient polarity boy plant/green/growth/leaf/numeric decimal point (-bos = 'grass' = other plants) bu work/deed/project (-bus = 'do something to/affect' = other acts) bun dangerous/harm/fear (-buv = 'grass snake' = snakes) bye membership/constituency/partitive/ordinal (-bem = 'organ/body-part' = other organs and body parts - mostly plant parts and regions) byu temporary (-bul = 'unspecified imperfect' = unspecified tense) ca good/want/purpose (-cal = 'try/attempt' = activities) can average polarity caw straight/unbent/smooth/calm (-dag = 'nail' = components - integral or essential components of larger items) cay hard/tough/cold/solid (-cap = 'steel' = other artificial substance) ce electric/shock/strike (-ceg = '(electrical) switch' = electrical components) cen top/summit/cap (-ces = 'head' = animal organs and regions of the animal body) ci past/before/pre/early/start/birth/contingency (-cip = 'past perfect' = past tense) cin chemical/clean/white (-civ = '(chemical) element' = elements and compounds) co four coy just/only polarity cu low/under/on/support/foot/heavy (-cul = 'heavy/weighty' = measurable physical states) cun safe/protect/armor/skin (-cup = 'crab' = other arthropods) da bird/fly/high/lift/gas/sky/blue/numeric Nth in importance/ resumptive aspect (-dam = 'pigeon/dove' = all birds) dan positive/gain/extra polarity daw present/now/recent (-das = 'present imperfect' = present tense) day equal/same/fair (-dap = 'be (equal to)' = other binary relational states) de flat/two-dimensional (-dep = 'room/chamber' = sections of buildings and similar supporting or enclosing structures or artificial plots of land) den religious/spiritual/supernatural (-dev = 'spirit/soul' = all living energy) di three do vote/opinion/choice/female (-dov = 'can/may (option)' = deontic obligation modality) don taste/tongue (-dop = 'herb' = farm, garden, and other cultivated plants) doy open/unobstructed/allowed/exchange (-dog = 'exchange/swap' = exchange and transfer verbs) du government/nation/politics (-dug = 'town' = large, typically political areas) dun bound/restrained/certain/exact polarity (-dus = '(clock) time' = point in time) fa away/distant/foreign (-fag = 'away from' = opposites of relational locatives) faw negative/loss/cancel/minus (-fav = 'sore/ulcer' = all wounds & growths) fay animal/meat (-fas = 'worm' = all other animals) fe hot/anger/fire/yellow/spicy (-fep = 'fire/blaze' = all non-living phenomena) fen free/unrestrained/uncertain/approximate polarity (-fem = 'season/time of year' = other times - periods of time) fi eight/writing (-fim = 'book' = other informative/artistic/entertaining/ social items) fin closed/obstructed/disallowed/stopped (-fig = 'wall' = separators/barriers) fo small/low polarity (-fom = 'small' = antonyms of other scalar non-relational states) foy loud/sound/hearing (-fop = 'gun' = weapons and explosive devices) fu food/eat/stomach/numeric 'real/imaginary' number (-fup = 'food/victuals' = processed food substance) fyu cut/slice/sharp (-fus = 'knife' = non-powered instruments and tools, including hand weapons) ja business/money/finance (-jag = 'business/firm' = all groups/organizations) jan law/justice/court/civilization/purple (-jam = 'law/statute' = protocols, paradigms, formalisms, programs, designs, and systems) jay sand/fine/filter/dig/deep (-jav = 'sand' = other natural substances) je future/after/post/late/finish (-jev = 'future perfect' = future tense) jen professional/expert/skillful (-jep = 'profession/occupation' = all professions, endeavors, and activities) ji tree/wood/paper/tall/vertical/leg (-jig = 'tree' = trees and shrubs) jin folded/sit/chair/angle (-jis = 'chair' = furniture) jo round/rotate/fat/full joy healthy/functional/ready (-jos = 'bread(loaf)' = food items) ju zero/not/opposite/imaginary/0% polarity (-juv = 'imaginary' = opposites of other binary non-relational states) jun shape/symbol/face (-jum = 'symbol/emblem' = symbols and shapes) ka reality/truth/inherent (-kav = 'real/actual' = other binary non-relational states) kaw jump/spring/back-and-forth/iterate (-kas = 'shiver/shake' = involuntary acts) kay sweet/sugar (-kag = 'bee' = insects) ke high polarity/big (-kem = 'big/large' = other scalar non-relational states - typically vague or subjective) ken thin/narrow/finger/touch ki see/light/evident/show (-kiv = 'see' = physical relations) kin excessive polarity ko nine/knowledge/wisdom/correct (-kop = 'know' = mental states) koy death/inactive/sorrow (-kol = 'dead' = opposites of attributes of living entities) ku number/count/group (-kum = 'some/any (quantity)' = numerics) kun dark/asleep/black/secret/unexpected (-kus = 'problem/puzzle' = other abstract concepts - typically components or sections of a performance) kwa pull/receive/collect/enter kwe life/alive (-kep = 'alive' = attributes of living entities) kwi building/home/reside/build/construct (-kig = 'house' = buildings and groups of buildings, including places of business) ta possible/potential/almost polarity (-tam = 'may (possibility)' = epistemic probability modality) tan kinship/family/trust (-tas = 'relative/kinsman' = kinship relations) tay seven/adopted te language/mouth/message/word (-teg = 'tell' = speech acts) ti fast/agile/move/go (-tim = 'car/automobile' = vehicles) tin land/location/earth/natural/wild/numeric fraction (-tis = 'land/ground' = other natural locations) to indefinite/nonspecific/general (-tom = 'a/an/some' = true generic root) toy duration/temporal/long-lasting (-tov = 'day' = measures) tu five/orange twa clothing/shirt/thick/wide/chest/numeric negative exponent (-tav = 'shirt/blouse' = clothing and related items) twe arm/branch/extension/lever/crane (-tev = 'bone' = other natural things, organic things and things made by animals) xa easy/habitual/common/male (-xas = 'easy' = abstract attributes/qualities of performances and their components) xe two/divided/opposition xi own/possess/enough/sufficient polarity (-xim = 'of/that' = genitive/relative clause linker) xo energy/power/strong/leadership/numeric exponent (-xog = 'power/energy rate' = other non-living energy - mass nouns) xu connect/intermediary/neck/between (-xum = 'of/about/have something to do with' = other scalar relationships) za six ze long/road/walk/line/string/hair (-zeg = 'road' = other artificial locations - typically infrastructure) zen involvement/together/help/numeric N-tuple (-zev = 'assistant/aide' = social/economic/political/etc relations) zi contain/inside/meaning/hold/hand (-zip = 'container/vessel' = containers and conduits) zin color (-zig = 'colored/having color' = colors) zo red/blood/combat/violence (-zop = 'blood' = plant/animal substances and mixtures) zon old/slow/dense/clumsy/immobile zoy enjoy/friend (-zov = 'dog' = carnivores) zu minimal polarity (-zum = 'pin' = other artificial items) zun performance/actor/substitute/similar/compare (-zug = 'show/performance' = complete performances)
Throughout this monograph, I have generally used English word order in my examples to make them easier for the English-speaking reader to understand. However, from the start, I have always intended the language to be purely right-branching (i.e. VSO). Right-branching languages are inherently easier to parse for both computers and humans.
Here is a complete listing of the production rules and general rules of the interlingua:
Production rules:
| = logical 'or' () = enclosed item is optional {} = enclosed item may appear zero or more times sentence ::= (topic) clause | vocative-noun-phrase topic ::= topic-particle argument topic-particle ::= heavy-topicalization-particle | reference-switching-particle clause ::= {disjunct} verb {argument} (valency-terminator) argument ::= core-argument | oblique-argument core-argument ::= expression oblique-argument ::= adverb | case-tag expression expression ::= noun-phrase | clause noun-phrase ::= noun (noun-modifier) | open-noun (noun-modifier) expression noun-modifier ::= {light-modifier | heavy-modifier} light-modifier ::= adjective heavy-modifier ::= open-adjective expression
General rules:
Previous-word Modifier Rule: Any word except a conjunction or a delimiting particle may be immediately followed by a previous-word modifier (pwm). Here is the syntax: pwm ::= {light-previous-word-modifier | heavy-previous-word-modifier} light-previous-word-modifier ::= previous-word-modifier heavy-previous-word-modifier ::= open-previous-word-modifier expression The scope of a pwm will be the preceding word plus any of the arguments or modifiers of the preceding word (i.e., the pwm immediately follows the headword of the item that it applies to). Coordination Rule: For the purpose of coordination, the following are considered "constituents": sentence, clause, noun phrase, heavy-modifier, adjective, adverb/case tag, and pwm. Any constituent "X" may be replaced by a coordinated constituent "X" of the same type, as follows: X ::= (coordination-initiator) X {coordinating-conjunction X} (coordination-terminator) Note that explicit coordination of adjectives using the word meaning 'and' is allowed and has the same meaning whether or not 'and' is used. For example, both "dog big black", "dog black big", and "dog black and big" are all equivalent to English "big black dog". Similarly, "house red green" means 'the green and red house'. It is not possible for disjuncts to be coordinated, although more than one may occur in sequence. This restriction must exist to prevent a constituent from having two or more heads. For the same reason, verbs also cannot be coordinated (e.g. "*He washed and polished the car."), although entire clauses may be coordinated (e.g. "He washed the car and he polished it."). Parsing Rule: When one constituent is embedded inside another, the parser will not exit the current level of embedding until all syntactically acceptable constituents have been parsed; i.e., it will leave its current level only when it encounters a constituent that violates the syntax for the current level. The semantics of the construction will never be a consideration. For example, in a sentence such as "verb1 noun verb2 noun oblique1 oblique2" where verb2 takes a single core argument, both oblique arguments are arguments of verb2. Since there will be cases in which it is necessary to prematurely terminate the argument structure of an embedded verb, we will need a particle to perform this function. In the interlingua, we will use "jojope" for this purpose, and refer to it as a valency terminator. For example, in "verb1 noun verb2 noun oblique1 jojope oblique2", "oblique1" is an argument of "verb2" while "oblique2" is an argument of "verb1". Similarly, if there are more core arguments present than are allowed by the argument structure of the verb, then parsing at that level will stop and the additional arguments will be available at the higher level. For example, in "verb1 verb2 noun1 adverb1 noun2", if "verb2" is P-d, then both "noun1" and "adverb1" are arguments of "verb2", while "noun2" must be an argument of "verb1". Again, use of the valency terminator "jojope" can override the default parse.