MT from Northern Sotho to English: Transfer phenomena of parts of speech

Get Complete Project Material File(s) Now! »

The slot system

As described in paragraph 3.2.1.1, the verb stem is usually followed by up to two objects of which one may be replaced by an object concord directly preceding the stem. We define the basic verbal phrase, VBP as containing the verb stem and its subcategorised arguments.
This VBP may appear together with bound morphemes to its left forming the VIE. Some of these morphemes appear in complementary distribution, i.e. the presence of some prevents the presence of others. The future morpheme MORPH fut, for example, never appears together with the present tense morpheme, MORPH pres. On the other hand, some morphemes in certain constellations have to occur together, like, for example, gaMORPH neg seMORPH neg, forming a negation cluster. Such distributionary issues will be examined and summarised in chapter 4.
In order to simplify the graphical representation and to give a better overview of the many different VIEs, a slot-system is designed, that is, positions of certain parts of speech or parts of speech clusters as parts of phrases are defined. The slot system is then utilised for building morphosyntactic rules aiming at unambiguous analyses which can later be translated into e.g. phrase grammar rules.
The VBP is defined as ‘slot zero’ representing the core element of the VP. It makes use of one to four fields or positions. These VBP positions are numbered from the leftmost pos-1 to the rightmost pos+2, as Table 3.4 demonstrates. In each of the positions pos-1 to pos-0, only one token of a specific part of speech (an object concord and a verb stem respectively) may appear while the positions pos+1 and pos+2 are defined to contain the object(s) of (double) transitive verbs which can be nouns, nominals, noun phrases or even clauses. The central position contained in the VBP, ‘pos-0’, contains the verb stem. Slot zero forms part of all further descriptions of the verbal moods as it remains unchanged. The VIE slots are then built to the left of the VBP numbered as zero-1 to zero-2 (from right to left). Slot zero-1 may only contain one tense marker, i.e. the present tense or one of the future morphemes, while slot zero-2 contains the constellations of subject concord and/or negation marker(s). Except for slot zero, pos-0 containing the verb stem, all other positions are permitted to be empty, as in Boelet˘sa! ‘Repeat!’, an imperative.

Labels used on nodes

In this study, up to four labels will be attached to the nodes of trees and elements of morphosyntactic rules. Some of these labels are retrieved from the lexicon, e.g. the data that forms part of the word class label (e.g. V or N01 loc), others are added by means of the rules that are defined in this chapter (e.g. a VIE01 containing a subject concord of class 1, CS01). The appearance of labels is demonstrated in (35): the superscript left to the node is reserved for information on the verbal mood of a node, e.g. INDVP, to describe an indicative VP. The subscript to the left of the node shows a syntactic function, e.g. OBJNP (an object NP). When being used at lexical items, it can however also mark the perfect tense form of a verb stem, i.e. perfV. The superscript to the right of the node is used whenever it is necessary to know the transitivity of an element, e.g. Vitr(i.a. an intransitive verb), and the subscript to the right of the node will show the noun class of the node, e.g. NP01 (NP of noun class 01), if necessary.

READ  gender mainstreaming in Malawi’s nutrition policy

Labeling information on the transitivity of verbs

A parser should be aware of the transitivity of a verb to avoid ambiguous and/or incorrect analyses. The problem arises especially in the case of Northern Sotho where punctuation is often used sparsely8. As a subject concord might represent an omitted subject nominal, any nominal placed between a verb and a subject concord might therefore function either as the object of the preceding, or as the subject of the following clause (cf. Figures 3.3 to 3.7 below). Therefore, whenever the respective punctuation is not present, it is problematic to identify the correct position of the sentence border. Moreover, nouns may generally be used either as arguments or as adverbials; only for locative or locativised nouns it can be said that they probably appear more often with an adverbial function that as an object of a verb, cf. example (36) of Van Wyk et al. (1992, p. 41).

1 Introduction 
1.1 Language introduction
1.2 Aims
1.3 Methods
1.4 A general introduction to grammar
1.5 Layout of the study
2 The word classes of Northern Sotho 
2.1 Introduction
2.2 The noun (Ncateg)
2.3 The pronoun
2.4 The concords
2.5 The adjective (ADJcateg)
2.6 The enumerative (ENUM)
2.7 The verb stem (V)
2.8 Adverbs (ADV)
2.9 Themorphemes (MORPH)
2.10 Particles (PART)
2.11 Summary
3 A fragment of the grammar of Northern Sotho 
3.1 Introduction
3.2 The Verbal Phrase (VP)
3.3 The copulative verbal phrase (VP cop)
3.4 Auxiliary verbs
3.5 Other verbal structures
3.6 Adverbial phrases (ADVP)
3.7 Summary of the verbal phrases
3.8 Constellations of the Noun Phrase (NP)
3.9 The Particle Phrase (PP)
3.10 A sentence of Northern Sotho
3.11 A brief summary of our grammar fragment
4 Features of verbal phrases 
4.1 Introduction
4.2 Parsers: approaches to describe natural languages
4.3 A sample analysis
4.4 Data categories ofmain verbs
4.5 Data categories of copulative verbs
4.6 Conclusions
5 Implementation of a grammar fragment 
5.1 Introduction
5.2 Implementation
6 A basis for an automated translation 
6.1 Introduction
6.2 An Introduction toMachine Translation
6.3 MT from Northern Sotho to English:
6.4 MT from Northern Sotho to English: Transfer phenomena of parts of speech
6.5 XLE inmachine translation
6.6 Summary
7 Summary and conclusions
7.1 Aims of this study
7.2 Summary of results
7.3 Conclusions and future work
Bibliography

GET THE COMPLETE PROJECT

Related Posts