Automatic extraction of phonemic inventory in Russian sign language

Learning inventory in sign languages. Assay of the inventory of the shape of the hands and its interaction with the hands. Phonemic inventory of hand forms. Formation of a phonemic inventory for the Russian sign language. Location cash setting feature.

Рубрика Иностранные языки и языкознание
Вид дипломная работа
Язык английский
Дата добавления 01.12.2019
Размер файла 4,6 M

Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже

Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.

Размещено на http://www.allbest.ru/

3

Правительство Российской Федерации

Федеральное государственное автономное образовательное учреждение высшего образования

Национальный исследовательский университет

«Высшая школа экономики»

Выпускная квалификационная работа

Automatic extraction of phonemic inventory in russian sign language

Автоматическое извлечение фонемного инвентаря русского жестового языка

Клезович Анна Геннадьевна

Москва 2019

Acknowledgements

Most of all I would like to thank my scientific advisor Vadim Kimmelman who has been like a mentor to me in the scientific world for the past three years and whose example led me in my personal development as an independent linguistic researcher.

I also owe a lot to George Moroz for being my non-official scientific advisor. He has guided me no less than Vadim and spent a lot of time with me discussing this thesis and helping me out with new ideas. In addition to that, George Moroz was the one to give me the advice to get acquainted with Carl Bцrstell's work, without which this thesis would have looked completely different. Carl Bцrstell's make-signs-still code initially gave me the whole idea of this research and inspired this work.

Without George and Vadim's influence I would not have grown so much as a researcher. I could not thank you enough for teaching me so much!

I would also like to thank my colleagues from our research group “Modality Effects in the Syntax of Russian Sign Language” at the Higher School of Economics for their help and consultation, especially Lena Pasalskaya and Lera Dushkina for answering my questions about their intuitions and experience with RSL any time of the day and for being so supportive.

I also thank our consultants from Novosibirsk. In particular I want to thank Alexey Prihodko who is not only our consultant, but at the same time a fellow researcher of sign linguistics at the NSTU. He discussed this thesis with me a lot and inspired me to pursue some new ideas.

Finally, I thank the students from the NSTU who came to my presentation of the thesis in Novosibirsk and asked questions.

All remaining errors are my sole responsibility.

Introduction

The phonological research of sign languages started in 1960 with Stokoe's (1960) work called Sign Language Structure: An Outline of the Visual Communication Systems of the American Deaf. This work is now considered as the one that gave a start to sign linguistics as a research field. By proving that sign languages have phonological components, Stokoe virtually contributed to proving that sign languages are no less “real” languages than spoken languages. Therefore, this work has drawn attention to sign linguistics research and to research into the phonological structure of sign languages. In the years following Stokoe's pioneering work a number of deeper theoretical frameworks on the phonology of sign languages have emerged (e.g. Sandler 1989, Liddell & Johnson 1994, Brentari 1998, etc.). They introduced segmental structure and hierarchical relationships between phonological components discovered by Stokoe (1960) (see a detailed discussion in Section 3). But what is of a particular interest for this research is that they separated movement from other phonological features. Movement has been perceived as the glue between hold-stills which in many signs can be predicted from the knowledge of the holds (Liddell & Johnson 1994). Some even compare movements to vowels, while holds behave like consonants (Liddell 1984; Brentari 2002 cited by Brentari 2012), which makes movement the nuclei of the syllables in all sign languages (Brentari 2012). All other phonological components, such as handshape, location, and orientation, are specified for hold-stills.

So what can be considered as phonemes in sign languages? And how can we establish the phonemic inventory for a particular sign language, in this case - Russian Sign Language (RSL)? A lot of research on different sign languages focuses on establishing phonetic handshape and phonemic handshape inventories (e.g. see Marsaja 2008; Bauer 2012; Schuit 2014; Nyst 2007; van der Kooij 2002). They additionally establish location inventories and estimate which handshapes can occur with which locations. In all of those papers, signs were collected and annotated manually. However, if phonetic handshapes can be established from the knowledge of all possible holds, then by extracting holds automatically we can establish a phonetic handshape inventory easier and faster. So, how can be holds extracted automatically? In order to do that Bцrstell's (2018) code for making images overlays can be used (see Section 3.2 for details). In this study I update this code, so that it suits the RSL dataset from the Spreadthesign dictionary, and use it for holds extraction in RSL. For each hold I annotate its handshapes for both hands and derive the phonetic handshape inventory from this annotation. Employing van der Kooij's (2002) methodology I conclude which handshapes are allophones and give a preliminary description of the phonemic inventory of handshapes in RSL.

Another question is what other linguistic information except holds can be extracted automatically. I hypothesized that the length of movement might be able to define the type of movement. For instance, signs with repeated movements could in general be longer than signs without repeated movement. However, there turned out to be no significant interaction between length and movement type.

Overall, the aims of the current study are the following: 1) to make an algorithm of automatic extraction of the hold positions; 2) to establish the phonetic inventory of handshapes in RSL; 3) to establish the phonemic handshape inventory in RSL; 4) to make cross-linguistic comparisons where possible; and 5) to describe how well different phonological frameworks apply to RSL, with a particular focus on the Prosodic model of phonology (Brentari 1998).

1. Theoretical background

In section 3.1 I give an overview of four main phonological theories proposed for sign languages in chronological order (which is also the order of importance in this case). While in section 3.2 I discuss methods of computational linguistics which have been or potentially can be applied to research on the phonology of sign languages.

1.1 Phonological theories

There has been a number of phonological theories which have developed gradually from the most simple structure solely based on minimal pairs (the Cheremic model (Stokoe 1960)) to the hierarchical structure of the Prosodic model (Brentari 1998) with a few steps in between. At first there have been observed that some signs form minimal pairs with each other in a sense that only one of the phonological parameters is different (e.g. location or handshape). Then it has been noticed that sign languages have linear structure as well as spoken languages, and that this linear structure leads to segmental phonological theories, such as (Liddell & Johnson 1994). However, sign languages obviously have a lot more simultaneity than spoken languages. Thus, it should be taken into account in phonological models. So, as the next step, theories which try to find a balance between simultaneity and sequential structure in phonological representation have started to emerge (see, for example, (Sandler 1989)). After that it has been noticed that phonological features, in addition to everything discovered before, have hierarchical relationships. For example, orientation feature constitute much less minimal pairs than handshape or place of articulation features. Furthermore, it has been established that movement feature represents a nuclei of a syllable for sign languages and should take a separate branch in the structure. All these advancements are introduced in the Prosodic model tree structure. In addition to that, the Prosodic model argues against the Movement-Hold model idea about three-segmental structure (aka. hold-movement-hold). It proves that two-segmental structure is enough, meaning that it is redundant to postulate that movement is a segment too.

In the further sections I discuss these phonological models and their impact on the phonological theory of sign languages in general in the further detail and compare them with each other. The particular attention will be paid to the Movement-Hold model due to the fact that the initial idea of this research is based on this theory, and to the Prosodic model, because it is currently the most descriptive and plausible model, and it still separates movement from other features as well as the Movement-Hold model does, and therefore, it does not contradict the initial idea of this research to extract holds.

The Cheremic model (Stokoe 1960)

The very first phonological theory and at the same time the very first linguistic research on sign languages linguistics emerged in 1960 - Sign Language Structure: An Outline of the Visual Communication Systems of the American Deaf by William C. Stokoe. This work was the first one to say that signs of any sign language have internal structure and are more than just iconic drawings in space.

In this work Stokoe proposed that all signs have three major phonological components, namely handshape, place of articulation, and movement. These components constituted an internal structure of a sign and were given by Stokoe names of tabula (i.e. location or place of articulation), designator (i.e. handshape), and signation (i.e. movement). He pointed out that there are minimal pairs of signs which can be distinguished by each of these three parameters. Consider, for instance, an example from RSL (Figure 1-2). Figure 1 represents a sign with the meaning `sandwich', where two hands are exploited. Both of hands are of so-called “B”-handshape (a palm with all fingers outstretched and non-spread). Whereas on Figure 2 a sign with the meaning `house' (also from RSL) is depicted with the same “B”-handshape, also with two hands, however, the movement is different. The movement in house is symmetric and more of an epenthetic nature. On the contrary, the movement in the sign sandwich is asymmetric and somewhat iconic. It stands for putting sandwich ingredients on a piece of bread.

Figure 1. sandwich, RSL Figure 2. house, RSL

One might also notice that signs on Figures 1-2 are different not only in movement, but also in hands orientation. In the sign house hands are making an angle together and facing each other, while in the sign sandwich hands are lying on one another without making an angle. Stokoe has not pointed out this fact. It was noticed later in 1978 by Robbin Battison in his book Lexical Borrowing in American Sign Language. Battison (1978) added orientation as the fourth component of the sign due to the fact that orientation can also distinguish minimal pairs in sign languages. However, orientation obviously should have a different status comparing to other three phonological components, because it distinguishes only a few minimal pairs. In addition to that, Battison (1978) was the first one to take into account handedness feature of a sign. He suggested that there are three main types of two-handed signs in sign languages. First type constitutes of signs, where both hands are active and perform the same symmetrical (synchronous or asynchronous) movement (see Figure 2 with the sign house again). Second type of two-handed signs the second hand is a dominant hand, but both hands have the same handshape (see Figure 1 with the sign sandwich again). Finally, the third type of signs is also signs with the active and the dominant hand, although here hands have different handshapes (see Figure 3 with the sign banana in RSL). In the most of recent works on sign languages researchers do not consider handedness to be one of the main phonological features. It is often said that there are five components of signs - handshape, location, movement, orientation, and non-manual component (see, for example, Bross, 2015). However, handedness is obviously a phonological feature too. In (Цstling et al. 2018; Kimmelman et al. 2018) it has been proven that handedness feature can interact with the type of iconicity of a sign. Therefore, it is, in fact, a phonological feature. For example, the sign with the meaning `morning' in RSL again has a “B”-handshape (Figure 4) (as well as signs sandwich or house), but it is one-handed comparing to the signs sandwich and house However, signs house and sandwich Vs. the sign morning do not form a minimal pair, because the location of the sign morning is not in the neutral space, therefore, different..

Figure 3. banana, RSL Figure 4. morning, RSL

The theory about phonological components is called the Cheremic theory after the Stokoe's (1960) work. To sum up, all signs in all sign languages have the 5-6 phonological features, which can help to distinguish minimal pairs to different extent. Handshape, location, and movement distinguish a lot of minimal pairs, while orientation distinguishes only a few. As for the handedness, there is actually no research into how often it can distinguish minimal pairs in sign languages, although it can influence iconicity patterns used in signs. The Cheremic phonological theory being the most basic one has a number of serious drawbacks and cannot be used alone: (1) it does not take into account hierarchical relations between phonological features, (2) it does not take into account timing units in which a sign is produced, therefore it is impossible to postulate syllables under this theory.

The Movement-Hold model (Liddell & Johnson 1994)

The next theory in the field of sign languages phonology was the Movement-Hold Model proposed by Liddell & Johnson in 1994 in American Sign Language: The Phonological Base. Liddell & Johnson suggested that it is possible and sufficient to represent all signs as sequences of holds (aka. positions without movement) and movements, where movements can be naturally derived from the knowledge of holds and their linear order. Furthermore, for each hold position there should be specified all phonological components described before (e.g. handshape, orientation, location for each hand (only if a sign is two-handed)). To show how phonological representation looks in in this model let us have a look at a sign with the meaning `thank you' in RSL (Figure 5). This sign consists of so-called “A”-handshape and one movement, from the forehead to the chin. The phonological structure of this sign under the Movement-Hold theoretical framework is depicted in Table 1. For this sign we specify two hold positions and one movement. Hold positions have the same handshape - “A”-handshape, and the same orientation - palm facing a signer. Non-manual signal is extended through the whole sign and, therefore, is specified once for the sequence. Location of two holds differs, the hand moves from the forehead of the signer to the chin.

Figure 5. thank-you, RSL

Table 1. Phonological representation of the sign thank-you for RSL under the Movement-Hold framework

thank-you, RSL

SEGMENTS

RIGHT HAND

Hold (H)

Movement (M)

Hold (H)

handshape

A

A

location

forehead

chin

orientation

palm towards signer

palm towards signer

nonmanual signal

mouthing `thanks' (`спасибо'\spasibo) in spoken Russian

LEFT HAND

-

-

-

-

Now consider again a sign for the meaning `house' on Figure 2 above. Table 2 is phonological representation of this sign in the Movement-Hold Model. The crucial difference in phonological structure of this sign is the fact that there is a slight movement, which is repeated once or twice (once - in the speech, twice - in the citation form). This kind of movement is called epenthetic, because it does not serve as a transition between different holds, the holds here are exactly the same. In (Liddell & Johnson 1989) these cases need to be explained with an additional rule, called the M(ovement) Epenthesis rule. So, the internal phonological structure of the sign has one movement The sign cannot appear without any movement at all (Channon & van der Hulst 2011). It is a word-formation requirement for all sign languages. Thus, we postulate one `empty' movement in the internal phonological structure, even before applying the M Epenthesis rule. and one hold (see Table 2). Then, the M Epenthesis rule applies to this internal structure and results into the following structure (see Figure 6). As a result the sign house comprises of the sequence “H | M | H | M | H”, where all holds share the same features and movements are epenthetic.

Figure 6. Schematic representation of the sign house for RSL after the M Epenthesis rule application

Table 2. Phonological representation of the sign house for RSL under the Movement-Hold framework

house, RSL

SEGMENTS

RIGHT HAND

Movement (M)

Hold (H)

handshape

B

location

neutral space

orientation

at right angle to the other hand in symmetric way

non-manual signal

mouthing `house' (`дом'\dom) in spoken Russian

LEFT HAND

handshape

B

location

neutral space

orientation

at right angle to the other hand in symmetric way

non-manual signal

mouthing `house' (`дом'\dom) in spoken Russian

The Movement-Hold model proposes additional rules for seven different phonological processes including M Epenthesis. Here is the list (see Liddell & Johnson 1989: 237-254 for the examples):

1) Movement Epenthesis

2) Hold Deletion - occurs between some words when a hold between two different movements is eliminated from the surface form

3) Metathesis - when a sign changes some of the properties of its initial hold because it accommodates to the previous sign

4) Gemination - when the final hold of one sign is identical to the initial hold of the following sign

5) Assimilation - either properties of the secondary hand accommodate to properties of the active hand, or some phonological features of two signs in a sequence assimilate with each other, or a sign loses its two-handedness because of the other signs in the sequence

6) Reduction - when location of a sign becomes less marked, moves closer to the neutral space

7) Perseveration and Anticipation - when a two-handed sign follows a one-handed sign, the one-handed sign can have a secondary hand in the anticipation of the two-handed sign or the other way around

All this being said, let us turn to a discussion of the limits and disadvantages of this model. The first problem that the Movement-Hold model has is the fact that feature matrices usually have a lot redundant material. For instance, Table 1 for the sign thank-you has all features specified twice for both holds, although only one feature actually varies there. After the movement transition, the location of this sign switches from a forehead to a chin. So, this brings up a question, whether we really need to specify all other features twice. The same observation holds for the sign house, described in Table 2. This sign is two-handed, and both hands move symmetrically. Instead of just specifying that hands are symmetric, this matrix contains the same information twice, on the right hand and on the left hand. Furthermore, the terminology of right hand and left hand does not seem to be very reasonable. Firstly, Liddell & Johnson (1989) themselves and Battison (1978) specify that there are different two-handed signs, where the hands are either symmetric, so that, we do not need to specify phonological features for them separately, or where the hands are asymmetric, and in this case it is crucial to specify which hand is secondary and which is active. It is crucial due to the fact that some signers are right-handed, while some are left-handed, so that it is not always the case that the right hand is active.

The second disadvantage of that theory is that it does not take into account hierarchical relations between phonological features. It does have a segmental structure, which is a great advance compared to the Cheremic model, however, it is not sufficient. Phonological features do interact with each other (Channon & van der Hulst 2011). For instance, Mak & Tang (2011) find out for Hong Kong SL (HKSL) that such features as the orientation change and the aperture change are dependent on the local movement, not on the path movement. (See more thorough discussion of (Mak & Tang 2011) after (Brentari 1998) theory discussion in this section) While Brentari (1998) proposes that orientation is derived from handshape and place of articulation, because there are much less minimal pairs which are distinguished by orientation only. Clearly some phonological features somehow depend on others and form hierarchical relation. The later phonological theories entertain this issue to different extents.

The third drawback of the Movement-Hold model is the fact that only path movements are taken into account. It has been shown in the later works that all sign languages have two types of movements: path movement and local movement (Brentari, 1998; Mak&Tang, 2011; Sandler, 1989; etc.). Under the path movement we mean the movement that starts in elbows and/or shoulders and results into a change of a hand position, while the local movements are the movements that start in wrist and/or finger joints. Local movements are very important due to the fact that they control change in orientation and aperture/handshape of a sign. Although the Movement-Hold model does take into account some of the local movements, it has two apply additional rules and specify that a sign with a local movement undergoes a phonological process. For instance, in RSL sign house the movement epenthesis rule has to be applied to some abstract internal phonological structure in order to derive a sign with the thrill movement. In the end, although the Movement-Hold has some tools to describe local movements, it seems that this is a little bit overcomplicated to postulate phonological processes rules for a huge number of signs instead of just specifying these local movements in the initial “internal” phonological structure.

Similarly, the Movement-Hold model does not account for the complex movement. This is virtually an implication of the fact that Movement-Hold model does not address local movements. Complex movements are combinations of more than one type of movement in a sign simultaneously. Different path movements cannot co-occur simultaneously in one sign. Consequently, complex movement is either path and local movement combined, or two local movements combined. An example of a sign with the complex movement is represented on Figure 7 - the sign to-adapt. This sign has a path movement of two hands downwards at a small angle. In addition to the path movement, there are numerous trill movements in finger joints of both hands - local movements. This type of signs, with complex movement, constitutes a large class of signs in the lexicon (e.g. HKSL has 283 signs with movements out of 1376 dictionary entities according to (Tang 2007)), and should be taken into consideration by a phonological theory.

Figure 7. to-adapt, RSL

Furthermore, the Movement-Hold model does not address that signs are produced in some time, and that different movements are produced in different time. The length of signing is not contrastive in this model. However, it does to some extent take into account the fact that signs have syllables. In the Movement-Hold model, the idea of the binary structure originates from binary structure “vowels vs. consonants” in spoken languages, where movements supposedly behave like vowels, and holds supposedly behave like consonants. This analogy is based on the fact that movements as well as vowels are responsible for making a syllable. In sign languages syllables' nuclei is the movement feature (Brentari 2012), and the Movement-Hold model was among the first ones to point this out.

The Hand-Tier model (Sandler 1989)

The Hand-Tier model (Sandler 1989) proposes a segmental hierarchical structure of phonology, which is absent from both the Movement-Hold model (Liddell & Johnson 1994) and the Cheremic model (Stokoe 1960). The structure of this model is organized around so-called hand configuration segment (i.e. around the hand configuration tier). Then hand configuration divides into two trees of associated phonological features, namely a handshape tree and a location tree.

The first tree refers to the handshape, on the one hand, and to the orientation, on the other hand. The Hand-Tier model was also the first one to notice that for the handshape it is reasonable to describe only selected fingers, without giving special names to handshapes. Later, this observation was also described for the Prosodic model. All in all, the handshape + orientation tree specifies the following parameters of a sign: which fingers are selected, what is the position (i.e. aperture) of these fingers (closed, open, curved, bent or spread); then is specifies a palm orientation in the second brunch (up, in, prone or contra). If a sign is two-handed, there would be a second brunch in this tree specified according to the same parameters - selected fingers, their aperture, and palm orientation. The basic structure of the hand configuration tree is represented on Figure 8.

Figure 8. Hand configuration tree structure

Figure 9. Location tree structure

The second tree under the hand configuration tier is called a location tree (see Figure 9). Features in the location tree are specified in a static way, meaning that locations are specified for each position without movement. Therefore, location trees play a role of holds in the Hand-Tier model. For each location tree the following phonological parameters are specified: position with respect to distance, height, and laterality, manner, and, finally, place.

Figure 10. The RSL sign mother representation in the Hand-Tier model [open]

Figure 10 shows how this model can be applied to some sign language data - e.g. the sign mother from RSL. The sign mother itself is represented on Figure 11. The main tier - hand configuration tier - produces two trees: location tree below and hand configuration tree above. Let us start from the hand configuration tree. Since all fingers in this sign are selected, it is a so-called “5”-handshape. All selected fingers are outstretched. Therefore, in terms of the Hand-Tier model, it is open position of selected fingers. As for the location tree, this sign evidently has two locations and one movement in which a hand transfers from one location to the other. The main place of articulation is a chin of a signer, and it is the same for both locations. In addition, the hand moves from the ipsilateral position to the contralateral.

Figure 11. mother, RSL

If we put this phonological model in the perspective of the development of the sign languages phonology field, we can evidently say that the Hand-Tier model was the first one to introduce features geometry and features hierarchical structure. As it is stated in Brentari's (2012) review of existent phonological models, the Hand-Tier model seems to be a balance between simultaneous and sequential structure. The location tree here stands for sequential structure, which is at first glance very similar to one in the Movement-Hold model. On the contrary, the hand configuration tree stands for sequential structure, because a number of phonological parameters, such as selected fingers, their position, etc. can be specified only once per sign. This type of simultaneity of the phonological structure is further used in the Prosodic model (Brentari 1998).

However, the Hand-Tier model also has a couple of disadvantages. For instance, it has been proven in the Prosodic model of phonology (Brentari 1998), that it is more reasonable to put location on a par with handshape configuration, and to analyze them not as segments, but as prosodic units. In addition to this, the Prosodic model shows that orientation is actually hierarchically underneath the handshape and the location tiers. For one thing orientation can distinguish only a few minimal pairs. In addition, the direction of the movement (which is derived from the knowledge of places of articulation or aperture positions) actually suggests the palm orientation. So, the orientation phonological feature is structurally different from handshape and location features. Mak & Tang (2011) provide a thorough discussion of the relationship of orientation feature and some movement parameters (see overview in section 3.1.5).

The Dependency model (van der Hulst 1995)

The next model reviewed here, the Dependency model of phonology (van der Hulst 1995), is based on the assumption that movement can be derived from a handshape and place of articulation, and, consequently, there is no need in including movement in the model at all. This assumption, however, has been proven to be incorrect due to the number of reasons. First of all, complex movements, such as local movements, combinations of movements in different joints in one sign, trill movements and etc., cannot be derived just from the knowledge of other static phonological features. In this respect the Dependency model is similar to the Movement-Hold model, where it is also assumed that movements are less important than other static phonological features.

Additionally, the Dependency model exploits feature geometry and segmental structure, as well as the Hand-Tier model does. The highest role in the features geometry is given to the handshape and place of articulation features. The orientation feature is specified under the handshape node. Furthermore, the manner of movement can be specified where relevant under the place of articulation node. Comparing to this structure, the Prosodic model (Brentari 1998) also places handshape and place of articulation on a par. However, unlike the Dependency model, it adds a separate movement node as opposing to handshape + place of articulation node.

After Sandler's (1989) Hand-Tier theory the Dependency model also suggests that timing units play a relevant role in sign phonology. However, van der Hulst (1995) does not postulate syllables for sign languages. Under the Dependency model phonological framework, the prosodic segment, in other words the lexeme, is the most important phonological unit. This statement is reasoned by the fact that syllables in sign languages do not follow usual onset and nucleus structure. There is basically only nucleus and it is very hard to postulate onset of a syllable for sign languages.

The Prosodic model (Brentari 1998)

In this section I discuss the Prosodic model of phonology by Brentari (1998). The idea behind the Prosodic model roots from Sound Pattern of English work by Chomsky (1968). In SPE Chomsky (1968) suggests analyzing the phonology of any language as a sequence of segments which is built up from a set of distinctive features.

The main idea behind this model is that every sign has prosodic features and inherent features. In sign languages inherent features are handshape, orientation, and place of articulation, i.e. parts of the sign that do not change. Note that orientation is specified under handshape and place of articulation features. On the other hand, prosodic features are parts of the sign that do change, namely movement. An example of the Prosodic model representation for the RSL sign mother (Figure 11) is depicted on Figure 12. Inherent features comprise of handshape and place of articulation branches. The sign mother is one-handed, and a hand has a “B”-handshape (all fingers extended, non-spread, and the hand is flat). The sign is articulated on a chin and only a side of the fingers touches the chin. Prosodic features, on the contrary, have only one branch - type of movement, which is a path movement in this case. And the hand moves in the contralateral direction on the chin. According to the Prosodic model, path movement constitutes two timing units - Xs on Figure 12.

Figure 12. The Prosodic model representation of RSL sign mother

Comparing to the Prosodic model, the Movement-Hold theory does not have the timing units and does not fully account for syllables. In the Movement-Hold model the nuclei of the syllable - movement - occurs between two holds, which makes this theory a three-segmental structure. However, the Prosodic model shows that it is enough to have two-segmental structure to account for syllables, where the first segment slot is the beginning of the syllable and the second slot is the ending of the syllable.

Another difference between the Prosodic model and the Movement-Hold model is that the orientation feature is specified not on a par with handshape and place of articulation features. Brentari (1998) reasons it with the fact that there are only a few minimal pairs in ASL and in sign languages in general postulated only on the basis of the orientation difference.

The one disadvantage of the Prosodic model is the fact that the length of signing in this model is not contrastive if it involves the same number of timing units, because each type of movement implies an exact number of timing units. Under this theoretical framework, it is assumed that there are only a few minimal pairs which have a distinction only in length, and that is why it can be omitted from the model. However, for example, in RSL there is a durative aspect which is represented with lengthier signing. The verb change in the example (2) is produced longer than in (1) due to the difference in aspect. The Prosodic model does not account for this type of non-segmental morphology, and it, therefore, under-generating. Channon & Hulst (2011) propose that this type of irregularities can be accounted for with the help of the iconicity. The length of the verb change in (2) is iconic, because it refers to prolonged action, to durative aspect. However, one must keep in mind that the length of the signing with durative does not refer to an actual duration of the action in reality. So, iconicity here is restrained, and consequently, it might be not descriptive enough to exploit it as a reason for not including length of a signing in the phonological model.

(1) 0,46sec poss:loc second mood change / what happen not.understand (Filimonova 2015: 149)

`Her mood has suddenly changed. I don't understand what happened.'

(2) 0,90secrussia indx weather gradually change (Filimonova 2015: 149)

`Climate in Russia is gradually changing.'

Similarly to the Movement-Hold and other models, the Prosodic model separates movement (aka. prosodic) features from other phonological features. In the Prosodic model all types of movements are represented together on one branch of the tree, sequentially. However, it is evident that in all sign languages there are two types of movements with respect to the articulators, namely path and local movements (Liddell&Johnson 1994, Brentari 1998, Brentari 2012, etc.), and two types of movements with respect to the complexity, namely simple movements and complex movements (more than one simultaneous movement on different articulators). Mak & Tang (2011) propose the advancement to the Prosodic model theory which accounts for the fact that movements can be of different types and this should be reflected in the tree structure of the prosodic features branch.

Mak & Tang (2011) suggest another structure of the PF branch (which they rename to the MF or movement features branch), so that repetition of the movement or return of the movement to the initial point can be postulated for any articulator (i.e. at any level of the MF tree branch). They postulate that there are four types of signs with respect to repetitions: no repetition, full repetition ([repeat] feature), return ([return] feature), and trill movement and bidirectional movements ([repeat][return] features). These four types of repetitions can be described just with two phonological features - [repeat] and [return]. These two features can be specified for any articulator. In order to make it possible PF branch in the Mak & Tang theory has two sub-branches - Path and Local movements. In addition to that, local movement branch is subdivided into Orientation and Aperture. Therefore, for example, the Prosodic model for the sign aggressive (which has a circle repeated movement on the chest) can be extended the following way with respect to this theory - see Figure 13. Path movement node here has both [return] and [repeat] features, because it is circular. Another example is a sign shark from RSL (see Figure 14). It has complex movement: movement in orientation with repetition + path movement.

Figure 13. Phonological representation of the RSL sign aggressive under the (Mak & Tang 2011) framework

Figure 14. Phonological representation of the RSL sign shark under the (Mak & Tang 2011) framework

Relevance to this research

As has been discussed above, each sign in basically any sign language can be specified according to a number of phonological features, such as handshape, place of articulation, orientation, movement, handedness, non-manual component and etc. In this research I am going to follow Brentari (1998) terminology. As for the inherent features only handshapes are taken into account systematically. As for the prosodic features, they are annotated with respect to Mak & Tang's (2011) extension of the prosodic model.

Unlike (Brentari 1998) model, I specify prosodic features or movement features for more parameters: whether the movement is path or local, whether there is full repeat, return or trill movement. This approach is borrowed from the Mak & Tang's (2011) paper. It helps to explain complex movements, such as trills, in a form of a tree consistently with the Prosodic model (Brentari 1998).

1.2 Computational methods

Researchers in the field of computational linguistics have also tackled the issue of the handshapes retrieval in order to implement sign language recognition (SLR) system. Sign language recognition is relevant to this research due to the fact that any SLR algorithm inevitably defines what is considered to be minimal features of a language.

Metaxas et al. (2018) proposed the first linguistically-driven SLR framework. Their main idea of the research is to add into a model training a 110-dimensional feature vector, where all the features are somewhat linguistic. This feature vector consists of handshape, motion trajectory, number of hands, start & end position of a sign, dependencies between active and secondary hand in two-handed signs, and non-manual component. What is crucial for us is that they take into account start and end position of a sign, namely the two holds. This approach results into high recognition accuracy (top-1 93.3%) among other things due to the fact that signs usually tend to be monosyllabic (Brentari 2012). It means that signs usually indeed have only two holds (in terms of the Movement-Hold model).

The main problem with this kind of research is the amount of data available. Koller et al. (2016), Rakowski et al. (2018), and others use large datasets in order to train their models. RNN and CNN neural networks require big data, so it is impossible to use them for handshape retrieval on a small dataset, such as a language dictionary. This poses a challenge to sign linguists, because there is virtually no big data available for the purposes of handshape extraction and recognition. Metaxas et al. (2018) use a relatively small dataset on ASL, but they have several recordings with different signers for each sign, while in a dictionary we have only one recording per sign.

Finally, another methodological contribution to the field is Bцrstell's (2018) script for making overlay pictures of signs for the Swedish Sign Language (SSL) dictionary. Bцrstell (2018) intended it to be used to reduce the amount of manual annotation of the SSL dictionary. Initially, to make such overlay pictures (see Figure 16 for the sign bear in SSL) people manually annotated all signs for hold positions. This script does this work automatically with the help of the openCV module for Python which registers differences between time frames of the video. First, it calculates a histogram for each frame of the video with the help of calcHist() function. The resulting histogram has values of pixels on the x- axis and the number of pixels of each value on the y axis. See example for the first frame of the RSL sign shark on Figure 15 below. Then it compares each frame histogram with the previous one with the help of the correlation metric (compareHist() function). Correlation metric returns a result from 0 to 1, where `1' stands for the exact same image, and the less is this number, the less similar the pictures are (formula). The result denotes a difference between frames. Then for each video it calculates these frame differences over frame number (very roughly speaking - over time).

Figure 15. CalcHist() function result

First peak in the frame differences over frame numbers stands for the start of the sign, where the hands are just starting to get into the sign first hold position, and is omitted from the further analysis. Negative peaks in frame differences stand for positions without movement (i.e. holds as in the Movement-Hold theory). As the next step, prominent negative peaks are retrieved with the help of the continuous wavelet transform function (find_peaks_cwt() function). Then the snapshots are made out of the video from the moments of time where the prominent negative peaks have occurred, because apparently negative peaks refer to holds. As a last step, overlays of the snapshots are made which results into pictures like the one on Figure 16. This algorithm does not employ any machine learning which requires big data and still manages to register hold positions in signs.

Figure 16. Overlay picture for bear in SSL (retrieved from Carl Bцrstell's github)

1.3 Summary

All in all, until we know a phonemic inventory of handshapes for some sign language, it is hard to predict what would work for sign language recognition as the minimal segment of this language. Nowadays, most of the SLR algorithms are based on large datasets, so that it is possible to use neural networks (such as RNN or CNN, for example (Koller et al. 2016; Rakowski et al. 2018)). However, in the case of the current research, there are no large enough RSL datasets. What we have for RSL on large scale is the Spreadthesign dictionary and corpus (Burkova 2015). The dictionary is rather small, while the corpus does not contain word entries, only lengthy speech or conversations. Corpus does have a lot of data, however, for the purposes of handshapes extraction we need dictionary-like entities due to the fact that in speech signs can assimilate to one another, and it will be an obstacle in retrieving citation form handshapes.

This situation poses a challenge for sign language recognition research on RSL. How can we automatically learn handshapes on a small linguistically-driven dataset? In this research I will focus on a discussion of phonemic inventory of RSL and its automatic retrieval without machine learning methods. This way, there will be a minimal description of minimal phonological segments of RSL, which can in the future perspective be helpful for automatic linguistically-driven sign language recognition.

2. Studies of handshape inventories in sign languages

In this section I give an overview of works on Sign Language of the Netherlands (NGT) and Adamorobe Sign Language (AdaSL, in Ghana), which pay a particular attention to the question of phonemic inventory of handshapes in these sign languages. In addition to that, I briefly discuss research on such rural sign languages as Inuit SL (IUR) (Schuit 2014), Yolngu SL (YSL, in Australia) (Bauer 2012), and Kata Kolok SL (Kata Kolok) (Marsaja 2008), which put more emphasis on phonetic inventories of handshapes, than on phonemic.

Els van der Kooj (2002) described her own phonological model and phonemic inventory of NGT. This phonological model is an update of the Dependency model by van der Hulst (1996). It suggests that movement as a separate segment and as a separate node of the tree structure can be omitted from the model. However, the features of start and end point of movement and manner of movement are still specified on the tree. Under this framework, van der Kooij (2002) postulated main rules for distinguishing between phonetic and phonemic handshapes for any sign language. A handshape is phonemic if and only if: 1) it is not (only) iconically motivated; there are signs where this handshape is arbitrary; 2) it cannot be predicted from the phonetics. Iconic motivation implies that the handshape refers to an object itself (“object” iconicity pattern), or stands for a contour of an object (“contour” iconicity pattern), or traces the shape of the object (“tracing” iconicity pattern), or hands refer to hands of someone holding or using an object (“tracing” iconicity pattern). So, if a particular handshape occurs only in iconic signs regardless of their iconicity type, then it is not phonemic. On the contrary, the phonetic restriction is less clearly defined. Consider RSL sign subtitles (see Figure 17) where the second hold handshape differs from the first one only with the fact that fingers are flattened. This second handshape is phonetically motivated, because if the first handshape was to be repeated in the second hold, a signer would have had to align his elbow with his hand and this extra movement is more energy consuming (and, consequently, more marked) than just flattening fingers on a hand. It is important to notice that if a particular handshape sometimes occurs in iconic signs or is predicted from phonetics, but is also represented in non-iconic signs and is not phonetically predicted, then it is phonemic. On the basis of this theory and these Phonetic Implementation rules van der Kooij investigates NGT and proposes that out of 70 phonetic handshapes NGT has 31 phonemic handshapes (see van der Kooij (2002: 154-158) for the whole list of phonemic handshapes).

Figure 17. subtitles, RSL

AdaSL phonemic inventory with a particular perspective on handshapes and locations inventories is described in (Nyst 2007). Nyst (2007) collected 365 videos of AdaSL signs. Her dataset does have compounds, but does not have any initialized signs, that is, signs in which handshapes refer to the first letter of the corresponding word from the spoken language. Then all of these videos were annotated by handshape, handshape change (movement in the aperture), and location. Handshapes annotation in this work fully relies on HamNoSys (Hanke 2004). The analysis in this work is split on the one for the active and the one for the secondary hand. She establishes that in total there are 29 phonetic handshapes in AdaSL. Three of them (so-called “K”, “8”, “F” handshapes, see Nyst (2007: 57) for images) occur exclusively in signs which have an aperture change and only in the first part of those signs. Furthermore, there are four handshapes which occur in half of all the signs and occur on the active hand: “1”-handshape, “B”-handshape, “S”-handshape, lax “B”-handshape (see Figures 18 and 19 for “S”-handshape and lax “B”-handshape respectively). Nyst (2007) has also found that the dominant hand has a similar behavior in the sense of the most frequent handshapes. The most frequent handshapes are almost the same regardless of the type of the sign - two-handed symmetric, two-handed asymmetric, or one-handed. In addition to that, she discusses van der Kooij's (2002) hypothesis about the fact that the most frequent handshapes in one sign language have approximately the same frequencies values in all sign languages. Basically, this means that, according to van der Kooij (2002), all sign languages have about 3-4 most frequent handshapes on the active hand which describe about the half of the whole dataset of signs. Nyst (2007) shows that this hypothesis holds for AdaSL (as well as for NGT, ASL, British SL (BSL), and Israeli SL (ISL)). As for the secondary hand handshapes, she has discovered that 83% of them have all fingers selected in their handshape. Having 29 phonetic handshapes for AdaSL, only seven of them turn out to be phonemic. These seven handshapes are arbitrary. Additionally, Nyst postulates six iconic handshapes. These six occur systematically, but they are always only iconically motivated, and following van der Kooij (2002) theory we cannot assume that they are phonemes too.

Figure 18. “S”-handshape Figure 19. lax “B”-handshape

In comparison with NGT, AdaSL shows a much smaller inventory of phonemic handshapes. NGT has 31 phonemic handshapes, while AdaSL has only seven. Crucially, Nyst (2007) also postulates that AdaSL phonemic inventory is restricted to less complex or in other words less marked handshapes. She claims that phonemic handshapes usually have either only one finger selected (index or thumb), or all fingers selected, and there is only one phonemic handshape - V-handshape, which has two fingers selected. However, it seems natural for any language that when the phonemic inventory is smaller, it has less marked and less complex phonemes.

There are also three more works which focus mostly on the phonetic inventories of the handshapes. Notice that all of those languages, Kata Kolok, IUR, and YSL, are rural languages, as well as AdaSL. Rural sign languages phonetics and phonology are usually a little bit different form urban sign languages, like NGT or RSL. Nyst (2012) points out that in general rural SLs have three main properties: 1) they have smaller phonetic inventories of handshapes; 2) larger signing space, in other words more types of places of articulation; and 3) not only hands can play a role of articulators. IUR, however, is quite different in this respect, although it is a rural sign language. For example, there are only a few signs which use not hands as articulators, but something else (Schuit 2014). In addition to that, the signing space in IUR is not very large and comparable to the signing space usually used in urban sign languages (Schuit 2014). Nevertheless, the inventory of phonetic handshapes in IUR is small as it is expected. There are 33 phonetic handshapes (Schuit 2014). On the contrary to IUR, all of those three patterns of rural languages phonetics and phonology apply to YSL (Bauer 2012) and Kata Kolok (Marsaja 2008). For instance, in Kata Kolok the inventory of locations includes a tongue and a crotch of a signer, which is very far from the neutral space (Marsaja 2008). However, Kata Kolok does not have a lot of mouthings (Marsaja 2008). Both YSL (Bauer 2012) and Kata Kolok (Marsaja 2008) have small inventories of phonetic handshapes, 33 and 28 respectively. Schuit (2014) proposes a typological classification of sign languages with respect to a size of a phonetic handshape inventory. The groups are separated by multiples of twenty. According to Schuit (2014), IUR, AdaSL, Kata Kolok, and YSL languages belong to a group with a phonetic inventory size of 21-40, while NGT having 70 handshapes and Australian SL (Auslan) (Johnston & Schembri 2007 cited by Schuit 2014: 36) having 62 handshapes belong to 61-80 size of phonetic handshape inventory. Furthermore, Schuit (2014) hypothesized that it is not very probable that there are sign languages which have more than 80 phonetic handshapes. I will show later that it is not true for RSL.

...

Подобные документы

  • Loan-words of English origin in Russian Language. Original Russian vocabulary. Borrowings in Russian language, assimilation of new words, stresses in loan-words. Loan words in English language. Periods of Russian words penetration into English language.

    курсовая работа [55,4 K], добавлен 16.04.2011

  • The case of the combination of a preposition with a noun in the initial form and description of cases in the English language: nominative, genitive, dative and accusative. Morphological and semantic features of nouns in English and Russian languages.

    курсовая работа [80,1 K], добавлен 05.05.2011

  • The area of the finite verb including particular questions tense, aspect and modal auxiliary usage. The categories of verb morphology: time, possibility, hypothesis, desirability, verb agreement. American sign language and the category of voice.

    курсовая работа [41,3 K], добавлен 21.07.2009

  • Theoretical foundation devoted to the usage of new information technologies in the teaching of the English language. Designed language teaching methodology in the context of modern computer learning aid. Forms of work with computer tutorials lessons.

    дипломная работа [130,3 K], добавлен 18.04.2015

  • Principles of learning and language learning. Components of communicative competence. Differences between children and adults in language learning. The Direct Method as an important method of teaching speaking. Giving motivation to learn a language.

    курсовая работа [66,2 K], добавлен 22.12.2011

  • The influence of other languages and dialects on the formation of the English language. Changes caused by the Norman Conquest and the Great Vowel Shift.Borrowing and influence: romans, celts, danes, normans. Present and future time in the language.

    реферат [25,9 K], добавлен 13.06.2014

  • The origins of communicative language teaching. Children’s ability to grasp meaning, creative use of limited language resources, capacity for indirect learning, instinct for play and fun. The role of imagination. The instinct for interaction and talk.

    реферат [16,9 K], добавлен 29.12.2011

  • Investigating grammar of the English language in comparison with the Uzbek phonetics in comparison English with Uzbek. Analyzing the speech of the English and the Uzbek languages. Typological analysis of the phonological systems of English and Uzbek.

    курсовая работа [60,3 K], добавлен 21.07.2009

  • Study of lexical and morphological differences of the women’s and men’s language; grammatical forms of verbs according to the sex of the speaker. Peculiarities of women’s and men’s language and the linguistic behavior of men and women across languages.

    дипломная работа [73,0 K], добавлен 28.01.2014

  • Features of the use of various forms of a verb in English language. The characteristics of construction of questions. Features of nouns using in English language. Translating texts about Problems of preservation of the environment and Brands in Russian.

    контрольная работа [20,1 K], добавлен 11.12.2009

  • The lexical problems of literary translation from English on the Russian language. The choice of the word being on the material sense a full synonym to corresponding word of modern national language and distinguished from last only by lexical painting.

    курсовая работа [29,0 K], добавлен 24.04.2012

  • The functions of proverbs and sayings. English proverbs and sayings that have been translated into the Russian language the same way, when the option is fully consistent with the English to Russian. Most popular proverbs with animals and other animals.

    презентация [3,5 M], добавлен 07.05.2015

  • Theories of discourse as theories of gender: discourse analysis in language and gender studies. Belles-letters style as one of the functional styles of literary standard of the English language. Gender discourse in the tales of the three languages.

    дипломная работа [3,6 M], добавлен 05.12.2013

  • The Importance of Achieving of Semantic and Stylistic Identity of Translating Idioms. Classification of Idioms. The Development of Students Language Awareness on the Base of Using Idioms in Classes. Focus on speech and idiomatic language in classes.

    дипломная работа [66,7 K], добавлен 10.07.2009

  • Definition and classification of English sentences, their variety and comparative characteristics, structure and component parts. Features subordination to them. Types of subordinate clauses, a sign of submission to them, their distinctive features.

    курсовая работа [42,6 K], добавлен 06.12.2015

  • Description of the basic principles and procedures of used approaches and methods for teaching a second or foreign language. Each approach or method has an articulated theoretical orientation and a collection of strategies and learning activities.

    учебное пособие [18,1 K], добавлен 14.04.2014

  • The history of football. Specific features of English football lexis and its influence on Russian: the peculiarities of Russian loan-words. The origin of the Russian football positions’ names. The formation of the English football clubs’ nicknames.

    курсовая работа [31,8 K], добавлен 18.12.2011

  • History of interpreting and establishing of the theory. Translation and interpreting. Sign-language communication between speakers. Modern Western Schools of translation theory. Models and types of interpreting. Simultaneous and machine translation.

    курсовая работа [45,2 K], добавлен 26.01.2011

  • The general outline of word formation in English: information about word formation as a means of the language development - appearance of a great number of new words, the growth of the vocabulary. The blending as a type of modern English word formation.

    курсовая работа [54,6 K], добавлен 18.04.2014

  • The best works of foreign linguists as Henry I Christ, Francis B. Connors and other grammarians. Introducing some of the newest and most challenging concepts of modern grammar. The theoretical signifies are in comparison with Russian and Uzbek languages.

    курсовая работа [50,3 K], добавлен 21.07.2009

Работы в архивах красиво оформлены согласно требованиям ВУЗов и содержат рисунки, диаграммы, формулы и т.д.
PPT, PPTX и PDF-файлы представлены только в архивах.
Рекомендуем скачать работу.