Главная Коллекция "Revolution" Иностранные языки и языкознание Automated method of hyper-hyponymic verbal pairs extraction from dictionary definitions based on syntax analysis and word embeddings

Automated method of hyper-hyponymic verbal pairs extraction from dictionary definitions based on syntax analysis and word embeddings

Analysis of methods for extracting infinitives and syntactic parsing of sentences for distinguishing hyper-hyponymic verb pairs from dictionary definitions in the Russian language. Development of automated translation systems for lexicographic resources.

Рубрика	Иностранные языки и языкознание
Вид	статья
Язык	английский
Дата добавления	20.05.2021
Размер файла	404,0 K

посмотреть текст работы

скачать работу можно здесь

полная информация о работе

весь список подобных работ

Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже

Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.

Размещено на http://www.allbest.ru/

Automated method of hyper-hyponymic verbal pairs extraction from dictionary definitions based on syntax analysis and word embeddings

O. Antropova, E. Ogorodnikova; Ural Federal University named after the first President of Russia B.N. Yeltsin

This study is dedicated to the development of methods of computer-aided hyper-hyponymic verbal pairs extraction. The method's foundation is that in a dictionary article the defined verb is commonly a hyponym and its definition is formulated through a hypernym in the infinitive. The first method extracted all infinitives having dependents. The second method also demanded an extracted infinitive to be the root of the syntax tree. The first method allowed the extraction of almost twice as many hypernyms; whereas the second method provided higher precision. The results were post-processed with word embeddings, which allowed an improved precision without a crucial drop in the number of extracted hypernyms.

Key words: semantics, verbal hyponymy, syntax models, word embeddings, RusVectores

Автоматический метод выявления гипо-гиперонимических глагольных пар из глагольных дефиниций на основе синтаксического анализа и векторного представления слов

О.И. Антропова, Е.А. Огородникова; Уральский федеральный университет имени первого Президента России Б.Н. Ельцина

Исследование посвящено разработке двух методов автоматизированного выявления гипо-гиперонимических глагольных пар из словарных дефиниций. Методы основаны на суждении, что в глагольной словарной статье чаще всего определяемое слово является гипонимом, а толкование формулируется через его гипероним в инфинитиве. С помощью первого метода извлекаются все инфинитивы, имеющие зависимые слова. Во втором методе помимо этого требовалось, чтобы извлекаемый инфинитив был вершиной дерева синтаксического разбора. Первый метод позволяет извлечь почти вдвое больше истинных гиперонимов, в то время как второй метод обеспечивает более высокую точность. Результаты были обработаны с помощью векторного представления слов - это позволило значительно увеличить точность без существенного снижения количества извлеченных гиперонимов.

Ключевые слова: семантика, глагольная гипонимия, синтаксические модели, векторное представление слов, RusVectores

infinitive syntactic hyper-hyponymic

Introduction

Studying and determining semantic relations has a special role in contemporary computational linguistics. The main domain of the application of semantic relations extraction is lexicography resources construction (e.g. electronic dictionaries and thesauri). Besides, different semantic relations are used in natural language processing and language teaching resources.

The number of semantic relations is large and the number of semantic units in a language is tremendous. This means that it is almost impossible to establish all links manually; this task requires some other solutions, including automated methods of semantic relations extraction. So, the following research is devoted to the problem of the elaboration of such a method for automated extraction of Russian hyper-hyponymic verbal pairs.

One of the problems is that there are no unified precise criteria of hyponymy (in particular troponymy); its definition is very subjective and depends on individual comprehension and linguistic experience.

Elena Kotsova describes the hypernym and hyponym as:

1. Hypernym

a. Is more frequent in a natural language;

b. Allows more active synonymic substitution of species words in speech;

c. Can be in a role of hypernym in different grades and levels of a genus-species hierarchy;

d. Is simpler in morphemic structure, has no nominal motivation;

e. Cannot be a word with an utterly broad meaning.

2. Hyponym

a. Has more specific meaning and can be divided into semantic features; usually it is a monosemantic word;

b. Has a seme, which is important for hyponymic links;

c. Is less frequent, especially for specific words with professional meaning;

d. More rarely can substitute a word with genus meaning, only in its syntagmatic field;

e. Has two-seme prototypic semantic structure (hyperseme + hyposeme);

f. Is in equivalent relations with other hyponyms of this hyponymic group;

g. Usually has more complex morphemic structure [Kotsova, 2010, p. 25-27].

The authors of Princeton WordNet formulate the idea of hyponymy among verbs as following: “...the many different kinds of elaborations that distinguish a 'verb hyponym' from its superordinate have been merged into a manner relation that [Fellbaum, Miller, 1990] have dubbed troponymy (from the Greek tropos, manner or fashion). The troponymy relation between two verbs can be expressed by the formula To V1 is to V2 in some particular mannef” [Fellbaum, 1993, p. 47].

For example, the verbs плестись, брести, тащиться `to trail, to plod, to trudge' are synonyms as they express the same notion and contain the same quantity of information. They are also interchangeable in a context. And the verb идти `to go' is their hypernym as it has a broader meaning. It also matches all hypernym criteria described above and fits in the formula: to trail/ plod/trudge is to go in some particular manner.

The relevance of our research is determined by the lack of studies dedicated to automated extraction of relations of verbs. As it is shown in the Related Work section, the majority of research is focused on nouns. At the same time the methods applied to nouns in most cases do not work well with verbs. The elaborated method could facilitate many theoretical and applied issues in linguistics, in particular, automated filling of lexical resources.

Related Work

Numerous studies on semantic relations extraction have been published since the pioneering work of [Hearst, 1992]. The methods applied to the task vary greatly: lexico-syntactic patterns [Hearst, 1992, 1998], automatic translation from a different language [Pianta et al., 2002], extraction from knowledge databases [Zesch et al., 2008; Panchenko et al., 2012], conversion from a linguistic ontology [Loukachevitch et al., 2016], crowdsourcing [Braslavski et al., 2016], extraction grammars [Gongalo et al. 2009, 2010], morpho-syntactic rules [Rubashkin et al., 2010] and different combinations of the aforementioned methods with machine-learning techniques such as clustering and word embeddings [Kiselev, 2016; Alekseevsky, 2018; Karyaeva et al., 2018].

Despite such abundance of research, studies on verbs are not easily found as long as most works are focused on nouns. Unlike the others' work [Gongalo et al., 2010], which extracted different types of relations for four open grammatical categories (nouns, verbs, adjectives, adverbs). They obtained 58,362 pairs of synonyms for nouns and 30,180 pairs for verbs; 122,478 noun hypernyms and no verb hypernyms at all. These results and our ongoing research allow one to suggest that verbal hyponymy extraction demands some special research as long as methods developed for nouns do not go well with verbs.

The authors of [Goncharova, Cardenas, 2013] designed a method of extraction of hypo-hypernymic hierarchy of verbs from domain-specific corpora. This method is based on the cognitive theory of terminology [Benitez et al., 2005]. The method is developed further in [Cardenas, Ramisch, 2019]. Firstly, authors automatically extracted noun-verb-noun triples from specialized corpora of environmental science texts in English and Spanish. Secondly, they manually annotated each triple with the lexical domain of the verbs and the semantic class and role of the noun. And lastly, they manually inferred the hypo-hypernymic hierarchy of the extracted verbs according to their syntactic potential: the more types of semantic subclasses of nouns a verb accept, the higher its position in the hierarchy. The method is very different from all the aforementioned ones because it focuses on domain-specific terminology and demands much more human effort. Therefore, despite being a useful tool for the creation of domain-specific ontologies, this method is hardly applicable to common language.

To the best of our knowledge there is no other research on hypernyms extraction for Russian verbs. Our research started from lexico-syntactic patterns [Hearst, 1992, 1998], or specific linguistic expressions or constructions which usually include both hyponym and hypernym in a context. For example, <hyponym> and other <hypernym>; <hypernym> such as <hyponym> and so on. Firstly, there was an attempt to find some typical lexico-syntactic patterns in corpus data, but it turned out to be inefficient for verbs. Even though hypernyms and hyponyms can both be found in the nearest context, we have failed to discover any regular patterns in corpus data.

We manually analysed more than 400 contexts for 100 hyper-hyponymic pairs and realised that these pairs fulfil the hyponymy function very rarely, no more than 5-6 examples among our set of contexts. For example, the sentence К тому же хотелось сучить ногами, вертеться, вообще - двигаться, хотя несколько минут назад он мечтал только об одном - лечь `In addition, he wanted to curl his toes, to spin, generally - to move, although a few minutes ago he dreamed of only one thing - to lie down' includes a hyper-hyponymic pair вертеться/двигаться `to spin / to move', but there is no regular pattern that can be applied to other texts to find other hyper- hyponymic pairs. Also it turned out that hyper-hyponymic pairs more often play a role of contextual synonyms in texts [Ogorodnikova, 2017].

Secondly, we have tried to process dictionary data and find some lexico- syntactic patterns there as long as they commonly contain both hyponym and hypernym in one entry. Frequent and universal lexico-syntactic patterns

1 are easily detected for nouns: <hyponym> - род/вид/разновидность/... t `class, sort, kind' <hypernym>. We have also failed to detect any similarly

2 universal lexico-syntactic patterns for verbs. Nonetheless, it was noticed that in most cases a hypernym in a definition is accompanied with a repeating specifying word. We have called such words “lexical markers”. Unfortunately, the lexical markers drastically differ for different semantic groups of verbs. For example, such lexical markers as вверх/вниз `up/down' are typical for verbs of movement and useless for verbs of speech. Those verbs are usually defined by such markers as громко / тихо, невнятно, отрывисто `loudly / quietly, incomprehensibly, abruptly'. We have manually created a list of such markers for verbs of movement, automatically extracted hyper-hyponymic pairs from six dictionaries and manually evaluated them [Antropova, Ogorodnikova, 2019].

The method based on this idea showed a moderate precision of 0,61, but the coverage of the method depends on the list of markers, which has to be created separately for every semantic group. Manual creation of such lists is time-consuming and the task of automated creation does not seem to be much easier than the task of hyponym extraction itself.

Data

The study is mainly based on the material of dictionary definitions for verbs which were taken from seven Russian dictionaries:

1. Babenko L.G.: The Dictionary of Synonyms of the Russian Language, 2011;

2. Babenko L.G.: The Explanatory Dictionary of Russian Verbs, 1999;

3. Efremova T.F.: The New Dictionary of Russian Language. Explanatory- derivational, 2000;

4. Evgenyeva A.P.: The Small Academic Dictionary: in 4 v. The 4th ed., 1999;

5. Kuznetsov S.A.: The Big Explanatory Dictionary of the Russian Language, 2000;

6. Ushakov D.N.: The Explanatory Dictionary of the Russian Language: in 4 v., 1935-1940;

7. Linguistic Ontology Thesaurus RuThes.

These dictionaries are available in electronic form, so they can be easily processed. Besides, they are well-known to Russian linguistics and present the fullest vocabulary.

To check the effectiveness of the proposed methods we used one hundred Russian verbs. This number of test units allows the estimation of the methods and it is possible to analyze all achieved results manually. The verbs were extracted from The Explanatory Dictionary of Russian Verbs because it contains a detailed semantic classification of verbs, and it allows the consideration of the difference between semantic groups as it can influence the result of our analysis. The verbs from different groups were taken proportionally according to rates in the dictionary. We then tested our methods on these verbs' definitions taken from all seven dictionaries.

Methods

Syntactic analysis of definitions

The creation of the method is possible because of traditional definition construction. According to [Komarova, 1990] and [Shelov, 2003] there are some typical classes of definitions. So, the main difference, which is significant for the purpose of the research, is that definitions can be extended or unextended. Extended definitions are usually based on hyponymy, meronymy, or contextual explanations (рвать - резким движением разделять на части `to rip - to divide into parts with an abrupt movement'). Unextended definitions contain synonyms of an entry word or its derivatives referring to another entry (жульничать - плутовать, мошенничать `to cheat - to palter, to swindle'; defining perfective referring to its imperfective pair).

The most common type of semantic relations in verbal extended definitions is hyponymy. This speculation allows us to elaborate a method of automated verbal hyper-hyponymic pairs extraction.

So, a hypernym is usually expressed as an infinitive with dependent words. However, this rule still results in the extraction of some noise, as far as an infinitive can be used in different extending constructions. We suggested that it is possible to get rid of some noise by adding a rule that a target infinitive should be the root of syntax tree of the definition.

In order to implement these methods, we decided to use the UDPipe¹pre-trained model to obtain syntax trees for the definitions. UDPipe offers 3 models for the Russian language. We chose the Russian model trained on SynTagRus because it provides better quality according to its authors' estimations.² On the basis of the model we created two methods of hypernyms extraction from dictionary definitions:

1. “InfsWithDependants”. It extracts all the infinitives having dependent words in a given definition.

2. “RootInfWithDependants”. It extracts the infinitive having dependent words in a given definition only if it is the root of the syntax tree.

So, the definition стучать - ударять (ударить) в дверь, окно коротким, отрывистым звуком, выражая этим просьбу впустить кого-л., куда-л. `to knock - to hit a door, a window with short, choppy sounds wishing to let somebody in' can be processed differently. The second method allows the discover of only one infinitive ударять `to hit', which is the true _m hypernym for стучать `to knock', while the first one extracts two verbsуда- s рять, впустить `to hit, to let in', and впустить `to let in' is an example of the noise.

Post-processing with word embeddings

As it is shown in Table 1, “Infs With Dependants” proved to find almost twice as many correct hypernyms as “Root Inf With Dependants”, whereas the second method delivers considerably higher precision. Thus, we devised an idea how to improve “Infs With Dependants” results by post-processing it with word embeddings.

A word embedding is a mathematical model of a language. It is based on the idea that similar words tend to appear in similar contexts. A word embedding is a trained neural network which transforms words into vectors (or pointsActually, the neural network output is N numbers - a set of coordinates in N-dimensional semantic space. This numbers can be visually represented either as a vector, beginning in the origin of coordinates and ending in the given set of coordinates, or simply as a point with the given set of coordinates.) in some N-dimensional semantic space: if the words appear in similar contexts, the points are close to each other in the space. Figure 1 shows an example of such a representation. For the current research it is important that word embeddings also allow the calculation of a similarity measure of given words, namely the cosine similarity, which scales from 0 (least similar) to 1 (most similar). For example, according to the embedding from Figure 1, cosine similarity of verbs глядеть `to gaze' and смотреть `to look' is 0.836, глядеть and делать `to do' - 0.310, глядеть and обладать `to possess' - 0.144. Models differ from each other by the following parameters: corpora used for the model training; part of speech (POS) tags used to distinguish homonymic parts of speech (e.g., if “go” is a verb or a noun); the size of the sliding window - the number of neighbourhood words taken into account; and a number of other technical parameters such as the learning algorithm or dimensionality. See [Kutuzov, Kuzmenko, 2017] for details.

We employed pre-trained embeddings from RusVectores project. The idea of the method is to drop the extracted candidate verb if its similarity with the defined verb is lower than a threshold.

We took “InfsWithDependants” results for the hundred verbs as a starting point and randomly divided them into test (30) and development (70) sets. Then, for each available RusVectores model we did the following. First, we cleaned the development set from verbs absent in a model. Second, we performed 7-fold cross-validation on the development set in order to get a better estimation of a model and find the best threshold for it. After that, we compared all the models by their mean performance on cross-validation and chose the best and applied it to the test set.

Fig. 1. A visualization of word embeddings

The embeddings are created by a Rus Vectores model trained on Russian National Corpus and Wikipedia. Represented verbs: делать to do', работать 'to work', иметь 'to have', обладать 'to possess', мочь 'to be able to', уметь 'can', глядеть 'to gaze', видеть 'to see', смотреть 'to look' , глазеть 'to stare', слышать 'to hear'

Results and Discussion

“Infs With Dependants” and “Root nf With Dependants” were applied to all the definitions of the hundred verbs. The derived hypernyms were then manually marked for correctness. The evaluation results are summarized in Table 1. Obviously, “Root Inf With Dependants” cannot extract more true hypernyms than “Infs With Dependants”, as long as it simply adds one more filtering condition. Calculating actual recall is not possible because only the extracted infinitives were marked. Nonetheless, we can get some notion about the recall drop judging by the drop of the true positive rate. “Infs With Dependants” allows the extraction of almost twice as many true hypernyms as “Root Inf With Dependants”, but it demonstrates a significantly lower precision.

Table 1. “Infs With Dependants” and “Root Inf With Dependants” results for the hundred of verbs

Method name	True Positive Rate	Precision
Infs With Dependants	1.000	0.466
Root Inf With Dependants	0.595	0.571

Let us consider some typical mistakes arising during the syntax analysis of the definitions. For example, for the definition вертеться - совершать круговые движения; вращаться, крутиться `to revolve - to carry out circular motions; to rotate, to spin' it marks крутиться `to spin' as dependent from вращаться `to rotate', whereas they actually are homogeneous and both have no dependent words, which is typical for synonyms in definitions and facilitates distinguishing them from hypernyms. Second, a common problem may be illustrated by the definition доходить - понимая и осознавая что-либо, разбираться/разобраться в чем-либо (в каком-либо сложном вопросе, запутанном деле и т.п.) `to see the light - to figure something out understanding or realizing it (about challenging issue, complicated problem etc.)'. Here the model mistakenly marks the first verb of verbal adverbial construction as a root while it should have been the homogeneous verbs разбираться/разобраться `to figure out'. Such mistakes might be avoided by customising the syntax model for our task. In further research this issue will be addressed.

The following example illustrates drawbacks of our methods. Заедать - подвергая что-л. (обычно какие-л. механизмы) отрицательному воздействию, зажимать/зажать, защемлять/защемить, зацеплять/зацепить какую-л. деталь, препятствуя движению, нормальному функционированию `to jam - exposing negatively (usually some devices), to press, to squeeze, to hook a detail, so that movement or action is prevented'. Even if the syntax tree for this definition was perfect, our method does not allow the delineation of hypernyms from synonyms in case the latter have dependent words. The application of our method is also limited to extracting one-verb hypernyms only. Finding the exact boundaries of a multi-word hypernym is much more difficult. A frequent case of multi-word hypernym is the verb совершать `to carry out' which can collocate with different specifying supplements. For instance, the definition to the verb вертеться `to revolve' starts with an expression совершать движения `to carry out motions' in many dictionaries.

As mentioned earlier, we decided to post-process the results of “Infs With Dependants” in order to improve its precision. Performance of every available Rus Vectores model in “Infs With Dependants” on crossvalidation is shown in Table 2. It was possible to calculate recall for this case, because here we processed only the extracted hypernyms, which had been manually marked for correctness, so we knew exactly how many true hypernyms the set contained.

Table 2 Average quality measures on cross-validation for Rus Vectores models

N	Model parameters	Precision	Recall	F-score
	Corpora	POS tags	Window size
1	Ruscorpora	Universal tags	20	0.5171	0.6835	0.5813
2	Russian Wikipedia and Ruscorpora	Universal tags	2	0.4851	0.798	0.5943
3	Tayga	Universal tags	2	0.5201	0.7295	0.6011
4	Tayga	None	10	0.5005	0.5889	0.5338
5	Russian news	Universal tags	5	0.4796	0.8957	0.6221
6	Araneum	None	5	0.5215	0.436	0.4728

The models can be downloaded from http://rusvectores.org/models/. Model filenames:

1 - ruscorpora_upos_cbow_300_20_2019;

2 - ruwikiruscorpora_upos_skipgram_300_2_2019;

3 - tayga_upos_skipgram_300_2_2019;

4 - tayga_none_fasttextcbow_300_10_2019;

5 - news_upos_skipgram_300_5_2019;

6 - araneum_none_fasttextcbow_300_5_2018

When fitting the thresholds and choosing the best model we decided to rely on precision rather than F-score because precision for this task does not grow with the increase of the threshold. A typical graph for precision, recall and F-score resembles Figure 1. This shows that precision grows up only to some threshold, but then it decreases. It happens because the word embeddings that we used does not distinguish different meanings of words, thus combining all the meanings of a word into a single average vector. Therefore, if at least one word of a hyper-hyponymic pair is used not in its most frequent meaning, their similarity might be rather low. Also, Figure 2 demonstrates that recall changes in a much wider span, thus having greater impact on F-score.

We chose the third model (Tayga, Universal tags, Window Size = 2) from all the models presented in Table 2 because even though the sixth model (Araneum, No tags, Window Size = 5) has slightly higher precision, the first one has significantly higher recall.

Fig. 2. A typical dependency of precision, recall and F-score from threshold

We applied the chosen model with the threshold found during cross-validation to the test set and compared it with the corresponding parts of “Infs With Dependants” and “Root Inf With Dependants” results (see Table 3). In that way we managed to obtain the results with a higher true positive rate and precision than those of the “RootInf With Dependants” method.

Table 3 Final results for the test set

Method name	True Positive Rate	Precision
Infs With Dependants	1.000	0.401
Post-processed Infs With Dependants	0.832	0.517
Root Inf With Dependants	0.740	0.504

Conclusion

A preliminary linguistic reflection allowed us to conclude that for verb hyponymy extraction, it is worth using dictionary definitions. In this kind of linguistic source, verbal hyper- and hyponyms occur together more frequently than in others (e.g. corpus data).

Our previous study also allowed us to conclude that lexico-syntactic patterns, widely used for the extraction of hyper-hyponymic pairs of nouns, do not fit for verbs because we were unable to find any verbal lexico-syntactic patterns neither in corpora nor in dictionary definitions. Therefore, some methods of extraction should be developed specifically for verbs.

The study shows that syntactic analysis of definitions is a good starting point for hyper-hyponymic verbal pairs extraction. We developed two methods based on syntactic analysis of definitions and applied them to seven Russian dictionaries. The first method extracted all infinitives that have dependants. The second method also demanded an extracted infinitive to be the root of the syntax tree. The use of pre-trained word embeddings from RusVectores project improved precision of the first syntax-based method without a crucial drop in the number of extracted true hypernyms, which allowed outperformance of the second syntax-based method in both precision and number of extracted true hypernyms.

Nonetheless, analysis of mistakes showed that the syntax model should be customised for our task to improve the results of the developed method. We will address these issues in future research.

References

1. Antropova, Ogorodnikova, 2019 - Антропова О.И., Огородникова Е.А. Возможности автоматизированного выделения гипо-гиперонимических пар из словарных определений глаголов // Вестник Южно-Уральского государственного университета. Серия: Лингвистика. 2019. Т. 16. № 2. С. 51-57.

2. Alekseevsky, 2018 - Алексеевский Д.А. Методы автоматического выделения тезаурусных отношений на основе словарных толкований: Дис. ... канд. филол. наук. М., 2018.

3. Benitez et al., 2005 - Benitez F., Exposito C.M., Exposito M.V., Linares C.M. Framing Terminology: A Process-Oriented Approach. Meta: Translators' Journal. 2005. Vol. 50. No. 4.

4. Braslavski et al., 2016 - Braslavski P., Ustalov D., Mukhin M., Kiselev Y. YARN: Spinning-in-Progress. Proceedings of the Eight Global Wordnet Conference. V.B. Mititelu, C. Forascu, C. Fellbaum, P. Vossen (eds.). Bucharest, 2016. Pp. 58-65.

5. Cardenas, Ramisch, 2019 - Cardenas B.S., Ramisch C. Eliciting specialized frames from corpora using argument-structure extraction techniques. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication. 2019. 25 (1). Pp. 1-31.

6. Fellbaum, Miller, 1990 - Fellbaum C., Miller G.A. Folk psychology or semantic entailment? A reply to rips and conrad. Psychological Review. 1990. No. 97. Pp. 565-570.

7. Fellbaum, 1993 - Fellbaum C. English verbs as a Semantic Net. Word-Net: Lexical Database - Revised August, 1993.

8. Goncharova, Cardenas, 2013 - Goncharova Yu., Cardenas B.S. Specialized Corpora Processing with Automatic Extraction Tools. Procedia - Social and Behavioral Sciences. A. Hamilton (ed.). Elsevier, 2013. Pp. 293-297.

9. Gongalo et al., 2009 - Gongalo H.O., Santos D., Gomes P. Relations extracted from a portuguese dictionary: Results and first evaluation. Local Proc. 14th Portuguese Conference on Artificial Intelligence (EPIA). L.S. Lopes, N. Lau, P. Mariano, L.M. Rocha (eds.). Springer, 2009. Pp. 541-552.

10. Gongalo et al., 2010 - Gongalo H.O., Gomes P. Onto.PT: Automatic Construction of a Lexical Ontology for Portuguese. Proceedings of 5th European Starting AI Researcher Symposium (STAIRS 2010). T. Abudawood, P.A. Flach (eds.). Lisbon, 2010. Pp. 199-211.

11. Hearst, 1992 - Hearst M.A. Automatic acquisition of hyponyms from Large Text Corpora. Proceedings of the 14th Conference on Computational Linguistics. Nantes, 1992. Pp. 539-545.

12. Hearst, 1998 - Hearst M.A. Automated Discovery of WordNet Relations. WordNet: An Electronic Lexical Database. C. Fellbaum (ed.). Cambridge, 1998. Pp. 132-152.

13. Karyaeva et al., 2018 - Karyaeva M., Braslavski P., Kiselev Yu. Extraction of hypernyms from dictionaries with a Little Help from Word Embeddings. Analysis of Images, Social Networks and Texts - 7th International Conference, AIST. A. Panchenko, W.M. van der Aalst, M. Khachay et al. (eds.) Moscow, 2018.

14. Kiselev, 2016 - Киселев Ю.А. Разработка автоматизированных методов выявления семантических отношений для электронных тезаурусов: Дис. ... канд. техн. наук. Екатеринбург, 2016.

15. Komarova, 1990 - Комарова З.И. Русская отраслевая терминология и тер- минография. Каменец-Подольский, 1990.

16. Kotsova, 2010 - Котцова Е.Е. Гипонимия в лексической системе русского языка (на материале глагола): Дис. . д-ра филол. наук. Архангельск, 2010.

17. Kutuzov, Kuzmenko, 2017 - Kutuzov A., Kuzmenko E. WebVectors: A Toolkit for Building Web Interfaces for Vector Semantic Models. Analysis of Images, Social Networks and Texts. AIST 2016. Ignatov D. et al. (eds.). Springer, Cham, 2017.

18. Loukachevitch et al., 2016 - Loukachevitch N.V., Lashevich G., Gerasimova A.A. et al. Creating Russian WordNet by Conversion. Proceedings of Conference on Computatilnal linguistics and Intellectual technologies Dialog-2016. V.P. Selegey et al. (eds.). Moscow, 2016. Pp. 405-415.

19. Ogorodnikova, 2017 - Огородникова Е.А. Использование лексико-синтаксических шаблонов для формализации родовидовых отношений в толковом словаре // Евразийский гуманитарный журнал. 2017. № 2. С. 20-24.

20. Panchenko et al., 2012 - Panchenko A., Adeykin S., Romanov P., Romanov A. Extraction of Semantic Relations between Concepts with KNN Algorithms on Wikipedia. Concept Discovery in Unstructured Data Workshop (CDUD) of International Conference On Formal Concept Analysis. D. Ignatov, S. Kuznetsov, J. Poelmans (eds.). Leuven, 2012. Pp. 78-88.

21. Pianta et al., 2002 - Pianta E., Bentivogli L., Christian G: MultiWordNet. Developing an aligned multilingual database. Proceedings of the 1st International WordNet Conference. Ch. Fellbaum, P. Vossen (eds.). Mysore, 2006. Pp. 293-302.

22. Rubashkin et al., 2010 - Опыт автоматизированного пополнения онтологий с использованием машиночитаемых словарей / Рубашкин В.Ш., Бочаров В.В., Пивоварова Л.М., Чуприн Б.Ю. // Компьютерная лингвистика и интеллектуальные технологии: По материалам ежегодной Международной конференции Диалог. Вып. 9 (16). Кибрик A.E. и др. (ред.). М., 2010. С. 413-418.

23. Shelov, 2003 - Шелов С.Д. Термин. Терминологичность. Терминологические определения. СПб., 2003

24. Zesch et al., 2008 - Zesch T., Muller Ch., Gurevych I. Extracting lexical semantic knowledge from Wikipedia and Wiktionary. Proceedings of the Sixth International Conference on Language Resources and Evaluation. Marrakech, 2008. Pp. 1646-1652.

Размещено на Allbest.ru

...

статья "Automated method of hyper-hyponymic verbal pairs extraction from dictionary definitions based on syntax analysis and word embeddings" скачать

Подобные документы

Methods of Lexicological Analysis
The process of scientific investigation. Contrastive Analysis. Statistical Methods of Analysis. Immediate Constituents Analysis. Distributional Analysis and Co-occurrence. Transformational Analysis. Method of Semantic Differential. Contextual Analysis.

реферат [26,5 K], добавлен 31.07.2008
Modality as one of syntactic categories of language
Analysis of expression of modality in English language texts. Its use as a basic syntactic categories. Evaluation modalities of expression of linguistic resources. Composite modal predicate verb is necessary in the sense of denial assumption corresponds.

курсовая работа [29,1 K], добавлен 10.01.2015
Lexicography as a science of dictionary-making
Lexicography as a science. Dictionary: notion, functions, classification, components. The characteristics of Macmillan English Dictionary for Advanced Learners. Theory and practice of compiling of dictionaries. Dzhonsonovskiy Method of creation.

реферат [41,3 K], добавлен 30.04.2009
Verb phrases
Phrases as the basic element of syntax, verbs within syntax and morphology. The Structure of verb phrases, their grammatical categories, composition and functions. Discourse analysis of the verb phrases in the novel "Forsyte Saga" by John Galsworthy.

курсовая работа [55,2 K], добавлен 14.05.2009
The History of English Syntax
The development of Word Order. Types of syntactical relations words in the phrase, their development. The development of the composite sentence. The syntactic structure of English. New scope of syntactic distinctions and of new means of expressing them.

лекция [22,3 K], добавлен 02.09.2011
Contrastive analysis of compound adjectives in English and Ukrainian
The place and role of contrastive analysis in linguistics. Analysis and lexicology, translation studies. Word formation, compounding in Ukrainian and English language. Noun plus adjective, adjective plus adjective, preposition and past participle.

курсовая работа [34,5 K], добавлен 13.05.2013
Text analysis in translation
Systematic framework for external analysis. Audience, medium and place of communication. The relevance of the dimension of time and text function. General considerations on the concept of style. Intratextual factors in translation text analysis.

курс лекций [71,2 K], добавлен 23.07.2009
Slang
Characteristic features of Slang. Feature Articles: Magical, Ritual, Language and Trench Slang of the Western front. Background of Cockney English. Slang Lexicographers. The Bloomsbury Dictionary Of Contemporary slang. Slang at the Millennium.

курсовая работа [69,2 K], добавлен 21.01.2008
The concept and feature of literary translation
The lexical problems of literary translation from English on the Russian language. The choice of the word being on the material sense a full synonym to corresponding word of modern national language and distinguished from last only by lexical painting.

курсовая работа [29,0 K], добавлен 24.04.2012
Features of syntactic structures in sports journalism (on the basis of the newspapers "Sport-express" and "Izvestia")
Syntactic structures in the media. Characteristic features of language media. Construction of expressive syntax. Syntactic structures in the newspaper "Sport Express" and "Izvestia". Review features of sports journalism and thematic range of syntax.

курсовая работа [24,7 K], добавлен 30.09.2011
Types of translation
Concept, essence, aspects, methods and forms of oral translation. Current machine translation software, his significance, types and examples. The nature of translation and human language. The visibility of audiovisual translation - subtitling and dubbing.

реферат [68,3 K], добавлен 15.11.2009
Usages of a concordance
The definition of concordance in linguistics as a list of words used in a body of work, or dictionary, which contains a list of words from the left and right context. The necessity of creating concordance in science for learning and teaching languages.

контрольная работа [14,5 K], добавлен 18.01.2012
A contrastive analysis of consonants of English and Turkish languages
Comparative analysis and classification of English and Turkish consonant system. Peculiarities of consonant systems and their equivalents and opposites in the modern Turkish language. Similarities and differences between the consonants of these languages.

дипломная работа [176,2 K], добавлен 28.01.2014
The history of development the science of translation in the countries of studied language
The history of translation studies in ancient times, and it's development in the Middle Ages. Principles of translation into Greek, the texts of world's religions. Professional associations of translators. The technology and terminology translation.

дипломная работа [640,7 K], добавлен 13.06.2013
Inversion and the means of its translation
Characteristic of inversion in the English from the point of view of its translation into Russian. The opportunity to transmit the meaning of the inversion in Russian. Subject-auxiliary, subject-verb. Local, negative, heavy inversion. inversion "there".

курсовая работа [51,9 K], добавлен 19.07.2015
Intercultural communication of Russian and English languages
Loan-words of English origin in Russian Language. Original Russian vocabulary. Borrowings in Russian language, assimilation of new words, stresses in loan-words. Loan words in English language. Periods of Russian words penetration into English language.

курсовая работа [55,4 K], добавлен 16.04.2011
Lexical and grammatical peculiarities of scientific-technical texts
Development of translation notion in linguistics. Types of translation. Lexical and grammatical peculiarities of scientific-technical texts. The characteristic of the scientific, technical language. Analysis of terminology in scientific-technical style.

курсовая работа [41,5 K], добавлен 26.10.2010
Methods of concept description
New scientific paradigm in linguistics. Problem of correlation between peoples and their languages. Correlation between languages, cultural picularities and national mentalities. The Method of conceptual analysis. Methodology of Cognitive Linguistics.

реферат [13,3 K], добавлен 29.06.2011
Stylistic analysis of the part of the novel "Rebecca" by Daphne Du Maurier
Daphne Du Maurier. The novel "Rebecca" is among the most memorable in twentieth-century literature. Stylistic morphology, stylistic syntax, stylistic semasiology. Parenthetic sentences/arenthesis. Parallelism. Nominative sentences. Rhetorical question.

реферат [32,1 K], добавлен 22.12.2007
The difficulties of rendering alliteration, assonance, rhythm and rhyme in a translation of literary works
A brief and general review of translation theory. Ambiguity of the process of translation. Alliteration in poetry and in rhetoric. Definitions and main specifications of stylistic devices. The problems of literary translation from English into Kazakh.

курсовая работа [34,6 K], добавлен 25.02.2014

Другие документы, подобные "Automated method of hyper-hyponymic verbal pairs extraction from dictionary definitions based on syntax analysis and word embeddings"

весь список подобных работ

скачать работу можно здесь

Работы в архивах красиво оформлены согласно требованиям ВУЗов и содержат рисунки, диаграммы, формулы и т.д.
PPT, PPTX и PDF-файлы представлены только в архивах.
Рекомендуем скачать работу.