What neural networks know about linguistic complexity
Linguistic complexity is a complex phenomenon, as it manifests itself on different levels (complexity of texts to sentences to words to subword units), through different features and also via different tasks (specific needs of other kinds of audiences).
Рубрика | Иностранные языки и языкознание |
Вид | статья |
Язык | английский |
Дата добавления | 16.08.2023 |
Размер файла | 65,9 K |
Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже
Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.
13. Fytas, Panagiotis, Georgios Rizos & Lucia Specia. 2021. What makes a scientific paper be accepted for publication? Proceedings of the First Workshop on Causal Inference and NLP. Association for Computational Linguistics, Punta Cana, Dominican Republic. 44-60.
14. Halliday, M.A.K. 1992. Language as system and language as instance: The corpus as a theoretical construct. In J. Svartvik (ed.), Directions in corpus linguistics: Proceedings of Nobel Symposium 82 Stockholm 65, 61-77. Walter de Gruyter.
15. Hosmer Jr, David W., Stanley Lemeshow & Rodney X. Sturdivant. 2013. Applied Logistic Regression. John Wiley & Sons.
16. Janizek, Joseph D., Pascal Sturmfels & Su-In Lee. 2021. Explaining explanations: Axiomatic feature interactions for deep networks. Journal of Machine Learning Research 22(104). 1-54.
17. Juilland, Alphonse. 1964. Frequency Dictionary of Spanish Words. Mouton.
18. Kading, Friedrich Wilhelm (ed.). 1897. Haufigkeitsworterbuch der Deutschen Sprache. Selbstverlag.
19. Khallaf, Nouran & Serge Sharoff. 2021. Automatic difficulty classification of Arabic sentences. Proceedings of the Sixth Arabic Natural Language Processing Workshop. Association for Computational Linguistics, Kyiv, Ukraine (Virtual). 105-114.
20. Kunilovskaya, Maria & Ekaterina Lapshinova-Koltunski. 2019. Translationese features as indicators of quality in English-Russian human translation. Proceedings of the
21. Human-Informed Translation and Interpreting Technology Workshop (HiT-IT 2019). Incoma Ltd., Shoumen, Bulgaria, Varna, Bulgaria. 47-56.
22. Laposhina, Antonina N., Tatyana Veselovskaya, Maria Lebedeva & Olga Kupreshchenko. 2018. Automated text readability assessment for Russian second language learners. Computational Linguistics and Intellectual Technologies: Proceedings of the International Conference “Dialogue''.
23. Lorge, Irving. 1944. Predicting readability. Teachers College Record.
24. Nadeem, Farah & Mari Ostendorf. 2018. Estimating linguistic complexity for science texts. Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, New Orleans, Louisiana. 45-55.
25. Council of Europe. 2001. Common European Framework of Reference for Languages: Learning, Teaching, Assessment (CEFR). Technical report, Council of Europe, Strasbourg.
26. Orlov, Jurij. 1983. Ein modell der haufigkeitsstruktur des vokabulars. In H. Guiter & M. Arapov (eds.), Studies on Zipf's law, 154-233.
27. Paun, Silviu, Bob Carpenter, Jon Chamberlain, Dirk Hovy, Udo Kruschwitz & Massimo Poesio. 2018. Comparing Bayesian models of annotation. Transactions of the Association for Computational Linguistics 6. 571-585.
28. Pitler, Emily & Ani Nenkova. 2008. Revisiting readability: A unified framework for predicting text quality. Proc EMNLP. 186-195.
29. Rogers, Anna, Olga Kovaleva & Anna Rumshisky. 2020. A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics 8. 842-866.
30. Sharoff, Serge. 2021. Genre annotation for the web: Text-external and text-internal perspectives. Register Studies 3. 1-32.
31. Sharoff, Serge, Svitlana Kurella, & Anthony Hartley. 2008. Seeking needles in the Web haystack: Finding texts suitable for language learners. Proc Teaching and Language Corpora Conference, TaLC 2008. Lisbon.
32. Shavrina, Tatiana & Olga Shapovalova. 2017. To the methodology of corpus construction for machine learning: Taiga syntax tree corpus and parser. CORPORA, International Conference. Saint-Petersburg.
33. Sheehan, Kathleen M., Michael Flor & Diane Napolitano. 2013. A two-stage approach for generating unbiased estimates of text complexity. Proceedings of the Workshop on Natural Language Processing for Improving Textual Accessibility. Association for Computational Linguistics, Atlanta, Georgia. 49-58.
34. Solovyev, Valery, Marina Solnyshkina, Vladimir Ivanov & Ildar Batyrshin. 2019. Prediction of reading difficulty in Russian academic texts. Journal of Intelligent & Fuzzy System 36(5). 4553-4563.
35. Straka, Milan & Jana Strakova. 2017. Tokenizing, POS tagging, lemmatizing and parsing UD 2.0 with UDPipe. Proc CoNLL 2017 Shared Task. Association for Computational Linguistics, Vancouver, Canada. 88-99.
36. Vajjala, Sowmya & Detmar Meurers. 2012. On improving the accuracy of readability classification using insights from second language acquisition. Proceedings of the Seventh Workshop on Building Educational Applications Using NLP. Association for Computational Linguistics, Montreal, Canada. 163-173.
37. Vajjala, Sowmya & Detmar Meurers. 2014. `Readability assessment for text simplification: From analysing documents to identifying sentential simplifications'. ITL-International Journal of Applied Linguistics 165(2). 194-222.
38. Wolf, Thomas, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Remi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yacine Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest & Alexander M. Rush. 2019. HuggingFace's Transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771.
39. Xia, Menglin, Ekaterina Kochmar & Ted Briscoe. 2016. Text readability assessment for second language learners. Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, San Diego, CA. 12-22.
40. Yuan, Yu & Serge Sharoff. 2020. Sentence level human translation quality estimation with attention-based neural networks. Proc LREC, Marseilles.
41. Zhai, Yuming, Gabriel Illouz & Anne Vilnat. 2020. Detecting non-literal translations by fine- tuning cross-lingual pre-trained language models. Proceedings of the 28th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Barcelona, Spain (Online). 5944-5956.
Appendix 1. Linguistic features
The order of the linguistic features and their codes are taken from (Biber 1988). The conditions for detecting the features for English replicate the published procedures from (Biber 1988), many of them are expressed via lists of lexical items or via POS annotations, which in this study are provided by UDPIPE (Straka & Strakova 2017). The Russian features are either based on translating the English word lists or on using identical or functionally similar constructions.
Code |
Label |
Condition |
|
A01 |
past verbs |
VERB, Tense=Past |
|
A03 |
present verbs |
VERB, Tense=Pres |
|
B04 |
place adverbials |
ADV, lex in (aboard,above,abroad,across...) |
|
B05 |
time adverbials |
ADV, lex in (afterwards,again,earlier...) |
|
C06 |
first person pronouns |
PRON, lex in (l,we,me,us,my...) |
|
C07 |
second person pronouns |
PRON, lex in (you,your,yourself,yourselves) |
|
C08 |
third person pronouns |
PRON, lex in (she,he,they,her,him,them,his...) |
|
C09 |
impersonal pronouns |
Conditions from (Biber 1988) |
|
C10 |
demonstrative pronouns |
Conditions from (Biber 1988) |
|
C11 |
indefinite pronouns |
PRON, lex in (anybody,anyone,anything,everybody...) |
|
C12 |
do as pro-verb |
Conditions from (Biber 1988) |
|
D13 |
wh-questions |
Conditions from (Biber 1988) |
|
E14 |
nominalizations |
lex ends with ('tion'/ment'/ness', 'ism') |
|
E16 |
nouns |
Conditions from (Biber 1988) |
|
F18 |
passives with by |
Conditions from (Biber 1988) |
|
G19 |
be as main verb |
Conditions from (Biber 1988) |
|
H23 |
wh-clauses |
Conditions from (Biber 1988) |
|
H34 |
sentence relatives |
Conditions from (Biber 1988) |
|
H35 |
causatives |
CONJ, lex in (because) |
|
H36 |
concessives |
CONJ, lex in (although,though,tho) |
|
H37 |
conditionals |
CONJ, lex in (if, unless) |
|
H38 |
other subordination |
Conditions from (Biber 1988) |
|
I39 |
prepositions |
ADP |
|
I40 |
attributive adjectives |
Conditions from (Biber 1988) |
|
I41 |
predicative adjectives |
Conditions from (Biber 1988) |
|
I42 |
adverbs |
ADV |
|
J43 |
type-token ratio |
Using 400 words as in (Biber 1988) |
|
J44 |
word length |
Average length of orthographic words |
|
K45 |
conjuncts |
Conditions from (Biber 1988) |
|
K46 |
downtoners |
lex in (almost,barely,hardly,merely...) |
|
K47 |
general hedges |
lex in (maybe, at about, something like...) |
|
K48 |
amplifiers |
lex in (absolutely,altogether,completely,enormously...) |
|
K49 |
general emphatics |
Conditions from (Biber 1988) |
|
K50 |
discourse particles |
Conditions from (Biber 1988) |
|
K55 |
public verbs |
VERB, lex in (acknowledge,admit,agree...) |
|
K56 |
private verbs |
VERB, lex in (anticipate,assume,believe...) |
|
K57 |
suasive verbs |
VERB, lex in (agree,arrange,ask...) |
|
K58 |
seem/appear |
VERB, lex in (appear, seem) |
|
L52 |
possibility modals |
VERB, lex in (can,may,might,could) |
|
L53 |
necessity modals |
VERB, lex in (ought,should,must) |
|
L54 |
prediction modals |
VERB, lex in (shall,will,would), excluding future tense |
|
N59 |
contractions |
Conditions from (Biber 1988) |
|
N60 |
that deletion |
Conditions from (Biber 1988) |
|
P66 |
synthetic negation |
Conditions from (Biber 1988) |
|
P67 |
analytic negation |
Conditions from (Biber 1988) |
Размещено на Allbest.ru
...Подобные документы
Extra-linguistic and linguistic spheres of colour naming adjectives study. Colour as a physical phenomenon. Psychophysiological mechanisms of forming colour perception. The nuclear and peripherical meanings of the semantic field of the main colours.
реферат [193,7 K], добавлен 27.09.2013English songs discourse in the general context of culture, the song as a phenomenon of musical culture. Linguistic features of English song’s texts, implementation of the category of intertextuality in texts of English songs and practical part.
курсовая работа [26,0 K], добавлен 27.06.2011The theory and practice of raising the effectiveness of business communication from the linguistic and socio-cultural viewpoint. Characteristics of business communication, analysis of its linguistic features. Specific problems in business interaction.
курсовая работа [46,5 K], добавлен 16.04.2011Genre of Autobiography. Linguistic and Extra-linguistic Features of Autobiographical Genre and their Analysis in B. Franklin’s Autobiography. The settings of the narrative, the process of sharing information, feelings,the attitude of the writer.
реферат [30,9 K], добавлен 27.08.2011Background of borrowed words in the English language and their translation. The problems of adoptions in the lexical system and the contribution of individual linguistic cultures for its formation. Barbarism, foreignisms, neologisms and archaic words.
дипломная работа [76,9 K], добавлен 12.03.2012Style as a Linguistic Variation. The relation between stylistics and linguistics. Stylistics and Other Linguistic Disciplines. Traditional grammar or linguistic theory. Various linguistic theories. The concept of style as recurrence of linguistic forms.
реферат [20,8 K], добавлен 20.10.2014The concept as the significance and fundamental conception of cognitive linguistics. The problem of the definition between the concept and the significance. The use of animalism to the concept BIRD in English idioms and in Ukrainian phraseological units.
курсовая работа [42,0 K], добавлен 30.05.2012Kinds of synonyms and their specific features. Distributional features of the English synonyms. Changeability and substitution of meanings. Semantic and functional relationship in synonyms. Interchangeable character of words and their synonymy.
дипломная работа [64,3 K], добавлен 10.07.2009Lexicology, as a branch of linguistic study, its connection with phonetics, grammar, stylistics and contrastive linguistics. The synchronic and diachronic approaches to polysemy. The peculiar features of the English and Ukrainian vocabulary systems.
курсовая работа [44,7 K], добавлен 30.11.2015Definitiоn and features, linguistic peculiarities оf wоrd-fоrmatiоn. Types оf wоrd-fоrmatiоn: prоductive and secоndary ways. Analysis оf the bооk "Bridget Jоnes’ Diary" by Helen Fielding оn the subject оf wоrd-fоrmatiоn, results оf the analysis.
курсовая работа [106,8 K], добавлен 17.03.2014Adverbial parts of the sentence are equally common in English and Ukrainian. Types of Adverbial Modifiers. Peculiarities of adverbial modifiers in English and Ukrainian, heir comparative description of similar and features, basic linguistic study.
контрольная работа [25,3 K], добавлен 17.03.2015Features of the study and classification of phenomena idiom as a linguistic element. Shape analysis of the value of idioms for both conversational and commercial use. Basic principles of pragmatic aspects of idioms in the field of commercial advertising.
курсовая работа [39,3 K], добавлен 17.04.2011Act of gratitude and its peculiarities. Specific features of dialogic discourse. The concept and features of dialogic speech, its rationale and linguistic meaning. The specifics and the role of the study and reflection of gratitude in dialogue speech.
дипломная работа [66,6 K], добавлен 06.12.2015Specific features of English, Uzbek and German compounds. The criteria of compounds. Inseparability of compound words. Motivation in compound words. Classification of compound words based on correlation. Distributional formulas of subordinative compounds.
дипломная работа [59,2 K], добавлен 21.07.2009The grammatical units consisting of one or more words that bear minimal syntactic relation to the words that precede or follow it. Pragmatic word usage. Differences in meaning. Idioms and miscommunications. The pragmatic values of evidential sentences.
статья [35,2 K], добавлен 18.11.2013Types of translation theory. Definition of equivalence in translation, the different concept; formal correspondence and dynamic equivalence. The usage of different levels of translation in literature texts. Examples translation of newspaper texts.
курсовая работа [37,6 K], добавлен 14.03.2013The problems as the types of sentences in English, their classification, the problem of composite sentences. Sentences with only one predication and with more than one predication: simple and composite sentence. Types of sentences according to structure.
курсовая работа [25,5 K], добавлен 07.07.2009Legal linguistics as a branch of linguistic science and academic disciplines. Aspects of language and human interaction. Basic components of legal linguistics. Factors that are relevant in terms of language policy. Problems of linguistic research.
реферат [17,2 K], добавлен 31.10.2011Phonetic coincidence and semantic differences of homonyms. Classification of homonyms. Diachronically approach to homonyms. Synchronically approach in studying homonymy. Comparative typological analysis of linguistic phenomena in English and Russia.
курсовая работа [273,7 K], добавлен 26.04.2012The Origin of Black English. Development of Pidgin and Creole. Differences of Black English and Standard English, British English and British Black English. African American Vernacular English and its use in teaching process. Linguistic Aspects.
дипломная работа [64,6 K], добавлен 02.11.2008