Главная Коллекция "Revolution" Иностранные языки и языкознание Towards a new linguistic model for detecting political lies

Towards a new linguistic model for detecting political lies

Acquaintance with the peculiarities of the pre-election speech of American presidential candidates: D. Trump and H. Clinton. General characteristics of the bodies of false statements by H. Clinton and D. Trump. Analysis of ways to determine lies.

Рубрика	Иностранные языки и языкознание
Вид	статья
Язык	английский
Дата добавления	10.03.2021
Размер файла	3,5 M

посмотреть текст работы

скачать работу можно здесь

полная информация о работе

весь список подобных работ

Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже

Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.

Страница:

Размещено на http://www.allbest.ru/

Towards a new linguistic model for detecting political lies

Amr M. El-Zawawy, Alexandria University

Abstract

The present study addresses the problem of how the two US presidential candidates Donald Trump and Hillary Clinton use statements judged to be false by the Politifact site while delivering their campaign speeches. Two corpora of Clinton's and Trump's alleged lies were compiled. Each corpus contained 16 statements judged to be false or ridiculously untrue (`pants on fire') by the Pulitzer Prize Winner site Politifact. Some statements were accompanied by the video recordings where they appeared; others had no video recordings affiliated because they are either tweets or their events had not been recorded on Youtube or elsewhere.

The present research made use of CBCA (Criteria-based Content Analysis) but as a stepping stone for building a new model of detecting lies in political discourse to suit the characteristics of campaign discourse. This furnished the qualitative dimension of the research. As for the quantitative dimension, data were analyzed using software, namely LIWC (Linguistic Inquiry & Word Count), and also focused on the content analysis of the deception cues that can be matched with the results obtained from computerized findings. When VSA (Voice Stress Analysis) was required, Praat was used. Statistical analyses were occasionally applied to reach highly accurate results. The study concluded that the New Model (NM) is not context-sensitive, being a quantitative one, and is thus numerically oriented in its decisions. Moreover, when qualitative analysis intervenes, especially in examining Politifact rulings, context plays a crucial role in passing judgements on deceptive vs. non-deceptive discourse.

Keywords: Clinton, Trump, LIW, Politifact, Lie detection

Аннотация

На пути к новой лингвистической модели определения политической лжи

Амр М. Эль-Завави

Университет АлександрииEl-Guish Road, El-Shatby, 21526 Александрия, Египет

Настоящее исследование рассматривает проблему того, каким образом предвыборная речь американских кандидатов в президенты, Дональда Трампа и Хиллари Клинтон, вводит в заблуждение избирателей. Были составлены два корпуса возможных лжевысказываний Клинтон и Трампа, каждый из которых содержал 16 утверждений, признанных сайтом Politifact (победитель Пулитцеровской премии) ложными или не соответствующими действительности. Некоторые заявления сопровождались видеозаписями, в то время как другие -- нет, поскольку являлись либо твитами, либо событиями, которые не были выложены на YouTube или иной ресурс. В данном исследовании применялся метод контент-анализа в качестве трамплина для построения новой модели определения лжи в политическом дискурсе в соответствии с характеристиками дискурса кампании, что обеспечило качественный аспект исследования. Что касается количественных данных, то они были проанализированы с помощью программного обеспечения LIWC (Linguistic Inquiry and Word Count), а также ориентировались на анализ содержания обманных сигналов, которые могли быть сопоставлены с результатами, полученными из компьютеризированных данных. Для анализа стрессовых изменений голоса использовалась программа Praat. Для достижения высокой точности результатов в некоторых случаях нашел применение и статистический анализ. В исследовании делается вывод о том, что новая модель не является контекстно-зависимой, будучи количественной, и, таким образом, численно ориентированной в своих решениях. Вместе с тем, качественный анализ, особенно при изучении положений проекта Politifact, показывает, что контекст играет решающую роль в определении дискурса как вводящего или не вводящего в заблуждение.

Ключевые слова: Клинтон, Трамп, программное обеспечение LIWC, проект Politifact, определение лжи.

speech lies presidential

Introduction

Lying is usually defined as not telling the truth. However, what is more important than this simplistic definition id why lying has become significant in human communication. DePaulo et al (1996) maintain that people lie in 31 percent of their social interactions. Their study thus points to the amount of lying committed, but how can this amount be studied linguistically in political campaign discourse?

Although political campaign discourse is part of the overarching political discourse, its language is unique in that it possesses a number of characteristics. One feature, according to Emerich et al (2001), is recurrence of imagery as a means of rendering the campaign discourse charismatic. Another feature is the use of `consilience' strategy, where the candidate-audience understanding stems from the mediation and embrace of different language, values and traditions in an attempt to encourage the listeners to remember the common principles shared by the candidate and voter (Frank and McPhail, 2005). A third feature, Fairclough maintains (2006), is that campaign language is capable of weaving visions and imaginaries which can change realities, obfuscate realities and construe them ideologically.A fourth is the topics that dominate campaign speeches. As Donella (1988) contends, campaign speeches serve as emotional triggers, spanning a range of issues such as the environment, taxes as well as good governance which can guarantee good jobs, among others. A final feature is the focus on populism as a discursive strategy that juxtaposes the virtuous populace with a corrupt elite and views the former as the sole legitimate source of political power (cf. Bonikowski and Gidron, 2015). Thus, campaign discourse is basically emotional and are geared towards canvassing support from voters.

Given this picture of campaign discourse, it is legitimate to ask how presidential candidates can strike a balance between emotionalism and truth-telling. They are required to be as much persuasive as possible while at the same time sound truthful. This inherently impinges on their ability to remain consistent and reliable all the time.

The two US presidential candidates Hillary Clinton and Donald Trump are now running the elections as representatives of the Democratic Party and the Republican Part, respectively. The two nominees have delivered several speeches and written posts and tweets on social media in the course of their campaigning. These electioneering channels can be a rich source for examining whether they tell the truth or not.

A special site called Politifact (www.politifact.com) was set up years ago to gauge the veracity of American politicians' releases. The site contains thousands of excerpts from past and present US politicians, including updates on Clinton's and Trump's statements. As a Pulitzer Prize Winner, the site claims that it adopts a criterion-based analysis of any statement. Such an analysis attempts to answer the following set of questions:

¦ Is the statement based on a fact that is subject to verification?

¦ Is the statement leaving a particular impression that may be misleading?

¦ Is the statement significant (barring slips of the tongue)?

¦ Is the statement likely to be carried over and repeated by others?

¦ Would a typical person hear or read the statement and wonder: Is that true?

The result is a meter that has six pointers as follows:

TRUE -- The statement is accurate; nothing significant is missing.

MOSTLY TRUE -- The statement is accurate but needs clarification or additional information.

HALF TRUE -- The statement is partially accurate but leaves out important details or takes things out of context.

MOSTLY FALSE -- The statement contains an element of truth but ignores critical facts that would give a different impression.

FALSE -- The statement is not accurate.

PANTS ON FIRE -- The statement is not accurate and makes a ridiculous claim.

The website claims that it is sometimes necessary to consider factors such context, timing, promise-keeping, etc. Other times they resort to acoustic analysis as was done with a contentious statement by Clinton about raising taxes on the middle class detected by Trump's supporters.

Another dimension in the present research is the use of LIWC (Linguistic Inquiry and Word Count) developed and continually updated by Pennebaker and others since 2000. LIWC is an website that it reads a given text and counts the percentage of words that reflect different emotions, thinking styles, social concerns, and even parts of speech. The most relevant part of this electronic tool to the present research is that it includes two dimensions that directly affect judgements on truth-telling, namely Authenticity and emotional tone. Authenticity refers to writing that is personal and honest. Emotional tone is scored such that higher numbers are more positive and upbeat and lower numbers are more negative.

The present paper attempts to examine 16 statements for each candidate judged by Politifact as false (whether downright false or `pants on fire'). This study derives its significance from the fact that it provides a suitable vantage point for investigating the topic of lying in the context of political discourse, particularly the case of Clinton and Trump, as a major human interactive encounter. This is set within the context of contrasting two US candidates' speeches with the aid of a linguistic model of analysis, which will eventually lead to providing a better understanding of the nature of lying as a verbal immediacy activity in political campaign discourse.

Linguistic approaches to lie detection

Linguistic approaches to lie detection can be divided into three categories: communication approaches, disfluency-based approaches (usually acoustically oriented), and holistic approaches. The review below provides a bird's eye view of the three approaches in tandem.

Three studies can be subsumed under the communication category. The first is Zuckerman et al's (1981). As early as 1981, Zuckerman et al focused on the metaanalysis of deception-detection (traditionally known as the Four-Factor Theory), and stated that no cue or cues to deception could be accurate all the time because deception was an individual psychological process.

The second is Newman et al's (2003), where they investigated linguistic features that discern true from false stories. They applied a computerized analysis of five independent samples, achieving a classification of liars and truth-tellers at a rate of 67% when the topic was constant and a rate of 61% overall. When compared to truth-tellers, liars exhibited lower cognitive complexity, used fewer self-references and other-references, and showed a tendency towards more negative emotive words.

The third is Zhou et al's. (2004a). They foregrounded The Interpersonal Deception Theory. The theory is based on the assumption that deceivers' number of words, verbal self-distancing tactics, and use of adjective and adverb increase during a conversation. Thus, while communicating, deceivers use feedback from recipients' message to modify deception strategy. According to this theory, cues to deception are divided into three categories: verbal, nonverbal, and physiological.

Some other studies later laid much emphasis on disfluencies in speech, particularly pauses, as a viable linguitsic marker of false statement. Anolli and Ciceri (1997) found out that longer time lapse occurs between the question and the lie to than the response latency that occurs in truthful statements. A major study in this direction is Benus et al. (2006), where they made use of a corpus of spontaneously recorded interviews to investigate the relationship between the distributional and prosodic features of silent and filled pauses and the interviewee's intention to deceive the interviewer. They concluded that the use of pauses correlated more with truthful rather than with deceptive speech. They also found out that prosodic features extracted from filled pauses as well as features describing contextual prosodic information in adjacent phonetic environments of the filled pauses may facilitate the detection of lies in speech.

Demenko (2008) attempted to introduce voice stress extraction and classification into the investigation of deceptive speech. She made use of the authentic Poznan police database with the recordings of the 997 emergency phone, and selected 20,000 recordings out of 60,000, then around hundred were acoustically analyzed. It was concluded that the range of fundamental frequency per se did not correlate with stress whereas the shift in fundamental frequency register constituted the primary indicator of stress. Through Linear Discriminant Analysis based on 12 acoustic features, it was shown that it is possible to reach the three categories of neutral, depressive, stressed, highly stressed speech.

Arciuli et al. (2009) followed suit and examined the frequency of use of the filler `um' during lying versus truth-telling statements in two laboratory-elicited lies about a murder case. They found out that within-participants, false statements exhibited fewer instances of `um' during lying compared to truth-telling. These results pointed to the fact that `um' is a major filler in lying statements, and thus can be reliably used to differentiate between deceptive and non-deceptive statements in ordinary communication. Therefore, the filler `um' may not be accurately categorized as an instance of filled pauses, whose increase is proportionate with increased cognitive load. Rather, they may assume a lexical status similar to interjections, and so constitute an important part of authentic, natural communication.

Latency or gaps in discourse was also used in recent studies as another indicator of deceptive speech. In fact, there are several studies in that domain; however, the best- known is Reynolds and Randle-Short's (2011). They adopted a rigorous methodological framework of conversation analysis (CA) as analytic tool kit to demonstrate the importance of context, particularly interactional context, when researching cues to deception in order to understand whether there is a relationship between response latency and deception. They thus followed De-Paulo et al. (2003), who emphasized the interactional context in detecting lies in speech. Reynolds et al examined data from outside laboratory settings taken from The Jeremy Kyle Show, adopting strict criteria to develop the data collection. Criteria were based on how participants in the outside-laboratory interactions formulate their verbal output. Lies were detected according to the following criteria: (1) agreement by the liar that a lie had occurred; (2) explicit labelling of talk as lies by other participants; and (3) the liar's `revision' of a prior action, thereby changing the course of action, in a `lie relevant' sequential context. They found out that participants in the show could display a longer transition space to signal that a concessionary stance is close, or they can reduce the transition space to reduce the risk of an upcoming turn, which can be considered a concession.

Preferring an overall perspective, Kirchhubel and Howard (2011) explored the acoustic changes in the speech in deceptive statements. Truthful, deceptive and control speech was collected from ten speakers during an interview. Results were displayed according to the parameters of fundamental frequency, intensity and vowel formants. They found out that no significant correlation could be established for any of the acoustic features, a result that runs counter to many mainstream studies in the field.

The holistic approach, on the other hand, is adopted by Picornell (2012), where she examined deception in written witness statements. She employed marked sentence structures to code discourse markers in written narratives, and mapped the progression of lying as it unfolded through the course of the narrative based on the interaction of linguistic cues. She found out that what may be important is not the individual cues, but the way they are utilized.

The same approach is also adopted by Burgoon et al (2012), where they focused on whether indicators of truth or deception are context-independent or context-sensitive. The factors they suggested are: motivation and modality. A 2 (veracity: truthful/deceptive) by 2 (incentives: high/low) by 3 (modality: FtF/audio/text). The factorial experiment revealed that linguistic indicators are significantly related to veracity, but the results are highly sensitive to context.

In view of the previous review, there appears to a gap in the studies that focus on content analysis (i.e. the linguistic features of a potential liars' outputs) and the prosodic features that verify spots in the speech that signal lying, i.e. latency responses, pauses, fillers, speech errors and the like in political discourse. Bringing the two dimensions together in one project that studies lies committed by politicians in English would eventually enrich the field, and help formulate a new theoretical framework liable to applications in a wider context. The present research project is an attempt at studying how lies can be detected in human interactions, especially political discourse in English.

Context of the problem

The present study addresses the problem of how the two US presidential candidates Donald Trump and Hillary Clinton use statements judged to be false by Politifact while delivering their campaign speeches. A normal search through Google would yield 6 pages that provide discussions on how both candidates lie to their audiences, each page having 10 hits. This means that the topic of how the candidates use lies is a rampant phenomenon that merits further research. However, there are few studies that tackle the presidential candidates' lies. Wortham and Lorcher (1999) suggested embedded metapragmatics to investigate politicians' lies by examining television network news coverage of the 1992 and 1996 US presidential campaigns. Their article describes an approach to the social functions of language, which draws heavily on Bakhtin, and gives a more formal account of embedded metapragmatic constructions.

Another extended study is David Corn's (2004) book entitled The Lies of George Bush. Although the book is an amalgam of Bush's lies about health programs, IRAQ and tax policies, it does not offer a linguistic approach that can be put to use in further analysis. Moreover, the tone of the book is polemic, and sometimes sounds as a personal war. Still, a third study by Kangas (2014) focused on computerized analysis of politicians' discourse, and touched on honesty as composed of the z-scores of exclusive words, references to self, references to others, motion words and negative emotion words. The paper did not allot ample space to deceptive discourse, having a major focus on how software could analyze political discourse.

Therefore, it is important to draw attention to the impact of lies on the US candidate's image. The amount of lying and/or truthfulness can be linguistically analyzed, and how various linguistic tools can contribute to detecting these lies in their speeches and sometimes tweets.

Methods and data. Corpus

Two corpora of Clinton's and Trump's alleged lies were compiled. Each corpus contained 16 statements judged to be false or ridiculously untrue (`pants on fire') by the Pulitzer Prize Winner site Politifact. Some statements were accompanied by the video recordings where they appeared; others had no video recordings affiliated because they are either tweets or their events had not been recorded on Youtube or elsewhere. All in all, the two corpora comprise 1536 words (639 for Clinton's statements and 897 for Trump's statements) and their 16 videos are 7.02 minutes in total length (3.02 minutes for Clinton and 4 minutes for Trump).

A note on the method of analysis. Model of analysis

One major approach to investigating the field of lie-detection is the CBCA (Criteria- based Content Analysis) as one of the major elements of Statement Validity Assessment (SVA), a technique developed to determine the credibility of child witnesses' testimonies in trials for sexual offenses and recently applied to assessing testimonies given by adults (cf. Raskin and Esplin 1991). The present research makes use of CBCA but as a stepping stone for building a new model of detecting lies in political discourse to suit the characteristics of campaign discourse. This will furnish the qualitative dimension of the research. As for the quantitative dimension, it will analyze data using software, namely LIWC, and will also focus on the content analysis of the deception cues that can be matched with the results obtained from computerized findings. When VSA (Voice Stress Analysis) is required, Praat will be used. Statistical analyses will also be occasionally applied to reach highly accurate results.

Based on an extensive reading of the literature on the linguistic markers of deceptive speech, the holistic approach was favored for a number of reasons. First, the present model can be considered the first to subject political campaign speeches and/or posts and tweets to lie detection analyses. It is difficult to zoom in on one aspect, such as acoustics, at the expense of other ones. Second, the model adopted here is just a starter that can be so broadened as to include other modifications and it is therefore far from being perfect. It just highlights how campaign discourse may divert from the norms of truthful speech. Third, the present model is adapted from Burgoon et al's (2012) version, which is summarized in the following table.

Table 1. Linguistic classes and indicators

Linguistic Categories and Operationalizations of Indicators
Quantity Refers to the length of an utterance, expressed at the lowest level in terms of morphemes and at the highest levels in terms of entire utterances or turns at talk
1. Syllables (morphemes and affixes) 2. Verbs (words that characteristically are the grammatical center of a predicate and express an act, occurrences, or mode of being)
Complexity The degree to which a lexical item has few or many syllables (lexical complexity) or a sentence has few or many phrases and clauses (syntactic complexity)
1. Big words (# of words with 6 or more characters) 2. Readabittty (indices, e.g., Flesh-Kincaid or SMOG index) that measure reading grade level or difficulty of comprehending a segment of text)
Diversity Degree to which a segment of text uses many unique words and phrases relative to the total number of'words or phrases in it
1. Lexical diversity (total # of different words divided by total # of words. i.e., percentage of unique words in all words)
Specificity Degree to which a segment of text is concrete and specific or abstract
1. Sensory details (sensory experiences such as sounds, smells, physical sensations and visual details) 2. Expressivity (a measure of vividness, quantified as the relationship of # of adjectives + # of adverbs, divided by # of nouns + # of verbs)
Uncertainty Degree to which words or constructions introduce ambiguity in meaning
1. Modal verbs (auxiliary verbs like would, should, could that are characteristically used with a verb of predication)
Verbal Language that expresses and creates psychological distance Nonimmediacy
1. Passive voice (form of a verb used when the subject is being acted upon rather than doing something)
Personalization Personalization: pronoun use that increases the specificily or reference to self and others
1. Self-reference (first-person singular pronouns: I, me, my) 2. Second person reference (you-references)
Affect Words and expressions that convey die subjective aspect of an emotion apart from bodily changes
1. Affect ratio (num. of affect-laden words from a dic. of aff. terms rel. to total num. of words) 2. Pleasantness (positive or negative feelings associated with a term, based on pre-scaled dictionary of terms)
Activation Degree of dynamism expressed by emotional terms, based on pre-scaled dictionary of terms
Informality Degree of adherence to formal, standard language forms
1. Tipographical errors (# of errors in written text)
Cognitive Terms describing the respondent's thinking process (e.g., “thought”, “surmised”) Processes
Cognitive Degree of nonfluencies in a segment of text Difficulty
1. Filled pauses (um, er, ah, you know, and similar nonlexical expressions that do not disrupt the flow of speech and substitute for a silent pause

The above table seems to be at first sight comprehensive, yet it contains a number of redundancies that can be conflated. For example, informality is not a viable marker of deception and can be excluded. The same is true for readability, which is measured for written texts only can be difficult to apply to speeches. An alternative benchmark as suggested by Burgoon and Qin (2006) is the average sentence length3. Moreover, the idea of relating cognitive difficulty to filled pauses runs counter to the view held by Arciuli et al (2009), where false statements usually contain fewer `um' instances than truthful statements. Finally, being a predictive study, Burgoon and her colleagues omitted to include two important aspects: (a) the minimum amount (or percentage) of each feature that should be available for a statement to be false and (b) a rating scale that could locate the degree of veracity. The same problem is also detected in LIWC, where the scale from 0--100 cannot be reliable in cases where half of the statement is true and the rest is false. The present model thus adopted Vrij and Winkel's (1991), Connell's (2012) and Picornell's (2012) results which could be summarized in the following points:

1. Deceivers use fewer first-person pronouns than truth tellers.

2. Deceivers used more words and more exact language (psychological distancing) than truth tellers.

3. Deceivers' language was simpler (shorter clauses) than that of truth tellers.

4. Deceivers are more uncertain (passive voice usage).

5. Deceivers exhibited a higher cognitive load (through simpler structures and cognitive verbs).

6. Deceivers exhibit more tension through higher pitch.

Therefore, for the purposes of the present research, the following table summarizes the new model with the scale included:

Table 2. A modified version of Burgoon et al's (2012) model (the New Model)

Indicator/Marker

Truthful

Half-Truthful

False

Ridiculously False

1. Complexity:

The degree to which a lexical item has few or many syllables (lexical complexity) or a sentence has few or many phrases and clauses (syntactic complexity)

a. Big words (more than 6 characters or three syllables, excluding proper names)

100--89%

90--59%

60--10%

9--0%

b. Average sentence length (relative to longest sentence in the same piece of discourse)

100--89%

90--59%

60--10%

9--0%

2. Specificity:

Degree to which a segment of text is concrete and specific or abstract

a. Sensory details (sensory experiences such as sounds, smells, physical sensations and visual details)

100--89%

90--59%

60--10%

9--0%

Indicator/Marker

Truthful

Half-Truthful

False

Ridiculously False

b. Lexical density (a measure of vividness, quantified as the relationship of # of adjectives + # of adverbs divided by # of nouns+ # of verbs)

0--10%

11--60%

61--90%

91--<100%

3 Uncertainty:

Degree to which words or constructions introduce ambiguity in meaning

a. Modal verbs

0--10%

11 --60%

61--90%

91--<100%

b. Qualifiers like `somewhat', `maybe', etc.

0--10%

11--60%

61--90%

91--<100%

4 Verbal Non-immediacy:

Terms or constructions that express and create psychological distance

a. Passive voice

0--10%

11 --60%

61--90%

91--<100%

5. Personalization:

Pronoun use that increases the specificity of reference to self and others

a. Self-reference

100--89%

90--59%

60--10%

9--0%

b. Second and third person references

0--10%

11--60%

61--90%

91--<100%

6. Emotiveness:

Words or terms that convey emotions

a. Affect ratio (number of affect-laden words from a dictionary of affect terms relative to total number of words)

0--10%

11--60%

61--90%

91--<100%

7. Cognitive process terms Terms describing the respondent's thinking process (e.g., “thought,” “surmised”)

0--10%

11--60%

61--90%

91--<100%

8. VSA (voice stress analysis):

Acoustic features that signal tension on the part of the deceiver

a. Higher pitch (means are calculated; a pitch amounts to zero if below

65 Hz for males and if below

100 Hz for females*)

0--10%

11--60%

61--90%

91--<100%

b. Fillers, especially `um'

0--10%

11--60%

61--90%

91--<100%

Total = degree of veracity

Truthful 100%

Half-truthful 99--50%

False 49--5%

Ridiculously false 4--0%

3 It is unclear why Burgoon et al (2012, p. 324) mentioned a similar criterion in their definition of complexity when maintaining that it refers to `a sentence [which] has few or many phrases and clauses (syntactic complexity)', then they subsumed readability under it. It is well-documented that Flesch-Kincaid readability tests are used with children and adults. SMOG is used particularly for checking health messages.

It is clear from the above table that eight indicators are adopted in the present model. They have been adapted from Burgoon et al's (2012) version. Some indicators follow a reverse order of intensity on the scale from truthful to ridiculously false, since deceivers may have fewer self-references than truth-tellers, yet they may have more cognitive verbs such as `think' ,'believe', `guess' etc. In any event, the new model is a so-called `test-bed' for manually checking veracity in political campaign discourse, and will be compared with LIWC and Politifact judgements.

It is noteworthy that the degree of veracity is calculated through summing up the percentages obtained in all the indicators. Then the total is divided by the 11 indicators and sub-indicators. In the case where there is no video available to measure pitch, the pitch indicator is excluded and the degree is calculated relative to 10 indicators only.

Data analysis

The analysis of the data follows a three-way measure:
1. New Model-LIWC Agreement/Discrepancy

2. New Model-Politifact Agreement/Discrepancy

3. LlWC-Politifact Agreement/Discrepancy

Under each of the first two sections, the nine indicators will be examined.

New Model-LIWC Agreement/Discrepancy

The New Model (henceforth NM) is greatly different from the LIWC tool. The following table summarizes the results obtained in both NM and LIWC for Clinton's statements.

Table 3. NM and LIWC results for Clinton's statements*

Statement

NM

LIWC

Benghazi

22.22

35.4

FBI

25.76

99.9

GOP

17.5

37.2

Mortgage

12.97

2.1

Gun factory

14.18

1.0

Healthcare

14.93

67.3

ISIS

12.63

50.4

Legislation

14.22

78.9

Hampshire

17.64

20.2

Oil

21.67

98.0

Sanders

15.11

96.0

Scott

17.47

32.4

Not a thing in America

20.06

1.0

Education

25.90

2.4

Clean Power

22.31

1.0

Emails

15.92

43.4

* Statements are named after their central themes. For verbatim transcripts of Clinton's statements selected, visit Politifact's website: http://www.politifact.com/personalities/hillary-clinton/statements/byruling/false.

It is clear from the above table that 3 statements are judged by LIWC to be halftruthfUl, i.e. around 98 and 99 %, while they are labeled false by NM. This discrepancy is not just found in the direction of truthfulness, so to say, but it also figures clearly in the direction of ridiculously false statements. Thus, 4 statements are judged as ridiculously false by LIWC while they are only false as labeled by NM. The problem is one of degree. If the rating scale proposed by NM is applied, then the above discrepancies are obviously problematic, since a statement cannot be true and false at the same time. The scale proposed in NM can be illustrated below:

Fig. 1 : An envisaged continuum of the NM veracity scale

This leads to considering 18.75% of LIWC results as completely inaccurate and 25% as partially inaccurate. In the first case, the discrepancy points to statements that are false judged as truthful, while in the second case, a statement is false but is labeled as ridiculously false. However, if taken from the point of view of LIWC, a statement is false if it does not attain 100 % on its scale. In view of this, the above discrepancy vanishes, but the question of degree is not fully tackled. In other words, a statement which attains a 99.9% percent on LIWC scale cannot be true although it has only a fraction left to be true. This interpretation causes the 99.9 % statements to be equal to 1.0% statements, which is a baffling decision. The same is true for statements which are considered halftruthful from the point of view of NM: they range from 65 to 79%, and are false according to LIWC, though their veracity is more than their falsehood.

As for the rest of the statements which are judged by both NM and LIWC to be false, the suffer the same obstacle of degree. A statement, for example, can be 17.5 on NM scale but 37.2 on LIWC. The net result is that both are false, yet they are on a par with each other on the `falsity scale', so to speak.

A similar situation is found in analyzing Trump's statements. The following table summarizes the NM and LIWC results for Trump's statements:

Table 4. NM and LIWC results for Trump's statements*

Statement

NM

LIWC

Clinton campaign

11.99

63.5

Coal

16.23

86.4

Cruz

13.96

2.8

Economy

10.67

17.0

FBI

23.85

1.0

Freddie

15.05

3.0

Iran

17.56

33.6

Iraq

14.85

96.2

ISIS

11.53

1.0

ISIS foundation

19.63

1.0

Marshal

21.39

41.4

Money laundering

10.75

1.0

Muslims

17.0

36.4

Obamacare

20.15

7.2

Ohio

15.31

8.3

Second amendment

23.51

1.0

"Statements are named after their central themes. For verbatim transcripts of Clinton's statements selected, visit Politifact's website: http://www.politifact.com/personalities/donald-trump/statements

It is clear from the above table that 3 statements are judged by LIWC to be halftruthful, i.e. around 63 and 99%, while they are labeled false by NM. This discrepancy is not just found in the direction of truthfulness, so to say, but it also figures clearly in the direction of ridiculously false statements. Thus, 6 statements are judged as ridiculously false by LIWC while they are only false as labeled by NM. The problem is again one of degree. The conclusion is similar to the one reached when discussing Clinton's statements: 18.75% of LIWC's results as completely inaccurate and 37.5% are partially inaccurate. In the first case, the discrepancy points to statements that are false judged as truthful, while in the second case, a statement is false but is labeled as ridiculously false. However, if taken from the point of view of LIWC, a statement is false if it does not attain 100% on its scale.

Statistics can come to the aid of the analysis at this point. The ANOVA analysis yields the following two tables: ANOVA results for NM (Clinton and Trump)

It is clear that p is not significant in either case: the NM for Clinton's and Trump's statements, and LIWC for both candidates. Statistically, this means that the NM and LIWC are equal in their judgements when broadly compared according to ANOVA results. However, if this mode of analysis is the only one adopted, the details are not fully addressed. Table 3 above shows that only one statement appears to receive similar judgements by NM and LlWc, namely the Hampshire one: it scores 17.64 and 20.2 on NM and LIWC, respectively. The 2.56% difference can be considered significant, and this can be considered the only point of agreement between NM and LIWC.

New Model-Politifact Agreement/Discrepancy

In this section, quantitative analysis is not possible, since Politifact does not provide numerical figures that can be set side by side with the NM results. The alternative, by nature, is qualitative analysis. The following table summarizes the qualitative results of both NM and Politifact for Clinton's statements:

Table 7. NM and Politifact results for Clinton's statements

Statement

NM

Politifact

Benghazi

False

False

FBI

False

False

GOP

False

False

Mortgage

False

False

Gun factory

False

False

Healthcare

False

False

ISIS

False

False

Legislation

False

False

Hampshire

False

False

Oil

False

False

Sanders

False

False

Scott

False

False

Not a thing in America

False

False

Education

False

False

Clean Power

False

False

Emails

False

False

It is clear that the results of both NM and Politifact are identical. The discrepancies detected in LIWC are not there. The sole comment that can be made is related to the indicators of Lexical Density and VSA in NM. In 81.25% of the statements examined, Lexical Density scores point to the falsity of the statements in question, but the remaining 18.75% point to ridiculously false statements according to NM. Consider, for example, the following statement by Clinton:

"I think this is a major challenge and I want us to address it. Not one word from the other side. And you take somebody like Governor Walker of Wisconsin, who seems to be delighting in slashing the investment in higher education in his state. And most surprisingly to me, rejecting legislation that would have made it tax deductible for you, on your income tax, to deduct the amount of your loan payments. I don't know why he wants to raise taxes on students. But that's the result when you don't look for ways to help people who are not sitting around asking for something, who are actually working hard every day to get ahead.”

This long statement has a Lexical Density score of 93.3%, being full of verbs and nouns. The problem is that the higher the lexical density, the more falsity score a statement attains (where details are provided to cover up any misinformation). According to NM, this statement is ridiculously false, while Politifact judges it false due to its context. Politifact maintains that it is true that Senator Scott did not publicly support the Democratic-sponsored measures that would have provided the tax deduction, but he had never rejected such legislation, either. This inherently means that Clinton passed the ruling without a sufficient amount of information. In a sense, the details of the indicators would at times point to judgements that are different from the overall decision of whether a statement is false or not, and this is the role of context.

As for the VSA scores, the NM provides a mean of 44.87%, which indicates that Clinton's statement is half-truthful. The upper-bound for a female voice pitch is 525 Hz, while the lower is 100 Hz. A Praat spectrogram has been created for a section of this statement as follows:

Fig. 2 : A spectrogram for the first part of Clinton's example statement

In this illustration, the blue streaks refer to pitch contours: they range from 239 Hz to 148 Hz. This means that Clinton is not stressed; she speaks normally. Yet, in another analysis later in the same segment, she starts to lose control and shout:

Fig. 3: A spectrogram for the second part of Clinton's example statement

The pitch contours change from 239 Hz to 394.1 Hz, which indicates emotional speech, and thus the deceptive part starts at the extract “who seems to be delighting in slashing the investment in higher education in his state”. This is exactly what Politifact states about the context of Clinton's judgement: Senator Scott remained tacit about the tax decision; he was neither delighted nor repugnant. This also tallies with Demenko's (2008) study about pitch contours in stressed males and females reveals that average frequency for extremely stressed females is 366 Hz. Stress is a major indicator of deception (cf. Ekman, 1991).

As for Trump's statements, the following table summarizes the qualitative results of both NM and Politifact:

Table 8. NM and Politifact results for Trump's statements

Statement

New Model

Politifact

Clinton campaign

False

Ridiculously false

Coal

False

False

Cruz

False

Ridiculously false

Economy

False

Mostly false

FBI

False

False

Freddie

False

False

Iran

False

False

Iraq

False

False

ISIS

False

Ridiculously false

ISIS foundation

False

False

Marshal

False

Ridiculously false

Money laundering

False

False

Muslims

False

False

Obamacare

False

False

Ohio

False

False

Second amendment

False

False

There are five discrepancies, which means that 31.25% of NM decisions are not accurate. Lexical Density scores point to the falsity of the statements in question, but the remaining 37.50% point to ridiculously false statements according to NM. Again, context has to be taken into account.

Recourse to VSA might show the moot point. One case in point is the statement about accusing marshals in Colorado and Ohio of incompetence. The following spectrogram illustrates the variations in pitch:

Fig. 4: A spectrogram for Trump's statement about marshals

Trump's pitch oscillates between 279 Hz and 312 Hz, especially when he speaks about fire marshals. This tallies with Demenko's (2008) study about pitch contours in stressed males and females reveals that average frequency for extremely stressed males is 238 Hz. Stress is a major indicator of deception (cf. Ekman, 1991).

As a concluding remark for this section, it is important to juxtapose context and VSA in order to achieve a sound judgement in deceptive speech analysis. Relying on Lexical Density and/or context alone would conduce towards erroneous decisions.

LlWC-Politifact Agreement/Discrepancy

Here, again quantitative analysis is not possible. The following table summarizes the LIWC and Politifact results for Clinton's statements:

Table 9. LIWC and Politifact results for Clinton's statements

Statement

LIWC according to NM scale

LIWC

Politifact

Benghazi

False

False

False

FBI

Half-truthful

False

False

GOP

False

False

False

Mortgage

Ridiculously false

False

False

Statement

LIWC according to NM scale

LIWC

Politifact

Gun factory

Ridiculously false

False

False

Healthcare

Half-truthful

False

False

ISIS

Half-truthful

False

False

Legislation

Half-truthful

False

False

Hampshire

False

False

False

Oil

Half-truthful

False

False

Sanders

Half-truthful

False

False

Scott

False

False

False

Not a thing in America

Ridiculously false

False

False

Education

Ridiculously false

False

False

Clean Power

Ridiculously false

False

False

Emails

False

False

False

The two columns provided for the LIWC decisions are meant to show that according to the scale proposed under section 4.1, discrepancy is easily detected, but according to the `loose' criteria of LIWC (where the two extremes 0 and 100 are at work), the discrepancy is absent. As for the first column, this is a glaring example of discrepancy. The LIWC results point to six statements that are half-truthful, which means more than 50% of each statement is true. Since LIWC does not provide detailed results for its `authenticity' indicator, it is clear that there is a major problem with the program. Even false statements are considered in five cases out of sixteen as ridiculously false. It can be said that LIWC vacillates between the two extremes of truthful and ridiculously false without an intermediate level. The reason for this is two-fold. First, LIWC, like the present NM, is not context-sensitive. Second, according to the developers of LIWC Pennebaker et al (2015), the program has mean standard deviations (SD) of 0.70 and 0.32% for certainty and anxiety, respectively. The two dimensions are closely related in the study of deceptive discourse, and the above statements might have fallen within this level of SD.

A similar situation is found in Trump's statement. The following table summarizes the LIWC and Politifact results for Trump's statements:

Table 10. LIWC and Politifact results for Trump's statements

Statement

LIWC according to NM scale

LIWC

Politifact

Clinton campaign

Half-truthful

False

Ridiculously false

Coal

Half-truthful

False

False

Cruz

Ridiculously false

False

Ridiculously false

Economy

False

False

Mostly false

FBI

Ridiculously false

False

False

Freddie

Ridiculously false

False

False

Iran

False

False

False

Iraq

Half-truthful

False

False

ISIS

Ridiculously false

False

Ridiculously false

ISIS foundation

Ridiculously false

False

False

Marshal

False

False

Ridiculously false

M...

Страница:

1
2

статья "Towards a new linguistic model for detecting political lies" скачать

Подобные документы

Principles of word-formation in English
Definitiоn and features, linguistic peculiarities оf wоrd-fоrmatiоn. Types оf wоrd-fоrmatiоn: prоductive and secоndary ways. Analysis оf the bооk "Bridget Jоnes’ Diary" by Helen Fielding оn the subject оf wоrd-fоrmatiоn, results оf the analysis.

курсовая работа [106,8 K], добавлен 17.03.2014

Formation and development of political parties in the Republic of Belarus
The factors of formation of a multiparty system in Belarus. The presidential election in July 1994 played important role in shaping the party system in the country. The party system in Belarus includes 15 officially registered political parties.

реферат [9,9 K], добавлен 14.10.2009

Linguistic and socio-cultural peculiarities of business communication
The theory and practice of raising the effectiveness of business communication from the linguistic and socio-cultural viewpoint. Characteristics of business communication, analysis of its linguistic features. Specific problems in business interaction.

курсовая работа [46,5 K], добавлен 16.04.2011

General Characteristics of Lexicology
The connection of lexicology with other branches of linguistics. Modern Methods of Vocabulary Investigation. General characteristics of English vocabulary. The basic word-stock. Influence of Russian on the English vocabulary. Etymological doublets.

курс лекций [44,9 K], добавлен 15.02.2013

The speech act of gratitude in dialogic discourse
Act of gratitude and its peculiarities. Specific features of dialogic discourse. The concept and features of dialogic speech, its rationale and linguistic meaning. The specifics and the role of the study and reflection of gratitude in dialogue speech.

дипломная работа [66,6 K], добавлен 06.12.2015

The Participation of American and British Youth in Political Life of Their Countries.
According to the constitutions of the USA, according to the British law as well, all citizens of both sexes over 18 years of age have a right of voting. Political apathy among the youth. Participation in presidential and parliamentary elections.

реферат [24,1 K], добавлен 24.09.2008

The grammar of contemporary English
The history of parts of speech in English grammar: verb, noun, adjective, adverb, preposition, conjunction and interjection. Parts of speech and different opinions of American and British scientists. The analysis of the story of Eric Segal "Love Story".

реферат [41,8 K], добавлен 12.04.2012

History of American Literature
The biography of John Smith, Washington Irving, Hugh Henry Brackenridge, Benjamin Franklin, Charles Brockden Brown, Edgar Allan Poe, Ralph Waldo Emerson, Philip Freneau, Nathaniel Hawthorne, Walt Whitman. General characteristics of American romanticism.

курс лекций [88,2 K], добавлен 21.07.2009

The word-group theory in Modern English
Definition and general characteristics of the word-group. Study of classification and semantic properties of the data units of speech. Characteristics of motivated and unmotivated word-groups; as well as the characteristics of idiomatic phrases.

реферат [49,3 K], добавлен 30.11.2015

Problems in translation of music press headlines
To determine the adequacy of the translation model, from difficulties in headline trаnslаtion of music articles. Identification peculiarities of english music press headlines. Translation analysis of music press headlines from english into russian.

дипломная работа [602,6 K], добавлен 05.07.2011

Irony from the cognitive perspective
Irony, as a widely used figure of speech, received considerable attention from linguists. The ways of joining words and the semantic correlation of words and phrases. Classification of irony and general distinctions between metaphor, metonymy and irony.

реферат [20,5 K], добавлен 05.02.2011

Understanding the text
Match the right words to form expressions from the first two paragraphs of the article. Matching the expressions to the equivalent expressions. Answering are the statements true or false or is it impossible to say, given the information in the article.

контрольная работа [32,9 K], добавлен 16.05.2010

Peculiarities of British and American variants in the English Language
A short history of the origins and development of english as a global language. Peculiarities of american and british english and their differences. Social and cultural, american and british english lexical differences, grammatical peculiarities.

дипломная работа [271,5 K], добавлен 10.03.2012

Lingvostylistic analysis
English songs discourse in the general context of culture, the song as a phenomenon of musical culture. Linguistic features of English song’s texts, implementation of the category of intertextuality in texts of English songs and practical part.

курсовая работа [26,0 K], добавлен 27.06.2011

The political power
Political power as one of the most important of its kind. The main types of political power. The functional analysis in the context of the theory of social action community. Means of political activity related to the significant material cost-us.

реферат [11,8 K], добавлен 10.05.2011

Policy of Barack Obama
Brief biography of the American president Barack Obama, the main stages of its formation and personal career growth. Presidential race and election victory. Pillars of the internal policy of the new president, its features and performance evaluation.

курсовая работа [36,6 K], добавлен 04.05.2014

The Comparative Analysis of the Functioning of Interjections in the English and Spanish Languages
Interjections in language and in speech. The functioning of interjections in Spanish and English spoken discourse. Possible reasons for the choice of different ways of rendering an interjection. Strategies of the interpretation of interjections.

дипломная работа [519,2 K], добавлен 28.09.2014

Gender and age peculiarities of the language and some linguistic difficulties of translation them in practice
Study of lexical and morphological differences of the women’s and men’s language; grammatical forms of verbs according to the sex of the speaker. Peculiarities of women’s and men’s language and the linguistic behavior of men and women across languages.

дипломная работа [73,0 K], добавлен 28.01.2014

Lingual-Stylistic Peculiarities of Poetic Works of English Romanticism
General View of Romanticism. Life, works and Heritage of the Romantic Poets. Stylistic analysis of Lord Byron’s works "Destruction of Sennacherib", "Prometheus", "Darkness", of Shelly’s works "Adonais", of Wordsworth’s work "A Fact and Imagination".

курсовая работа [56,5 K], добавлен 30.10.2014

Translation of political literature and terms
Studying the translation methods of political literature and political terms, their types and ways of their translation. The translation approach to political literature, investigating grammatical, lexical, stylistic and phraseological difficulties.

дипломная работа [68,5 K], добавлен 21.07.2009

Другие документы, подобные "Towards a new linguistic model for detecting political lies"

главная

рубрики

по алфавиту

вернуться в начало страницы

вернуться к началу текста

вернуться к подобным работам

Рубрики

По алфавиту

Закачать файл

весь список подобных работ

скачать работу можно здесь

Работы в архивах красиво оформлены согласно требованиям ВУЗов и содержат рисунки, диаграммы, формулы и т.д.
PPT, PPTX и PDF-файлы представлены только в архивах.
Рекомендуем скачать работу.

Indicator/Marker	Truthful	Half-Truthful	False	Ridiculously False
1. Complexity:	The degree to which a lexical item has few or many syllables (lexical complexity) or a sentence has few or many phrases and clauses (syntactic complexity)
a. Big words (more than 6 characters or three syllables, excluding proper names)	100--89%	90--59%	60--10%	9--0%
b. Average sentence length (relative to longest sentence in the same piece of discourse)	100--89%	90--59%	60--10%	9--0%
2. Specificity:	Degree to which a segment of text is concrete and specific or abstract
a. Sensory details (sensory experiences such as sounds, smells, physical sensations and visual details)	100--89%	90--59%	60--10%	9--0%
Indicator/Marker	Truthful	Half-Truthful	False	Ridiculously False
b. Lexical density (a measure of vividness, quantified as the relationship of # of adjectives + # of adverbs divided by # of nouns+ # of verbs)	0--10%	11--60%	61--90%	91--<100%
3 Uncertainty:	Degree to which words or constructions introduce ambiguity in meaning
a. Modal verbs	0--10%	11 --60%	61--90%	91--<100%
b. Qualifiers like `somewhat', `maybe', etc.	0--10%	11--60%	61--90%	91--<100%
4 Verbal Non-immediacy:	Terms or constructions that express and create psychological distance
a. Passive voice	0--10%	11 --60%	61--90%	91--<100%
5. Personalization:	Pronoun use that increases the specificity of reference to self and others
a. Self-reference	100--89%	90--59%	60--10%	9--0%
b. Second and third person references	0--10%	11--60%	61--90%	91--<100%
6. Emotiveness:	Words or terms that convey emotions
a. Affect ratio (number of affect-laden words from a dictionary of affect terms relative to total number of words)	0--10%	11--60%	61--90%	91--<100%
7. Cognitive process terms Terms describing the respondent's thinking process (e.g., “thought,” “surmised”)	0--10%	11--60%	61--90%	91--<100%
8. VSA (voice stress analysis):	Acoustic features that signal tension on the part of the deceiver
a. Higher pitch (means are calculated; a pitch amounts to zero if below 65 Hz for males and if below 100 Hz for females*)	0--10%	11--60%	61--90%	91--<100%
b. Fillers, especially `um'	0--10%	11--60%	61--90%	91--<100%
Total = degree of veracity	Truthful 100%
	Half-truthful 99--50%
	False 49--5%
	Ridiculously false 4--0%

Statement	NM	LIWC
Benghazi	22.22	35.4
FBI	25.76	99.9
GOP	17.5	37.2
Mortgage	12.97	2.1
Gun factory	14.18	1.0
Healthcare	14.93	67.3
ISIS	12.63	50.4
Legislation	14.22	78.9
Hampshire	17.64	20.2
Oil	21.67	98.0
Sanders	15.11	96.0
Scott	17.47	32.4
Not a thing in America	20.06	1.0
Education	25.90	2.4
Clean Power	22.31	1.0
Emails	15.92	43.4

Statement	NM	LIWC
Clinton campaign	11.99	63.5
Coal	16.23	86.4
Cruz	13.96	2.8
Economy	10.67	17.0
FBI	23.85	1.0
Freddie	15.05	3.0
Iran	17.56	33.6
Iraq	14.85	96.2
ISIS	11.53	1.0
ISIS foundation	19.63	1.0
Marshal	21.39	41.4
Money laundering	10.75	1.0
Muslims	17.0	36.4
Obamacare	20.15	7.2
Ohio	15.31	8.3
Second amendment	23.51	1.0

Statement	NM	Politifact
Benghazi	False	False
FBI	False	False
GOP	False	False
Mortgage	False	False
Gun factory	False	False
Healthcare	False	False
ISIS	False	False
Legislation	False	False
Hampshire	False	False
Oil	False	False
Sanders	False	False
Scott	False	False
Not a thing in America	False	False
Education	False	False
Clean Power	False	False
Emails	False	False

Statement	New Model	Politifact
Clinton campaign	False	Ridiculously false
Coal	False	False
Cruz	False	Ridiculously false
Economy	False	Mostly false
FBI	False	False
Freddie	False	False
Iran	False	False
Iraq	False	False
ISIS	False	Ridiculously false
ISIS foundation	False	False
Marshal	False	Ridiculously false
Money laundering	False	False
Muslims	False	False
Obamacare	False	False
Ohio	False	False
Second amendment	False	False

Statement	LIWC according to NM scale	LIWC	Politifact
Benghazi	False	False	False
FBI	Half-truthful	False	False
GOP	False	False	False
Mortgage	Ridiculously false	False	False
Statement	LIWC according to NM scale	LIWC	Politifact
Gun factory	Ridiculously false	False	False
Healthcare	Half-truthful	False	False
ISIS	Half-truthful	False	False
Legislation	Half-truthful	False	False
Hampshire	False	False	False
Oil	Half-truthful	False	False
Sanders	Half-truthful	False	False
Scott	False	False	False
Not a thing in America	Ridiculously false	False	False
Education	Ridiculously false	False	False
Clean Power	Ridiculously false	False	False
Emails	False	False	False