Skip to content
Shreyas edited this page Apr 4, 2013 · 7 revisions

Textual Patterns :: Keywords and Corpus Analysis in Language Education

By: Mike Scott & Christopher Tribble


My Notes

Ch4 : Keywords of Individual Texts

Pg : 55

Objectives in the Chapter:

  • method of identifying KeyWords (KW) in text
    • identify items of unusual frequency in comparison with a reference corpus
  • 2 kinds of outputs in a key words list:
    • aboutness indicators
    • stylistic indicators
  • demonstrate how a plot of keywords can illustrate themes and progressions of text.

Keyness

Keyness is a quality words may have in a given text or set of texts, suggesting that they are important, they reflect what the text is really about, avoiding trivia and insignificant detail.

To establish term keyness there are 2 methods:

  1. Kintsch & van Dijk's (1978)
    • propositional analysis of text identifies text structure, establishing a hierarchy which we might label heirarchy of keyness
  2. Hoey (2001)
    • analyze sentences
  3. Author's process
Method 1 (Kintsch & van Dijk):

Consider, sample text:

EastEnders star Steve McFadden was 'stable' in St. Thomas's Hospital, London, 
last night after being stabbed in the back, arm and hand under Waterloo Bridge, 
Central London, on Friday. 

Propositional analysis will give us:

1. S. McF is a star
2. S. McF is in EastEnders
3. S.McF was stable
4. someone said that
5. S.McF is in hospital
6. The hospital is called St. Thomas's
7. The hospital is in London
8. was so last night

Then figure out which proposition is referred to most in the entire set ( macropropositions ). The propositions then naturally fall into a hierarchy of prominence, reflecting their keyness. The propositions, more than others, are what the text is really ' about '.

Method 2 (Hoey):
Step 1

This method takes sentences and not propositions. It seeks out elements which are most linked. A link is a repetition of some kind like

  • Grammatical variants ( want - wanted )
  • Synonyms ( desire , etc)
  • Hyponyms
  • Meronyms
  • Antonyms
  • Equivalents (like McFadden referred as he, McFadden, star, actor) of a word being considered would count as repetitions making for links.
Step 2

Links alone do not suffice, what makes a sentence truly prominent is when it is bonded by a pattern of at least 3 links with other sentences in the text

Both these methods rely on conceptual repetition to come up with keyness.

Method 3 (Author's process):
Step 1:

Verbatim repetition of words (so wanted, want, wants are same)

Step 2:

Use a reference corpus (preferrably large) to generate an expected frequency of those words

Then find out those words which occur outstandingly more than their expected values based on the reference corpus. These are the keywords (KW)

To find out whether a word is outstanding do a chi-square analysis (observed vs expected)

Sample results:

KeyWords

key words plot of key words

KeyWords Links

key word links key word links of collocated

Links between KW

Consider linkage betweek KWs as collocational neighbours. This will give KeyWords which not only share textual keyness but local proximity as well like say having a range of x words left or right of the keyword is the collaction neighbourhood.

It should be noted that KW linkages are not, like ordinary collocational linkages, since they require both node and collocate to be key (general collocate just returns values like strong tea which just classifies the tea. But by looking for keywords in a range of say 5 words left or right of the key, collocate establishes a relation between 2 different key concepts)

Ch5 : Keywords and genres

Pg : 73

Objectives in the Chapter:

  • explain notion of association i.e. the contextual relationship between words that are key in the same texts.

Keyword linkage between texts

We have established keyness and co-keyness.

Jones's (1971) patterns in keyword linkage

keyword links in text
  • Strings occur when a series of KWs are connected one to the next within a text
  • Star linkage occcur when a central KW is shared in linkage by several texts
  • Clique has a set of inter-connected KWs
  • Clumps are also inter-connected but have links to other parts of the text as well (represented with dashed lines)

Keywords across texts can be plotted as a graph.





Reference