Name: ………………………………………………………………………………

1.  Which of the following statements is NOT true about corpus linguistics?

a.  Corpus linguistics is a prescriptive field of study.

b.  Corpus linguistics combines qualitative and quantitative research methods.

c.  Corpus linguistics is a study tool for multiple disciplines, including linguistics, translation, and literary studies.

2.  A corpus of fiction and non-fiction books as well as newspaper articles, political speeches, and letters from the 1800s to the 2000s is best described as …

Page 4 of 5

a.  a synchronic, general corpus

b.  a synchronic, specialized corpus

c.  a diachronic, general corpus

d.  a diachronic, specialized corpus

Page 4 of 5

3.  Unlike corpus-based studies, corpus-driven studies do not try to test or validate any given theory.

a.  True

b.  False

4.  Typically, corpus-based translation studies use …

Page 4 of 5

a.  comparable corpora

b.  bilingual corpora

c.  monolingual corpora

d.  raw corpora

Page 4 of 5

5.  Which of the following statements is NOT true about what a corpus is?

a. It is a machine-readable collection of texts.

b. It must be authentic.

c. It should be both representative and balanced.

d. It must include both old and modern texts.

6.  ... are collections of historical texts.

Page 4 of 5

a.  Diachronic corpora

b.  Synchronic corpora

c.  Modern corpora

d.  General corpora

Page 4 of 5

7.  To know the vocabulary size of any corpus, we need to look at …

Page 4 of 5

a.  the total number of tokens

b.  the total number of types

c.  the token/type ratio

d.  the collocations of the corpus

Page 4 of 5

8.  Raw frequency takes into account the total number of the words in the corpus.

a.  True

b.  False

9.  With online corpus processors, you can download the corpus on your computer.

a.  True

b.  False

10.  A researcher wants to know the most frequent vocabulary mistakes of first-year students. He should use a … corpus.

Page 4 of 5

a.  monitor

b.  learner

c.  comparable

d.  parallel

Page 4 of 5

11.  Authenticity is best defined as …

a.  texts occurring in natural communicative settings

b.  texts tailored to prove the researcher’s hypotheses

c.  texts collected from native speakers

d.  texts collected in written formats.

12.  For a researcher, who is interested in identifying obsolete Arabic words that are no longer used in today’s movies, what type of corpus should he use?

a.  a diachronic monolingual corpus of Arabic movies

b.  a diachronic bilingual corpus of both English and Arabic movies

c.  a synchronic monolingual corpus of Arabic movies and books

d.  a synchronic monolingual corpus of Arabic movies

13.  In a corpus of 3,000 words, the raw frequency of reserve as a noun is 100 while its raw frequency as a verb is 50. What is the best way to calculate the normalized frequency of reserve as a noun?

Page 4 of 5

a.  (3000 ÷ 100) x 1000

b.  (100 ÷ 3000) x 1000

c.  (50 ÷ 3000) x 1000

d.  (50 ÷ 100) x 1000

Page 4 of 5

14.  Unlike offline corpus processors, online corpus processors enable you to work on your own corpus.

a.  True

b.  False

15.  To find out which words can replace the word ‘penny’ in the idiom ‘a penny for your thoughts’, we can use the wildcard option as follows:

Page 4 of 5

a.  a penny * for your thoughts

b.  a * for your thoughts

c.  a* penny for your thoughts

d.  a * penny for your thoughts

Page 4 of 5

16.  All corpus processors – both online and offline – can differentiate between collocations, idioms, and phrasal verbs.

a.  True

b.  False

17.  To find out the adjacent right-hand collocations of give, we need to set the window size to …

a.  0 give 1

b.  1 give 1

c.  0 give 0

d.  1 give 0

18.  A wordlist is … (choose the best answer)

a.  a list of words

b.  a list of words sorted by their normalized frequencies

c.  a list of words sorted by their raw frequencies

d.  a list of phrase raw frequencies

19.  To find out the most frequent nouns in American English, the researcher needs ….

a.  a morphologically annotated specialized corpus

b.  a semantically annotated raw corpus

c.  a raw corpus annotated for parts of speech.

d.  a corpus annotated for parts of speech

20.  To know the most frequent spelling mistakes in the essays of the first-year students in the Department of English at Al-Alsun, which of the following corpora should you use?

a.  a corpus of essays collected from the first-year students in the mainstream and the credit-hours systems in the English and Spanish departments of Al-Alsun.

b.  a corpus of essays collected from the first-year students in the mainstream and the credit-hours systems in the English department of Al-Alsun.

c.  a corpus of essays collected from the first-year students in the mainstream system in the English and Spanish departments of Al-Alsun.

d.  a corpus of essays collected from the first-year students in the credit-hours system in the English and Spanish departments of Al-Alsun.

21.  One disadvantages of online corpus processors such as COCA is that

a.  they are portable

b.  they are not available for free

c.  you can’t upload your own corpus

22.  The Web Interface of online corpus processors is …

a.  a computer software that you can download and install on your local machine.

b.  an online software to enable you search your offline corpora.

c.  a computer software only accessible through tablets.

d.  an online software to enable you search the online corpus.

23.  The advantages of offline corpus processors are … (choose all correct answers)

a.  many of them are available for free

b.  you can work on your own corpora

c.  they come with their own built-in corpora

d.  they are portable as you can use them everywhere once installed on your PC.

24.  Corpus analysis can be done manually without using computers.

a.  True

b.  False

25.  Big Data is an expression which means that nowadays corpora tend to have billions of words.

a.  True

b.  False

26.  The main difference monitor and diachronic corpora is that …

a.  Monitor corpora are static while diachronic corpora are dynamic

b.  Monitor corpora have historical texts but diachronic corpora have modern texts.

c.  Monitor corpora are dynamic while diachronic corpora are static

27.  A corpus of English biology research papers is … (choose the best answer)

a.  a monolingual corpus

b.  a monolingual specialized corpus

c.  a monolingual learner corpus

d.  a monolingual general corpus

28.  A well-designed corpus is … (choose the best answer)

a.  representative, written, large, balanced, and authentic

b.  machine-readable, representative, authentic, large, and balanced

c.  machine-readable, representative, authentic, and balanced

d.  general, machine-readable, authentic, representative, and balanced.

29.  For a corpus of 1000 tokens and 200 types, how will you calculate its lexical richness?

a.  200 ÷ 1000

b.  1000 ÷ 200

c.  200 x 1000

d.  1000 + 200

30.  The reference corpus used to identify keywords must be … (choose all correct answers)

a.  large

b.  specialized

c.  general

d.  academic

31.  The common base of the normalized frequency is typically set based on …

a.  the number that you like the most

b.  the number of the word types in the corpus

c.  the number of word tokens in the corpus

d.  none of the above it is always set to 100

32.  Unlike parallel corpora, comparable corpora contain translated texts.

a.  true

b.  false

33.  A corpus of Medieval English texts is a … corpus.

a.  synchronic

b.  diachronic

c.  monitor

d.  learner

34.  Historical linguistics is interested in studying language change. The best type of corpora for historical linguistics is …

a.  learner corpora

b.  annotated corpora

c.  raw corpora

d.  monitor corpora

35.  To look for all possible prefixes that come with danger, we can try …

a.  *danger

b.  * danger

c.  danger*

d.  danger *

36.  Studies that do not try to teach people how to use language but, instead, try to describe how language is being used are referred to as …

a.  general studies

b.  descriptive studies

c.  monolingual studies

d.  prescriptive studies

37.  One main disadvantage of qualitative approaches is that …

a.  they are applied

b.  they are descriptive

c.  they are statistical

d.  they are biased towards the experts’ opinions

38.  Corpus annotation refers to ...

a.  adding linguistic information to the corpus raw texts.

b.  collecting corpus texts from online and offline resources

c.  converting audio data into a written format via transcription or transliteration

d.  normalizing spelling variations and separating punctuation markers.

39.  In corpus linguistics, the word is a keyword if ...

a.  it is the most frequent word in the text

b.  it is written in upper-case letters

c.  it comes at the beginning of the text

d.  none of the above

40.  A query is …

a.  a word or a phrase with a low frequency in the corpus

b.  a word or a phrase with a high frequency in the corpus

c.  a word or a phrase that does not exist in the corpus

d.  a word or a phrase that we look up in the corpus

Page 4 of 5