|Home||Linguistik online 43, 3/2010|
Morphologically complex nouns in English Scientific Texts after Empiricism [*]
Isabel Moskowich (A Coruña)
The way in which language was used in the early days of the scientific register has been studied by different authors in a number of different ways. Thus, medical writing has been approached from a discourse perspective (Taavitsainen 2004), from the point of view of its lexicon (Norri 1998, 2004), in studies on code-switching (Pahta 2004), etc. Other disciplines within the scientific register/genre have also been the object of study, mainly as regards the particular processes by which they acquired new vocabulary; in other words, how a lexical repertoire particular to it was created (Cowie 1998; Moskowich/Crespo 2006; Crespo/Moskowich 2007) in a period known to have been especially rich from a lexical point of view (Görlach 1991: 136–137).
This paper examines the different characteristics of new vocabulary formations (in nouns) following the introduction of the scientific method in the 17th century in order to ascertain whether the morphological processes in these special languages behave the same way across two different scientific disciplines and levels of formality of texts or if, on the contrary, each adapts to the needs of its particular type of users and audiences. In the latter case, the use of different morphological devices could be viewed as an in-group strategy on the part of authors. Our analysis and results should not be, however, interpreted as extensible to other periods, text-types or disciplines due to the scarcity of the data under survey at this particular stage of the research.
For some time now there has been a growing interest in the features and development of scientific English. However, it has been often dealt with as if it were a monolithic structure. It is the aim of this paper to show that variation may be also inherent to scientific English and that the study of word-formation pattern choice behind texts may help, though very modestly, draw a better portrait of that development. To this end, section 2 offers an outline of the different word formation processes present in the second half of the 17th century, in the period known as empiricism or modern science (Latour 1999), while section 3 contains a description of the material from which my data have been extracted. In section 4, I present the results of my analysis in relation to each individual word-formation process. Unclear cases will be dealt with in section 5 and some very preliminary conclusions extracted in the very last section (6).
2 Some Preliminary Remarks
Though there were calls to enhance the eloquence of English through borrowings and other devices, scholars at the time, such as Galileo, Bacon and Lever (as early as 1573), argued that accurate language tools were needed to describe complex natural phenomena (Stubbs 1996: 18). Experimentation and observation according to the new scientific method should have their verbal equivalent in language, they claimed. The polysemy and ambiguity of everyday language were deemed unacceptable and the mechanisms of a 'new language' designed to avoid both began to take form.
Halliday (1978: 195f.) identifies seven strategies used in the formation of specialised terminology:
1. Reinterpretation of existing words
In my analysis of the patterns by which scientific language created new lexical units in the 17th century texts analysed, I will be concentrating on Halliday's strategies 2 and 7, that is to say, the creation of new words from already existing material.1 These methods encompass probably the two most important word-formation processes in the English language: compounding and derivation or affixation.
By derivation, I understand the combination of two or more elements, where one of them functions as the base (being either a free or a bound morpheme, as in greatnesse, Strangehopes 1663: 15) and the remaining elements are affixes. The result of this merging is the creation of a new word. While the number of affixes listed in grammar books or other linguistic works dealing with word-formation can vary,2 my own approach here will include all those affixes that are considered by most authors. Similarly, I will adopt for the purpose here Dalton-Puffer's (1996) view of derivation where forms such as restorative are not considered borrowings (although containing a foreign, non-native element) but as derivatives.
As in earlier periods, language contact favoured the introduction of new vocabulary items, both simple and complex. In the dissemination of scientific knowledge, authors reading works published in other languages (Latin mainly, though not exclusively) would have acquired new words and structures which, once assimilated by the scientific discourse community, could then be used in combination with more familiar native bases (e. g. unfortunately). The reverse was also true: foreign bases could be combined with native affixes (observing). I share Adams's (2001: 11) view that a complete account of English word formation must take into consideration the hybrid nature of English vocabulary and the external circumstances under which the processes involved in enlarging the lexicon developed. However, no such complete account is relevant for our purposes here.
The issue of compounding has proved much more complicated. In fact, Plag (2003: 132) is right on the mark when he explains that
Plag (2003) proposes a characterisation of compound nouns in terms of the properties observed in formations of this kind, which he groups according to four different levels of linguistic analysis: syntactic, morphological, phonological and semantic.
To illustrate his point, he quotes Weinreich's (1963) definition of idiom as 'a grammatically complex expression a+b whose designatum is not completely expressible in terms of the designate of A & B respectively'. However, Bauer himself (1978, 1983) claims a stage of lexicalisation in the formation of a compound. There would seem, therefore, to be no clear-cut distinction between the concepts of compound and idiom/collocation.
What is it that makes us consider celestial body/glittering star (Strangehopes 1663: 63) a compound, a single entity referring to some kind of star or planet, rather than an NP in which celestial is just a pre-modifier denoting the properties of a body?4 Conversely, what makes us reject widely used collocations (in the broad sense of the term) such as whisky and soda5 as compounds and class them together with other co-ordinated constructions. To answer these questions, we have often relied upon semantic criteria (cohesion, mainly), and intuition.6
Where both compounding and derivation are involved, certain authors (Bauer 1983) claim that if derivation takes place subsequent to compounding, the result should not be taken as an example of compounding. My own survey contains formations of this kind that have undergone both processes. However, in all instances derivation appears to have taken place first. Ecliptick line (Strangehopes 1663: 2l, 22), for example, seems to have been formed later than the word ecliptick itself.
For the present paper I have, once again, selected texts of two sorts. Firstly, medical texts, represented by A Choice Manual of Rare and Select Secrets in Physick and Chyrugery; Collected, and Practised by the Right Honorable, the Countesse of Kent, late deceased by William Shears, 1653. This text has been transcribed by the team working at present on the compilation of the Corpus of Early English Recipes (CoER).7 Astronomy texts, secondly, are represented by Vincent Wing's fifth book in the Armonicum Coeleste: or, the Coelestial Harmonie of the Visible World, dated 1651; and A Book of Knowledge in three Parts, published in 1663 by Samuel Strangehopes. These last two have been extracted from CETA (a Corpus of English Texts on Astronomy), the astronomy section of the Coruña Corpus of English Scientific Writing.8 Though they are all scientific texts there is a difference in domain or discipline that will be useful to see whether word-formation patterns are related to discipline and intended audience or not.
My three samples contain 36'268 words overall, as Table 1 shows, which is just a small corpus for a very preliminary approach to the topic. Of those, 20'874 words belong to the medical recipes sample, and 15'394 to the astronomy texts (Strangehopes contains 8'634 and Wing 6'650).
Table 1: Corpus material
For the purpose of my study, those word classes which could be subject to different native morphological processes seem to be especially significant. I have elected to examine the lexical category of nouns in order to observe how they combine with affixes of different origins or other lexical stems, and to see to what extent the scientific register has been able to create and increase its own repertoire as a reflection of the identity and specificity of the field. The lexicon of the scientific register feeds off nouns more than off any other part of speech since they are used to convey ideas or to identify new objects or inventions, processes in nature, etc. In fact, as Nevalainen (1999) states, nouns are one of the largest and most important lexical categories in scientific terminology.
Place names and proper nouns have been excluded from my data, except when they appear as part of a compound (e. g. Jordan almonds) or when used to denote ingredients or other things of that nature. Thus Arundell, for instance, is excluded as it appears in
whereas proper nouns have been included when they form part of an ingredient, as in
Cardinal points and nouns denoting seasons, days of the week, months of the year, constellations and zodiac signs have also been ignored. Adjectival nominalisations (the good) and -ing nominalised forms (rising, akyng) have been taken into consideration.
4 Analysis of data
The three variables I will be looking at here are: word formation processes involved, namely derivation and compounding; etymological origin, i. e. the source language from which a particular item has been adopted; and discipline, namely medicine and astronomy.
After manual collection of data, I have found 1'013 different nouns or types, corresponding to 9'681 tokens (astronomy: 2'654 + 2'010; medicine: 5'017). This represents 26.7 % of the total 36'268 words in our samples which seems to support Nevalainen's (1999) claim.
4.1 Word formation processes
Analysing the word formation processes in these nouns, the results show that simple forms predominate (581 types), followed by derivation with 311 types and, finally, compounds with 121 types. These figures are displayed in Graph 1 below:
Graph 1: Word formation processes
The reason for this higher proportion of derivative forms may lie in the abundance of borrowed forms in general attested since the Inkhorn controversy. Many of these borrowings were derivatives themselves already in the source language, taken mainly from Latin through translations and later adapted and incorporated into the English language. This is in line with the findings presented in 3.2 below, where items of Romance origin clearly predominate.
Compounding provided a useful mechanism for conveying scientific content in a simple, clear, concise style, in keeping with the new demands upon contemporary writers in the field. That is the case of Zodiacal circle (Wing 1651) where the form zodiacal shows both a Latinate base and suffix (according to the Oxford English Dictionary, OED).
Were we to consider derivatives from a different perspective (i. e. Moskowich/Crespo 2006), our figures for them and compounds would be more balanced. Moreover, all the texts studied here were produced at a time when vernacularisation was already well established, though some authors did still continue to write in Latin.
4.2 Etymological origin
In order to examine the second variable I have resorted to the Oxford English Dictionary. In it a number of different origins – OE, F, OF, ON, L, Gr, NLG, NHG, MDu, AF9 – is given but, for the sake of simplicity, I have decided to separate them into two groups, Romance or Latinate (Harley 2006: 165) and Germanic. Graph 2 shows the clear predominance of nouns of Romance origin:10
Graph 2: Etymological origins
Taking the two variables so far considered, we find that in derivation Romance forms predominate, with both affixes and bases sharing a common origin:
Graph 3: Etymology in derivative forms
The examples below illustrate the types of formations found in my samples in relation to their different provenance, all of them taken from Shears (1653):
The hybrids in example 6 can be considered to illustrate the degree of vernacularisation in English scientific writing: the OED dates the first occurrence of the verbal noun vomiting to 1495 and the verb vomit, from which it derives, to 1422, which seems to prove a very rapid development. In the case of faintenesse, the abstract noun is first recorded in 1398.
Upon closer inspection, compounds also show some peculiarities with regard to etymology. In this case, and in contrast to what has just been described for derivatives, the nouns seem to stem, preferably, from Germanic sources. The results of my analysis are displayed in Graph 4:
Graph 4: Etymology in compounds.
These data seem to confirm the general tendency in the English language to adopt derivative forms from the Romance languages but to create compounds out of native stock. Consider examples (7) to (9):
According to the OED, the hybrid compounds in (9) were introduced later than the hybrid derivatives described previously: claret wine in 1513 and dead palsie in 1592. The cohesion between the elements, since they are in phrases rather than in single words, in these instances seems to be less than that between the elements of hybrid derivatives.
4.3 Word formation and etymology in the two disciplines
In this section two variables, the analysis of word formation processes and the etymology of the elements involved according to their distribution in the two disciplines under survey, have been combined. The results will be presented separately.
4.3.1 Astronomy texts
As Table 2 shows, derivation is the predominant process in astronomy texts:
Table 2: Word formation and etymology in astronomy texts
It also shows there is a considerable difference in the number of types between derivation and compounding. Most of these derivatives are of Romance or Latinate origin as the ones in the example below:
In compound nouns we observe the reverse process, where Germanic forms (51.6 %) predominate over those of Romance provenance (11.6 %). Some of such Germanic forms are exemplified in (11) below:
What is outstanding here is the number of hybrid formations that can be found in examples such as (12) following:
In fact, the percentages in my table illustrate that hybrid formations prefer compounding (36.6 %) over derivation (5.3 %) in the same way that native structures do (where 51.6 % involve compounding against only 17.7 % derivation). This may be due to the fact that many of these hybrids are felt as native forms by speakers and therefore undergo similar processes.
4.3.2 Medicine text
My study of the medical text reveals a clear preference for compounding over derivation (with examples such as handful, restorative, swelling), as seen in Table 3:
Table 3: Word formation and etymology in the medicine text
Here, too, the compounds found have different origins, as Table 3 shows: 31.14 % of my cases are of Romance provenance, as in examples (12) to (14); 22.9 % have a mixed etymology (see (16)); and 45.9 % have been obtained from native forms (see (17)):
The striking difference between the number of derivatives and compounds in my data is not due to the discipline to which the text could be ascribed but to the provenance of the lexicon used. The medical sample used for this study is not an academic text but rather a practice-oriented one (Taavitsainen 2004). The selection of vocabulary, therefore, is undoubtedly determined by its intended audience. The author, keeping in mind the average practitioner, may have decided to resort to common words to ensure his message was understood. Indeed, writers at the time, such as Ralph Lever in the Arte of Reason (quoted by Jones 1953: 126), highlighted the usefulness of forming compounds in the English language. As Gotti (1996: 22) points out:
An example of this is found in (18) below, which also meets the pragmatic principle of maximum transparency:11
It would seem, therefore, that etymology is the decisive factor in choosing between derivation and compounding: derivation is linked to the use of Romance-origin or Latinate words while native stock seems to have a tendency to follow compounding patterns. In other words, it seems it is not discipline but the etymological origin of the base that determines the choice of word formation process.
5 Some unclear cases
Not all the nouns examined have been easy to classify as either derivatives or compounds; such cases have not been included in my database. Problems arose when analysing them as I realised that the formations in question in many cases were somewhere in the middle of a lexicalisation process. How, for instance, should one categorise constructions such as the ones in examples (19) to (23)?
In an attempt to clarify the issue, I applied, when possible, the various criteria posited by Plag (2003) and mentioned in section 1 that are normally used to distinguish between compounds and other structures:
The phonological criterion was of no use, since stress cannot be an indicator of compounding in written material.
Turning to the syntactic criteria, I found that none of my unclear cases fulfilled any of the properties listed there (recursivity, right-hand head rule, modification by very).
In relation to morphology, I observed that not all my examples behave the same. When expressing the plural, for instance, the right-hand element is not always marked for number, owing, it would seem, to some inherent semantic constraints upon the right-hand element (such as being a proper or an abstract noun).
The varying degree of lexicalisation in each instance may also account for the difference in behaviour between examples (20) and (21), on the one hand, and (22) and (23), on the other. Other proposals view prepositional phrases in these cases as phrasal adjectives functioning as post-modifiers in the noun phrase (Gross/Millar 1990).
Finally, from a semantic point of view, however, these unclear cases can be considered compounds since they are perceived as a single unit and express a single content (Zandvoort 1972), because, as Bauer (1978, 1983) has explained, a certain degree of lexicalisation is present in the formation of compounds.
No doubt further research is needed in order to try to classify these unclear cases and see which are the criteria that apply in each case.
6 Concluding remarks
Though the samples analysed are not at all sufficient to obtain any definitive conclusions, some preliminary ones can be posited. Almost half the types examined exhibit some word formation process, primarily derivation. The process of compounding tends to concentrate meaning and this was a practice that was to increase gradually from the second half of the 17th century onwards as part of the effort to use the resources of the English language for scientific purposes (Jones 1930; Shapin 1984), but one which is not yet prevalent in the texts in my survey. As regards origins and disciplines, I have seen that a Germanic origin is more frequent in the medical text while Romance is more so in astronomy. However, it could be claimed that this distribution is not determined by discipline but by the text's intended audience. That is to say, the pragmatic concerns of the sub-genre are dictating linguistic choices: the choice of word-formation patterns can be considered a tool employed to meet the expectations of different readerships.
Although the texts included in my analysis all belong to the so-called 'scientific register' and though samples are not large at all, my data have demonstrated that the register is not monolithic but, rather, shows different levels of scientism. Scientific writing depends on its users and what their needs are because the linguistic mechanisms employed will vary according to what is to be transmitted.
Similar results have been obtained for earlier periods in the language using comparative samples from the same two disciplines (Moskowich/Crespo 2006; Crespo/Moskowich 2007). This parallelism may be explained by the fact that in the seventeenth-century special languages were not so different from the general language (Gotti 1996), and that would continue until the Royal Society was created and authors such as Boyle and Priestley became conscious of the necessity of creating a particular way of writing about scientific matters.
All in all, we have seen that some ad hoc constructions are created for scientific writing, especially in particular disciplines as is the case of Astronomy. Perhaps we need to rethink the existing taxonomy of text-types/genres (particularly in terms of how we conceive of science as a discipline or different disciplines), and our understanding of the relationship between word-formation processes, level of technicality and genre.
* The research which is here reported on has been funded by the Xunta de Galicia through its Dirección Xeral de Investigación e Desenvolvemento, grant number PGIDIT07PXIB104160PR. This grant is hereby gratefully acknowledged. Here I use the term 'complex' in the same sense as Adams (2001); that is to say, to refer to any noun to whose root some other linguistic material has been added so that it represents both compounding and derivation. back
1 Strategy number 1 will not be dealt with since it entails semantic shift rather than any word-formation process. Strategy number 5, though also related to word creation, has been excluded from this study because it does not resort to forms or parts of forms already existing in the language. back
3 This phenomenon is known in the literature on word-formation as 'feature percolation': the right-hand member is the head, from which the grammatical properties are transferred to the whole compound. back
8 The Coruña Corpus is a project currently being carried out in the University of A Coruña (Spain) by the Research Group for Multidimensional Corpus-based Studies in English (MuStE). The main interest of the group is the study of language change as observed in scientific texts, not only from a diachronic point of view but also for each linguistic period. Its purpose is to facilitate investigation at all linguistic levels, excluding phonology. More information about the research group can be found at http:wwww.udc.es/grupos/muste. back
9 These abbreviations used in the OED correspond to the following source languages: Old English, French, Old French, Old Norse, Latin, Greek, North Low German, North High German, Middle Dutch, Anglo-French. back
10 When referring to Romance origin forms here we have not analysed the possible influence of Latin constructions since that is not the aim of this paper. No doubt, the works by Charles D. F. Du Cange (Glossarium mediae et infimae Latinitatis. Paris: 1678) and Alexander Souter (Glossary of later Latin 150–600 AD. Oxford: 1949) would have been relevant if this were the case. back
Adams, Valerie (2001): Complex Words in English. Edinburgh.
Bauer, Laurie (1978): The Grammar of Nominal Compounding. Odense University Studies in Linguistics. Odense.
Bauer, Laurie (1983): English Word-Formation. Cambridge.
Biber, Douglas (1995): Dimensions of Register Variation. A Cross-Linguistic Comparison. Cambridge.
Cowie, Claire (1998): "The Discourse Motivations for Neologising: Action Nominalization in the History of English". In: Coleman, Julie/ Kay, Christian J. (eds.): Lexicology, Semantics and Lexicography – Selected Papers from the Fourth G. L. Brook Symposium, Manchester, August 1998. Amsterdam: 179–206.
Crespo, Begoña/Moskowich, Isabel (2006): "Derivational Morphology Revisited: the Study of Verbal Forms in English Scientific Discourse". Paper presented at the 18th SELIM (Spanish Society for Medieval English Language and Literature) Conference Málaga 2006.
Crespo, Begoña/Moskowich, Isabel (2007): "Medicine, Astronomy, Affixes and Others: An Account Of Verb Formation In Some Early Scientific Works". Journal of the Spanish Society for Medieval English Language and Literature 13: 179–198.
Dalton-Puffer, Christiane (1996): The French Influence on Middle English Morphology. A Corpus-Based Study of Derivation. Berlin.
Gotti, Maurizio (1996): Robert Boyle and the language of science. Milan.
Górlach, Manfred (1991): Introduction to Early Modern English. Cambridge.
Gross, Derek/Miller, Katherine (1990): "Adjectives in word-net". International Journal of lexicography 3/4: 265–277.
Halliday, Michael A. K. (1978): "Sociolinguistic aspects of mathematical education". In: Halliday, Michael A. K. (ed.): Language as Social Semiotic. London: 194–204.
Harley, Heidi (2006): English Words. A Linguistic Introduction. Oxford.
Hay, Jennifer/Baayen, Harald (2002): "Parsing and Productivity!". In: Geert E. Booij/van Marle, Jaap (eds.): The Yearbook of Morphology 2001. Dordrecht: 203–235.
Jespersen, Otto (1942): A Modern English Grammar on Historical Principles. Part IV: Morphology. London/Copenhagen.
Jones, Richard F. (1930): "Science and English Prose Style in the Third Quarter of the Seventeenth Century". Publications of the Modern Language Association 45/4: 977–1009.
Jones, Richard F (1974): The triumph of the English Language. Stanford.
Kastovsky, Dieter (1992): "Semantics and Vocabulary". In: Hogg, Richard (ed.): The Cambridge History of the English Language: From the Beginnings to 1066. Cambridge: 290–408.
Katamba, Francis (2005): English Words. Structure, History, Usage. New York.
Kruisinga, Etsko (1932): A Handbook of Present-Day English. 5th ed. Groningen.
Latour, Bruno (1999): Pandora's Hope: Essays on the reality of Science Studies. Cambridge MA.
Lever, Ralph (1573): The Arte of reason, rightly termed, titcraft, teaching a perfect way to argue and dispute. London.
Marchand, Hans (1969): The Categories and Types of Present-Day English Word-Formation. Munich.
Moskowich, Isabel/Crespo, Begoña (2006): "Lop-webbe and henne cresse: Morphological Aspects of the Scientific Register in Late Middle English". Studia Anglica Posnaniensia 42: 133–145.
Moskowich, Isabel/Crespo, Begoña (2007): "Presenting the Coruña Corpus: A Collection of Samples for the Historical Study of English Scientific Writing". In: Pérez-Guerra, Javier et al. (eds.): Of Varying Language and Opposing Creed. New Insights into Late Modern English. Berlin/New York: 341–357.
Nevalainen, Terttu (1999): "Lexis and Semantics". In: Lass, Roger (ed.): Cambridge History of the English Language: 1476–1776. Cambridge: 332–458.
Norri, Juhani (1998): Names of Body Parts in English, 1400–1550. Helsinki. (= Annales Academiae Scientiarum Fennicae, Humaniora 291).
Norri, Juhani (2004): "Entrances and exits in English Medical Vocabulary, 1400–1550". In: Taavitsainen, Irma/Pahta, Päivi (eds.): Medical and Scientific Writing in Late Medieval English. Cambridge: 100–143.
Pahta, Päivi (2004): "Code-Switching in Medieval Medical Writing". In: Taavitsainen, Irma/Pahta, Päivi (eds): Medical and Scientific Writing in Late Medieval English. Cambridge: 73–99.
Plag, Ingo (2003): Word-Formation in English. Cambridge.
Shapin, Steven (1984): "Robert Boyle's Literary Technology". Social Studies of Science 14/4: 481–520.
Stockwell, Robert/Minkova, Donka (2001): English Words: History and Structure. Cambridge.
Stubbs, Michael (1996): Text and Corpus Analysis. Oxford.
Taavitsainen, Irma (2004): "Transferring Classical Discourse Conventions into the Vernacular". In: Taavitsainen, Irma/Pahta, Päivi (eds): Medical and Scientific Writing in Late Medieval English. Cambridge: 37–72.
Williams, Edwin (1981): "On the notions 'lexically related' and 'head of a word'". Linguistic Inquiry 12/2: 245−274.
Weinreich, Uriel (1963): "Lexicology". In: Sebeok, Thomas A. (ed.): Current Trends in Linguistics. Vol. I: Soviet and East European Linguistics. The Hague: 60–93.
Zandvoort, Reinard Willem (1972): A Handbook of English Grammar. London.