Simon Wolming, 1998: Validity. A modern approach to a traditional concept /Validitet: Ett traditionellt begrepp i modern tillämpning/. Pedagogisk Forskning i Sverige, Vol 3, No 2, Pp 81-103. Stockholm. ISSN 1401-6788.
The discussion about the quality of measurements includes concepts like precision, credibility, accuracy, trustworthiness and relevance of the measurement. The concepts of reliability and validity are central aspects of the quantitative tradition of measurement. In this article it is stated that these two concepts, and especially validity, can be seen as general concepts independent of scientific approach.
Educational research can be classified as either quantitative or qualitative. The concept of measurement, and especially the concepts of reliability and validity, are usually connected with the quantitative research tradition. This article states that different types of measurement occur, in one way or another, in all types of empirical sciences. Thus, measurements can be performed in quite different ways and with a variety of instruments. The problems connected with different types of measurements (e.g. psychological tests, ratings, interviews, observations, etc) are basically the same for all scientific approaches. The shift in the opinion that there is a conflict between the quantitative and qualitative paradigms can be illustrated with the following quotation from Husén (1997): The two paradigms are not exclusive, but complementary to each other (p. 21). This statement supports the opinion that the concepts of validity and reliability can be applied to both quantitative and qualitative methods.
This article focuses on validity and presents a historical perspective of the changed meaning of the concept of validity. It is possible to identify three significantly different periods in the development of the outlook on the concept of validity. One of the early definitions of validity was that a test is valid for anything with which it correlates (Guilford 1946). This definition of validity prevailed during the first period and focused exclusively on the statistical correlation between a predictor and a criterion.
During the second period this relatively simple definition of validity was replaced with a somewhat more complex one. This period was characterised by different types of validity related with the different aims of testing. Content validity was used for tests describing an individual’s performance on a defined subject. Predictive validity was used for tests predicting future performance. Concurrent validity was used for new tests proposed as substitutes for less convenient tests. Construct validity was used to make inferences about psychological traits like intelligence.
The third and current period can be characterised as the uniting of the three different types of validity described above. The new outlook can be illustrated with the following quotation: They are interrelated operationally and logically; only rarely is one of them alone important in a particular situation (American Psychological Association, American Educational Research Association, and National Council on Measurement in Education 1974). The most important change between the third and the earlier periods is that all types of validity can now be included in the concept of construct validity. Another major difference, compared with the earlier definitions of validity, is that one validates, not a test, but an interpretation of data arising from a specified procedure (Cronbach 1971 p. 447). Thus, the emphasis of the validity question is now on the interpretations which can be made from a measurement procedure, not on the specific instrument used for the measurement.
A third major contribution to validity theory during the third period was presented by Messick (1989), in his four-faceted model of validity. In Messick´s model, construct validity was set up to be the connecting link. In the first aspect of the model he set up construct validity in the traditional way. The second aspect was the relevance and utility of the measurement in the specific context. The third aspect was the values that could be connected with the specific measurement and, finally, the social consequences of the measurement. Messick stated that the former classification of validity was fragmented and, therefore, could not be seen as a complete model for validation. He emphasised the values connected with the measurement and the potential social consequences of the measurement for the testees. The use of an unfair test could, for example, have negative effects for different ethnic groups, and it could also produce unfair gender differences.
The general opinion is that the consequences of a measurement have to be taken into account, but it is a matter of controversy whether they should be included in the definition of validity. The opinion of Mehrens (1997) and Popham (1996) was that, even if the social consequences are important within all research, it is questionable whether they should be connected with the theoretical concept of validity.
The four-faceted model of validity presented by Messick requires new methods and techniques for practical and applied validation. Validation has changed from being the final »quality check» in research to becoming a central, never-ending process. During the research process it has become necessary to use multiple techniques to continuously value, question, and check the inferences and interpretations that are being made. Zeller (1997) stated that it is more difficult to be misled by the result that triangulates multiple techniques with compatible strengths than it is to be misled by a single technique which suffers from an inherent weakness (p. 828).
To sum up, the concept of validity has changed gradually over the years. Early definitions of validity focused on the relation between two variables, subsequent definitions focused on different types of validity, and still later definitions focused on the interpretations and meaning of test scores in the light of construct validity. According to Messick (1989), validation is a process consisting of four aspects. Thus, according to Messick´s model, the main questions for the validation process are: How does the measurement correspond with the theoretical construct? Is the measurement relevant for the specified purpose? What values can be connected with the construct and the measurement and, finally, what are the consequences of the measurement? It can also be stated that these questions are relevant, irrespective of whether quantitative or qualitative approach is used. With this in mind we can also draw attention to the fact that there is a tendency within the field of behavioural measurement to use both quantitative and qualitative methods.
Thus, these questions cannot be fully answered with quantitative or qualitative approaches alone. The nature of these questions requires different methods and approaches for the validation. The values connected with different constructs and measurements is one aspect requiring several types of evidence. The validity evidence can therefore best be gathered with multiple methods and approaches.
In the light of this multiple approach for gathering validity evidence and the opinion that the two major paradigms, the quantitative and the qualitative, are complementary to each other the current concept of validity can be linked to both these paradigms. Validity can therefore be regarded as a general concept independent of scientific approach.
Simon Wolming, Department of Educational Measurement, Umeå University, SE-901 87 Umeå. Sweden
Helge Strömdahl, 1998: Phenomenon and property: the physical quantities elucidated by science education research /Fenomen och egenskap: de fysikaliska storheterna i didaktisk belysning/. Pedagogisk Forskning i Sverige, Vol 3, No 2, Pp 104-1112. Stockholm. ISSN 1401-6788.
Words like energy, force, electric current and heat are generally used in everyday language as denotations for phenomena. In physical theory they are primarily denotations for physical quantities. Such distinctions are often neglected in science education. Students’ problems with scientific knowledge, revealed by educational research, are probably partly dependant on semantic indistinctness.
In an analysis of the national syllabus in physics for the Swedish compulsory school a mixture of everyday language and scientific language is found. This is especially evident in expressions concerning phenomena and physical quantities. The international system of physical quantities and units, SI, is not mentioned. Nor are the base physical quantities mass, length, time, temperature, electric current, amount of substance and luminous intensity discerned as making up a special concept category from which all other quantities are derived. Instead phenomena and physical quantities are mixed up. This mirrors the dilemma of writing a syllabus in ”school-physics” for citizen education without using too much systematic disciplinary physics. However, if school-physics is intended to procure scientific knowledge it is necessary to elucidate the conceptual situation.
In contemporary philosophy of science it is a well-founded standpoint that concepts are embedded in theories and get their meanings from theory-contexts. Hence, it is evident that physical quantities are only comprehensible within the theory-frames where they belong. For instance it is not possible to understand the classical Newtonian physical quantity of force outside Newtonian theory. It is a derived quantity connected to the base quantities mass, length and time. The physical quantity energy, also derived from the base physical quantities mass, length and time, is comprehensible within e.g. thermodynamic theory. The importance of mathematics as a communicative tool in this context could not be overestimated. It is evident that when non-scientific interpretations of terms like force and energy are revealed among students, it is a symptom of the fact that the terms are interpreted outside of scientific theories.
The pedagogical answer to the problem of concept learning in science education is often concretion and laboratory work. The learner is supposed to make his or her own observations and conclusions. Originating in the reasoning above, this kind of approach is problematic. There is no guarantee for a correspondence between a student’s and a scientist’s observation and interpretation of a given experiment. Students ›discovery› as a basis for knowledge attainment in science is often insufficient. Observation and interpretation of a mechanical experiment including an everyday conception of force does not contribute to the growth of scientific knowledge. Scientific knowledge formation is not primarily the result of direct perception but a demanding intellectual challenge, where abstraction and idealization is often to the fore. Observations and interpretations are theory-laden. To know science is an ability to talk and reason about the world in disciplined ways. One could not expect that a student during a short laboratory course, split up in 40 or 80 minutes periods, should be able to rediscover and communicate what scientists have worked with for centuries. The need for communicative guidance in laboratory work is evident. The object of learning science is more a question of becoming a participant in an existing discourse than being a free observer. It is a project with linguistic implications. The learner is situated at the boarder between natural language and scientific language. A teacher’s assignment is to introduce and guide the learner into the scientific way of viewing and reasoning about the world. One important issue is to make the student aware of the fact that words have separate meanings in different contexts and that words used rather arbitrarily in everyday language have got precise meanings in science. This awareness seems to be a necessary prerequisite for enlarging ones conceptual repertoire in order to encompass the scientific meaning of words like force and energy. Hence, syllabuses in science education are suggested to explicitly include the semantic/ communicative aspects of science learning.
Helge Strömdahl, Department of Engeneer Education Didactics, Royal Institute of Technology, SE-100 44 Stockholm. Sweden