These are terms used in educational assessment and testing. For more information about assessment in education, go to the educational assessment page.
Alignment – The correlation between assessments and learning objectives
Classical Test Theory - X (observed score) = T (true score) + E (measurement error). Assumed that the error scores will be normally distributed and will average out to "0". In addition, error scores and true scores would be "0" (no correlation). (see Generalizability Theory and Item Response Theory)
Concept – the basic unit of thought. Non-verbal mental representation of a category (ie types of sentences, see 18-1)
Criterion referenced tests – a sample of a clearly defined domain. The advantage is that we can interpret the results in terms of what the student can do instead of comparing to other people (see 1-10). Narrow domain (one or two categories at the most) – ie spelling test with 84 questions all dealing with the rule of doubling or not. Amplified Objective Approach can be useful (3-14). CR tests are concerned with what you know (or at least a representative percentage of the domain)
Differential Item Functioning (DIF) - A test item functions differentially when equally-able examinees from different groups do not have equal probabilities of answering the item correctly.
Domain - A domain consists of any clearly-defined, delimited set of homogenous tasks that a test user wishes to make inferences about. For example, "distinguishes complete sentences from fragments", or "demonstrates knowledge of rules of punctuation"
Generalizability Theory - (see Classical Test Theory and Item Response Theory)
Item Analysis - a formative evaluation process which permits a test maker to make decisions about how to improve a test based on statistical summaries of the performance characteristics of individual test items.
Item Banks - 8-17
Item Discrimination Index - 8-7
Item difficulty index - 8-3
Knowledge – derived from information through thinking. Knowledge is idiosyncratic (peculiar to each person). Knowledge is a private possession and must be constructed by each person for himself.
Maximum performance – what a person can do – ie. how you drive on a drivers test. Happens when someone knows they are being observed. Results from a Maximum performance exam should not be interpreted as measuring typical performance (and results should be qualified)
Norm referenced test – describes a student’s relative standing in a group. (see 1-10). Wider domain (more categories). Table of specifications (3-6) is useful for norm-referenced tests. The GRE is an example of this – questions that discriminate between two students are kept and ones that don’t are thrown out.
Principles – made up of two or more concepts
Reasoning – one form of thinking
Reliability - The degree of consistency in the ratings of the quality of an essay. Inconsistency in ratings produces a lack of reliability.
Test - A test is a standardized series of problems or tasks used to elicit comparable responses as a basis for making inferences about the degree to which the examinee(s) possess or lack some attribute or trait.
True Score - This is the hypothetical score a person would receive if they took a test an infinite number of times and no learning or forgetting occurs (See Classical Test Theory)
Typical performance – what a person will do – ie. how you drive normally. To observe typical performance, you have to do it when people don’t know they are being observed or be there on enough occasions that people get used to the examinee, or don’t know what they are being tested on.
Understanding – (see 2-14b)