To view a PowerPoint presentation describing the various properties of the normal curve, click here:

 

The following is a list of typically used derived score types. With each is a definition and a bit of perspective about the score type.  Notice that a raw score is not included in this list because raw scores are simply a frequency count and should not be used to describe performance on an instrument. They cannot be interpreted directly or compared between subtests. A raw score's only purpose is to be used to derive scores so that performance on an instrument can be accurately described.

 

Standard Score...

A standard score indicates the distance of a person's raw score from average, taking into account the variability of scores among examinees of that age or grade. Standard scores are expressed in whole numbers with a mean of 100 and a standard deviation of 15. Standard scores are usually expressed with a confidence interval. Confidence intervals are expressed first as a percent (90% confidence level) and then with a standard score range. The percent indicates the degree of certainty that the examinee's true score is based on his or her ability (rather than just the person's performance on a particular test), and the score range indicates the range of standard scores within which the true score is likely to fall.

As an equal-interval measure, a standard score is one of the most common and useful metrics because it can be compared across subtests and across other tests and also can be arithmetically manipulated. One important point needs to be made about standard scores. Often, people that if someone receives the same standard score from year to year, that individual is demonstrating no growth. This is not accurate. Because the reference point for a standard score is the individual's own age group at the time of testing, which changes and grows in skill from year to year, the exact same standard score from one September to the next indicates one full year of growth.

 

Percentiles or Percentile Rank...

A percentile indicates the percentage of people in the reference group who performed at or below the examinee's score.  This score type is easily confused and unfortunately is widely misused, despite its popularity.  Percentiles are an ordinal or rank-order scale of measurement, rather than an equal-interval scale. That means one cannot subtract or average percentile scores in order to represent growth or change.

 

Stanine...

The term stanine is a contraction of "standard nines."  Stanines provide a single-digit scoring metric with a range from 1 to 9, a mean of 5, and a standard deviation of 2.  Each stanine score represents a specific range of percentile scores in the normal curve.  Stanines are useful when a researcher is interested in providing a "band" interpretation rather than a single score cutoff. Stanines 1 and 2 represent the bottom 11 percent of the examinee's performance distribution, indicating a need in the tested skill area. Stanines 8 and 9 indicate a performance within the top 11 percent and a strength in a skill area. Stanines 4, 5, and 6 represent the average range.

 

Normal Curve Equivalent...

Normal curve equivalents (NCEs) are based on percentiles but are statistically converted to an equal-interval scale of measurement. NCEs range from 1 to 90, with a mean of 50 and a standard deviation of 21.06. NCEs of 1, 50, and 99 correspond to percentiles of 1, 50, and 99. However, other NCE values do not have a direct relationship to percentiles.

NCEs are used in many federal and state programs as a method of reporting specialized programs, such as Title I.  Since they can be arithmetically manipulated, they are particularly helpful for reporting data.

 

Grade Equivalent...

A grade equivalent (GE) is the grade at which a person's raw score is the median (or at the 50th percentile) score. Grade equivalents are expressed in tenths of a grade (1.2 = the second month of first grade).

Keep in mind that a GE has nothing to do with how the examinee performs against the local school curriculum or standards for a particular grade, nor does it take into account the person's life experiences.  Again, the reference for this score is the standardization sample of the test.  Grade equivalents are also a rank-order scale; they place an examinee on a growth continuum, which may or may not increase at regular intervals.  The same grade equivalent on two different subtests may not mean the same thing.  Therefore, GEs are not the best option for making diagnostic and placement decisions.

 

Test-Age Equivalent or Age Equivalent...

Similar to a grade equivalent, a test-age equivalent represents the age in years and months at which a particular raw score is the median score.  Like GEs, test-age equivalents are a rank-order scale; they place an examinee on a growth continuum, which may or may not increase at regular intervals.  The same test-age equivalent on two different subtests may not mean the same thing.  Thus, test-age equivalents are not the best option for making diagnostic and placement decisions.

 

Summary...

Derived scores that are an equal-interval scale of measurement (standard scores, NCEs) can be arithmetically manipulated (i.e., added, subtracted, multiplied, or divided).  Those that are rank-order scales (percentiles, stanines, grade and test-age equivalents) cannot be used this way.  This is very important to remember when working with individuals or groups of students and their data. For example, a teacher can average a class's standard scores on a particular test, but that teacher may not average the students' percentile ranks.

The different score types offer different information, but they are also elements of mathematics and have their own rules.