What To Keep In Mind When Using Different Grading Scales

What To Keep In Mind When Using Different Grading Scales
Summary: Marking assessment components with varying weights can create challenges. Instructors are prone to a number of errors, particularly when using different marking scales. Test your accuracy in performing simple mental calculations, and then invite your colleagues and your institution for a discussion.

Meet The Challenge Of Using Different Grading Scales: What Is 17/23 Out Of 100?

You have just started marking 100 student research reports worth 30% of the total unit grade. The marking guidelines specified the weight of each of the three criteria but did not clarify whether you should:

  • mark using the possible marks for each criterion, for example 5 points for presentation, 20 points for the content and 5 points for references, or
  • mark each criterion using the standard 100 point scale, and then apply the relative weights.

The question here is if it matters which option you choose to use and the answer is that yes, it sure does! So, what percentage did you have in mind when you awarded 17/23 for that section?

1. Not Convinced? Take This Test

You, we'll call you Marker A in this scenario, have chosen to use option 1, that is to mark based on the criteria and the marking scales - see columns A and B in Table 1 below - provided by the Unit Convenor. Hence, you marked the Abstract 4 out of 5 possible points, the Introduction 16 out of 20 possible points and so on.

You have completed your marking. We, now, invite you to estimate the approximate percentage mark for the Abstract. Unless you need to do this in your mind, without using a calculator. OK? Great!

You probably thought that was easy. In the case of the Abstract, it is easy indeed since 4 out of 5 is 80%. Correct. Enter 80% in column D. Perfect, thank you. Except for the fact that we haven't finished yet. Please, continue with the task and do the same for the each of the other sections, such as the Introduction, the Method, the Results, etc.

When you are done, perhaps you would care to consider the following issues.

a. Accuracy

Review your mark for the Method section. If you followed the instructions, you have calculated the percentage mark in column D in your mind, without using a calculator. You should have done the same for all sections. Now, to examine the accuracy of your mental calculations, use an actual calculator to redo the calculations and check each mark against your mental calculations. Was that the percentage and the grade you had in mind when you awarded an 11/15 for the Method?

What percentage did you have in mind when you awarded an 18/25 for the Discussion section?

The point is that our mental calculations of grades are inaccurate when using uncommon marking scales!

b. Lack Of Variability

When the 0-10 scale is used, as in the Results section, for instance, variability is significantly reduced. Markers rarely use decimals on a 10-mark scale. Hence, the possible marks are multiples of 10, for example 40-50-60-70-80-90, which eliminates possible in between marks such as 63, 77 etc. Review your mark for the "In-text citations" criterion. You have awarded a 6 out of 8. On a 0-100 point scale, this is the equivalent of actually any mark from 69, or else credit, to 81 which is high distinction. What mark did you have in mind when you decided on the 6?

2. A Reality Test

An inter-rater reliability exercise we conducted among tutors of large online first level units indicated that their mean marks differed substantially - up to 20 points! - despite the fact that their students were studying the same subject, using the same text and identical assessments. Students had been randomly assigned to classes and such large differences could not be explained by student ability or other performance issues since tutors did not do any actual lecturing. Hence, it appears that student marks depended more on who the marker was than what their actual performance was. A closer examination indicated that among other factors that affect marking such as leniency different tutors were applying different methods to assess the same papers. Despite the fact that we provided clear marking criteria, we had not clarified whether they should mark using a 0-30 or a 0-100 point marking scale for an assessment weighting 30 points.

3. Marking Options: A Closer Look

Indeed, a number of marking options are available to mark any assignment. In our examples below, we’ll continue using 30% as the total mark for the assessment in our discussion.

a. Option 1

The instructor can mark the paper as a whole. If a 0-30 marking scale is used, marks may then have to be converted to reflect the standard grades that the university uses, as for example the 100 point scale or NP, P, C, D, HD. As we will illustrate, using non-standardized scales creates inter-rater reliability challenges. For example, the pass mark out of 30 possible points is 15 - that was not difficult to calculate - but what is a credit, a distinction, or a high distinction in a 30 point scale? An instructor may, for example, assign a 19/30 having a vague idea of how much that is out of 100. We urge you to do the calculation in your mind without a calculator and estimate what that number is out of a 100. Is it a pass/credit/distinction? High/low? What about a 27/40 on a 40 point scale?

However, marking the paper as a whole is not the recommended option. Rather, instructors are urged to set assessment criteria that address the learning objectives and break the assessment further into smaller assessed sections.

b. Option 2

Assume a rubric with three criteria is used to assess separate components of the paper. In other words the content can receive 70 points, the presentation 15 points and the referencing another 15 points. Once again, if the instructor marks each part using non-standard scales, such as 0-70 for content, inter-rater reliability challenges are evident.

c. Option 3

For these reasons, we suggest using the standard 0-100 marking scale for each individual section of the paper. That is, marking the content, the presentation and referencing out of 100. Once done, the instructor can enter these marks into a custom-made Excel sheet, which will calculate final marks based on the weight of each section. The same process should be used at the end of the unit. Here is a working example. Assume a unit’s assessment is made up of three parts, a multiple choice quiz (A1), an essay (A2), and a final exam (A3), worth 10, 30 and 60%, respectively. In this case, the formula will look like this: (A1*0.1) + (A2*0.3) + (A3*0.6). If for an imaginary student we substitute 67% for the MCQ, 75% for the essay and 63% for the exam, then the formula would result to 67*0.1 + 75*0.3 + 63*0.6 = 67.

4. Recommendations

On top of the existing inter-rater reliability sources of error, using different or non-standardised scales creates serious additional challenges. We strongly recommend you to:

a. Develop All Marking Guides Using A Standardised System.

You could, for example, use a 0-100 scale.

b. Instruct Tutors To Use The Same System.

Using an excel sheet, tutors can then convert marks to any scale they want.

Finally, we do not see any reason why marking scales should differ among Departments or, for that matter, among HEI in the same or different countries.

An earlier version of this paper has been presented at the 26th Annual Teaching and Learning Forum Curtin University, 2-3 February 2017