Analysis of? Instructor Evaluation Forms

 

performed by

 

Dr. Daniel J. Ghezzi

 

January 26, 2004

 

 

In my analysis, I looked at all of the evaluations for the Fall semester 2001 and the Spring semester 2003.

 

Recall, that the evaluation forms have 10 questions regarding the student?s perception of the instructor. The student gives a rating of 1, 2, 3, 4, 5, or 0.

 

A ?5? is the best rating and a ?1? is the worst.

A ?0? represents a ?don?t know? response and is treated as a ?blank? or missing value.

-----------------------------------------------------------------------------------------------------------

For Fall 2001, there were 6014 total evaluations performed on 173 distinct instructors.

 

A table of responses follows.

 

Responses:

FALL 2001

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Score

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Total

Question

 

0

1

2

3

4

5

6

*

 

Eval's

Number

 

 

 

 

 

 

 

 

 

 

 

 

1

23

39

95

204

2135

3517

1

0

 

6014

 

2

10

30

60

185

1698

4031

0

0

 

6014

 

3

14

19

63

167

1295

4455

1

0

 

6014

 

4

65

32

72

291

1920

3631

3

0

 

6014

 

5

18

32

102

294

1547

4018

2

1

 

6014

 

6

774

53

80

529

1385

3191

1

1

 

6014

 

7

28

34

60

306

1445

4140

1

0

 

6014

 

8

18

54

125

320

1546

3950

1

0

 

6014

 

9

20

98

169

310

1532

3884

1

0

 

6014

 

10

22

142

189

369

1651

3640

1

0

 

6014

 

 

 

 

 

 

 

 

 

 

 

 

 

 

992

533

1015

2975

16154

38457

12

2

 

60140

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1.65%

0.89%

1.69%

4.95%

26.86%

63.95%

0.02%

0.00%

 

 

TABLE #1

 

 

Recall that question #6 concerns instructor?s availability for office hours.

 

I am not sure how ?6?s wound up in the data.

However, they are few in number, so I treated them as blanks.

 

A table of percentages for the Fall 2001 responses follows:

 

FALL 2001

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Score

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Question

 

0

1

2

3

4

5

6

*

Number

 

 

 

 

 

 

 

 

 

 

1

0.38%

0.65%

1.58%

3.39%

35.50%

58.48%

0.02%

0.00%

 

2

0.17%

0.50%

1.00%

3.08%

28.23%

67.03%

0.00%

0.00%

 

3

0.23%

0.32%

1.05%

2.78%

21.53%

74.08%

0.02%

0.00%

 

4

1.08%

0.53%

1.20%

4.84%

31.93%

60.38%

0.05%

0.00%

 

5

0.30%

0.53%

1.70%

4.89%

25.72%

66.81%

0.03%

0.02%

 

6

12.87%

0.88%

1.33%

8.80%

23.03%

53.06%

0.02%

0.02%

 

7

0.47%

0.57%

1.00%

5.09%

24.03%

68.84%

0.02%

0.00%

 

8

0.30%

0.90%

2.08%

5.32%

25.71%

65.68%

0.02%

0.00%

 

9

0.33%

1.63%

2.81%

5.15%

25.47%

64.58%

0.02%

0.00%

 

10

0.37%

2.36%

3.14%

6.14%

27.45%

60.53%

0.02%

0.00%

TABLE #2

 

 

For Spring 2003, there were 5883 total evaluations performed on 166 distinct instructors.

 

A table of responses follows.

 

Responses:

SPRING 2003

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

?

 

 

 

 

 

 

 

Score

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Total

Question

 

0

1

2

3

4

5

6

*

 

Eval's

Number

 

 

 

 

 

 

 

 

 

 

 

 

1

36

33

88

202

1871

3653

0

0

 

5883

 

2

16

34

65

179

1570

4019

0

0

 

5883

 

3

29

28

42

172

1167

4445

0

0

 

5883

 

4

66

52

108

292

1662

3703

0

0

 

5883

 

5

29

70

99

280

1443

3962

0

0

 

5883

 

6

749

40

57

416

1307

3313

0

0

 

5882

 

7

32

49

86

263

1361

4092

0

0

 

5883

 

8

29

81

138

311

1411

3913

0

0

 

5883

 

9

21

125

161

351

1400

3825

0

0

 

5883

 

10

21

156

211

365

1485

3645

0

0

 

5883

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1028

668

1055

2831

14677

38570

0

0

 

58829

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1.75%

1.14%

1.79%

4.81%

24.95%

65.56%

0.00%

0.00%

 

 

TABLE #3

 

 


A table of percentages for the Spring 2003 responses follows:

 

Score

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

0

1

2

3

4

5

6

*

 

 

 

 

 

 

 

 

Question #1??????????? 0.61%

0.56%

1.50%

3.43%

31.80%

62.09%

0.00%

0.00%

Question #2??????????? 0.27%

0.58%

1.10%

3.04%

26.69%

68.32%

0.00%

0.00%

Question #3??????????? 0.49%

0.48%

0.71%

2.92%

19.84%

75.56%

0.00%

0.00%

Question #4??????????? 1.12%

0.88%

1.84%

4.96%

28.25%

62.94%

0.00%

0.00%

Question #5??????????? 0.49%

1.19%

1.68%

4.76%

24.53%

67.35%

0.00%

0.00%

Question #6????????? 12.73%

0.68%

0.97%

7.07%

22.22%

56.32%

0.00%

0.00%

Question #7??????????? 0.54%

0.83%

1.46%

4.47%

23.13%

69.56%

0.00%

0.00%

Question #8??????????? 0.49%

1.38%

2.35%

5.29%

23.98%

66.51%

0.00%

0.00%

Question #9??????????? 0.36%

2.12%

2.74%

5.97%

23.80%

65.02%

0.00%

0.00%

Question #10????????? 0.36%

2.65%

3.59%

6.20%

25.24%

61.96%

0.00%

0.00%

TABLE #4

 

 

The Table for change in percentages between the two semesters follows:

 

COMPARE? FALL2001? TO? SPRING 2003

 

 

 

 

Spring-03

minus

Fall-01

 

 

 

 

 

Question

 

 

 

 

 

 

 

 

 

Number

 

0

1

2

3

4

5

6

*

 

1

0.23%

-0.09%

-0.08%

0.04%

-3.70%

3.61%

-0.02%

0.00%

 

2

0.11%

0.08%

0.11%

-0.03%

-1.55%

1.29%

0.00%

0.00%

 

3

0.26%

0.16%

-0.33%

0.15%

-1.70%

1.48%

-0.02%

0.00%

 

4

0.04%

0.35%

0.64%

0.12%

-3.67%

2.57%

-0.05%

0.00%

 

5

0.19%

0.66%

-0.01%

-0.13%

-1.20%

0.54%

-0.03%

-0.02%

 

6

-0.14%

-0.20%

-0.36%

-1.72%

-0.81%

3.26%

-0.02%

-0.02%

 

7

0.08%

0.27%

0.46%

-0.62%

-0.89%

0.72%

-0.02%

0.00%

 

8

0.19%

0.48%

0.27%

-0.03%

-1.72%

0.83%

-0.02%

0.00%

 

9

0.02%

0.50%

-0.07%

0.81%

-1.68%

0.44%

-0.02%

0.00%

 

10

-0.01%

0.29%

0.44%

0.07%

-2.21%

1.43%

-0.02%

0.00%

 

 

 

 

 

 

 

 

 

 

 

 

0.10%

0.25%

0.10%

-0.14%

-1.91%

1.61%

-0.02%

0.00%

TABLE #5

 

For example, the ?3.70% in Question #1-response 4 was obtained as follows:

In Spring 2003, 31.80% of the responses to Question #1 were 4?s

and in Fall 2001, 35.50% of the responses to Question #1 were 4?s.

31.80% minus 35.50 % gives the ?3.70%.

 

 

Generally, students in the Spring 2003 tended to give more 5?s and fewer 4?s compared to the Fall 2001.

 


We can further compare Fall 2001 to Spring 2003 by looking at the mean and quartiles for the instructors overall average rating.

 

 

??????????????????????????????????????????????? FALL 2001???????????????????????????? SPRING 2003

 

Number of Instructors?????????????????????????? 173????????????????????????????????????????? 166

 

Average?????????????????????????????????????????????? 4.55???????????????????????????????????????? 4.57

 

Minimum????????????????????????????????????????????? 2.82???????????????????????????????????????? 2.22

 

25th percentile?????????????????????????????????????? 4.43???????????????????????????????????????? 4.42

 

Median???????????????????????? ??????????? 4.61???????????????????????????????????????? 4.67

 

75th percentile?????????????????????????????????????? 4.78???????????????????????????????????????? 4.83

 

Maximum???????????????????????????????????????????? 5.00???????????????????????????????????????? 5.00

 

 

 

In the Fall 2001, 2663 (44.28%) of the 6014 evaluations had no variance. That is, a student gave the instructor the same rating on all 10 questions.???????????????????

 

Of these:?????????? 88.43% recorded all 5?s.

??????????????????????? 11.12% recorded all 4?s

??????????????????????? ? 0.38% recorded all 3?s

? 0.00% recorded all 2?s

??????????????????????? ? 0.08% recorded all 1?s

??????????? ? 0.00% recorded all *?s

 

In the Spring 2003, 2919 (49.62%) of the 5883 evaluations had no variance.

 

Of these:?????????? 87.84% recorded all 5?s.

??????????????????????? 11.17% recorded all 4?s

??????????????????????? ? 0.41% recorded all 3?s

? 0.10% recorded all 2?s

??????????? ? 0.34% recorded all 1?s

??????????? ? 0.14% recorded all *?s

 

 


Analysis of the three distinct categories ?Professionalism?, ?Student Centeredness?, and ?Effectiveness?.

 

FALL 2001

For the 4 ?Professionalism? items (questions 1-4), 58.86% of the evaluations had no variation in the responses;

and an additional 21.83% of the evaluations had 3 alike responses with a 4th response differing by 1.

 

For example:? 2 2 2 3?? or?? 3 3 2 3?? etc...

These two percentages total 80.69%

 

For the 4 ?Student Centeredness? items (questions 5-8), 58.26% of the evaluations had no variation in the responses;

and an additional 18.31% of the evaluations had 3 alike responses with a 4th response differing by 1.

 

For example: same as above?

These two percentages total 76.57%

 

For the 2 ?Effectiveness? items (questions 9-10),? 84.27% of the evaluations had no variation in the responses;

and an additional 14.71% of the evaluations had the two responses differing by 1.

These two percentages total 98.98%

 

{Clearly, the two questions in the ?Effectiveness? category are redundant.

 

SPRING 2003

For the 4 ?Professionalism? items, 62.38% of the evaluations had no variation in the responses;

and an additional 20.43% of the evaluations had 3 alike responses with a 4th response differing by 1.

For example:? 2 2 2 3?? or?? 3 3 2 3?? etc...

These two percentages total 82.81%

 

For the 4 ?Student Centeredness? items, 62.45% of the evaluations had no variation in the responses;

and an additional 15.49% of the evaluations had 3 alike responses with a 4th response differing by 1.

For example: same as above?

These two percentages total 77.94%

 

For the 2 ?Effectiveness items?,? 85.23% of the evaluations had no variation in the responses;

and an additional 13.27% of the evaluations had the two responses differing by 1.

These two percentages total 98.50%

 

IS THE LACK OF VARIATION INSIDE THE THREE DISTINCT CATEGORIES THE RESULT OF THE EVALUATION FORM DEPICTING QUESTIONS 1 - 4,

5 - 8, AND 9 - 10 IN SINGLE CATEGORIES?

 

 

CORRELATIONS BETWEEN QUESTIONS:

 

 

FALL 2001

 

The correlation between any of the 10 questions and the student?s evaluation mean ranges from 0.722 (Question #6) to 0.903 (Question #9).

 

 

SPRING 2003

 

The correlation between any of the 10 questions and the student?s evaluation mean ranges from 0.740 (Question #6) to 0.911 (Question #9).

 

Question #9 is a very good predictor (explainer) of the student?s evaluation mean.

 

 


Questions to consider

 

1.???????? What is the response rate?

???????????

2.???????? How can we increase the response rate?

 

3.???????? How can we better control the process (i.e. reduce variation)?

 

4.???????? Should ?inverted? response question(s) be added?

That is, a question in which a?5? is the worst rating and a ?1? is the best.

 

5.???????? How should evaluations with no variation be treated?

 

6.???????? Should we eliminate the words ?Professionalism?,? ?Student Centeredness?, and

?Effectiveness??

 

7.???????? In the instructor?s overall average evaluation, do we want ?Professionalism?

(questions 1-4) and ?Student Centeredness? (questions 5-8) weighted 40% each \???????? while ?Effectiveness? (questions 9-10) gets weighted 20%?

 

8.???????? What categorical data is worth collecting for statistical analysis?

 

Note that currently ?Student?s major?, ?Student?s Class Year?, ?Student?s

Expected Grade?, and ?Student?s Gender? are collected but do not get

downloaded into the dataset and are therefore of no analytical use.

However, individual instructors do benefit from this collected data.

 

9.???????? Do we want more room for open ended comments?

 

10. ????? How many questions do we want?


REMARKS:

 

??????????? It is clear that one question inside each category is sufficient. I would suggest that we reduce the total number of questions to 3, 4, or 5. Also, I would suggest that room be provided after each question for the student to explain or comment on their response. Although these comments would not be downloaded for statistical analysis, they would be of value to the individual instructors. Also, a form of this design would send a message to the students that we are serious about wanting their input.

 

Additionally, we should collect the student?s gender and the student?s class year so that this data gets downloaded and can be analyzed.

 

I believe that a survey with fewer questions 9as suggested) would generate a more thoughtful response and may therefore result with a lower percentage of evaluations that have no variation.

 

I would not recommend that we throw out evaluations that have no variation. The student may simply have decided that the instructor is a ?4? say, and gave all ?4?s? without reading the questions. This does not invalidate their belief that the instructor rates as a ?4?.

 

Since the purpose of an ?inverted? question is to identify those evaluations for which the student did not read the questions, and since not reading the questions doesn?t automatically invalidate the evaluation, I see no reason to confuse the evaluations by inserting an ?inverted? question.

 

Finally, we can we better control the process (i.e. reduce variation) by implementing any or all of the following::

 

??????????? *????????? Eliminate verbal communication between instructor and students.

Immediately before and during the evaluation time.

???????????

??????????? *????????? Limit the time frame for evaluations.

?(All evaluations given during the same week for instance.)

 

??????????? *????????? Have all instructors remain (in / out of) classroom during the evaluation

period.

 

??????????? *????????? Have all instructors give evaluations at the (beginning / end) of the class

period.