Professors’ Beauty, Ability, and Teaching Evaluations in Italy

Michela Ponzo
  • Corresponding author
  • Department of Economics and Statistics, University of Naples Federico II, Via Cintia Monte S. Angelo, I-80126, Napoli, Italy,
  • Email:
/ Vincenzo Scoppa
  • Department of Economics, Statistics and Finance, University of Calabria, Via Ponte Bucci, 87036 Arcavacata di Rende (CS), Italy
  • Email:
Published Online: 2013-08-21 | DOI: https://doi.org/10.1515/bejeap-2012-0041


Using data from an Italian University, we relate student evaluations of teaching quality to the physical attractiveness of instructors (as evaluated by external raters using photos), controlling for a number of teacher and course characteristics. We first show that teachers’ beauty significantly affects evaluations of their teaching. We carry out a number of checks to tackle threats to internal validity: course fixed effects and individual research productivity are controlled for; an IV estimation strategy is undertaken using a second measure of beauty as an instrument; and measures of grooming and fastidiousness are introduced. Notwithstanding these controls, we find that more attractive teachers receive much better evaluations.

Keywords: beauty; discrimination; teaching quality; subjective evaluations


Further evidence finds associations between wages and height (Persico, Postlewaite, and Silverman 2004) and between wages and obesity (Cawley 2004).

Physical attractiveness appears also to be a relevant factor in explaining the success of politicians, lawyers, economists, and prostitutes. See the recent book of Hamermesh (2011) for a detailed account (and related references) of the evidence on the impact of beauty in a number of economic and social contexts.

Teaching evaluations, in turn, tend to have an effect on instructors’ wages, since university administrators often take into account these evaluations in setting salaries (Becker and Watts 1999).

With regard to the difficulties regarding the effective implementation of these incentive mechanisms, see the detailed account in Perotti (2002).

Alternatively, Teaching Evaluations can be seen as the average score (multiplied by 100), if 1 is assigned to each student’s positive rating and 0 to each student’s negative rating.

The data regarding the number of students expressing each judgment were not publicly provided by the University Statistical Office. However, we asked the Statistical Office to calculate the mean evaluation of each course and assign a score of 1 to “very negative”, 2 to “negative”, 3 to “positive”, and 4 to “very positive” (a similar scale is used in Hamermesh and Parker 2005). They accepted to do this for us for the academic year 2009–2010. We asked to calculate the correlation coefficient between the Mean Evaluations and the variable, Teaching Evaluations, which we use. The correlation coefficient is 0.92 (p-value = 0.000). In a regression of Teaching Evaluations on Mean Evaluations, the coefficient is 37.65 and t-stat is 47.41 (R-squared = 0.86). Therefore, we are confident that, although imperfect, the measure we use is a good representation of the evaluation of students.

The rating forms also include questions on whether the instructor begins classes on time, he/she is available during office hours, students possess adequate knowledge to allow them to understand the subject, the rooms are satisfactory, the study load in the period was tolerable, and so on.

We asked these students whether they knew the instructors. Some of them declared that they knew four instructors (because they are or have been Dean or Department Chairmen). As a robustness check, we estimate by excluding these four instructors, but we obtain almost identical results.

Hamermesh (2011) argues that people are not able to disentangle the effect of age from beauty, even if they are asked to do so.

As a robustness check, in Section 4, the original evaluations of each rater are used directly, but we estimate controlling for raters’ fixed effects.

Nonetheless, age is missing for three instructors.

On average, each teacher in our sample taught 12 courses.

We find a strong negative correlation between Age and Beauty (t-stat = –5.36).

When regressing Age on the dummies for academic positions, it emerges that all coefficients are highly statistically significant.

Estimating a model on the whole sample including an interaction term between Second Level Degree and Beauty (not reported), we find that the difference is statistically significant.

They also show that unattractive individuals are more likely to be involved in criminal activities as young adults.

For a simple theoretical model relating beauty to teaching ability and research productivity, see Ponzo and Scoppa (2012).

See De Paola and Scoppa (2011).

For example, Law professors typically publish less in international journals and more in books (in Italian). It is more unusual for this type of publications to be present in the Google Scholar Archives.

Ashenfelter and Krueger (1994) adopt a similar strategy by using the level of education of i as reported by his/her twin as an instrument for the self-reported level of education of individual i.

Notice that the number of observations is lower, because not all of the instructors were evaluated by administrative employees.

In a similar spirit, Hamermesh and Parker (2005) use a dummy variable “Formal Dress”, equal to one for male faculty members who are wearing neckties in their pictures and for female faculty who are wearing a jacket and blouse. In the same way, we also built a variable Formal Dress and notice that it is positively and significantly correlated with Well Dressed/Groomed.

In the Italian academic system, all professors (Full, Associate, and Assistant) have a permanent position.

As a further check, to verify if the beauty ratings are consistent across raters, we also regress Teaching Evaluations separately on each rater’s evaluation (rjk ). Confirming the high degree of agreement among raters, the coefficients of Beauty are always positive and, with only a few exceptions, statistically significant (results are not reported).

