Notes on a Remarkable Finding from Finland
I recently came across a paper by Ursina Schaede and Ville Mankki that contains a fascinating empirical finding with major implications for the way in which we think about meritocracy.
The paper examines the long run effects on students of a change in the manner in which their teachers were selected into a graduate program. Finland is well known for having an extremely effective school system, in part because primary teacher education has been “exclusively taught as a research-oriented, five year masters’ degree at universities” since the 1970s. These programs are in very high demand among applicants, with acceptance rates of about 10 percent. The admissions process has a first stage based largely on scores on a high school matriculation exam, followed by a second stage involving interviews and the evaluation of live teaching. Candidates are ranked again at the end of the process, and those at the top are taken until capacity is filled.
For a number of years, acceptance into the second stage was based on a quota, ensuring that at least 40 percent of students making it through the first stage of evaluation were male. Although this did not place any constraint on second stage outcomes, it turned out that the entering classes (and hence several cohorts of trained teachers) did not differ much in gender composition from those making it through the first stage. The quota was abolished in 1989, leaving first stage outcomes unconstrained. The first post-quota cohort thus graduated in 1994.
The paper examines the causal effects of this change on the long run outcomes for students. Identification is facilitated by variations across municipalities in the age distribution of teachers at the time of quota removal, coupled with mandatory retirement at age 60. This means that students were differentially exposed to the newer post-quota cohorts, which had a different gender composition (fewer males) and a different distribution of scores on the matriculation exam (higher scores on average).
The authors find that students differentially exposed to the quota-constrained cohorts of teachers ended up with better educational attainment and labor force participation at age 25. In other words, removal of the quota led to a decline in student performance. While this finding is interesting in its own right, even more interesting are the mechanisms that the authors rule out, and the one that they eventually accept.
Could the effect be arising through a role model channel, with particular benefit for boys? No, the authors find “no evidence for boys’ educational attainment being more affected from exposure to male quota teachers relative to girls,” and none of their “main effects differ systematically or significantly by pupil gender.”
Could it be that male and female teachers bring different benefits to the table, with the whole being greater than the sum of its parts? This would be a benefit from diversity. The authors cannot rule this out entirely (the estimates are too noisy) but they do find that “the benefits of adding an additional male teacher are similar in magnitude between places with few male teachers and places where the share of men among colleagues is already high.”
What, then, do the authors think is driving their results? They argue that the evidence “is consistent with male quota teachers contributing positive qualities to the school environment that are not sufficiently captured by the selection criterion in absence of the quota.”
It is important to be clear about this, because the finding can be so easily misunderstood. It is not that the quota teachers proved effective because they were male. It is that the distributions of important characteristics (unmeasured by scores) were not identical across male and female applicant pools. The quota was picking up individuals with these characteristics by proxy. It is the characteristics that mattered for students, not the gender of the teacher.
For example, male teachers in the data were “slightly more likely to come from rural areas and to live in their region and municipality of birth when compared to female teachers.” So a quota that favored rural applicants or those who had not moved from their municipality of birth could have had similar effects. In fact, this helps explain the mechanism—if rural applicants have fewer resources on average, they will have higher ability conditional on any given score than applicants from more resource-rich urban environments.
In fact, it is extremely likely that even within a given applicant group (men or women), the conditional distribution of these other valuable characteristics is not independent of score. There may be a particular range of scores at which these other characteristics happen to be especially abundant. In this case a policy that optimizes benefits for students may not even be monotonic within group—some people with higher scores would be skipped over in favor of those with somewhat lower scores.
This possibility is discussed at length in a recent paper with Rohini Somanathan that I will present at a symposium at Yale next week. The event has been organized by Gerald Jaynes and Rohini Pande, and is open to all (with registration).
Of course, policies that are non-monotonic (within group) would give rise to incentive effects, and probably could not be sustained. But the conceptual point they raise is that the understanding of meritocracy in public discourse is terribly impoverished. If one were to design a truly meritocratic policy, it could well have features that resemble the pursuit of representation targets. Meritocratic policies will not, in general, involve the application of a common score threshold across all candidates.
I understand that Ursina is on the academic job market this year and that this is the paper she’ll be presenting. I think that the work will be influential, and I wish her luck.