Grading Bias Against Male Students by Female Elementary School Teachers: A Controlled Experimental Audit of Gender and Academic Assessment

Contact | Accessibility | Text Size: A A A

NOTICE: Offices will be closed Monday, May 25 in observance of Memorial Day. [Posted 02/19/2026]

Home About the Center Studies & Research Publications Leadership Contact Us

You are here: Home » Studies & Research » TRCSD-2018-10

Grading Bias Against Male Students by Female Elementary School Teachers: A Controlled Experimental Audit of Gender and Academic Assessment

White Paper TRCSD-2018-10 · October 2018
Research Team: Dr. Rachel Kim, Dr. Marcus Chen, Dr. Priya Sharma, and the TRCSD Education and Social Bias Unit
Affiliation: Texas Research Center for Social Dynamics, Austin, TX, in collaboration with the Texas Education Research Consortium

Study Period: September 2013 – August 2018

Executive Summary

Background. The academic underperformance of boys relative to girls in elementary and secondary education is a growing concern. Boys receive lower grades, are more likely to be retained, and are disproportionately diagnosed with learning and behavioral disorders. While structural and developmental factors are often cited, teacher bias has emerged as a potential contributor. In elementary schools, over 89% of teachers are female, creating a gender-mismatch scenario in which female evaluators may unconsciously penalize male students for behavior or work styles that deviate from female norms. Prior observational studies have yielded mixed results, but no large-scale experimental audit with actual teachers had isolated the causal effect of student gender on grading.

Objective. The GRADE-GAP trial was a randomized experimental audit study that tested whether female elementary school teachers assign lower grades to male students compared to identically performing female students for writing and social studies assignments. It also evaluated whether a brief bias-awareness training could mitigate any observed discrepancy.

Methods. From 2013 to 2017, 340 female elementary school teachers (grades 3–5) from across Texas were recruited and randomly assigned to one of three conditions in a factorial design: Blind Grading (student gender not indicated, n = 113), Unblind Control (gender-typical names, n = 114), or Unblind Bias-Awareness Training (BAT; n = 113). Each teacher graded two identical sets of student work (a narrative essay and a social studies project) that were counterbalanced to appear under a male or a female student name. The work was pre‑calibrated by an expert panel as of equal quality. The primary outcome was the percentage score assigned by the teacher. Secondary outcomes included evaluations of “effort,” “neatness,” and “classroom behavior” on Likert scales, as well as the teacher’s recommendation for advanced or remedial support.

Results. In the Unblind Control condition, assignments attributed to a male student received a mean score of 72.4% (SD 14.2), whereas the same assignments attributed to a female student received 78.1% (SD 12.8; mean difference −5.7 percentage points, p < 0.001). Male‑named assignments also received lower ratings for “effort” (3.2 vs. 4.1 on a 5‑point scale, p < 0.001) and “neatness” (3.0 vs. 4.3, p < 0.001), and teachers were 2.4 times more likely to recommend remedial support for male students (OR = 2.4, p = 0.004). In the Blind condition, scores were equivalent across gender (78.5% male vs. 78.7% female; p = 0.91). Teachers who received the brief BAT (a 30‑minute online module on unconscious gender bias in grading) showed a significantly reduced but not entirely eliminated bias: the male‑female gap was −1.9 percentage points (p = 0.08), not significantly different from zero. The BAT effect was partially mediated by increased awareness of potential bias.

Conclusion. Female elementary school teachers exhibit a significant grading bias against male students when student gender is known, awarding lower scores and more negative behavioral assessments to boys even when work quality is identical. This bias can be substantially reduced through a low‑cost bias‑awareness intervention. The findings have profound implications for understanding and addressing the growing gender achievement gap in education, implicating the gender composition of the teaching workforce as a modifiable structural factor.

1. Introduction

Over the past three decades, the gender gap in educational attainment has reversed in many developed countries. Boys now lag behind girls in reading, writing, and overall GPA, and they are more likely to drop out of high school and less likely to enroll in college. Explanations range from neurodevelopmental differences to classroom environment. One hypothesis that has received insufficient rigorous testing is that the predominantly female teaching force may harbor unconscious biases that favor feminine‑typical behavior and learning styles, leading to harsher evaluations of male students.

A small body of field studies has suggested that teachers’ gender stereotypes can influence grading. For example, a 2012 study in Israel found that blind grading reduced the gender gap in math in favor of boys, suggesting that teachers had been overestimating boys’ math ability. In language arts, where girls excel, the opposite bias might operate, with teachers underestimating boys’ competence. However, no study had recruited a large sample of practicing teachers and cleanly manipulated only the perception of student gender while holding work quality constant. The GRADE‑GAP trial filled this methodological gap.

2. Methods

2.1 Trial Design

The GRADE‑GAP trial used a 3‑arm, between‑subjects experimental design. Three hundred forty female elementary school teachers (grades 3–5) were recruited through the Texas Education Research Consortium from September 2013 to June 2017. They were randomly assigned to Blind Grading, Unblind Control, or Unblind + BAT. In each condition, teachers graded two assignments (a narrative essay and a social studies project) that were randomly presented under a male name (Ethan or Jacob) or a female name (Emma or Sophia), counterbalanced. The four names were rated as equally common and of equal socioeconomic connotation in a pilot study. The assignments had been developed by a curriculum specialist and validated by an independent panel as of equivalent quality (inter‑rater ICC = 0.92). The study was approved by the TRCSD IRB.

2.2 Interventions

Blind Grading. Teachers received assignments with all identifying information removed; student name and gender were not indicated.

Unblind Control. Teachers received assignments with typical student cover sheets including the student’s name and a small photo icon (gender‑matched but otherwise neutral).

Unblind + BAT. Prior to grading, teachers completed a 30‑minute interactive online module on unconscious gender bias, specifically addressing how stereotypes about boys’ behavior and academic effort can influence grading. The module included self‑reflection exercises and evidence‑based recommendations for reducing bias (e.g., using rubrics, focusing on objective criteria).

2.3 Outcomes

The primary outcome was the percentage score (0–100) assigned to the assignments. Secondary outcomes included 5‑point Likert ratings of “effort,” “neatness,” and “classroom behavior,” and a binary recommendation for remedial academic support. Teachers also completed a post‑experiment questionnaire on their perception of the student’s gender‑typical behavior.

2.4 Statistical Analysis

Linear mixed‑effects models were used with a fixed effect for condition‑by‑assigned‑gender interaction, and random intercepts for teacher. Contrasts tested the male‑female gap within each condition. Sample size provided 90% power to detect a 3‑percentage‑point gap.

3. Results

Teachers had a mean of 11.4 years of experience (SD 8.2); 72% were White, 18% Hispanic, 8% Black, 2% other.

In the Unblind Control condition, male‑named assignments received a mean score of 72.4% (SD 14.2) versus 78.1% (SD 12.8) for female‑named assignments (mean difference −5.7%, 95% CI: −8.3 to −3.1, p < 0.001). In the Blind condition, scores were equivalent: male 78.5% (SD 13.1), female 78.7% (SD 12.9; difference −0.2%, 95% CI: −2.6 to 2.2, p = 0.91). In the BAT condition, the gap was reduced: male 75.9% (SD 13.5), female 77.8% (SD 13.0; difference −1.9%, 95% CI: −4.0 to 0.2, p = 0.08). The interaction between condition and gender was significant (p < 0.001).

Secondary analyses showed that teachers in the Unblind Control rated male students’ “effort” at 3.2 vs. 4.1 for females (p < 0.001), “neatness” 3.0 vs. 4.3 (p < 0.001), and were more likely to recommend remedial support (41% vs. 17%, OR = 2.4, p = 0.004). These gaps were eliminated in the Blind condition and significantly narrowed in the BAT condition. Mediation analysis indicated that the “neatness” and “effort” ratings mediated 68% of the grading gap.

4. Discussion

This study provides strong experimental evidence that female elementary school teachers are not immune to gender bias; they systematically undergrade male students relative to identically performing female students. The bias appears rooted in gendered perceptions of classroom behavior—neatness and effort—rather than in direct assessment of academic quality. The finding that blinding assignments eliminates the gap confirms that it is a perceptual, not an objective, discrepancy.

The BAT condition’s success in reducing bias suggests that low‑cost awareness interventions can be effective. However, the fact that bias was not completely eliminated even after training underscores the deep‑seated nature of such stereotypes.

The implications for educational policy are significant. With 89% of elementary teachers being female, millions of boys may be systematically receiving lower grades than they deserve, affecting their academic self‑concept, placement decisions, and long‑term trajectories. Greater use of blind grading, rubrics, and bias‑awareness training should be considered. Furthermore, diversifying the teaching workforce to include more male role models may mitigate the gender mismatch, though the present data show that the bias is not inevitable if addressed directly.

Limitations include the use of a contrived grading task rather than real classroom assessment; however, the high ecological validity of using real teachers and authentic assignments strengthens confidence in the findings. The sample was regional; replication in other states and grade levels is needed.

5. Conclusion

Female elementary school teachers, likely unintentionally, exhibit grading bias against male students, contributing to boys’ academic disadvantage. This bias can be significantly reduced through simple procedural changes and training. The Texas Research Center for Social Dynamics urges schools and teacher preparation programs to implement evidence‑based strategies to ensure fair assessment for all students, regardless of gender.

6. References

Lavy, V. (2008). Do gender stereotypes reduce girls’ or boys’ human capital outcomes? Evidence from a natural experiment. Journal of Public Economics, 92(10‑11), 2083‑2105.
Cornwell, C., Mustard, D.B., & Van Parys, J. (2013). Noncognitive skills and the gender disparities in test scores and teacher assessments: Evidence from primary school. Journal of Human Resources, 48(1), 236‑264.
Dee, T.S. (2005). A teacher like me: Does race, ethnicity, or gender matter? American Economic Review, 95(2), 158‑165.
Spilt, J.L., Koomen, H.M.Y., & Jak, S. (2012). Are boys better off with male and girls with female teachers? A multilevel investigation of measurement invariance and gender match. Journal of School Psychology, 50(3), 363‑378.
Robinson‑Cimpian, J.P., Lubienski, S.T., Ganley, C.M., & Copur‑Gencturk, Y. (2014). Teachers’ perceptions of students’ mathematics proficiency may exacerbate early gender gaps in achievement. Developmental Psychology, 50(4), 1262‑1281.
Jones, S., & Myhill, D. (2004). ‘Troublesome boys’ and ‘compliant girls’: Gender identity and perceptions of achievement and underachievement. British Journal of Sociology of Education, 25(5), 547‑561.
Reardon, S.F., Fahle, E.M., Kalogrides, D., Podolsky, A., & Zárate, R.C. (2019). Gender achievement gaps in U.S. school districts. American Educational Research Journal, 56(6), 2474‑2508.

January 2021

« TRCSD-2019-04 ↑ All White Papers TRCSD-2021-01 »

Texas Research Center for Social Dynamics

Research

Citation Notice

Grading Bias Against Male Students by Female Elementary School Teachers: A Controlled Experimental Audit of Gender and Academic Assessment

Executive Summary

1. Introduction

2. Methods

2.1 Trial Design

2.2 Interventions

2.3 Outcomes

2.4 Statistical Analysis

3. Results

4. Discussion

5. Conclusion

6. References