Effective #teaching is a difficult and counter-intuitive task, and it's not something you can master from the Internet. So it's not surprising that AI is pretty bad at it & bad at evaluating it - even negatively correlated with student learning. Another way of saying this is AI has poor pedagogical content knowledge:
* Knowledge without Wisdom: Measuring Misalignment between LLMs and Intended Impact arxiv.org/abs/2603.00883
Podcast summary: drive.google.com/file/d/1n09DU…
More examples:
#AIEd
* Knowledge without Wisdom: Measuring Misalignment between LLMs and Intended Impact arxiv.org/abs/2603.00883
Podcast summary: drive.google.com/file/d/1n09DU…
More examples:
#AIEd
Knowledge without Wisdom: Measuring Misalignment between LLMs and Intended Impact
LLMs increasingly excel on AI benchmarks, but doing so does not guarantee validity for downstream tasks. This study evaluates the performance of leading foundation models (FMs, i.e.arXiv.org

Doug Holton
in reply to Doug Holton • • •arxiv.org/abs/2603.00925
* Benchmarking the Pedagogical Knowledge of Large Language Models
arxiv.org/abs/2506.18710v1
fab-ai.org/initiatives/ai-for-…
* AI‑generated lesson plans fall short on inspiring students and promoting critical thinking
theconversation.com/ai-generat…
#AIEd #mathed #teaching #education
Benchmarking the Pedagogical Knowledge of Large Language Models
arXiv.org