Generative AI is no longer a hypothetical in education; it is sitting on the same devices students use to write, solve, and submit their work. The old assumption that a polished essay or lab report reflects only the author’s effort has collapsed, and with it the comfort of traditional grading. If teachers want grades to mean “this is what the student can actually do,” they now have to evaluate thinking, process, and judgment in a world where a chatbot can draft a passable answer in seconds.
That shift is not just about catching cheaters, it is about redesigning assessment so students learn how to use AI without outsourcing their minds to it. I see the most promising work happening where educators combine low-tech safeguards, explicit AI use, and richer feedback loops, rather than trying to ban tools that are already embedded in students’ lives.
From catching AI to grading thinking
The first instinct in many schools has been to treat AI as a new form of plagiarism and to hunt for it with detectors. That approach is already breaking down, because large language models can paraphrase, detectors are unreliable, and students can mix AI text with their own. A more durable response is to redesign assessment so that what is being graded is the student’s reasoning, not just the final prose. That is the core argument behind work on assessment in what By Xiao Hu, College of Information Science, calls a “post-plagiarism era,” where process and critical reflection carry as much weight as the finished product.
In practice, grading thinking means asking students to show how they got there: annotated drafts, planning notes, and short metacognitive reflections that explain choices and revisions. It also means valuing oral explanations, quick in-class problem solving, and other performances that are hard to fake with a chatbot. When teachers shift rubrics toward these elements, they are less vulnerable to invisible AI help and more aligned with what learning scientists describe as deeper understanding.
AI resistant does not have to mean AI free
Some educators are experimenting with what one guide bluntly calls “Going Medieval,” reintroducing traditional pen-and-paper tests for key checkpoints so they can be confident about who did the work. In that framework, teachers selectively use handwritten quizzes, in-class writing, and closed-note problem sets as part of a broader assessment strategy, not as a nostalgic return to blue books. The point is to design moments where students must retrieve knowledge and reason independently, then connect those moments to larger projects where AI can be used more openly.
Similar thinking shows up in lists of Resistant Practices for, which put “Paper, Person” work at the top: paper and pencil in-person tasks that force students to grapple with material themselves. I see these as guardrails rather than a full solution. If every assignment becomes AI proofed by banning devices, students never learn how to use AI responsibly. The more sustainable model is to carve out AI-free assessments for core skills while also building assignments that assume AI is present and ask students to go beyond what a generic chatbot can do.
Designing assignments for an AI saturated world
When teachers do invite AI into the process, the design of the task matters as much as the tool. One set of Key Principles of Effective Assessment urges instructors to Align tasks with Learning Outcomes and Use AI to support, not replace, those goals. In that view, teachers start by clarifying what they want students to know or be able to do, then decide where AI can help them practice or get feedback without undermining the target skill. Guidance on Key Principles of emphasizes that alignment, so AI becomes a scaffold for learning rather than a shortcut around it.
Another thread of work focuses on making tasks personal enough that generic AI output is obviously insufficient. One set of Article Contents on AI proofing urges instructors to Make assignments Personal and Connect Assignments to Lived Experiences, and to Make the Thinking Process Visible so it can be graded. When I ask students to tie a concept to their own community, job, or family, or to critique how an AI answer fits (or fails) with their experience, they cannot simply paste in a chatbot response. That approach, outlined in Article Contents that frame AI as a potential “teaching superpower,” turns the tool into a foil for deeper reflection rather than a ghostwriter.
Oral defenses, handwritten work, and the return of the viva
One of the clearest ways to verify that a student understands something is still to talk with them about it. Some assessment guides are reviving the Viva, an oral defense where students must explain their work and answer questions on the spot. In that model, the written product can be drafted with AI help, but the grade hinges on whether the student can discuss, justify, and extend what is on the page. Advocates of this approach describe the Viva as a powerful Assessment tool that sidesteps unreliable AI detection and instead checks for genuine understanding.
Handwriting is also making a comeback, not as a romantic gesture but as a practical control. One professor, Meger, has required students to handwrite their assignments so that, in what she calls the absolute worst-case scenario where a student has gone to AI for help, they still have to write it out one time. Her approach, described in a piece on how professors envision classrooms in the Age of AI, treats handwriting as both a deterrent and a learning aid, since the act of writing can deepen memory. The account of Meger shows how low-tech tweaks can coexist with high-tech tools in the same course.
Teachers are using AI too, but they are not ready to surrender grading
While schools worry about students outsourcing work to chatbots, teachers themselves are experimenting with AI to manage their own workload. In one account, Alex Rainey, who teaches English to fourth graders at Chico Country Day School in northern California, used GPT-4 to help score student writing. She described how the model could quickly sort work into rough bands, then she would review and adjust, saying she could not go backwards now after seeing the time savings. Her experience, detailed in a report on how Alex Rainey works, captures the tension: AI can triage, but human judgment still anchors the grade.
Research on automated scoring backs up that caution. One study of large language models found that they could reliably evaluate short science responses, such as when students were asked One question about what happens to particles when heat energy is transferred to them, and could even suggest which direction to go in for feedback. At the same time, the researchers warned that, despite the speed of LLMs, humans still have to design rubrics and interpret edge cases. Their findings, summarized in a report on how One science question was scored, and a companion piece that notes that, Copy link aside, humans still have to oversee the process, both stress that AI should offer ideas, not deliver grades. The follow up from Despite the speed of LLMs, as Eshan Latif and Ninghao Liu argue, reinforces that point.
More from Morning Overview
*This article was researched with the help of AI, with human editors creating the final content.