✨ For Visitors ✨
An overview of what we did in this course
Overview
This page is for those who said "tell me more" after hearing about our Agentic Studio course in Spring 2026. The rest of the site is what we provided to students during the semester (and we didn't always do a good job of keeping it up to date).
In a nutshell, we:
- have a hypothesis that teaching early stages of agentic coding amounts to teaching core principles of software engineering and design
- wondered where our expert blind-spots were around students' abilities and interest in using these principles when developing applications with agents
- wanted to explore these ideas with students who had had some, but not too much, programming experience
So we ran a research project under the structure of a group independent-study course.
You can hear the hour-long summary of the course and its design in the Oxide and Friends podcast episode that Shriram and Kathi did in May 2026.
Further down, we provide info about our research design, a summary of our observations and findings, as well as a more detailed schedule of the topics and activities (along with relevant handouts), and where we'll be going from here.
Research Design
As part of the coding projects in the course, students were required to keep reflection journals about their experiences: what feelings were invoked at different points, what technical content they drew on, what they felt they needed to know, and how they split up the work between themselves and Claude. They also submitted their chat transcripts alongside their assignments (which came in via GitHub).
Students also wrote various documents to separately capture what they were learning or what advice they would give to other students about working with agents. These were given as assignments.
Our research questions include:
- How do Claude’s capabilities and behaviors interact with students’ experience (technical and affective) of the program-development process?
- In what ways do students utilize software engineering practices, and when do they do so relative to what Claude was doing?
- How do student reactions to working with Claude and software engineering practices change across the semester?
Findings
We are actively analyzing the data in Summer 2026. So far, we have found:
- Most students developed their own personal boundaries for when to use agents and for how much
- A majority of students were concerned about their lack of understanding of their code bases
- Most students were concerned about the loss of learning if we put this in intro courses
- Students found significant value in the peer-review and crit activities
- Several students wished the course had spent more time on advanced agentic skills (like orchestration and developing Claude skills), though we had warned upfront that we wouldn't be covering those. Others felt the course hadn't taught as much as it could have (though they noted the likely differences between 2nd and 4th semester students in this regard)
- Students by-and-large adopted testing and design practices, including the use of model-view-controller, but they did not pick up on externalization as much as we might have liked
- Students appreciated how the variety in the projects let us expose them to many areas of CS that are covered in more advanced courses
- Students didn't necessarily understand how websites worked
- Even in a small class centered around group feedback, some students were still reluctant to talk
- The prior experience differences were more noticeable than we expected in discussions and submitted work
- The later assignments took students longer than we anticipated, particularly as they ran into systems issues and token/plan limits (they had $20/month Claude plans by design); this in turn curtailed some of what we asked them to do.
We are still extracting blind spots from the student journals (as part of data analysis).
Looking Ahead
In Fall 2026, we will be expanding the experiment downward by weaving agents into an experimental section of an intro CS course with programming novices. We still intend to teach students how to program manually, but we will include some agentic projects so that we can also motivate design and testing with more nuance that our intro courses usually allow. Part of the challenge here will be training teaching assistants in the code and design review skills that such a course will demand.
Shriram will also expand the experiment upward by re-designing his programming languages course, bringing in many of the themes (types, constraints, testing) from this one.
Our project designs focused on implementation and testing, not front-end UI design. We realized how much our students would benefit from some attention to this area.
Detailed Class Schedule
| Date | Topics | Main class activities | Assignments | Readings and Handouts |
|---|---|---|---|---|
| Thu, Jan 22 | Course Overview | Overview of course | Project 1 (Tetris basic) | Offloading? No, Outsourcing |
| Tue, Jan 27 | Variations on Tetris, Model-View-Controller | Generate and critique anti-gravity tetris | Project 1.5 (Tetris dual) | |
| Thu, Jan 29 | Intro to Code Review and testing plans | Code review a solution, generate first testing plans | Project 1.75 (Tetris tested) | Crit handout |
| Tue, Feb 3 | Tetris recap | Share out final testing plans and reflection journal entries | Project 2 (Airport Weather) | Writing an Agents.md file |
| Thu, Feb 5 | Review airport weather project | Live code reviews, reviewing guidelines on testing | Project 2.5 (Airport Weather more data) | Peer-review rubric |
| Tue, Feb 10 | Designing data structures with Claude | Group exercise to design data structure for tournaments | Project 3 (Peer Review Airport Weather Testing) | Testing review rubric |
| Thu, Feb 12 | Auditing implementations for performance | Contrast tournaments solutions generated with CS1 level knowledge (in prompt) and more advanced knowledge | Checkpoint 1 (Memo to Peer and Course Eval) | |
| Tue, Feb 17 | No class | President's Day weekend | ||
| Thu, Feb 19 | Leveraging programming language features for readability | Contrast assembly, python, and typescript code for a Library checkout system to identify features that lead to readability | Project 4: Requirements Checker Design | Simon Willison's LLM predictions for 2026 |
| Tue, Feb 24 | Crit requirements checker designs | Follow the crit rubric on each others' designs through three roles (Designer, Manager, Client). | Project 4.5: Requirements Checker Design Revision | Requirements Checker Crit (Spr 26) document |
| Thu, Feb 26 | Externalization of data | Review an online shop builder from perspectives of user experience, product quality assurance (testing), and code quality (implementation) | Project 4.75: Requirements Checker Implementation | Shopping Site Code Review document |
| Tue, Mar 3 | Code review rubric and readability | Work through a code-review rubric on a revised version of the online shop. Expert code review of same codebase by instructor | Project 4.83 (Peer Review Requirements Code) | Code-review rubric |
| Thu, Mar 5 | Taking responsibility for products | Setup TestFest, in which each students' code runs against each students' tests | Project 4.91 (Requirements Checker Product) | |
| Tue, Mar 10 | SQL | Return to airport weather and consider how different data structures would be better for different queries. Explain SQL and how it handles performance across different queries | ||
| Thu, Mar 12 | Mapping requirements to SAT solving | Inspect the design of the TestFest harness, revisit SQL, show how requirements and transcripts map to boolean formulas | Project 5 (Music Tour Design) | |
| Tue, Mar 17 | Designing constraints for applications | Review the subtleties of analyzing TestFest results, contrast three student approaches to representing requirements as data, discuss the different constraints for the Music Tour exercise | Project 4.96 (Requirements Checker Analysis) | |
| Thu, Mar 19 | No class | Class cancelled | ||
| Tue, Mar 24 | No class | Spring break | ||
| Thu, Mar 26 | No class | Spring break | ||
| Tue, Mar 31 | Mental models and SBF (Structure, Behavior, Function) | Work through models, SBF framework, and Jackson's software concepts using a desktop trash can example and the Brown waitlist system for course registration | Project 6 (Metaphors) Project 6.5.1 (Metaphor Peer Review) | Some Observations on Mental Models (Donald Norman) Structure-Behavior-Function (Cindy Hmelo-Silver) |
| Thu, Apr 2 | Practicing concepts and concept mapping | Look at concepts in different styles of chat (Slack, email, SMS) and talk through goals of Chat Client assignment | Final Project (Chat Client) Project 6.5.2 (Metaphor Revision) | |
| Tue, Apr 7 | APIs and Constraint Solvers | Compare web APIs (GET, PUT, POST), introduce Matrix protocol, constraint-solving (4-coloring problem). | Chat Client: Mapping and Testing | Matrix protocol |
| Thu, Apr 9 | Privilege levels and Security | Overview of usable security, explore privilege levels in common course software | Chat Client: Server Testing | |
| Tue, Apr 14 | Studying impact of agents via education research | Dissect what makes for a good education research question, learn about cognitive load, discuss the MIT Brain on ChatGPT study, group exercise on qualitative coding of reflection journal entries | Chat Client: Implementation | |
| Thu, Apr 16 | How coding agents work | Overview of the algorithmics that underlie coding agents | Readings on AI and Mental Fatigue | |
| Tue, Apr 21 | Fatigue and productive agentic coding practices | Overview of cognitive science of multitasking and cognitive open loops; discussing own experiences with fatigue and cognitive load | Reflecting on Progress | |
| Thu, Apr 23 | Agents in intro courses? | Focus groups to consider benefits, risks, and potential policies for weaving agents into intro CS courses | Chat Client: Implementation with AI |