Introducing rlbook.ai

I’m launching rlbook.ai. I’m excited, but I also have mixed feelings.

Five years ago, I spent close to two years writing Mastering Reinforcement Learning with Python. It was exhausting but rewarding. I felt I’d contributed something useful to the RL community.

Over this new year break, I built rlbook.ai in two days. The content is better than my book in many ways—more interactive, more current, more comprehensive. I’m still reviewing and improving what the AI generated, but the quality is already really good.

This feels both exciting and unsettling. What does it mean when two years of human effort can be surpassed by two days of AI-assisted work? I don’t have all the answers, but I want to explore this openly.

Why Traditional Content Creation is Painful

I’ve spent years teaching—as a TA during my PhD, as adjunct faculty, and as a book author. Writing that book was truly challenging. Each chapter took weeks. I was never satisfied. The process looked like this:

Research everything thoroughly
Plan the chapter flow and examples
Struggle to express concepts clearly (especially hard as a non-native English speaker)
Create diagrams that never quite looked right
Debug code examples through endless iterations
Move on, knowing I could have done better

The worst part? After all that effort, the content was frozen. A reader suggests a better explanation? Great, maybe I can fix it in the second edition in 3-5 years. A new technique emerges? The book is already outdated.

Books force you to stop iterating and ship. But learning never stops evolving.

What Changed (And What Didn’t)

LLMs are good at explaining established knowledge. They’re getting better at synthesis and creativity, but for now, their strength is conveying existing ideas clearly and consistently. For technical education, that’s actually most of the work.

I saw this recently when I worked on an education project for my local community. We spent weeks manually creating content. When we switched to using Claude with carefully designed prompts, the results were significantly better. The lesson: we were better at designing what to teach than how to write it.

Then a publisher reached out about another RL book. I had to think: what would I contribute that an LLM couldn’t? Technical books aren’t about novel research—they’re about clear explanations. LLMs are extremely capable of that.

But curriculum design, quality supervision, ensuring coherence across many chapters, and building community—those still need human judgment. At least for now.

How rlbook.ai Works

The key difference between rlbook.ai and just “asking ChatGPT to explain RL” is this:

The prompts that generate content are versioned in the codebase and evolve over time.

Every chapter is generated from prompts that encode:

Platform-wide standards for notation and code style
Specific learning objectives and narrative arcs
Cross-references to other chapters
Requirements for interactive demos
Three complexity levels (intuition, math, implementation)

When someone suggests a better way to explain something, we don’t just fix that one chapter. We update the prompt template. Then we can improve all similar content.

Think of it as content-as-code: versioned, testable, and improvable through pull requests.

The Quality Process

Content moves through clear stages:

AI-Generated: Fast and comprehensive, but may contain errors or inconsistencies. Clearly marked so readers know what they’re getting.
Editor Reviewed: Human review (currently just me) for accuracy and coherence. Most problems get caught here.
Community Reviewed: Published with open comments via Giscus and Discord. Readers help improve it.
Verified: Code tested, feedback incorporated, cross-references validated. This is mature content.

The big difference from traditional publishing: you don’t need perfection before shipping. You improve continuously based on real feedback.

What I’m Trying to Build

If I could combine my favorite learning resources, it would be:

3Blue1Brown’s visual explanations
Distill.pub’s interactive technical content
D2L.ai’s hands-on code examples

All kept current through AI assistance and community input.

That’s the aspiration. Whether it works, we’ll see.

What I’m Still Figuring Out

This is an experiment. Here are the things I don’t know yet:

How do you prevent drift? As prompts evolve, how do you keep Chapter 20 consistent with Chapter 1?
Where does AI help most? Some content might need more human craft than others.
How do you maintain voice? Can AI-generated content feel coherent across 30+ chapters?
Can community feedback scale? What happens when there are 100 comments on one chapter?

I’m learning as I go. If you have thoughts on these, I’d genuinely like to hear them.

A Note on Authenticity

I could claim this is “the future of education” or “revolutionizing learning.” But honestly? I’m not sure. This is one person’s attempt to use AI to create better educational resources than I could alone. Maybe it works, maybe it doesn’t.

What I do believe: the old model of spending years writing static content doesn’t make sense anymore. We can iterate faster and improve based on actual learner feedback. The tools have changed; we should change how we use them.

Get Involved

If this interests you:

Contribute on GitHub: Fix errors, improve prompts, suggest examples
Leave feedback: Comment on chapters with what worked or didn’t
Join Discord: Discuss RL, content design, or just ask questions

This is an open experiment. Let’s see where it goes.