Recently, “the bitter lesson” is having a moment. Coined in an essay by Rich Sutton, the bitter lesson is that, “general methods that leverage computation are ultimately the most effective, and by a large margin.” Why is the lesson bitter? Sutton writes: