Oscine songbirds have been an important study system for social learning, particularly because their learned songs provide an analog for human languages and music. Here we propose a different analogy; from an evolutionary perspective, could a bird’s song be more like an arrowhead than an aria? We modify an existing model of human tool evolution to accommodate cultural evolution of birdsong: each song learner chooses the most skilled available tutor to emulate, and more likely produces an inferior copy than a superior one. Similarly to human toolx evolution, we show that larger populations foster greater improvements in song over time, even when learners restrict their pool of tutors to a subset of individuals. We also demonstrate that randomly sampling tutors from the population offers no clear benefit over sampling only existing connections in a structured social network, and that by allowing a lower quality trait to be easier to imitate than a higher quality one, simpler songs can be maintained after population bottlenecks. We show that these processes could plausibly generate empirically observed patterns of song evolution, and we make predictions about the types of song elements most likely to be lost when populations shrink. More broadly, we aim to connect the modeling approaches used by researchers studying social learning in human and non-human systems, moving toward a cohesive theoretical framework that accounts for both cognitive and demographic processes.