ACM ByteCast - Roger Dannenberg - Episode 61

Episode Date: December 17, 2024

In this episode of ACM ByteCast, Bruke Kifle hosts ACM Fellow Roger Dannenberg, a Professor Emeritus of Computer Science, Art & Music at Carnegie Mellon University. Dannenberg is internationally r...enowned for his research in computer music, particularly in the areas of programming language design, real-time interactive systems, and AI music. Throughout his career, he has developed innovative technologies that have revolutionized the music industry and is known for creating Audacity, the widely known and used audio editor. In addition to his academic work, His other projects include Music Prodigy, aiming to help thousands of beginning musicians and Proxor, aiming to help software developers launch a successful career. Roger is also an accomplished musician and composer, having performed in prestigious venues around the world. Roger traces his two lifelong passions for computer science and music, and his fascination with the connection between sound, mathematics, and physics. He describes the signal changes in interactive computer music, which once required specialized hardware but has since been replaced by ubiquitous software-based audio processing. Roger and Bruke discuss the promise of AI in music, especially for enhancing creativity and live performance, as well as the challenges of balancing AI with human labor and creativity. Roger also describes his work on the powerful open-source audio editor Audacity (co-developed with former student Dominic Mazzoni), which has democratized music production and is now used by millions of users worldwide. Finally, he talks about some recent projects in music analysis and composition, and reflects on his role as an academic and advisor. 

Transcript
Discussion (0)
Starting point is 00:00:00 This is ACM ByteCast, a podcast series from the Association for Computing Machinery, the world's largest education and scientific computing society. We talk to researchers, practitioners, and innovators who are at the intersection of computing research and practice. They share their experiences, the lessons they've learned, and their own visions for the future of computing. I am your host, Brooke Kifle. experiences, the lessons they've learned, and their own visions for the future of computing. I am your host, Brooke Kifle.
Starting point is 00:00:33 The intersection of computer science and music has led to remarkable innovations in how we create, perform, and experience music and the arts. Advancements in real-time interactive systems, AI, and digital audio processing have opened up new possibilities for musicians and composers. At the forefront of this exciting field is our next guest, Dr. Roger Dannenberg, who has been pioneering groundbreaking technologies that bridge the gap between computer science and music for over four decades. Dr. Roger Dannenberg is a professor emeritus of computer science, art, and music at Carnegie Mellon University. He received his PhD in computer science from Carnegie Mellon in 1982 and has since made significant contributions to the field of computer music. Dr. Dannenberg is internationally renowned
Starting point is 00:01:15 for his research in computer music, particularly in the areas of programming language design, real-time interactive systems, and AI music. Throughout his career, he has developed innovative technologies that have revolutionized the music industry and is known for creating Audacity, the widely known and used audio editor. Surprisingly, in addition to his academic work, Dr. Dannenberg is an accomplished trumpet player and composer, having performed in prestigious venues around the world. Dr. Roger Dannenberg, welcome to Bycast. Thanks for having me. It's good to be here. Very, very excited to have you. You know, you have such a wide range of interests and experiences, and you've been able to bridge this personal passion and love for music with a very accomplished
Starting point is 00:02:04 academic career. Could you describe some key inflection points in your personal and professional journey that have inspired you to pursue a career in computing and specifically within your area of research at the intersection of computer science, arts, and music? Sure. I think that one thing that might be good to understand is that I've always been really attracted to mathematics and science and engineering. When I was a kid, I liked to build things. My father had a workshop and taught me a lot about using tools. And so that's just kind of part of my nature.
Starting point is 00:02:42 And I was also really attracted to music early on and pursued music through trumpet playing and writing and arranging. And so I kind of grew up with both of these things as very separate directions. And one inflection point that comes to mind is in high school, I went to a music store that had an analog synthesizer on display. And these days, that would be nothing or be hard to imagine discovering synthesizers that late in life. But for me, these were very new things and really fascinating. And I just really took to it. I had a chance to just play around with this thing in the music store. And being able to turn knobs and parametrically adjust sounds was something that just was really awakening.
Starting point is 00:03:36 And it sort of tied all of everything that I knew about mathematics and functions and physics of sound to the musical ideas that I had in my head. And in some sense, looking back on that, I see that that was really an awakening and there was no going back. So I did try to pursue computer science as a profession, a solid way to make a living and enjoy it. And at the same time, practicing and developing as a musician, which I thought I would, you know, through computer science,
Starting point is 00:04:11 I would have the resources and the freedom to make the kind of music I wanted to make. And it wasn't really until after graduate school in computer science that I started really putting the two together and doing research in the computer music area. And I guess one other thing I would mention is that after I finished my PhD, there were people around the computer science department at Carnegie Mellon that were very encouraging. I think Alan Newell and Herb Simon were among those. You know, these are great names and great people in computer science. And their attitude was that computer science was so all-encompassing that the important thing was to do good research and not worry about anything else.
Starting point is 00:05:00 And that attitude was really pervasive in the department. There were just all kinds of things going on. And so I felt invited and welcome to really pursue my passion. And that was tremendously helpful. That's amazing. I think going from having a personal passion and then seeing that as independent or somewhat exclusive of your academic pursuits, but finally getting to a point where you were able to combine these two pursuits. And so being in a position where not only music is something that you can pursue in your free time as an exercise of enjoyment, but rather becomes the focal point of your career. And so I think that's certainly something that a lot of people will take great inspiration from. I think, as you described it, you've had this long career, not just as an academic, but also as an accomplished trumpet player and composer.
Starting point is 00:05:50 How do you think your musical background and performance in sort of different cultural spaces have influenced your approach to teaching and scholarship more broadly? Do you find that there are parallels between music and your academic work, or do you find that it's shaped your perspective as a researcher? I would say the most important thing that music has done for me is given me a real basis for asking important questions and really forming research questions. So, you know, it's through the practice of music. Well, let me give just one example. One of the early things that I worked on was getting computers to listen to live performing musicians and play along with them. And that was sort of inspired by a talk that I heard where someone was building a sensor for a conductor's baton. And their idea was that if you could sense baton motions, then someone could actually conduct a computer
Starting point is 00:06:56 through processing sensor data to figure out what the beat was and what the tempo was. And that would be a good way to achieve expression through computer music. And it's a good idea, but actually I think it was musically uninformed because as a trumpet player who had played in orchestras and bands and worked with many different conductors, I knew that performers in the orchestra are not really following directly what comes from the baton. And if you try to synchronize visually with the baton, you're just not going to play in time. And so I started thinking about, well, how does that
Starting point is 00:07:39 actually work? Where does time and tempo come from in an orchestra or a band? And it comes from listening to the musicians around you, that there's this kind of collective entrainment is the technical term we use that enables people to play together. And our sense of hearing is temporally much more precise than our vision. So it's really necessary to listen. And so that got me thinking about, okay, so maybe if you want to synchronize a computer to musicians, you should be listening to musicians. And I thought about how to do that. And that led to a computer accompaniment system and patents and a spinoff company. And that happened very early in my computer music career. And it's just an example of how being a musician and being concerned with real musical problems led to interesting solutions and interesting research
Starting point is 00:08:34 questions. Well, that's amazing. So the ability to ask the right questions, but also having the personal context and knowledge yourself being a musician to pursue the right areas of research and solutions. You know, you touched on some of your early work, looking back from score following in the 1980s to some of your more recent projects like the compo system. From your point of view, how has the field of interactive computer music involved since you began your work in the 80s? One of the big changes, and I would say probably the biggest change, has been the speed of computing and hardware. It's hard to imagine in some ways how slow computers used to be. We're looking at machines that are a million times faster than just a few decades ago.
Starting point is 00:09:26 And it's something that I thought about decades ago that, wow, computers are growing in speed exponentially. So what are the interesting problems going to be 10 or 20 years from now when we have so much more computational power? And it's amazing how difficult it is to think about a world in which you have a million times more compute power. Maybe some problems, it's obvious what you would do with more power. But I think with music and audio, it wasn't so clear. But one thing definitely that happened that changed is when I started working in the field, it was only possible really to make sound with specialized hardware. And now it's just routine to build synthesizers and very sophisticated audio processing totally in software. And that's just opened up so many
Starting point is 00:10:21 possibilities for interaction. Now everyone actually has a very sophisticated audio system in every browser in the form of web audio, which is just one example of how software audio signal processing has become just ubiquitous. Very interesting. And I think to your point, which kind of leads me to my next question which is one of the key implications present day that we've seen of the acceleration of of computing technologies is also in the field of ai and so i'm sure as we look specifically at ai you know i think maybe if you asked a lot of folks in the field 15 20 years ago a lot of folks saw a lot of folks in the field 15, 20 years ago, a lot of folks saw a lot of potential for AI to enhance productivity with more routine tasks and less of some of the interesting creative
Starting point is 00:11:11 tasks that we're seeing AI sort of excelling in today. And so what potential do you see for applications of some of the emerging AI capabilities that we have today in enhancing music composition and potentially live performances as well? It's a great question. The potential is really huge. And I was just thinking that back when I worked on score following and computer accompaniment, I helped a company develop a product that came out initially for the Macintosh. And at the time, it was billed as the first consumer AI application. So that was considered artificial intelligence. And certainly, for the time, it checked all the boxes. I think AI has been part of music research and making music, along with so many other things.
Starting point is 00:12:01 And we've seen an incredible acceleration of AI capabilities through deep learning. And I think we're just beginning to really understand what that can do for music and how to integrate that with music. One of the things I'm interested in pursuing is using AI to build more interactive real-time systems. So far, most of the work has been towards less interactive systems, for example, doing music search or prompting an audio generation system with text and getting an audio composition out without a lot of interaction or real-time control. And I think that music making can be a very physical, performative task, and connecting some of these AI tools to real-time gesture sensors and live performance is a really interesting direction to pursue, and I think
Starting point is 00:13:03 we'll see more of that in the future. Very interesting. And how do you generally think about how, as a community, we can think about balancing the role of AI or potentially AI assistance with human creativity in music while addressing sort of the importance of preserving some of these things to musicians and composers and the general ethical considerations with the role of AI in the
Starting point is 00:13:25 space? It's a very complicated situation. One thing I like to do is take a historical perspective and look at not just what's happening right now, but what's happened over the years and even over centuries, that musicians have always adopted new technologies. And I think it's true of the arts in general. Even oil painting in the early days involved a lot of chemistry and processes and techniques that were being developed largely by artists themselves. And in the music world, you look at the piano, it's just a marvel of mechanical construction, huge pieces of cast iron to withstand the tension of all those strings,
Starting point is 00:14:14 and intricate mechanisms in the keys. So huge amounts of engineering and technology to make a musical instrument just that could play loud and soft with a keyboard. And so technology has always gone hand in hand with musical developments and that followed into electronics and computers. And so it's not like AI is making fundamental changes in that sense. The other thing to think about is that while there is some threat to musicians from artificial intelligence, generative audio algorithms for composition and synthesis and processing, it wasn't that long ago that we didn't have electronics and broadcast and transmission. And if you look at what the recording industry, a different technology,
Starting point is 00:15:06 did to live performance and the lives of many practicing musicians, I like to point out to people that when they talk about threats to musicians, that there aren't that many musicians left making a living at music because they've already been put out of work by the recording and the broadcast industries and all of that technology. So things are going to change, and I think that change is inevitable. But I think that just like electronic musicians embraced the new electronics technology and created new artistic opportunities, we've got the same situation with new AI techniques. And so this will keep the arts alive and keep lots of interesting work out there for composers and creators. The downside of some of this is maybe it's going to be easier to make
Starting point is 00:16:00 music coming out of black boxes and not requiring so much human labor, which has advantages and disadvantages, but it's just going to take some adjustment. to make music coming out of black boxes and not requiring so much human labor, which has advantages and disadvantages, but it's just going to take some adjustment. I think that's a really good way to put it. Of course, historically, the role of technology has always existed and has always been a key enabler for further accelerating or improving the production of music, how we consume music. And so while there are, of course, certainly downsides in this day and age and sort of the advent of AI, we kind of have to embrace these technologies as an enabler. So I think that's well said.
Starting point is 00:16:39 I would love to turn to some of your work around software design for computer music. Of course, as the co-creator of Audacity, as the developer of the Nyquist programming language, what are some key principles that have guided your approach to designing software for computer music? Well, one thing I can say about Audacity, which by the way, I should mention that it was a co-creation with Dominic Mazzoni, who did a huge amount of work in the early days and was really the leader in working with people to put Audacity out there, which led to it becoming a huge open source project used by millions and millions of people. When we started Audacity, our initial goal was just to do visualization of audio waveforms and spectra to help with research because there weren't really great tools out there and we thought it wouldn't be that hard to build something,
Starting point is 00:17:41 which was really the case. And so I think one thing that would say is that the commercial stuff out there tended to be very task specific. The software was designed specifically for audio editing tasks. And I think the designers were thinking more about taking a very narrow view of what users might want to do with something rather than looking at editors and audio displays as a general capability that could be offered in a general form for any purpose. And that's what we set out to do with Audacity. The other thing is that the models for early audio editors were treated audio files as these monolithic things as if maybe they were pieces of tape like from a tape recorder and so if you wanted to
Starting point is 00:18:34 cut copy paste the early editors would copy entire sound files which at the time, well, I mean, they're still big, they're megabytes in size. And at that time, disk speeds were not so great and copying a big multi-megabyte chunk of data took a lot of time. And so these editors tended to be very slow. And we were able to look at this as computer scientists with some knowledge of data structures and really think about, well, where's the time going and is that necessary? And do we have to store audio in one big monolithic file and keep it that way? Or can we break it up into chunks and reassemble chunks in ways that are quick so that we can make editing go much faster. And so that was early on. One of the reasons I think Audacity caught on
Starting point is 00:19:28 is that it was so fast for doing simple editing operations compared to all of the commercial software. And it was just because we took maybe a computer science viewpoint to simple data structures. It's not even that sophisticated, but it was obviously a good idea. And then I think similarly, if you look at Nyquist, which is a audio music composition language also used for sound synthesis, and it's actually embedded within Audacity now. And so you can write audio effects in a very high level language within Audacity, if you know how. So the whole point of Nyquist was to kind of revisit some existing computer music languages, more from a language design and a computer science standpoint, where things could be very much more unified. Thinking about time and signals and building abstractions that could be put together
Starting point is 00:20:33 in a very flexible way. And that led to a functional programming language. And interestingly, by using these abstractions, the implementation could be very specialized to different sorts of data types. And it actually ended up running much faster than some earlier designs that were more concrete. So normally we think that if you build a low-level, very close to the metal programming language, it's going to go faster than some abstract functional programming system. But Nyquist is at least one exception where having those nice abstractions allows you to do a lot of optimization that wouldn't happen otherwise.
Starting point is 00:21:17 ACM ByteCast is available on Apple Podcasts, Google Podcasts, Podbean, Spotify, Stitcher, and TuneIn. If you're enjoying this episode, please subscribe and leave us a review on your favorite platform. One thing that at least stood out to me is thinking about existing problems and, you know, identifying interesting or novel approaches from computer science or sort of core principles from computer science to solve those existing problems have been key to unlocking new opportunities in these spaces. How much do you see open source development, I guess, particularly in the case of Audacity, influencing both the
Starting point is 00:21:57 accessibility and the widespread availability of music production? Well, I think it's been huge. Audacity has had something like a million downloads a month for well over a decade. So you can do the math and figure out how many people are getting access to some high quality audio production tools that run on average, even low-end hardware. And I talked to so many young people that say that Audacity was their first exposure. And I think back to my days wandering into a music store in high school and discovering an analog synthesizer. And maybe the same thing is happening now to high school kids that download Audacity and record their guitar in their bedroom or whatever and start doing music production. It's really had a big impact there. And at the same time, Audacity has been used. Almost every professional music producer that I talk to
Starting point is 00:22:57 uses Audacity for one thing or another, just because it's maybe quicker to set up for small tasks, even though they have access to very sophisticated systems in their studios. So yeah, I think open source software, just because it's free, it encourages people to try it out and enables people to get things done. And the other thing I would say, and I like this comes from Dick Moore, who is a computer music guy at UCSD. And he, in talking about software and music software in particular, said that the software that academics and researchers create and put out there as open source may not always be the best software or the most sophisticated for any given task. But one thing it does is it sets a bar over which if you want to introduce some
Starting point is 00:23:54 software into the market and sell it, you'd better do better than what's already out there for free. And so it's kind of establishes, has a strong influence in product design and systems design and just overall quality and capabilities that if you can't beat the stuff that researchers put out there for free, then you've got to try harder. And I think we've seen a lot of software kind of step up their game a little bit. Oh, that is certainly a very true point. It has indeed set a bar. And I think from your point in terms of democratizing sort of music production, but also more generally developing this new generation of folks who are just as inspired in music production and potentially the role of computers and computer science, just as you were inspired
Starting point is 00:24:45 by the analog synthesizer way back when, I think must be an inspiring and moving driving force for you. Right. Well, I hope so. So, you know, I think we touched on a couple of your contributions and I sort of want to give space for you to highlight maybe one other project that, you know, you're most proud of. Of course, looking back on your on your career you know we talked about a couple things with you know software design programming languages piano tutor i think audacity as well you know is there maybe another project or achievement that you're most proud of and would like to highlight sure it's kind of hard to pick something because which of your children is your favorite? But before I move on to another thing, I just want to mention one thing I'm very proud of. It hasn't been that long since I started collaborations with Jorge Sastre in writing an opera.
Starting point is 00:25:39 This has been translated into English and has had a few performances. And it's been a really great experience. And it's a huge project that I'm very proud of. But another thing, in the more technical direction, one thing that I've worked on recently is trying to understand music through studies of prediction and entropy, because it's key as a music composer and music listener that music has, at a very abstract level, it's all about repetition and anticipation. And we've tried to formalize some of that, as have many other people in computer science and music theory. But our approach has been to use especially hidden Markov models as predictors. So it's a form of sequence learning. And we've used the predictions to form estimates of how much repetition there is in music and where repetition occurs.
Starting point is 00:26:52 And the repetition is how music achieves structure in many cases, maybe most cases. And once you can identify structure, you can start looking at the influence and interaction between structure and harmony and pitch and entropy or predictability. And so I think this work is by no means finished, but it's a direction. And what I really hope that it will lead to is a deeper understanding of music, which currently there's a lot of, especially with all of the efforts to model music with artificial intelligence systems and deep learning, there's been a real focus on popular music because it's popular. And it's what most people hear and listen to. And it actually tends to be a little more formulaic than a lot of other musical forms. And so I think it's easier to learn. And so what these systems are,
Starting point is 00:27:47 I think, failing to accomplish is an understanding of what underlies all music. So not just popular music, but let's look at Beethoven or let's look at Stravinsky or let's look at bebop. How do all these forms evolve, and how do they all call themselves music? What's going on that's common across all of this music? And I think a lot of it has to do with repetition and anticipation. And that's, of course, all tied to human perception, because if we don't perceive it, then it's, you know, it's not going to, doesn't really matter so much. And so, anyway, I think that line of research has been really interesting, and we have a long way to go, but that's something I'd like to highlight. Very exciting, very exciting, and certainly an exciting area of further development.
Starting point is 00:28:50 I'd love to touch on some of your work sort of in the academia space as a professor and as an instructor. Having advised, I'm sure, a lot of students who've gone through the department and the field, what skills do you believe are essential for students specifically in music, but also entering more generally this interdisciplinary field, especially in light of some of the emerging technologies and trends that we're seeing these days. I think for my students, the most successful ones have been the ones that really know a lot about music and have been educated musically and are really passionate about music. So I run into a lot of other researchers and lots of students who are technical and everyone loves music. And so it's common for people with a lot of technical skills and computer science interests to say, well, I could apply this to music, but they
Starting point is 00:29:41 don't really know so much about music. And I think that makes it difficult and less successful. So there's a lot said about interdisciplinary studies and interdisciplinary teams with the assumption or just observation that not everyone's an expert in everything. And maybe the way to make progress across disciplines is you have disciplinary specialists that talk to each other a lot and you get some kind of benefit from getting experts together. And I think that's certainly true. And there are lots of good case studies and examples of that. But personally, I've always worked more on personal projects or projects with me and a student and not so much in these broader interdisciplinary team sorts of approaches. And I think it's just really valuable to have to be really solidly grounded in music or whatever it is that you're interested in pursuing.
Starting point is 00:30:45 Yeah. Yeah. very much so. And so I guess maybe as we sort of wrap up this conversation, I'd love to get some of your thoughts on future directions for the field of computer music. So from your point of view as somebody who's worked in this field for a number of decades, where do you see the field heading in, say, maybe the next five to 10 years, particularly as we think about some of the role with advancements in computing, some of the rapid progress that
Starting point is 00:31:17 we're seeing with AI integration and interactive systems? So where do you see some of the exciting areas of research? Well, it seems to me that artificial intelligence has become, in a sense, the new computer science. And so just as in the early days when computers were becoming available, people were looking around saying, oh, I can apply computing to this and I can apply computing to that. We're seeing that with artificial intelligence that, oh, now I can apply machine learning to this and I can apply computing to that. We're seeing that with artificial intelligence that, oh, now I can apply machine learning to this and to that. And just as computers and computer programming became pervasive, I think intelligent systems are going to be pervasive in almost everything that we do, including music. And so I think for the future, the big question out there is how do we make intelligence systems more AI and machine learning, even though machine learning has traditionally been a batch-oriented, take a whole bunch of data,
Starting point is 00:32:31 try to distill it into some knowledge and build an inference engine. So that's kind of the paradigm that's developing, but I think that it has its shortcomings, and thinking about real-time and interaction is going to open up a lot of new possibilities. So I think that's a future thing to work on and otherwise just generally interaction and control. We have amazing systems that are capable of taking prompts and creating stuff from that. What we don't have are systems
Starting point is 00:33:05 that after they create something, allowing artists to go in and modify them, perfect them, teach the system something it didn't know that would make everything better or would push everything in a certain direction. People do that in a very limited way with prompting, but I think we have a lot to learn about how to really control systems especially for artistic purposes where the the standards are so high very promising
Starting point is 00:33:34 and exciting areas of work and i guess maybe just to wrap up of course we have a lot of folks who tune into bytecast who are excited about computing, who are potentially interesting in establishing a career in the computing space, and much like you, potentially in the interdisciplinary field of study. And I know you provided some interesting nuggets or perspectives on your work in music and some of the models of students that you found to have great success. But more generally, what advice would you give to young professionals, young researchers, students who are interested in computing more broadly and potentially interdisciplinary field of study? Well, I think the best advice I can give, and it's advice that many others have given, so it's nothing new, but pursuing your passions is really important, that you will be
Starting point is 00:34:26 tempted to do what people tell you to do or do what your parents tell you to do or do whatever there is out there that's offering the most money and the most income. And ultimately, I think that the thing that people should prioritize is having fun and enjoying themselves and being proud of their work. And you can achieve that by following what you're passionate about, and at least not having regrets of leaving opportunities behind that you really wish you had pursued. And I think it's a difficult world to do that. And so it's difficult to give that advice to people because it doesn't guarantee that you'll be successful. But I think it's still a good thing to aspire to and a way to provide some guidance and look for the right opportunities
Starting point is 00:35:20 and make the right decisions. I think that is a great piece of advice to pursue your passions, to find things that provide you joy and fun and allow you to live a sense of purpose. And I think coming from you as somebody who's been able to live that advice and live that mission, I'm sure a lot of our listeners will take great inspiration in your journey
Starting point is 00:35:41 and not just your advice. So Dr. Dannenberg, thank you so much for joining us on ByteCast. And we're very excited to see the future of the computer music field. Oh, thank you. ACM ByteCast is a production of the Association for Computing Machinery's Practitioner Board. To learn more about ACM and its activities, visit acm.org. For more information about this and other episodes, please visit our website at learning.acm.org.
Starting point is 00:36:19 That's learning.acm.org.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.