Some Musings on AI in Art, Music and Protein Design

When I started my PhD in late 2018, AI hadn’t really entered the field of de novo protein design yet – at least not in a big way. Rosetta’s approach of continually ranking new side chain rotamers on a fixed backbone was still the gold standard for the ‘structure-to-sequence’ problem. And of course before long we had AI making waves in the structure prediction field, eventually culminating in the AlphaFold2 we all know and love. 

Now, towards the end of my PhD, we are seeing the emergence of new generative models that learn from existing pdb structures to produce sequences that will (or at least should) fold into viable, sensible and crucially natural-looking shapes. ProtGPT2 is a good example (https://www.nature.com/articles/s41467-022-32007-7), but there are several more. How long before these models start reliably generating not only shapes but functions too? Jury’s out, but it’s looking more and more feasible. Safe to say the field as a whole has evolved massively during my time as a graduate student.

This comes at a time when AI is encroaching more and more into the fields of art/graphic design and prose/text generation. Right now I’m seeing people post AI-generated avatars of themselves produced by the Lensa photo editing app (https://www.theguardian.com/us-news/2022/dec/09/lensa-ai-portraits-misogyny) on their Instagrams, and simultaneously a Guardian article entitled ‘AI bot ChatGPT stuns academics with essay-writing skills and usability’ (https://www.theguardian.com/technology/2022/dec/04/ai-bot-chatgpt-stuns-academics-with-essay-writing-skills-and-usability) recently popped up on my Facebook newsfeed.

There’s a subtle difference in these endeavours – using AI to design functional proteins takes something that human designers have had little success at beyond Frances Arnold-style directed evolution, and attempts to make it a reality. But AI-generated art and text takes two things that humans have traditionally had no trouble doing and attempts to do them in a far shorter timeframe for little or no cost. How ethical is this? Lensa is already being called out for effectively plagiarising the artists from which it has drawn its training sets; meanwhile chatbots have a history of being accused of perpetuating racist & sexist stereotypes due to their training set being… well, the internet (https://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist).

But say a workaround is eventually found for the issue of training set ownership in AI graphics; say, for example, various artists’ works are bought for the sake of training AIs, or perhaps a large number of artists are hired to create works for the sole purpose of constructing a suitable training sets for generative models. Will this become the new paradigm? Will artistic tasks that have traditionally been seen as purely human start to be routinely generated from AI models? Arguably this is already the case for some areas of prose generation – or at least if the myriad of ‘this advert was written by AI’ type adverts I see online are to be believed.

A part of me worries that a knock on effect of this could be a gradual reduction in the number of capable human artists and writers. If most bread and butter digital graphics and text generation tasks can be done pretty well by AI – or at least well enough to be worth the massively reduced cost compared to hiring a person – then this means far less training opportunity for human creatives: far less chance to practice and hone their skills.

Whilst not AI per se, I see parallels with how technology has changed the way popular music is both consumed and produced. Over the last century or so, as day-to-day provision of music in venues such as bars/clubs has shifted further towards use of speakers hooked up to a source of audio (i.e. radio or vinyl/cassette/CD/Spotify) instead of groups of human instrumentalists/vocalists, so too have the number of bread and butter music performance jobs decreased over time. Fewer regular performance jobs means fewer chances for instrumentalists and vocalists to practice and hone their skills (in particular their ensemble performance skills), resulting in a smaller pool of highly skilled, well trained performers. 

Many highly successful bands from the past ~half century were successful because they comprised the cream of the crop from a larger pool of day-to-day performers, but with a smaller pool to fish from, many popular musicians now tend to find greater success success working solo (either in a performer sense or a producer sense). Just ask Wolf Alice (https://www.theguardian.com/music/2021/mar/18/why-bands-are-disappearing-young-people-arent-excited-by-them). Unless there is a sudden drive by record companies to search for and recruit pop music-oriented performers with intent to form groups, it is far less likely that the next Beatles will arise through chance encounters via this smaller pool.

And whilst graphical art and the written word are traditionally less group-mediated art forms and more solo affairs, I do wonder if an analogous future reduction in talent pool, driven by advances in AI generated content, may lead to a similar change in creative landscape; specifically, the human-generated creative landscape. It would be a huge shame if by using AI to cut corners in creative industries we end up starving future human artists of the resources and opportunities they need to flourish, and in doing so leave ourselves, ironically, with only the AI content to turn to.

Author