Abstract
This essay takes the concept of “digital ekphrasis” as an opportunity to look at contemporary multimodal AI – or more precisely text-to-image generators, understood as the latest phenomena in the media history of technical images. In my discussion, I raise the question of whether the digitally programmed image generation performed by programs like Stable Diffusion, Midjourney or DALL-E can be thought of as ekphrasis. Following recent discussions in the field of media theory, I thereby ask whether the criterion of operativity is decisive for distinguishing text-to-image generation from ekphrasis in the classical sense. My discussion evolves in a revision of ekphrasis in the art historical sense, confronted with the structural processes of multimodal AI. However, the comparison of these two modes of ekphrasis reveals how impoverished the concepts of both image and language risk being in the context of text-to-image-modelling. This, in turn, does not mean that current AI imageries cannot be discussed regarding a (post-) digital aesthetics. Rather, as recent media artists working with multimodal AI show, text-to-image reformulates old questions about the relationship between art or aesthetics and (media) technology.