Resources
Research
Research Papers
Paper

Fantasy on Demand: The Temptation of Text-to-Image AI

History, Theory & Criticism

The Line, Saudi Arabia. © NEOM

1 / 10

Abstract

This paper explores the implications of text-to-image artificial intelligence (T2I AI) on architectural design practice. It begins with an overview of generative artificial design. It then explains of why T2I AI is the focus of this study, and the potential impact on design methodology and aspects of day-to-day practice of architects and designers. We draw parallels between the effects of T2I AI and past architectural and urban design futurists to situate this technology within a larger historical context. Finally, we test the effects of T2I AI on architectural design methodology, demonstrating its potential for architects to conceptualize, design, and communicate ideas. Architects should actively explore the possibilities of T2I AI as a design tool but understand its limitations. T2I AI can assist designers and stakeholders through visual inspiration, but it is not a tool to replace the human expertise required to identify, address, and solve complex challenges in the built environment.

 

Keywords: Architectural Design, Artificial Intelligence, Text-to-Image

Introduction

Architects and urbanists have long been fascinated by the potential for new technologies to shape the built environment—the emergence of Artificial Intelligence (AI) is no exception. Some would argue, like in many other industries, AI has the potential to drastically alter architecture, urbanism, and the future of cities. Others, however, view the new technology as only potentially transformative and less of an existential threat. As discourse over AI technology continues, it is imperative that experts explore the technology and consider its true impact. Foster + Partners, the largest architecture firm in the United Kingdom, has been exploring AI and machine learning tools in the early conceptual design phase of its projects. “In addition to using reference images or studying precedents online, we can now use these tools to quickly illustrate an idea or a feeling that we want a particular space to evoke,” says Martha Tsigkari, head of Applied R+D at Foster + Partners. “This side of machine learning is shaping up to be a useful tool for drawing inspiration” (Berg 2022).

Quote pattern

“The ability to use T2I AI as a tool to spark new and innovative ideas is like having an extra team member with a unique perspective.”

Architectural Design and T2I AI

Like many creative endeavors, architectural design is not about flashes of inspiration or sudden insights, but rather the result of persistent effort, experimentation, and refinement. Architects and designers embrace the complexity of the world around them and recognize that the built environment is not a passive backdrop, but an active agent that shapes and is shaped by our actions. Style and form are not sufficient to address the complex challenges we face today—from technical and economic to social and environmental— as they require rigorous analysis, creative thinking, and disciplined execution.

In a world where the challenges we face are increasingly interconnected and urgent, it is important for architecture and design to be more than an expression of aesthetics, but a vital tool for confronting issues such as climate change and social inequality. Ultimately, it is our ability to ask the right questions, uncover the needs of our clients and communities, and respond with intelligent, thoughtful solutions that set true design apart from Artificial Intelligence (AI). Text-to-image (T2I) AI generates visual content by using a combination of algorithm-learning machines, neural networks, and image processing techniques, but has no understanding of the norms and values of serious architectural design methodologies. It has no awareness of the broad range of complex issues architects, designers, and urban planners need to address.

Figure 1. An initial set of thumbnails created on the first round of data input into Midjourney.

Figure 2. An upscaled, more detailed iteration of the initial Midjourney rendering.

The basic process of creating an image in T2I AI starts with a text prompt on a Discord server. In a message prompt, a user begins by typing “/image,” which activates popular AI platform Midjourney’s image-generating bot. One could enter only a few words or a lengthy description, but a common format is to begin by defining the desired image’s style, the subject, adjectives or modifiers, and ending with specific parameters. Parameters allow users to shape the image options, add more importance or”weight”to some descriptive words than others, or set a specific aspect ratio (Wolff 2023). After inputting the description, four thumbnails are generated (see Figure 1). From there, the user can either “upscale” one or more of the images to generate larger, more detailed versions of them (see Figure 2) or prompt the bot to provide four more varieties of any of the four original images. This offers a quick and easy way to generate multiple options and explore different design possibilities. The process can be repeated as many times as needed until the desired image and description are obtained.

T2I AI can create high-quality images with specific detail in a broad range of styles and character. The images can be anything, from a landscape to an architectural structure, or even a piece of furniture. Viewpoints can be from any perspective or angle. The images are then created through a neural network that has been trained using a vast dataset of images, allowing it to understand the patterns and features of nearly anything. The bot functions as a facilitator of collaboration, using all the Internet’s graphic content to stitch together pixels to form a collage. Because of the ease, speed, and ability of the bot to generate images, it is an enticing tool for testing the breadth of a limitless architectural future, rather than replicating present conventions.

 

There are several nuances in the process of creating the images. First, the selection of keywords and phrases has a significant impact on the outcome of the generated images. Using conventional descriptions such as “building,” “city,” and “street” led to more predictable, mundane, and in some cases incoherent expressions of the built environment.Attempts to create derivatives or replicate existing concepts were also challenging. This point was illustrated after entering a description of the real-world, futurist city proposal, The Line, which is planned in the Kingdom of Saudi Arabia. The Line is a linear megacity wrapped in a mirror-like material that is projected to hold nine million people (see Figure 3). As a purely graphic exercise, its overall features are clear and distinctive. It is 500 meters high, 200 meters wide, and 170 kilometers long. Attempting to generate a large, mirrorwalled structure with similar proportions as The Line using T2I AI was challenging, as the scale of the generated image was often underwhelming (Figure 4), but if “mirror” was substituted with other words, the results displayed nondescript coastal cityscapes (Figure 5). More promising attempts were created using words and phrases that were either more abstract or not typically associated with architecture. Using words such as “hot-air balloon,” “greenery,” or “interstellar,” led to a broader interpretation and expression of believable concepts, even if they were rooted in fantasy.

Another useful tactic was to input wellknown architectural and urban design sources, precedents, and/or architects to the image generator. This resulted in a clearer outcome that seemed to bridge sciencef iction imagery with futuristic architecture. Providing examples of notable architects and their work was useful for Midjourney as an information source, similar to asking ChatGPT to write something in the style of a particular author. This is underscored by the work of Zaha Hadid Architects, where designers are encouraged to use T2I AI and enter “Zaha Hadid” as part of their working process, which creates a feedback loop of design exploration (Budds 2023).

Human decision and interaction are still necessary through this whole process; curation and discernment of the images; how they are refined, which are selected, and what purpose they might ultimately serve are all factors that cannot be decided by artificial intelligence. T2I AI is clearly about image production and not about providing a framework to create architecture in the real world. Using T2I AI as a visual tool can be exhilarating and unpredictable; just a few keywords can produce a vast array of outcomes in seconds, each having the potential to inspire new iterations and ideas.

THE LINE - credit NEOM

Figure 3. Rendering of The Line, the 170-kilometer-long city under construction in Saudi Arabia. © NEOM

While T2I AI can provide a visual “sugar rush,” the emergence of the technology has raised concerns among architects and educators about its impact on students and professionals alike. Educational and professional architectural practice emphasize the importance of understanding and responding to a wide range of real-world issues. Solely focusing on subjective images and aesthetics, such as what T2I AI provides, is not an acceptable approach to addressing serious needs and issues. Therefore, the question arises: Does this technology have any true value for architects, urban planners, and designers? By slightly altering key phrases, like pulling a handle on a slot machine, the possibilities of graphic outcomes are extensive, and thus reveal T2I AI’s greatest power: providing answers to “what if?” in compelling ways.

Figure 4. Using keywords describing The Line literally, the AI-generated image bore little resemblance to the original, particularly in terms of scale.

Figure 5. Removing the word “mirror” from the text developed non-descript coastal cityscapes.

Figure 6. Étienne-Louis Boullée’s Cenotaph for Sir Isaac Newton, 1784, is in some ways a pre-digital work of AI—its purpose was to express Newtonian ideas and symbolism, not to be a real building. Source: Miller, 2014

Figure 7. Claude-Nicolas Ledoux, House of Surveyors of the Loue River (1804), is the visionary work of an architect, with no intention to build. Source: Ricci, n.d.

T2I AI as Architectural “What Ifs?”

While seductive and instantaneous images are not the goal of architecture, the value of T2I AI lies in its ability to inspire new ideas. It’s a tool free from constraints, which can facilitate unconventional design solutions. Proposing grand, audacious, and intentionally conceptual ideas is not new in architecture. Throughout history, there has been a consistent parallel form of expression and thought, separate from everyday practice, intended to push the boundaries of what is possible.

 

Étienne-Louis Boullée and Claude-Nicolas Ledoux were French architects who lived during the 18th century, a time of great intellectual and artistic innovation in Europe. Both were leaders of the Neoclassical movement, which sought to revive the classical forms and principles of ancient Greece and Rome. Their designs were bold and innovative, often characterized by massive scale and powerful geometry.

 

One of the most notable works of Boullée was his Cenotaph for Sir Isaac Newton (Figure 6), which he designed in 1784, 40 years after Newton’s death (Miller 2014). The Cenotaph was not intended to be a practical or functional building, but rather an avant-garde work of architecture. Boullée’s design was highly conceptual and was meant to express the grandeur and significance of Newton’s achievements through a monumental, symbolic form.

 

Ledoux’s most famous project was his 1804 design for the House of Surveyors of the Loue River (see Figure 7) and was also intended to be visionary work. The project was part of Ledoux’s larger plan for a utopian city called Chaux, which he designed for the king of Prussia. It was a vision of a selfsufficient community, with all the necessary functions of daily life contained within a single, massive complex. The buildings were to be surrounded by gardens and farmland, with a central courtyard containing a fountain and statue (Etlin 1995).

 

The work of avant-garde architects such as Archigram (1961–74) are modern examples of conceptual thinkers who proposed several radical and innovative concepts that challenged notions of architecture and urban design (Moore 2018). Their ideas were diverse and futuristic. Some of their most notable concepts include the “Plug-in City” to the “Walking City” and the “Instant City.” The “Plug-in City” was a modular and adaptable city that could be built and rebuilt as needed, while the “Walking City” was made up of giant, self-contained robots or “pods” that could move around and reconfigure themselves based on the needs of their inhabitants. The “Instant City” (see Figure 8), on the other hand, was a concept for creating temporary and mobile structures that could be deployed quickly to meet the needs of people in emergency situations. While these concepts may have seemed far-fetched at the time, they have had a profound impact on the field of architecture and continue to inspire new ways of thinking about the built environment. By pushing the boundaries of what is possible in architecture, groups like Archigram have influenced architects to explore new possibilities.

 

Archigram’s influence on real-world architecture can be seen in many contemporary buildings and designs that have been inspired by their futuristic and unconventional concepts (see TalkingTall, page 58). For example, one could compare the design of the Centre Pompidou in Paris (see Figure 9 )completed in 1977 and designed by Richard Rogers and Renzo Piano, to the Plug-in City concept. The 20th-century high-tech design was driven by the concept of “inside-out” or “reverse architecture,” where the building’s mechanical and functional systems are exposed on the outside. This design approach allows for f lexible, column-free, and adaptable interior spaces, with the ability to reconfigure over time. The building’s colorful pipes and ducts are not merely decorative but serve a functional purpose distributing air, water, and electricity throughout the structure.

 

Another megaproject echoing Archigram’s out-of-the box conceptual thinking is the Taipei Performing Arts Center (see Figure 10), designed by OMA. The multi-functional cultural center is located in the Xinyi District of Taipei. The project’s primary concept was to create a space that seamlessly integrates the arts and the surrounding urban context. Archigram’s Instant City concept was a proposal for a flexible and adaptable urban infrastructure. Today, large-scale modular architecture has become a practical solution for many building projects, offering benefits such as cost savings, time efficiency, and sustainability. The modular construction of the 461 Dean (see Figure 11) in Brooklyn, New York, designed by SHoP Architects and built by Forest City Ratner, is an example of a large-scale project that focuses on prefabrication and adaptability.

Figure 8. “Instant City” was a concept for creating temporary and mobile structures that could be deployed quickly to meet the needs of people in emergency situations. © Peter Cook

Figure 9. Centre Pompidou, Paris, designed by Renzo Piano and Richard Rogers, was presaged by Archigram’s megastructure designs. © Jean-Pierre Dalbéra

T2I AI in Everyday Applications

In architecture, there is value in envisioning big ideas, and a compelling image generated by T2I AI can inspire new concepts. AI doesn’t consider our own self-imposed rules like personal biases, meaning it could add a fresh perspective on design problems. The ability to use T2I AI as a tool to spark new and innovative ideas is like having an extra team member with a unique perspective. Several prominent global architecture firms have resources to establish in-house research and development teams to test big ideas and enter competitions. With the invention of T2I AI, a readily available, inexpensive tool becomes accessible for small- and mediumsized firms to explore.

Designers face constraints, such as client preferences, budgets, zoning regulations, and building codes, which, in turn, shape the design process. T2I AI images cannot consider real-world constraints and conditions, allowing for unconventional design ideas and perhaps even a future real-world solution. AI technology could assist in creating something truly innovative.

The ability to rapidly generate images through AI technology can foster an innovative design process, involving diverse groups of stakeholders at an earlier stage. This can enable architects and designers to better understand the needs and perspectives of various groups and incorporate them into designs that reflect the unique needs of the community. With the help of AI-generated images, community members and other stakeholders can create and share their own compelling images, facilitating communication and collaboration throughout the design process. This ultimately leads to more informed decisions and a stronger sense of community investment in the final product.

Conclusion

Despite all the excitement, it’s worth remembering that AI is still in its infancy and rapidly evolving. Current versions of T2I AI, readily available to the public, have limited usefulness in professional architectural design. It could aid architects and designers, but is not a replacement. The most promising use of T2I AI is early concept and idea generation or non-specific, open-ended, conceptual studies. Even with that narrow application, it should be noted that T2I AI would not be able to respond to specific design goals or form concepts around the particular needs of people, which is important in the early stages of design. T2I AI would not be suitable for visualizing real-world architectural projects with specific contexts and technical drawings, such as images needed for successful permitting and city-level approvals. The risk of project delay, cost uncertainties, and inaccurate depictions of projects in high-stakes city approval processes is too great to use the current forms of T2I AI applications.

Architectural design and creativity involve responding to the needs of clients, with a serious and sustained effort to reveal and respond to complex social, environmental, and technical challenges for the human condition. The strength of T2I AI, and the visionary imagery it can create, may help inspire design teams, clients, and city officials to tackle today’s challenges, but it cannot solve them. While T2I AI can provide another tool to add to our design process, we must be wary of the temptation of instant, fantastic images of artificial architecture. However, we should continue to embrace the unexpected and use it as one of many starting points for thought, creativity, and innovation. As designers, we should be willing to use any tool at our disposal that could potentially lead to the betterment of the human condition.

See more examples of Hewitt’s Midjourney AI explorations at: ctbuh.org/Nageletext-2- AI. 

Figure 10. The Taipei Performing Arts Center by OMA, is another megaproject echoing Archigram’s out-ofthe-box conceptual thinking. © Supanut Arunoprayote.

Figure 11. The prefabricated 461 Dean project in Brooklyn, New York City, can be seen as a latter-day advancement of Archigram’s Instant City. © Kidfly182 (cc by-sa)

About the Author

Julia Nagele brings more than 20 years of experience to her role as senior principal and director of design at HEWITT Architecture. Her design approach in HEWITT Architecture’s classic studio environment is similar to the projectbased learning she leads as an affiliate assistant professor at the University of Washington’s College of Built Environments.

Known for her design of tall buildings, Nagele’s 40-story Emerald Tower, completed in 2020, is the tallest building on the West Coast created by an all-female design team. Mama Tower, which is moving through permitting, will reach 484 feet (147.6 meters). Nagele graduated Cum Laude with a Bachelor of Arts in Architecture from Lehigh University and completed her Master of Architecture degree from the University of Maryland’s School of Architecture, Planning and Preservation.

References

Berg, N. (2022). “AI Tools Like DALL-E 2 and Midjourney are Helping Architects – and Their Clients – Design New Buildings.” Fast Company. https://www. fastcompany.com/90780871/ai-tools-like-dall-e-2andmidjourney-are-helping-architects-and-their-clientsdesign-new-buildings.

Budds, D. (2023). “The Zaha Hadid AI Feedback Loop.” Curbed. https://www.curbed.com/2023/04/zaha-hadidarchitects-ai-design-midjourney-dall-e-2.html.

Etlin, R. (1995). Symbolic Space. Chicago: University of Chicago Press.

Hobson, B. “Archigram’s Instant City Concept Enables ‘A Village to Become A Kind of City for A Week’ Says Peter Cook.” Dezeen. https://www.dezeen.com/2020/05/13/ archigram-instant-city-peter-cook-video-interview-vdf.

Miller, M. (2014). “AD Classics: Cenotaph for Newton / Etienne-Louis Boullée.” ArchDaily. https://www.archdaily. com/544946/ad-classics-cenotaph-for-newton-etiennelouis-boullee.

Moore, R. (2018). “The World According to Archigram.” The Guardian, 18 November 2018. https://www.theguardian. com/artanddesign/2018/nov/18/archigram-60s-architectsvision-urban-living-the-book.

Ricci, B. (n.d.) “Imagining New Worlds. or How Architecture Has Dreamt of Utopia.” Artland Magazine. https://magazine. artland.com/imagining-new-worlds-or-how-architecturehas-dreamt-of-utopia/.

Wolff, T. (2023). “The Midjourney Cheat Sheet (v5).” Medium. https://medium.com/design-bootcamp/the-midjourneycheat-sheet-v5-54b5fd92d2da.

Unlock new levels of knowledge.

Explore our entire research archive and its vast collection of documents, reports, and multimedia resources. Hosted on the Engage platform, it is a treasure trove of new ideas and proven best practices.

Go to Engage

You have added this page to your collection.

You can view or share your full collection.