Insights

AI in the kitchen

Designing with Stable Diffusion and ControlNet

Doug Cook

—

Aug

2023

We recently experimented with ControlNet, an extension to Stable Diffusion, the popular, open source text-to-image tool developed by Stability AI.

ControlNet is a neural network structure that provides control over diffusion models in image and video creation. It addresses the problem of spatial consistency by providing a way to specify which parts of an image should be preserved, giving designers more control over the composition of AI-generated images.

With ControlNet, you can extract things like depth maps, poses, and edge lines from an image to inform new generations, avoiding random compositions or the need to rely on a seed. This process goes by a few different names, but is commonly referred to as annotation or conditioning. From a workflow perspective, it's just a form of preprocessing that takes place before generation.

‍

4 step process of uploading image, edge detection, adding a text prompt, going through stable diffusion, and getting an output image

‍

To test this approach, we used ControlNet to extract depth and edge lines from a photo of our studio kitchen to create a template, called a control map, for generating new designs.

Combined with a text prompt, we created a variety of different designs and treatments based on our original photo. While each is stylistically different, the architectural elements of our kitchen are maintained between designs.

‍

‍

In fact, you can create an almost infinite number of designs with a single control map, using basic prompts to guide each exploration. Below are two quick outtakes, along with the prompts we used to achieve each design.

‍

‍

But this is just one example of what's possible with ControlNet. Imagine creating control maps for an entire digital product or domain. At first glance, this might look like a stylesheet for a set of photographic assets. But the use cases actually go far beyond that. Below are just a few:

‍

Interior and architectural design 

With increased fidelity and more mature application of materials and finishes, interior designers could quickly conceptualize new interior designs based on an existing structure or even a sketched design.
‍

Product photography and photo shoots

Object selection, pose detection, and more make it easy to experiment with colors, environments, and textures. Product mockups, marketing collateral, and photo shoots could all be achieved without complex photography, rigs, or lighting.
‍

Digital product customization

By customizing a product's appearance, imagery, and content based on individual user preferences, technologies like ControlNet could provide more personalized experiences. Dynamic adjustments to the look and feel could help increase user satisfaction and engagement, fostering a stronger connection between products and their users.
‍

Characters and environments ‍

Imagine a game that can generate different 3D environments, lighting, and caustics, all based on the same characters or level design. In-game customizations, marketing, and branding could all be satisfied on-the-fly without having to re-render each design.
‍

Augmented and virtual reality‍

By mapping digital content onto virtual or real-world environments, ControlNet could enhance immersive experiences, enabling more interactive and realistic AR/VR experiences and expanding the ways in which digital products can be used.
‍

Internationalization and localization

‍Abstracting common styles from bitmap images could also provide unique opportunities to internationalize or localize individual assets and/or images in a digital product. Customization based on locale could improve customer response while providing greater personalization.

‍

For more information on ControlNet, see the original research by Lvmin Zhang and Maneesh Agrawala at Stanford University. Their project is also actively maintained on Github.

Have an idea or interested in learning more? Feel free to reach out to us on Instagram or Twitter!

Doug Cook

FOUNDER AND PRINCIPAL

Doug is the founder of thirteen23. When he’s not providing strategic creative leadership on our engagements, he can be found practicing the time-honored art of getting out of the way.

Around the studio

UPDATES

Studio News

AI in the kitchen

Interior and architectural design

Product photography and photo shoots

Digital product customization

Characters and environments ‍

Augmented and virtual reality‍

Internationalization and localization

Doug Cook

Around the studio

Designing agentic AI with Dell

Death of a browser

A minimalist’s guide to agents

BAM! BOOM! KAPOW!

Creating a more inclusive, connected world with AI

Fixify raises $25M in Series A

The future of accessibility

Seeing the world through AI

Modeling new shoes

The new language of experience design

How LLMs are reshaping digital experiences

Celebrating Earth Day

Lost in translation

Designing for health and longevity

We made Inc’s 2024 Regionals list!

Intelligent care

Design boom

Designing invisible interfaces

Speaking in gestures

Interacting in space

The internship experience

Mentoring our interns

Designing in the age of intelligence

thirteen23 honored by Inc Magazine

2022: A retrospective

Looking forward to new innovations

Camp thirteen23

Rebranding thirteen23

Design collaboration from afar

Bringing Design Friday to our team

Our design playbook

Sign up to our newsletter

Thanks for subscribing!

Interior and architectural design 

Characters and environments ‍