AI in the kitchen
Designing with Stable Diffusion and ControlNet
We recently experimented with ControlNet, an extension to Stable Diffusion, the popular, open source text-to-image tool developed by Stability AI.
ControlNet is a neural network structure that provides control over diffusion models in image and video creation. It addresses the problem of spatial consistency by providing a way to specify which parts of an image should be preserved, giving designers more control over the composition of AI-generated images.
With ControlNet, you can extract things like depth maps, poses, and edge lines from an image to inform new generations, avoiding random compositions or the need to rely on a seed. This process goes by a few different names, but is commonly referred to as annotation or conditioning. From a workflow perspective, it's just a form of preprocessing that takes place before generation.
To test this approach, we used ControlNet to extract depth and edge lines from a photo of our studio kitchen to create a template, called a control map, for generating new designs.
Combined with a text prompt, we created a variety of different designs and treatments based on our original photo. While each is stylistically different, the architectural elements of our kitchen are maintained between designs.
In fact, you can create an almost infinite number of designs with a single control map, using basic prompts to guide each exploration. Below are two quick outtakes, along with the prompts we used to achieve each design.
But this is just one example of what's possible with ControlNet. Imagine creating control maps for an entire digital product or domain. At first glance, this might look like a stylesheet for a set of photographic assets. But the use cases actually go far beyond that. Below are just a few:
Interior and architectural design
With increased fidelity and more mature application of materials and finishes, interior designers could quickly conceptualize new interior designs based on an existing structure or even a sketched design.
Product photography and photo shoots
Object selection, pose detection, and more make it easy to experiment with colors, environments, and textures. Product mockups, marketing collateral, and photo shoots could all be achieved without complex photography, rigs, or lighting.
Digital product customization
By customizing a product's appearance, imagery, and content based on individual user preferences, technologies like ControlNet could provide more personalized experiences. Dynamic adjustments to the look and feel could help increase user satisfaction and engagement, fostering a stronger connection between products and their users.
Characters and environments
Imagine a game that can generate different 3D environments, lighting, and caustics, all based on the same characters or level design. In-game customizations, marketing, and branding could all be satisfied on-the-fly without having to re-render each design.
Augmented and virtual reality
By mapping digital content onto virtual or real-world environments, ControlNet could enhance immersive experiences, enabling more interactive and realistic AR/VR experiences and expanding the ways in which digital products can be used.
Internationalization and localization
Abstracting common styles from bitmap images could also provide unique opportunities to internationalize or localize individual assets and/or images in a digital product. Customization based on locale could improve customer response while providing greater personalization.