For Google Cloud Next ‘23, the Foundry for AI by Rackspace (FAIR™) team designed a fantastic app that combined interactive fun with the power of generative AI at our booth. With the help of Vertex AI’s Imagen, we crafted a web application that allowed users to input prompts and generate images of their choosing. To add an extra layer of realism to the experience, we connected this app to a printer and used sticker paper, ensuring that participants could take home their very own stickers as souvenirs.
The outcome was the user-friendly interface above, which empowers users to create stickers based on their interactions with the AI. We had a blast testing it for hours on end, but it was even more heartwarming to witness many people visiting our booth, some returning multiple times to engage with the app and grab a sticker of their choice.
First up, the technical bits, here’s the high level application design using Google Cloud native services and products.
The process begins with a react-based front-end interface. Through this interface, users input a text prompt. This text prompt undergoes a meticulous moderation process that involves the utilization of the Cloud Natural Language API and an internally developed model endpoint.
Once the text prompt is deemed safe, it is then directed to either the ImageNet or Stable Diffusion model endpoint, which employs advanced techniques to generate an image based on the provided text. Following the image generation process, the resulting image is subjected to further scrutiny through an image moderation step to ensure the generation of safe and appropriate visual content.
Pivoting away from the technical aspects, let’s delve into the more intriguing aspects of human-AI interaction, which we found more interesting.
Cat lovers can rejoice as cats won over dogs but only by a very small margin.
75 word prompt:
“Couple on a bench watching photorealistic shot of a starry night sky with meteor shower. A magnified view of the moon in the sky. A lone tree in the foreground of the picture with a shadow view”
50 word prompt:
“A cartoon of a pink unicorn and a purple unicorn playing with a background of a bright blue sky with fluffy white clouds and a bright rainbow with stars in the sky”
20 word prompt:
“A Pomeranian is sitting on the kings throne wearing a crown. Two tiger soldiers are standing next to the throne”
Single word prompt:
There were many single word prompts which produced more generic images.
All said, the team had loads of fun building, testing and watching the application in action at the booth. We will leave you with some takeaways.
- AI image generation is quite advanced, and it is amazing what it has managed to create from general descriptions.
- AI still does not completely replace humans for content creation. If you carefully review the above images, you will find small flaws like poorly constructed hands or feet. These need to be fixed, but it definitely gives a great starting point for new content.
If we were to suggest some enhancements, we would recommend the following:
- Educate end-users on effective communication with these systems. Unlike web searches where keyword usage is the focus, providing more elaborate descriptions in your prompts can yield better results.
- Implement prompt engineering to facilitate smoother communication between humans and AI. This approach can involve behind-the-scenes prompt adjustments or offer interactive guidance to users. For instance, we could modify the initial prompt and generate images based on both the original and altered prompts, allowing users to choose the most appealing results.
- Transition to a more context-based conversational model, akin to ChatGPT, where an image is produced and then refined through successive prompts.
That wraps our review. Generative AI got its fair share of buzz at Google Cloud Next ‘23 as it has a massive potential to enhance business processes. Prompt-based AI tools will ultimately evolve into faster, simpler solutions that solve a wide range of business problems and eliminate bottlenecks. While Google Cloud has been at the forefront in making these technologies accessible to customers, the successful integration of generative AI requires both strategic and tactical approaches to various aspects including data platform management, data lifecycle oversight, ethics, and security.
Read more about key announcements at Google Next ’23.