In 2022, the excitement surrounding AI surged dramatically with increased interest in large language models such as OpenAI’s GPT-3 (ChatGPT) and image generation models of the likes of Stable Diffusion and Midjourney (amongst others).
The narrative began to switch to how AI tooling might shape the future—and how and where it would interact with different technologies and creative processes. This is a discussion that continues to this day. With each day passing, we see a new generative model releasing, a new version or feature, and new paradigms being set. The backbone of AI is becoming a more ingrained and fundamental part of our lives, affecting how we live, work, and create.
At Fiction Tribe, our development team thrives on staying at the forefront of innovative and impactful technologies such as AI, and the use cases offered by generative models. As such, we had the opportunity to turn our focus inward for a new project, and consider what could be beneficial from these models for us as a team. We considered how we could build upon the foundations offered by these technologies in our own way.
We set about building our own internal AI interface—one not unlike the experience offered by ChatGPT—but specially tailored to fit our creative team with our own unique blend of components and services.
Not only has this been an exercise in staying current with the technology as it is happening, but it has also been about discovering what’s possible and exploring the latest developments in AI to deliver features sometimes before even the sharpest players in the room.
Fiction Tribe Mind Authentication Screen
Why did we create an AI chat interface?
When we first started building our interface, limitations of early ChatGPT were clear. Our love of exploring all things tech notwithstanding, we saw an opportunity to create an interface that could be crafted based on feedback and input from our team. We saw we could remedy our various pain points quite easily—from using basic but essential UI/UX improvements such as adding a copy button, to adding features like sharing chats internally and continuing a colleague’s chat with an AI model.
With team feedback, we continually added new features and refined the interface. We built a prompt system where users could create and save their own prompts, optionally share them with the team, or keep them private.
With time, as the number of AI-integrated services and APIs being rolled out grew, we wanted to streamline access to different APIs to allow our team easy access in one place to a curated selection of the latest services. This would provide multiple benefits to us, like having a useful internal tool we could maintain, experiment with, and implement new AI services on an API level, while adapting the tool to custom use cases for our team.
We saw immediate benefits of enabling access for our entire team to (at the time) OpenAI’s state-of-the-art GPT 4 with unlimited queries when GPT 4 was a “ChatGPT plus” feature, and highly rate limited for queries. This allowed our team to explore sooner and more substantively with the model early on.
Fiction Tribe Mind; Querying GPT 4o
The development process. Building the AI interface.
Our development process began with a prototype and deep-dive to understand the early OpenAI developer platform, so we could implement a structure that would work well with their APIs while also offering a similar chat interface focused on ease-of-use. We used serverless functions with Google Cloud Platform’s cloud run functions to help secure our API calls and protect our organizational keys. We also used the OAuth 2.0 standard with Google’s platform to connect to our organizational email accounts—this was essential to manage authentication for this internal tool, and ensure only accounts originating from our domain would have access.
For the backend, not knowing how the tool might develop as new APIs would release, and wanting to effectively scale without concern for migrations or major database structural changes, we opted for a loosely structured NoSQL database offered by Google’s Firebase. This is a database we have substantial experience with—we regularly use it for internal projects and prototypes.
What can we build for you? Let's chat.
Early on, we decided to adopt and experiment with Astro (which at the time had just released its’ version 1.0). Astro is a fast component-based and server-first web framework for rendering the front end of our web app. Two years later, this framework has considerably increased in popularity in the developer community.
The overall look and feel of our chat interface, although intended as an internal-facing application, would see a couple overhauls with the first styling matching that of our Fiction Tribe Open Source project developer pages, and later to coincide with branding on the complete overhaul of our Agency website.
Highlighted features and functionality.
One of the first features we wanted to add was the ability to share prompts internally within our team. This came in the form of our prompt editor.
Fiction Tribe Mind; Prompt Editor
With the prompt editor, users can create their own prompts that can be easily shared amongst the team. After creating and saving a prompt, a user can access their private and team prompts from within the main chat tab through dropdowns.
We also added image generation models including DALL-E 2, Stable Diffusion XL 2.1, and Flux Realism Lora—a more capable image generation model useful for creating more realistic stock imagery.
Fiction Tribe Mind; Prompting Flux Realism Lora with Flags
Within our serverless functions, we implemented a middleware flag system (dash dash syntax followed by a parameter name corresponding to a field in the API) for Stable Diffusion and Flux image generation endpoints. Our implementation is similar to a system offered by Midjourney for their image generation model in Discord.
The goal of this is to reduce complications in the user interface, while incorporating all possible API parameters to maximize the power of each model from within our interface. Whether choosing a sampler (which decodes and helps determine the image from raw noise), the number of steps for the image generation process, or setting a unique seed (a series of numbers that informs the model how to generate an image) we allow granular control of all possible inputs.
Fiction Tribe Mind; An image generated with Flux Realism Lora
In addition to generative models, we have also implemented the ability to make adjustments to images through various image manipulation tooling.
After uploading an image to a temporary Google Cloud Storage bucket, it is possible to pipe the image to various AI services: a ESRGAN 2x/4x image upscaling pipeline, AI background removal, img2img functionality (convert one image to another using a prompt), or a Scribble API (pass in a basic sketch to a specialized model of Stable Diffusion with a prompt in order to create a more elaborate image).
Fiction Tribe Mind; Image manipulation tooling
Continuous development, and future plans.
Our AI interface is an expanding, ever-changing tool, and has grown to be a large project consisting of numerous integrated services as we have gathered feedback from our team, and also as AI integrations have shifted and grown.
Our commitment closely aligns with the ideals we had at the start of the project: a commitment to continuously understand and experiment with new technologies, models, and APIs and to offer a playground for our team to continue to explore AI capabilities.
With the knowledge we've gained from building this project we are excited to take this experience and apply it to help us further solve challenges for our clients, whether it be through custom deployments or finding practical and trusted solutions in a world where AI transforms expectations and changes business processes daily.
Try it now.
We have opened up the Alpha version of our chat interface to guests live on our studio site. Guests have access with a rate limit of 20 queries across all APIs and services. This includes integrations with ChatGPT, Claude, Stable Diffusion, Flux Realism Lora, image manipulation APIs, and a prompt editor.