The Ultimate Tech Stack for Building AI Products

A Peek Into the Tech Stack That Powered My Viral AI Web Application

This is the era of solopreneurs. It has never been easier to build end-to-end AI-powered applications, thanks to the most recent developments in AI and developer-friendly frameworks in particular. What used to be a complex domain, mainly led by experienced data scientists, is now democratized and is as straightforward as calling an API (and will only get better and easier over time). However, with all the noise, hype, and continuous advancements in the field, it is difficult to know where to focus and with what stack.

In July 2022, I released Cowriter, an AI-powered text editor aimed at writers. The product quickly went viral and grew to over 500K users worldwide within a few months. In the process, I learned the challenges and benefits of using various tech stacks at scale. Building impressive AI demos is one thing, but scaling AI applications to millions is another, and where most current frameworks fail.

In this post, I will share with you the entire tech stack used for building my viral AI web application. You may already be familiar with some of this stack, so my goal is not only to introduce them to you, but also to help you gain intuition on why they were optimal for my needs. Let’s dive right in!

OpenAI API

This one is less obvious than you might think. From all the AI services out there, which one is the best? What defines the best? For me, the best is about quality, reliability, security, performance and pricing.

Over the past few years, I have researched numerous AI services, including OpenAI, AI21, Anthropic, Bloom, GPT-J, and more. Recently, there has been a significant increase in open source models, some of which are said to outperform superior models, such as Falcon, Alpaca, Vicuna and Llama. However, these models currently require “LLM/ML OPs” experience, as well as investing in your own hosting and monitoring. As a solopreneur, it is necessary to decide on which parts of the business to focus your efforts.

The conclusions and research of these models deserve their own post. However, the TL;DR is that OpenAI’s GPT models are far superior in terms of performance and reliability when compared to their competitors. Although OpenAI is on the higher end of pricing, you can build powerful AI tools using chatgpt-3.5-turbo for only $0.002 per ~750 words. When you do the math, it is quite remarkable how cheap it is.

In addition, OpenAI offers a fine-tuning API, which is very easy to use. I leveraged fine-tuning to create a model to generate content optimized for SEO and marketing.

Replicate

While building an Editor for blog writers, it was important to leverage image generation as well. While Dalle being the obvious API, I was frustrated by the lack of dimension options (such as 1024x768) which is optimal for blogs, and Dalle is pretty expensive. Also, Dalle only hosts your images for one hour, so you also need to integrate with hosting services such as AWS S3 to keep generated images for the long term. Lastly, I was determined to find an alternative that was more similar to Midjourney’s quality. That’s when I discovered Replicate.

Replicate makes it easy to deploy and host machine learning models, and run models with a few lines of code, without needing to understand how machine learning works. One specific powerful and viral model hosted on Replicate was trained on Midjourney images, which shows some pretty impressive results! Moreover, the model allows you to define multiple widths and heights and the number of outputs. Nonetheless, it’s fairly cheap, much cheaper than Dalle, and you “pay as you go”, based on inference computation time.

Langchain

LangChain is a framework for developing applications powered by large language models (LLMs). It’s an open-source library that provides developers with the tools to build applications using LLMs, such as OpenAI or other similar models. LangChain simplifies the creation of these applications and connects to the AI models you want to work with.

Without a doubt, Langchain is a must for all developers and the definitive leader for developer AI frameworks. Not only does Langchain have the largest community of contributors and support, but it is built by developers for developers. There are numerous examples and snippets online showing “how to build a chatpdf with 10 lines of code”. And it’s true, its that easy to build LLM chains and applications.

Langchain offers various solutions for building AI agents. It even supports the latest autonomous agent solutions such as BabyAGI and AutoGPT. One very powerful and popular feature of Langchain is called ReAct, which aims to leverage LLMs with reasoning and performing actions such as given tools only when needed. To learn more about ReAct and how to use it with Langchain see here.

Pinecone

Vector databases are the hottest kid on the block right now. With investments exceeding $1 billion in funding over the past year, it is essential to comprehend why Vector DBs, such as Pinecone, should be incorporated into almost any AI application.

Vector databases are specialized databases designed to store and manage high-dimensional data represented as vectors. These databases are highly efficient in dealing with complex data and allow for quick searches of similar items based on specific criteria. This is especially important when working with LLMs such as GPT, as they are capable of overcoming the token limit issue.

However, it is important not to be fooled by the idea of “what if there was no token limit?”. When working with AI models and services, you pay per inference, or in other words, you pay per request based on the amount of tokens sent to the model. Sending over 100,000 words per every request would be both slow and expensive. Vector databases help to easily retrieve only data related to the user context and text similarity, thus optimizing relevancy of data and costs.

In addition, due to the nature of vector databases, they can serve as long term memory for AI agents, by retrieving relevant “memories” from past interactions related to the conversation context.

Choosing the right database varies greatly based on many considerations. The leading ones to consider are Pinecone, Chroma, Milvus, Weaviate, Vespa, and Elastic Search. Some are open-sourced, while others are not; some offer customizability while others are aimed at simplicity.

I decided to go with Pinecone due to its simplicity of using a hosted service, which takes care of most of the burden for me. Additionally, Pinecone has a very large community, is well-funded, highly scalable and has been around for long enough (2019).

FastAPI

FastAPI is a modern web framework for building RESTful APIs in Python. It was first released in 2018 and has quickly gained popularity among developers due to its easy use, speed and robustness. A good alternative to FastAPI is Flask, as seen in this example.

It plays a great advantage to build your back-end system with the Python language. Due to the exponential growth of AI development, the Python community has grown massively, with many crucial open sources being built exclusively in Python. Such exclusive open sources vary from application and infrastructure frameworks, to LLMs and vector databases.

Other popular frameworks such as Node.js are lagging behind with a much smaller open source community building AI related tools. With the fast pace development of AI, it’s business critical to stay ahead.

ReactJS

There is a lot of discussion around AI not being a moat for tech startups, since it’s become so powerful and easy to integrate. But I truly believe that building great products, that aim at specific verticals and optimize for user experience, create a significant advantage in a very competitive market. This is why UX/UI is still king and even more crucial than ever when building your AI application.

You might’ve heard already about Streamlit or Gradio which are python frameworks for building data driven web interface applications. There is a lot of noise on frameworks like these for easily building web applications on top of backend code that run some AI functionality. However, these frameworks usually go as far as great demos, but don’t scale. They don’t scale both in terms of the UX/UI you can actually build and in terms of performance.

As a full stack developer, nothing beats good old React, or its more advance frameworks such as Next.js and Vue.js. By leveraging the React community, you can build super powerful user experiences that are critical for optimizing AI tailored for user needs.

Slate

As already mentioned, Cowriter is a rich text editor, so I had to find the best React framework for building editors. This was probably the longest research of all, since there is no definitive winner. Research included hands on trial and error with several frameworks until I found the one. Each editor optimizes for different business goals. I researched Draft.js, Editor.js, Slate.js, Plate.js, Quill and more.

For example, if you are looking for the stability of a large organization behind your project and a fairly wide adoption, then Draft.js from Facebook is the way to go. However, features such as nesting and collaborative editing may be difficult to achieve. Alternatively, Editor.js offers a range of features out of the box, and is a great option if you want to get started quickly. If ultimate control and support is what you are after, and the possibility of collaboration is also desirable, then Slate.js is the way to go, but be prepared to do some extra work to get set up. Finally, Plate.js was built on top of Slate.js and offers a wide range of plugins for powerful features, but may not be the best choice if you are looking to create customisable AI experiences.

Ultimately, Slate.js was the best choice for me, since it has a great community and is fully customizable for building powerful editors. With AI-powered editors, it is important to create experiences such as automated writing, colorful animated gestures, selection toolbox, markdown support, DOM manipulation, adding elements such as images, etc.

Netlify

Netlify is a cloud computing platform that automates the deployment of web projects. It enables developers to build, deploy, and manage modern websites and web applications more easily and efficiently. Netlify also provides other features such as continuous integration and deployment (CI/CD), serverless functions for dynamic content, form handling, and more.

Overall, Netlify simplifies the process of web development and hosting, making it more accessible and efficient for developers and businesses.

I’ve recently been exploring different deployment options, and I was pleasantly surprised to discover how easy and efficient it is to work with Netlify. Not only that, but I was amazed to find that performance didn’t suffer at all compared to other options I’ve used, like deploying directly on AWS. And the best part? It’s completely free for single projects!

Stripe

Stripe is an online payments processing platform that allows to accept payments over the internet. It provides a suite of APIs and software tools that enable businesses to securely accept and manage online payments. Stripe also offers additional features, such as customizable checkout pages, fraud detection, subscription billing, and more.

I was looking for the quickest time to market with payment integration, and I found Stripe’s incredible latest feature Checkout. Checkout is a prebuilt, hosted payment page that can be configured with no code in only a few minutes. I was able to go from no monetization to a great payment checkout flow in less than a day of work.

There are many more frameworks to consider in your technical stack such as HuggingFace, Supabase, LLamaIndex and much more. It all depends on the product, vertical, time to market, resources and scale you’re aiming for. For me, it was important to choose a stack that enables me to succeed as a sole developer, while focusing mostly on building a great product and supporting customers, without the hassle of dealing with infrastructure, scale, performance, non-strategic assets, etc. The more resources and funding you have, the more creative you can be with choosing the best stack for you.

Feel free to drop a comment below if you have any questions, or if you’re wondering what would be the best stack for you.