nlp

Generative AI: The ultimate beginner's guide

This image was created by Assaf Elovic and Dalle

Why you should care about the generative AI revolution

This summer has been a game-changer for the AI community. It’s almost as if AI has erupted into the public eye in a single moment. Now everyone is talking about AI — not just engineers, but Fortune 500 executives, consumers, and journalists.

There is enough written about how GPT-3 and Transformers are revolutionizing NLP and their ability to generate human like creative content. But I’ve yet to find a one stop shop for getting a simple overview of its capabilities, limitations, current landscape and potential. By the end of this article, you should have a broad sense of what the hype is all about.

Let’s start with basic terminology. Many confuse GPT-3 with the more broad term Generative AI. GPT-3 (short for Generative Pre-trained Transformer) is basically a subset language model (built by OpenAI) within the Generative AI space. Just to be clear, the current disruption is happening in the entire space, with GPT-3 being one of its enablers. In fact, there are now many other incredible language models for generating content and art such as Bloom, Stable Diffusion, EleutherAI’s GPT-J, Dalle-2, Stability.ai, and much more, each with their own unique set of advantages.

The What

So what is Generative AI? In very short, Generative AI is a type of AI that focuses on generating new data or creating new content. This can be done through a variety of methods, such as creating new images or videos, or generating new text. As the saying goes, “a picture is worth a thousand words”:

Jason Allen’s A.I.-generated work, “Théâtre D’opéra Spatial,”

The image above was created by AI and has recently won the Colorado State Fair’s annual art competition. Yes, you read correctly. And the reactions did not go well.

If AI can generate art so well that not only is it not distinct from “human” art, but also good enough to win competitions, then it’s fair to say we’ve reached the point where AI can now take on some of the most challenging human tasks possible, or as some say “create superhuman results”. 

Another example is Cowriter.org which can generate creative marketing content, while taking attributes like target audience and writing tone into account. 

cowriter.org - creative marketing content generator

In addition to the above examples, there are hundreds if not thousands of new companies leveraging this new tech to build disruptive companies. The biggest current impact can be seen in areas such as text, video, image and coding. To name a few of the leading ones, see the landscape below:

However, there are some risks and limitations associated with using generative models. 

One risk is that of prompt injection, where a malicious user could input a malicious prompt into the model that would cause it to generate harmful output. For example, a user could input a prompt that would cause the model to generate racist or sexist output. In addition, some users have found that GPT-3 fails when prompting out-of-context instructions as seen below:

Another risk is that of data leakage. If the training data for a generative model is not properly secured, then it could be leaked to unauthorized parties. This could lead to the model being used for malicious purposes, such as creating fake news articles or generating fake reviews. This is a concern because GPT models can be used to generate text that is difficult to distinguish from real text.

Finally, there are ethical concerns about using generative models. For example, if a model is trained on data from a particular population, it could learn to biased against that population. This could have harmful consequences if the model is used to make decisions about things like hiring or lending.

The Why

So why now? There are a number of reasons why generative AI has become so popular. One reason is that the availability of data has increased exponentially. With more data available, AI models can be better trained to generate new data that is realistic and accurate. For example, GPT3 was trained on about 45TB of text from different datasets (mostly from the internet). Just to give you an intuition of how fast AI is progressing —  GPT2 (the previous version of GPT) was trained on 8 million web pages, only one year before GPT-3 was released. That means that GPT-3 was trained on 5,353,569x more data than it’s predecessor a year earlier.

Another reason is that the computing power needed to train generative AI models has become much more affordable. In the past, only large organizations with expensive hardware could afford to train these types of models. However, now even individuals with modest computers can train generative AI models. 

Finally, the algorithm development for generative AI has also improved. In the past, generative AI models were often based on simple algorithms that did not produce realistic results. However, recent advances in machine learning have led to the development of much more sophisticated generative AI models such as Transformers.

The How

Now that we understand the why and what about generative AI, let’s dive into the potential use cases and technologies that power this revolution.

As discussed earlier, GPT-3 is just one of many solutions, and the market is rapidly growing with more alternatives, especially for free to use open source. As history shows, the first is not usually the best. My prediction is that within the next few years, we’ll see free alternatives so good that it’ll be common to see AI integrated in almost every product out there. 

To better understand the application landscape and which technologies can be used, see the following landscape mapped by Sequoia:

Generative AI Application landscape, mapped by Sequoia

As described earlier, there are already so many alternatives to choose from, but OpenAI is still leading the market in terms of quality and usage. For example, Jasper.ai (powered by GPT-3) just raised $125M at a $1.5B valuation, surpassing OpenAI in terms of annual revenue. Another example is Github, which released Copilot (also powered by GPT-3), which is an AI assistant for coding. OpenAI has already dominated three main sectors: GPT-3 for text, Dalle-2 for images and Whisper for speech. 

It seems that the current headlines and business use cases are around creative writing and images. But what else can generative AI be used for? NFX has curated a great list of potential use cases as seen below:


NFX (including myself) believe that generative AI will eventually dominate almost all business sectors, from understanding and creating legal documents, to teaching complex topics in high education. 

More specifically, the below chart illustrates a timeline for how Sequoia expect to see fundamental models progress and the associated applications that become possible over time:

Based and their prediction, AI will be able to code products from simple text product descriptions by 2025, and write complete books by the end of 2030.

If art generated today is already good enough to compete with human artists, and if generated creative marketing content can not be differed from copywriters, and if GPT-3 was trained on 5,353,569x more data than its predecessor only a year later, then you tell me if the hype is real, and what can we achieve ten years from today. Oh, and by the way, this article’s cover image and title were generated by AI :).

I hope this article provided you with a simple yet broad understanding of the generative AI disruption. My next articles will dive deeper into the more technical understanding of how to put GPT-3 and its alternatives into practice.

Thank you very much for reading! If you have any questions, feel free to drop me a line in the comments below!

How to build a URL text summarizer with simple NLP

To view the source code, please visit my GitHub page.

Wouldn’t it be great if you could automatically get a summary of any online article? Rather you’re too busy, or have too many articles in your reading list, sometimes all you really want is a short article summary. 

That’s why TL;DR (too long didn’t read) is so commonly used these days. While this internet acronym can criticize a piece of writing as overly long, it is often used to give a helpful summary of a much longer story or complicated phenomenon. While my last piece focused on how to estimate any article read time, this time we will build a TL;DR given any article.

Getting started

For this tutorial, we’ll be using two Python libraries:

  1. Web crawlingBeautiful Soup. Beautiful Soup is a Python library for pulling data out of HTML and XML files. It works with your favorite parser to provide idiomatic ways of navigating, searching, and modifying the parse tree. It commonly saves programmers hours or days of work.

  2. Text summarizationNLTK (Natural Language Toolkit). NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries.

Go ahead and get familiar with the libraries before continuing, and also make sure to install them locally. Alternatively, run this command within the project repo directory

pip install -r requirements.txt

Next, we will download the stopwords corpus from the nltk library individually. Open Python command line and enter:

import nltknltk.download("stopwords")

Text Summarization using NLP

Lets describe the algorithm:

  1. Get URL from user input

  2. Web crawl to extract the page text from the HTML page (by paragraphs <p>).

  3. Execute the summarize frequency algorithm (implemented using NLTK) on the extracted text sentences. The algorithm ranks sentences according to the frequency of the words they contain, and the top sentences are selected for the final summary.

  4. Return the highest ranked sentences (I prefer 5) as a final summary.

For part 2 (1 is self explanatory), we’ll develop a method called getTextFromURL as shown below:

def getTextFromURL(url):
    r = requests.get(url)
    soup = BeautifulSoup(r.text, "html.parser")
    text = ' '.join(map(lambda p: p.text, soup.find_all('p')))
    return text

The method initiates a GET request to the given URL, and returns the text from the HTML page.

From Text to TL;DR

We will use several methods here including some that are not included (to learn more see code source in repo).

def summarizeURL(url, total_pars):
    url_text = getTextFromURL(url).replace(u"Â", u"").replace(u"â", u"")
    fs = FrequencySummarizer()
    final_summary = fs.summarize(url_text.replace("\n"," "),       total_pars)
    return " ".join(final_summary)

The method calls getTextFromURL above to retrieve the text, and clean it from HTML characters and trailing new lines (\n). 

Next, we execute the FrequencySummarizer algorithm on a given text. The algorithm tokenizes the input into sentences then computes the term frequency map of the words. Then, the frequency map is filtered in order to ignore very low frequency and highly frequent words, this way it is able to discard the noisy words such as determiners, that are very frequent but don’t contain much information, or words that occur only few times. To see the code source click here.

Finally, we return a list of the highest ranked sentences which is our final summary.


Summary

That’s it! Try it out with any URL and you’ll get a pretty decent summary. There are many summarization algorithms which have been proposed in recent years (such as TF-IDF), and there’s much more to do in this algorithm. For example, go ahead and improve the filtering of text. If you have any suggestions or recommendations, I’d love to hear about them so comment below!

Chatbots - The beginners guide

If you search for chatbots on Google, you'll probably come across hundreds of pages starting from what is a chatbot to how to build one. This is because we're in 2016, the year of the chatbots revolution.

I've been introduced to many people who are new to this space, and who are very interested and motivated in entering it, rather they're software developers, entrepreneurs, or just tech hobbyists. Entering this space for the first time, has become overwhelming in just a few months, particularly after Facebook announced the release of the messenger API at F8 developer conference. Due to this matter, I've decided to simplify the basic steps of entering this fascinating world.

What is a chatbot?

To fully understand what is a chatbot and its potential, lets start by watching the following example:

Get the idea? The conversation example above, was conducted between an end user and a chatbot, built on the Facebook messenger platform.

So what is a chatbot? It is a piece of software that is designed to automate a specific task. More specifically, a chatbot is essentially a conversational user interface which can be plugged into a number of data sources via APIs so it can deliver information or services on demand, such as weather forecasts or breaking news. 

Why now?

Chatbots have been around for decades right? So what is all this noise all of a sudden? This question has many different answers, depending on who you ask. If you ask me, there are two main reasons:

1. Messaging has become the most essential and most popular tool for communication. 

2. We're closer to AI (Artificial intelligence) and NLP (Natural Language Processing) breakthroughs than ever before. This means that talking to a chatbot can closely become as real as talking to a human. Today, developers can find many APIs that offer AI/NLP services, without even understanding how AI/NLP works - This is HUGE. A few examples I recommend are Crunchable.io, Chatbots.io, Luis.ai (a must!), API.ai and Wit.ai.

Basically, the point I'm trying to make is, that messaging platforms are the place we all go to on a regular basis. So why not bring all the other places into this platforms? This is what Facebook did with Facebook Messenger.

Facebook Messenger is far more than a messenger app. It is a store for thousands of apps which are integrated into our daily conversations. Furthermore, as stated above, Facebook has released its chatbot platform in April, 2016. Since then, more than 11,000 bots have been added to Messenger by developers.

Where are the chatbots?

The first chatbot I built was on WhatsApp. The reason I chose WhatsApp, is because all my friends use it as their main messaging platform. Unfortunately, WhatsApp doesn't offer an official API. What this means is, that WhatsApp doesn't approve building chatbots on its platform (not a surprise since WhatsApp is a Facebook company, which itself offers an extensive API). This doesn't mean that there aren't any work arounds. If you're as stubborn as I am, take a look at yowsup and start from there. You'll also need a registered phone number before starting the process. So to conclude, WhatsApp is probably not the place you'll find rich and powerful chatbots. 

Platforms that do offer official APIs are:

1. Facebook Messenger

2. Slack

3. Telegram

4. Kik

There are other deployment channels such as Android and iOS (via SMS), Skype and even Email. However, the listed above are the ones I would focus on.

You can find a rich list of most of the chatbots out there by clicking here, thanks to our friends at Botlist.co that did an amazing job. 

How do I build a chatbot?

This requires a long answer. An answer I will save for my next blog post, in which I will describe how to build your very first chatbot using Node.js and MongoDB.

If you're not a developer, or is looking for an easier approach which does not require programming, here are a few solutions for you:

1. Chatfuel - My first choice. No coding required. Easily add and edit content— what you see is what you get.

2. Botsify - Build a facebook messenger Chatbot without any coding knowledge. 

3. Meya.ai - Meya helps with the mechanics of bot building so you can concentrate on the fun stuff.

There is some downsides to using a service instead of building your own. Using the above services limit your creativity in many ways, enabling you only a glimpse of what can be done. Secondly, you are using a third party hosting service, which means you're stuck with them. Nevertheless, these are great solutions for services that will get you started with chatbots, without the need for any coding knowledge.

Summary

There has been a lot of controversy rather bots will succeed or fail in the near future. To understand the controversy, you have to understand the differentiation between "stupid" bots and "smart" bots. "Stupid" bots work with structured input, while "smart" bots process your natural language and provide a more human-to-human experience.

The main issue with "stupid" bots is that as soon as people start bundling things up, changing their minds, going back to what has been mentioned earlier in the chat, etc., the bot falls apart. Therefore, as long as chatbots can't fully conduct a conversation naturally, while understanding the intent of the user at every stage, bots will be limited and ineffective. 

Having said that, in my opinion, chatbots don't have to be smart in order to succeed. There are thousands of use cases in which a "stupid" chatbot can simplify both the end users experience, and the business side productiveness. Take for example ordering Pizza. You can create a flow in which the user needs to enter inputs based on questions and options. You can deliberately state the input you're expecting from the user, and therefore the need for NLP or AI becomes irrelevant. I would prefer ordering pizza from a "stupid" bot then over the phone, or some cheap website any day. 

To fully summarize the above and much more, have a look at the Chatbot ecosystem, brought together by Business Insider.

Stay tuned for my next blog post, about how to develop your very first Facebook Messenger chatbot, using Node.js and MongoDB.