"Introduction to Prompt Engineering for Generative AI" - Notes

Key points from the 44-minute LinkedIn Learning course Introduction to Prompt Engineering for Generative AI by Ronnie Sheer:

Generative AI is a broad description for technology that leverages AI to generate data. This data can include text, images, audio, video, and even code. 

Prompt engineering refers to constructing inputs that help us get the most out of generative AI, and language models in particular.

Large language models are trained on huge, huge amount of texts, almost the entire internet, enormous amounts of books. And then on top of that, sometimes they're fine-tuned for particular tasks.

GPT-3, or Generative Pre-trained Transformer 3, refers to a model, or really, an ecosystem of models, which have been trained on enormous amounts of data. These models vary in size and capability.

Some are optimized for zero-shot learning or few-shot learning, which are two kind of ways of looking at prompt engineering.

In generative AI, token refers to a small unit that can easily be understood by a large language model. If you think about the word everyday, you can sort of break it into two tokens: every and day.

One word can be made up of multiple tokens. Different models have different mechanisms with which they split inputs into tokens. This is the step known as tokenization. And often, the tokenization method really changes the results of the models.

The stop sequence is how you separate examples in your prompt. Two hashtags can be used as a stop sequence.

You can control the maximum tokens you want it to spew out, and this is important because you do have a limited amount of tokens to work with, and you do pay per token. Then you have various other settings.

When we build a prompt for Few-Shot learning, we start with some examples. 

One of the limitations of using Few-Shot learning is that your prompts end up much larger, meaning you're going to spend more money on tokens. 

DALL-E by OpenAI & Midjourney (uses Discord as its interface) use Large language models that can turn words into images that look like drawings and photographs. They are based on the Transformer model architecture, and trained on millions of images that have descriptions. 

If your model is fine-tuned, you need less tokens in your prompt because the model already has a sense of what the completion should be like. Also, you can expect lower latency on some tasks. There are costs associating with fine-tuning models, as this is a heavy computational operation. But a good fine-tuned model may save you quite a lot of money in the long run, requiring less tokens, and perhaps a smaller model to achieve the same results. For fine-tuning, you need extremely high quality examples of prompts and completions. A good example of this is a GPT-based model that powers GitHub's Copilot. It's been fine-tuned on lots and lots of open source code.

Many of these models were trained on datasets that contain human biases, and therefore, many of these models become biased.  Always verify the Generative AI output to make sure it's correct, and keep in mind AI biases.

Links referenced in the course:

AI21 Studio

DALL·E: Creating images from text


Comments