Generative AI - Introduction to Large Language Models (LLM)

Thursday, June 29, 2023

Generative AI - Introduction to Large Language Models (LLM)

Note: The below is from my learning from https://www.cloudskillsboost.google/course_templates/536 (Introduction to Generative AI).

Large Language Models (LLMs) are a subset of Deep Learning.

They can be pre-trained and then fine tuned for specific purposes.

What do we mean by pre-trained and fine tuned?

Assume in our everyday life we train dogs basic commands such as sit/stand/walk etc. This is basic training. But if need to train a dog to be a police dog, we need more fine training apart from the basic ones. This is the difference between pre-trained and fine tuned.

Similar idea applies to LLMs.

LLMs are trained to solve common language problems like document summarization, text classification, text generation etc.

They can then be tailored to solve specific problems in the field of finance, retail etc.

Benefits of using LLMs:

Single model can be used for various purposes
Fine tuning a LLM requires minimum field data
Performance grows continuously as more data and parameters are added.

Note: the above image has been taken from the internet.

If we think of an example of getting answers for questions, the question answering model is able to search for an answer from a large document. Depending on the model used, the answer will either be extracted from a document or a new answer will be generated.

e.g. see the answer from Google's AI chatbot BARD which can be access by going to bard.google.com

In the above example, we gave the prompt (question) to get the desired answer.

Prompts involve instructions and context passed to the language model to achieve the desired result.

There are three main LLMs (and each need prompting in a different way).

Generic (or Raw) Language Model

Predict the next word

Instruction Tuned

Predict a response

Dialog Tuned

Have a dialog by predicting the next response

My Learning Cafe

Thursday, June 29, 2023

Generative AI - Introduction to Large Language Models (LLM)

No comments:

Post a Comment