My Learning Cafe

Tuesday, July 4, 2023

Introduction to Image Generation

Introduction to Image Generation

Note: The below is from my learning from https://www.cloudskillsboost.google/course_templates/541 (Introduction to Image Generation).

While many approaches have been implemented for image generation, the following model families have looked more promising over time:

Variational auto encoders

Encode images to a compressed size and then decode back to the original size while learning the distribution of the data itself.

Generative adversarial models (GANs)

Pit two neural networks against each other.

One neural network, the generator creates images.

The other neural network, the discriminator predicts, if the image is real or fake.

Over time, the discriminator gets better and better at distinguishing between real and fake, and the generator gets better and better at creating real looking fakes.

Auto regressive models

Generate images by treating an image as a sequence of pixels

Lets discuss Diffusion Models.

Unconditioned diffusion models, where models have no additional input or instruction, can be trained from images of a specific thing, such as faces and it will learn to generate new images of that thing.

Conditioned diffusion models like text to image where we can generate

An image from a text prompt

e.g. Batman with a cat face

Image-in painting

e.g. Remove the apple from the image

Text guided image to image where we can remove or add things, we can edit the image itself.

e.g. Horse in space with glowing headbands

How do Diffusion models work?

The idea is to systematically and slowly destroy structure in a data distribution through an iterative forward diffusion process. Really, this is going to be adding noise iteratively to an image. We then learn a reverse diffusion process that restores structure in the data, yielding a highly flexible and tractable generative model of the data.

In other words, we can add noise to an image iteratively, and we can then train a model that learns how to de-noise an image, thus generating novel images.

Process:

We start with a large dataset of images.
For an image we add some noise in an iterative way.
So if we do this over and over, iteratively adding more noise, we need to think about how many times do we perform that operation.
By the end of it, we should reach a phase of pure noise.
By this point, all structure in the initial image is completely gone
The challenging part is how do we go from a noisy image to a relatively less noisy image?
This process is called the "reverse diffusion process"
Do note, that every step that we add noise, we also learn the reverse diffusion process. That is, we train a machine learning model that takes in as input the noisy image and predicts the noise that's been added to it.
Over time, after seeing enough examples, this model gets very very good at removing noise from images.
How do we generate images with it?

We can just start with pure, absolute noise and send that noise through our model that is trained.
We then take the output, the predicted noise and subtract it from the initial noise.
And if we do that over and over and over again, we end up with a generated image.

Generative AI Studio

What is Generative AI?

It is a type of artificial intelligence that generates content for you.

Note: The below is from my learning from https://www.cloudskillsboost.google/course_templates/556 (Introduction to Generative AI Studio).

What type of content?

Any type of content like text, images etc.

How does it generate this content?

It generates this content from so many existing content already available. The process of learning from these existing content is called training. Through this it creates a foundational model. e.g of foundational model is LLM (Large language model). The foundational model can be used to generate content and and tasks such as content extraction.

One can add new data sets to the above foundational model for a specific task and thus creating a new model.

How can I create a new model from a foundational mode? Is it easy?

Using Google Cloud tool called Vertex AI. Vertex AI is an end-to-end ML development platform on Google Cloud that helps you build, deploy, and manage machine learning models.

What is Generative AI Studio?

Generative AI Studio allows a user to quickly prototype and customize generative AI models with no code or low code. Generative AI Studio supports language, vision, and speech.

Language - Tune Language models

Vision - Generate images based on prompts

Speech - Generate text from speech or vice versa.

Best practices for prompt design

What is a prompt?

A prompt is your text input that you pass to the model

Best practices for prompt design:

Be concise
Be specific and well-defined
Ask one task at a time
Ask to classify instead of generating (e.g. "is X better to learn?" instead of "what is better to learn"?)
Include examples (Adding examples tends to yield better results)

There are a few model parameters once can experiment with to try to improve the quality of responses:

Temperature
Top P
Top K

Temperature is a number used to tune the degree of randomness.

Low temperature means to select the words that are highly possible and more predictable.

High temperature implies more random, unexpected and some may say "creative" responses.

Top K lets the model randomly return a word from the top K number of words in terms of possibility. For example, top 2 means you get a random word from the top 2 possible words.

Top P allows the model to randomly return a word from the top P probability of words.

Conversations

Before we try to create conversations, we need to specify the conversation context.

Context instructs how the model should respond.

We can add words that the conversation can or cannot use. Same goes for the topic to focus on or avoid.

Tune a Language Model

Prompt design allows for fast experimentation and customization.

However, we have to understand that changes in the prompt wordings can impact the model significantly. Hence, we look to tune the model.

Thursday, June 29, 2023

Generative AI - Introduction to Large Language Models (LLM)

Note: The below is from my learning from https://www.cloudskillsboost.google/course_templates/536 (Introduction to Generative AI).

Large Language Models (LLMs) are a subset of Deep Learning.

They can be pre-trained and then fine tuned for specific purposes.

What do we mean by pre-trained and fine tuned?

Assume in our everyday life we train dogs basic commands such as sit/stand/walk etc. This is basic training. But if need to train a dog to be a police dog, we need more fine training apart from the basic ones. This is the difference between pre-trained and fine tuned.

Similar idea applies to LLMs.

LLMs are trained to solve common language problems like document summarization, text classification, text generation etc.

They can then be tailored to solve specific problems in the field of finance, retail etc.

Benefits of using LLMs:

Single model can be used for various purposes
Fine tuning a LLM requires minimum field data
Performance grows continuously as more data and parameters are added.

Note: the above image has been taken from the internet.

If we think of an example of getting answers for questions, the question answering model is able to search for an answer from a large document. Depending on the model used, the answer will either be extracted from a document or a new answer will be generated.

e.g. see the answer from Google's AI chatbot BARD which can be access by going to bard.google.com

In the above example, we gave the prompt (question) to get the desired answer.

Prompts involve instructions and context passed to the language model to achieve the desired result.

There are three main LLMs (and each need prompting in a different way).

Generic (or Raw) Language Model

Predict the next word

Instruction Tuned

Predict a response

Dialog Tuned

Have a dialog by predicting the next response

Tuesday, June 27, 2023

Generative AI (Introduction)

Note: The below is from my learning from https://www.cloudskillsboost.google/course_templates/536 (Introduction to Generative AI).

What is Generative AI?

Generative AI is a type of artificial Intelligence technology that can produce various types of content like text, images, speech, audio etc.

What is AI?

AI has to do with the theory and methods to build machines that think and act like humans.

Machine learning (ML), which is a subfield of AI, is a program or system that trains a model from input data. Trained model can make useful predictions. ML gives the computers the ability to learn without programming.

Two of the most common classes of machine learning models are unsupervised and supervised (Labeled data) ML models.

What problem can a supervised ML model solve?

Supervised ML model implies the model has labelled data. If we have historical labelled data of bill amount and how much different people tipped based on order type, the model learns from past examples to predict future values.

If you want to look at tenure and income and then group or cluster employees to see whether someone is on the fast track, would be a case for Unsupervised ML model. Unsupervised problems are all about discovery, about looking at the raw data and seeing if it naturally falls into groups.

What is Deep Learning?

Deep learning is a type of machine learning that uses artificial neural networks, allowing them to process more complex patterns than machine learning. Deep learning models typically have many layers of neurons, which allows them to learn more complex patterns than traditional machine learning models.

Gen AI is a subset of deep learning. Large language models are also a subset of deep learning.

Deep learning models can be divided into two types, generative and discriminative.

A discriminative model is a type of model that is used to classify or predict labels for data points. Discriminative models are typically trained on a data set of labeled data points. Once trained, it can be used to predict the label for new data points.

A generative model generates new data instances based on a learned probability distribution of existing data. Thus generative models generate new content.

A discriminative model, given input dog classifies it as a dog and not a cat.

The generative model learns and predicts the conditional probability that this is a dog and can then generate a picture of a dog.

So generative models can generate new data instances while discriminative models discriminate between different kinds of data instances.

In generative AI, we as users can generate our own content, whether it be text, images, audio, video etc, for example models like PaLM or Pathways Language Model (PaLM API lets you test and experiment with Google's large language models and gen AI tools) ingest very, very large data from the multiple sources across the internet and build foundation language models we can use simply by asking a question using a prompt.

Given the above, lets now formally define Generative AI.

Generative AI is a type of artificial intelligence that creates new content based on what it has learned from existing content.
The process of learning from existing content is called training and results in the creation of a statistical model when given a prompt.
AI uses the model to predict what an expected response might be and this generates new content.

Generative AI Studio helps developers create and deploy Generative AI models by providing a variety of tools and resources that make it easy to get started.

Generative AI App Builder lets you create gen AI apps without having to write any code. It has a drag and drop interface.

Tuesday, May 30, 2023

Make a talking Avataar in 5 min (using AI)

Yes, you read it right!

Make an AI talking Avataar in 5 min.

How?

Step 1 - Go to https://playgroundai.com/ and make your own avataar. (You need to sign in). Alternatively you can skip this and go to 2 and 3 (in 3 you can choose from existing avataars)

Step 2 - Go to https://beta.elevenlabs.io/ and convert text to Audio

Step 3 - Go to https://studio.d-id.com/ and create your own video

Take the Avataar from 1 (or already existing one in this site)
Upload the audio from 2
Click Generate video.

Example below:

Sunday, February 5, 2023

Kafka - Basics

Hello All!!!

Starting a new series on learning Kafka (from basics)!!!

What is Kafka?

Apache Kafka allows you to decouple your data streams and your systems. So your source system will feed data into Kafka and target system will take data from Kafka.

Why Apache Kafka?

Open Source
Resilient Architecture
Distributed
Fault Tolerant
Horizontally scalable
High performance with low latency

Why Apache Kafka?

Saturday, July 30, 2022

Google Cloud - Q&A

What is critical outcome of API Management? - Measuring and tracking business performance.
Who provides highest level of security? Titan Security Keys
4 key benefits to manage cloud costs? Visibility, Accountability, Control and Intelligent recommendations.
What is Chronicle? Its is a service built on top of Google Cloud Infrastructure, to ingest data (logs etc) and scan for threats.
Types of support? Basic, Standard, Enhanced and Premium.
What is

DataProc - Hadoop/Spark
DataFlow - Streaming Data
DataPrep - wrangle data based on tabular/interactive or visual structure
DataPlex - Unified Data Management

Three components of Google Clouds defence-in-depth data security design? Sharding, encryption key, key encryption key
What is

Cloud Profiler - Analyze application performance (CPU)
Cloud Debugger - Monitor Performance
Cloud trace - Optimize code
Cloud Monitoring - monitor the performance of the entire cloud infra.
Cloud Vision API - identify images/text etc in a document

What is BYOIP? - Bring your own IP.
Build a new application on cloud while keeping old application On-Premise. What is this pattern called? - Invent in Brownfield. [Greenfield implies something completely new]
Minimize payment for traffic from Google cloud to Internet? use Cloud VPN.
Your org uses Active Directory to authenticate users. Google account access must be removed when their AD account is terminated. ---- Use single sign on in the Google domain
Migrating on Premise to Google Cloud. Functions owned by the cloud provider? - Infra arch and Hardware Maintenance
Which product provides consistent platform for multi-cloud application deployments and extends other Google Cloud services to your environment? - Anthos
Your organization needs to restrict access to a Cloud Storage bucket. Only employees who are based in Canada should be allowed to view the contents.What is the most effective and efficient way to satisfy this requirement? - Configure Armor to allow access to only IP from Canada
Google Cloud managed solutions to automate your build, testing, and deployment process? - Cloud Build
Google Cloud to privately and securely access your large volume of on-premises data, and you also want to minimize latency? - Google Edge network
2 hour SLA - Enhanced support model
Plug-and-play AI components which can easily build ML services -AI Hub
Recommendations AI delivers highly personalized product recommendations at scale.
Document AI uses AI to unlock insights from documents.
Cloud Talent Solution uses AI with job search and talent acquisition capabilities.
Preview, Early Access, Alpha, and Beta do not have any SLA commitments.
Which of the following NIST Cloud characteristics uses the business model of shared resources in a cloud environment? - Multi-Tenancy
What are the network requirements for Private Google Access?

Because Private Google Access is enabled on a per-subnet basis, you must use a VPC network. Legacy networks are not supported because they don't support subnets.
- Private Google Access does not automatically enable any API. You must separately enable the Google APIs you need to use via the APIs & services page in the Google Cloud Console.
If you use the private.googleapis.com or the restricted.googleapis.com domain names, you'll need to create DNS records to direct traffic to the IP addresses associated with those domains.
Your network must have appropriate routes for the destination IP ranges used by Google APIs and services. These routes must use the default internet gateway next hop. If you use the private.googleapis.com or the restricted.googleapis.com domain names, you only need one route (per domain). Otherwise, you'll need to create multiple routes.
Egress firewalls must permit traffic to the IP address ranges used by Google APIs and services. The implied allow egress firewall rule satisfies this requirement. For other ways to meet the firewall requirement.

manage a bunch of API keys for external services that are accessed by different applications, which are used by a few teams - Store the information in Secret Manager is a secure and convenient storage system for API keys, passwords, certificates, and other sensitive data. Secret Manager provides a central place and single source of truth to manage access, and audit secrets across Google Cloud.
Which Google Cloud product gives you a consistent platform for multi-cloud application deployments and extends other Google Cloud services to your environment? - Anthos
Bigtable is the best suited for time series data. It also has high read-write throughput and ability to scale globally.
VM instances that only have internal IP addresses (no external IP addresses) can use Private Google Access. They can reach the external IP addresses of Google APIs and services.
Google offers Firebase, In terms of Firebase Console, any particular message that has to be delivered to a customer at a certain degree of change in behavior can be managed through _________________ >> notification composer
Google Clouds WebApp and API Protection (WAAP) protects the application from BOTS.
You are working with a user to set up an application in a new VPC behind a firewall and it is noticed that the user is concerned about data egress. Therefore, to provide assistance you want to configure the fewest open egress ports >>> Setup a low priority rule (65534) that blocks all egress. Create a high priority rule (1000) that allows only specific port.
Container Registry is only multi-regional but Artifact Registry supports multi regional or regional repositories