WEBVTT

00:00.320 --> 00:01.000
Welcome.

00:01.280 --> 00:07.280
In this episode, we are diving into a really important topic for building agents and AI automations

00:07.600 --> 00:08.920
vector databases.

00:09.200 --> 00:10.200
So let's jump in.

00:10.840 --> 00:15.280
This is something you really need to understand, especially for building agents.

00:15.600 --> 00:21.880
And later in this course we will build agents with long term memory and vector databases are key for

00:21.920 --> 00:22.280
that.

00:22.520 --> 00:27.640
So we will see how the vector databases work in practice and why they are so crucial.

00:27.840 --> 00:33.600
So in this video you will learn what vector databases are, what they are essential for large language

00:33.600 --> 00:40.040
models like GPT, and how they help AI systems understand meaning, not just exact words.

00:40.760 --> 00:41.840
So let's get into it.

00:42.160 --> 00:42.720
All right.

00:42.720 --> 00:44.120
Let's start with the basics.

00:44.560 --> 00:46.000
What is a vector database?

00:47.040 --> 00:54.680
A vector database is a new kind of database designed especially for large language models and AI systems.

00:55.440 --> 01:02.440
Traditional databases store data in rows and columns and match exact keywords.

01:03.400 --> 01:10.640
Vector databases, on the other hand, store information as vectors, allowing them to find results

01:10.640 --> 01:15.400
based on meaning and context, not just exact text.

01:15.840 --> 01:23.920
So instead of matching exact words, they retrieve data that's similar in context, which is super useful

01:23.920 --> 01:25.440
for large language models.

01:25.920 --> 01:27.320
So think of it like this.

01:27.320 --> 01:32.480
Traditional databases work like looking for an exact street address.

01:33.000 --> 01:42.000
Vector databases are more like searching for places similar to your favorite coffee shop so they understand

01:42.000 --> 01:45.200
the vibe or meaning, not just the exact name.

01:46.400 --> 01:49.720
Now what about vectors and vector stores?

01:50.160 --> 01:56.400
Vector store is just a group of numbers that captures the meaning of some text or data.

01:56.760 --> 02:04.360
So you can think of it like a summary of meaning, not what something says word for word, but what

02:04.360 --> 02:05.120
it's about.

02:05.120 --> 02:11.630
So the meaning and they focus on meaning, similarity and context, not just exact matches of words.

02:12.510 --> 02:18.750
A vector store is where all these vectors are saved, kind of like a searchable memory for LMS.

02:19.030 --> 02:26.470
So when you ask a question, it can go look through the stored meanings and find the most relevant information.

02:26.790 --> 02:35.070
So when you input text or data vector databases store it as a as vectors which are like unique fingerprints

02:35.070 --> 02:37.070
representing the meaning of your data.

02:37.230 --> 02:43.870
I will explain how that works visually in a moment, but for now, just keep in mind that vectors equal

02:43.910 --> 02:47.910
meaning and vector stores equal memory banks for meaning.

02:48.910 --> 02:49.710
Let's move on.

02:53.590 --> 03:00.790
Now via vector database is important for agents because large language models like GPT, the brains

03:00.790 --> 03:05.790
behind ChatGPT, rely on understanding context and meaning.

03:06.230 --> 03:13.990
So vector databases help agents with that because they store relevant knowledge, and they act as a

03:13.990 --> 03:19.710
memory for AI and enable Llms to store and retrieve information dynamically.

03:20.150 --> 03:21.430
Now semantic search.

03:22.030 --> 03:27.830
So instead of just finding exact words, they let llms search by meaning.

03:28.070 --> 03:35.990
So basically semantic search is a technique which allows llms searching, binding and real time insights.

03:36.150 --> 03:42.510
So they retrieve the most relevant data, whether it's customer frequently asked questions or personalized

03:42.510 --> 03:43.390
recommendations.

03:43.710 --> 03:44.590
And the next slide.

03:46.950 --> 03:50.910
Now let's quickly go over how Vectorizing process works.

03:51.390 --> 03:56.830
And this is what actually powers things like RAC agents and smart retrieval.

03:57.390 --> 04:00.510
So first we start by chunking the data.

04:01.430 --> 04:07.950
So this just means breaking large documents into smaller, more manageable pieces of text.

04:08.630 --> 04:11.110
Next we create vectors.

04:11.110 --> 04:17.430
So each chunk is converted into a numerical Representation called an embedding.

04:18.150 --> 04:20.510
And this captures the meaning of that chunk.

04:21.750 --> 04:25.950
Then we store these vectors in a vector database.

04:26.830 --> 04:32.110
This is like saving all those meanings into in a searchable memory.

04:32.750 --> 04:41.710
When a user asks a question, the system turns the question into its own vector and compares it against

04:41.710 --> 04:45.950
the stored vectors to find the closest matches.

04:46.870 --> 04:55.630
And finally, it uses the best matching chunks to generate a response that's accurate relevant to to

04:55.630 --> 04:56.310
your data.

04:56.710 --> 05:00.710
Now let's take a closer look at how the vectorization process actually works.

05:01.390 --> 05:03.350
And I will walk you through this diagram.

05:04.230 --> 05:08.870
I created this diagram for my paper for my AI system.

05:10.030 --> 05:12.550
It all starts with your source documents.

05:13.590 --> 05:17.470
This can be PDFs, JSON, or plain text files.

05:17.470 --> 05:22.310
These are the raw materials we want to make searchable by meaning.

05:22.710 --> 05:29.910
So first we extract the content from those files, just the useful text inside.

05:30.550 --> 05:35.790
Then we chunk that content into smaller, more manageable pieces.

05:35.790 --> 05:40.230
So it's much easier for llms to work with smaller chunks.

05:40.390 --> 05:45.630
Think of this like breaking a big book into individual paragraphs or sections.

05:45.630 --> 05:53.150
Then each chunk is then converted into a vector, a group of numbers that represents the meaning of

05:53.190 --> 05:53.910
that chunk.

05:54.310 --> 05:58.990
And this is done using an LLM, which generates what we call an embedding.

05:58.990 --> 06:06.190
And these embeddings are stored in a vector database, also called a vector store or knowledge base.

06:06.190 --> 06:11.870
So it's like saving the meaning of every chunk into a memory bank that can be searched later.

06:11.910 --> 06:18.830
Now let's say a user asks a question and that question is also turned into a vector.

06:18.950 --> 06:20.710
So this is the question embedding.

06:20.710 --> 06:23.620
Then the system performs semantic search.

06:24.340 --> 06:31.660
So in other words, compares the questions meaning to distort vectors and finds the most relevant matches

06:31.660 --> 06:39.820
and the best matching chunks are passed into the LLM like GPT, which uses them to generate a final

06:40.220 --> 06:41.580
answer back to the user.

06:42.340 --> 06:47.580
Now let's talk about popular vector databases pinecone, Supabase, and quadrant.

06:47.940 --> 06:54.180
So pinecone is specifically designed for vector search and is widely recognized as one of the easiest

06:54.180 --> 06:57.020
tools to use for working with vector data.

06:58.100 --> 07:04.260
While it's great for agents, pinecone excels at semantic search, so it's perfect for agents that need

07:04.300 --> 07:07.220
to retrieve contextually relevant information quickly.

07:08.340 --> 07:11.900
It's a fully managed solution, so you don't need to worry about infrastructure.

07:12.220 --> 07:13.540
Just focus on your.

07:13.660 --> 07:15.700
Just focus on your agent's functionality.

07:16.620 --> 07:19.580
So it's plug and play for fast implementation.

07:20.420 --> 07:24.620
It's optimized for high speed searches working with real time agents.

07:25.260 --> 07:32.940
Plus it can be on the higher side for enterprise size solution, but for individual users or small teams

07:32.940 --> 07:33.940
is very affordable.

07:34.140 --> 07:38.220
And because it's fully managed, there is less control over customization.

07:38.820 --> 07:45.180
And best use case are chatbots, semantic search for customer support, and personalized recommendations.

07:45.620 --> 07:48.060
We will use Python later in this course.

07:48.340 --> 07:49.300
Now super bass.

07:49.740 --> 07:53.420
Super bass is not strictly a vector database.

07:53.420 --> 07:56.540
It's an open source alternative to Firebase.

07:57.060 --> 08:01.660
However, with extensions like PG vector, it can handle vector embeddings.

08:02.460 --> 08:09.220
So Super Bass combines the flexibility of traditional relational databases with the capability to store

08:09.260 --> 08:09.940
vectors.

08:10.260 --> 08:17.940
So it's great if your agent needs both structured data like user profiles and vector based search.

08:18.420 --> 08:19.260
So it's open source.

08:19.260 --> 08:22.020
You can self-host it, giving you full control.

08:22.140 --> 08:27.460
It combines relational and vector storage so you can centralize all your data.

08:28.020 --> 08:34.620
It requires more setup and maintenance compared to fully managed options, and may not match Bitcoin's

08:34.620 --> 08:37.660
speed for large scale vector searches.

08:38.820 --> 08:45.940
And the best use case are agents that handle hybrid data, so combining customer information with semantic

08:45.940 --> 08:48.420
search for ask questions.

08:51.660 --> 08:57.460
We will be using Supabase later in this course, so we will see how it works in practice now.

08:57.500 --> 08:58.100
Quadrant.

08:58.140 --> 09:04.020
Quadrant is an open source vector database designed specifically for AI and machine learning applications.

09:04.300 --> 09:09.140
So it's a favorite among developers who value full control over the database.

09:09.820 --> 09:15.740
So built from the ground up for vector search is great for self-hosted solutions.

09:15.980 --> 09:22.500
It offers integration with embeddings from models like OpenAI and is perfect for cost efficient AI workflows.

09:23.380 --> 09:27.380
It's fast and efficient for mid to large sized datasets.

09:27.660 --> 09:33.170
It integrates with frameworks like Long Chain so you can build multi-agent systems because it's he's

09:33.170 --> 09:33.930
self-hosted.

09:33.930 --> 09:39.450
It requires technical expertise to set up and maintain, so it does not provide a full managed service

09:39.450 --> 09:40.610
like Bitcoin does.

09:41.290 --> 09:47.850
And the best use case and best use case is self-hosted agents for semantic search, such as customer

09:47.850 --> 09:50.370
recommendations or knowledge management systems.

09:50.650 --> 09:52.970
To conclude, which one should you choose?

09:53.370 --> 09:59.890
In my opinion, Bitcoin is perfect if you are building agents with a focus on semantic search and need

09:59.890 --> 10:01.250
a fully managed solution.

10:02.210 --> 10:09.530
Supabase is great for hybrid data workflows, especially if you are already using relational data alongside

10:09.530 --> 10:17.010
vectors and could run is great if you want full control and need a cost efficient open source solution

10:17.010 --> 10:18.090
for your agents.

10:19.010 --> 10:24.890
So to wrap up, vector databases are the backbone of modern agents and rack systems.

10:25.410 --> 10:30.210
They help your agents remember, understand, and respond with context.

10:30.450 --> 10:37.170
Now that you know the basics, you will learn how to use vector databases in projects later in this

10:37.170 --> 10:37.650
course.