WEBVTT

00:00.080 --> 00:07.160
This slide introduces a critical shift in how we must think about security in AI powered applications.

00:07.640 --> 00:12.080
Large language models do not simply extend traditional software systems.

00:12.360 --> 00:15.320
They fundamentally change the threat landscape.

00:15.920 --> 00:23.240
As shown in the visual security is no longer just about protecting servers and APIs, but about safeguarding

00:23.240 --> 00:26.040
reasoning, context, and behavior.

00:26.680 --> 00:29.400
The subtitle emphasizes an important reality.

00:30.040 --> 00:35.160
Llms introduce unprecedented security challenges that require a new mindset.

00:35.600 --> 00:41.520
Traditional application security assumes inputs are structured and validated against strict schemas.

00:42.160 --> 00:48.960
Llms, however, operate on natural language, which blurs the boundary between instructions and data.

00:49.000 --> 00:53.520
This section focuses on understanding the threats before attempting to solve them.

00:54.080 --> 00:59.440
Without a clear threat model defenses are incomplete and often ineffective.

01:00.120 --> 01:07.300
By the end of this section, you'll understand why LLM systems must be designed with security as a foundational

01:07.300 --> 01:10.260
requirement, not a feature added later.

01:10.300 --> 01:17.340
This slide explains why LM security requires a fundamentally different approach from traditional application

01:17.340 --> 01:18.020
security.

01:18.620 --> 01:25.660
Unlike conventional systems where inputs are parsed and validated against predefined formats, LMS accept

01:25.660 --> 01:32.100
free form language, making it difficult to distinguish legitimate requests from malicious manipulation.

01:32.740 --> 01:39.540
The diagram highlights four critical layers of risk the prompt layer, context layer, tool layer,

01:39.660 --> 01:40.860
and output layer.

01:41.260 --> 01:44.140
Each of these represents a unique attack surface.

01:44.580 --> 01:47.140
User prompts can contain hidden instructions.

01:47.380 --> 01:49.740
Documents may embed malicious content.

01:50.060 --> 01:56.580
Tools can be exploited through trusted responses, and outputs themselves may leak sensitive information.

01:57.060 --> 02:04.180
The most important takeaway is the principle stated clearly on this slide LMS must be treated as untrusted

02:04.180 --> 02:05.060
components.

02:05.420 --> 02:10.470
You cannot assume the model will follow instructions Functions correctly or reject malicious input on

02:10.470 --> 02:10.990
its own.

02:11.430 --> 02:19.470
Security controls must therefore exist outside the model, validating inputs, isolating context, and

02:19.470 --> 02:22.070
sanitizing outputs at every step.

02:22.110 --> 02:28.270
This slide provides a structured overview of the major categories of threats facing LM systems.

02:28.750 --> 02:35.070
Unlike traditional vulnerabilities that exploit code or memory, LM attacks exploit behavior.

02:35.190 --> 02:37.870
How models interpret and respond to language.

02:38.470 --> 02:45.110
The first category is prompt based attacks which manipulate model behavior using carefully crafted inputs.

02:45.830 --> 02:52.190
The second is data exfiltration, where attackers attempt to extract sensitive information from prompts,

02:52.230 --> 02:54.110
context, or training data.

02:55.070 --> 03:01.310
The third is abuse and misuse, or systems are repurposed for malicious activities such as scams or

03:01.310 --> 03:02.670
malware generation.

03:03.190 --> 03:09.470
Finally, supply chain risks arise from dependencies on third party models, tools and plugins.

03:10.190 --> 03:15.530
The reality check at the bottom is critical Attackers don't hack LMS in the traditional sense.

03:15.930 --> 03:19.090
They influence outcomes by speaking the model's language.

03:19.690 --> 03:25.090
Effective defences must therefore focus on controlling behaviour, not just infrastructure.

03:25.330 --> 03:31.410
This slide introduces prompt injection, one of the most serious threats in LMS systems.

03:31.850 --> 03:38.050
Prompt injection occurs when an attacker embeds malicious instructions into user controlled input,

03:38.210 --> 03:42.890
causing the model to override system rules or perform unsafe actions.

03:43.490 --> 03:49.210
Unlike SQL injection or XSS, prompt injection does not exploit parsing bugs.

03:49.570 --> 03:56.410
It exploits the fact that LMS process all instructions, system prompts, and user input as natural

03:56.410 --> 03:58.970
language within the same context window.

03:59.650 --> 04:04.090
The attack example shown illustrates how simple and powerful this can be.

04:04.650 --> 04:11.050
A single sentence asking the model to ignore previous instructions can expose system prompts, internal

04:11.050 --> 04:13.170
logic, or safety constraints.

04:13.930 --> 04:18.580
The danger lies in how difficult prompt injection is to fully eliminate.

04:19.020 --> 04:24.020
Defending against it often requires trade offs between usability and security.

04:24.460 --> 04:30.820
This makes prompt injection a first class threat that must be addressed explicitly in system design.

04:31.580 --> 04:37.820
This slide breaks prompt injection into three distinct forms, each requiring different defenses.

04:38.340 --> 04:43.100
Direct injection involves malicious instructions entered directly by users.

04:43.620 --> 04:49.700
These attacks are easy to execute and often used to probe system boundaries, but they are also easier

04:49.700 --> 04:50.580
to detect.

04:50.980 --> 04:54.260
Indirect injection is more subtle and dangerous.

04:54.700 --> 05:01.460
Here, malicious instructions are embedded inside documents, web pages, or emails that the LM later

05:01.460 --> 05:02.340
processes.

05:02.900 --> 05:08.780
This is especially problematic for rack systems where untrusted documents are ingested automatically.

05:09.380 --> 05:14.380
Tool based injection exploits trusted tool outputs or API responses.

05:14.940 --> 05:20.960
Because the system assumes tool responses are safe, attackers can chain tools together to escalate

05:20.960 --> 05:21.560
attacks.

05:22.160 --> 05:24.080
The core challenge is stated clearly.

05:24.480 --> 05:27.960
Models cannot reliably distinguish instructions from data.

05:28.480 --> 05:29.680
This is not a flaw.

05:29.960 --> 05:32.880
It is a fundamental property of how llms work.

05:33.400 --> 05:37.160
Security must therefore be layered and external to the model.

05:37.360 --> 05:40.440
This slide focuses on data leakage.

05:40.640 --> 05:44.760
One of the most subtle and damaging risks in LLM systems.

05:45.200 --> 05:50.720
Data leakage often occurs quietly without triggering traditional security alerts.

05:51.200 --> 05:58.760
The slide highlights high risk scenarios such as Rag systems, multi-user applications, and enterprise

05:58.760 --> 06:01.400
tools with access to sensitive documents.

06:02.000 --> 06:07.960
Leakage can occur through overly verbose outputs, where models include more information than necessary,

06:08.240 --> 06:12.240
or through prompt logging where sensitive data is stored in logs.

06:12.800 --> 06:19.440
Another serious risk is training data memorization, where models reproduce sensitive content when prompted

06:19.440 --> 06:20.160
correctly.

06:20.760 --> 06:28.330
Context crossover is especially Dangerous in multi-user systems, where one user's information contaminates

06:28.370 --> 06:29.330
another session.

06:29.930 --> 06:35.730
These risks make it clear that protecting data in LM systems requires more than encryption.

06:36.050 --> 06:41.210
It requires strict isolation, filtering, and disciplined system design.

06:41.450 --> 06:46.290
This slide outlines a defense in depth strategy for preventing data leakage.

06:46.730 --> 06:49.530
No single mitigation is sufficient on its own.

06:49.890 --> 06:53.050
Effective protection comes from layering controls.

06:53.570 --> 07:00.330
Strict input and output filtering ensures sensitive data patterns are detected before and after model

07:00.330 --> 07:01.170
interaction.

07:01.890 --> 07:08.530
Output redaction adds an additional safety net by removing sensitive entities before responses are delivered.

07:09.370 --> 07:12.050
User and session isolation is critical.

07:12.530 --> 07:18.650
Context must never be shared across users, and permissions must be validated on every access.

07:19.410 --> 07:26.410
Minimal prompt logging reduces exposure by storing only what is necessary with encryption and automatic

07:26.410 --> 07:27.650
deletion policies.

07:28.270 --> 07:30.550
The golden rule at the bottom is essential.

07:30.910 --> 07:33.350
Never trust model outputs blindly.

07:33.830 --> 07:41.030
Validation, sanitization and verification must happen outside the model, especially in security sensitive

07:41.030 --> 07:41.830
workflows.

07:42.230 --> 07:44.110
This slide addresses a reality.

07:44.150 --> 07:49.630
Many teams underestimate even well-intentioned LLM systems can be weaponized.

07:50.150 --> 07:56.670
Attackers may use your system to generate malware, phishing content, or large scale disinformation.

07:57.190 --> 08:02.670
Social engineering attacks can manipulate customer service bots or impersonate users.

08:03.190 --> 08:09.590
Policy circumvention allows attackers to bypass content filters through creative prompt engineering.

08:10.150 --> 08:14.630
Automated disinformation enables misuse at unprecedented scale.

08:15.070 --> 08:18.670
The defense techniques listed provide a practical response.

08:19.270 --> 08:24.110
Clear usage policies and technical guardrails define acceptable behavior.

08:24.470 --> 08:27.910
Content moderation filters both inputs and outputs.

08:28.110 --> 08:30.390
Rate limiting prevents automated abuse.

08:30.790 --> 08:34.200
Continuous monitoring enables detection and escalation.

08:34.640 --> 08:36.840
The best practice at the bottom is critical.

08:37.080 --> 08:40.120
Designed for misuse, not just intended use.

08:40.440 --> 08:43.960
Security planning must assume adversarial creativity.

08:44.320 --> 08:48.920
The final slide summarizes the core lessons of this section.

08:49.240 --> 08:56.040
First, llms expand the attack surface beyond anything traditional security models were designed for.

08:56.480 --> 09:02.080
Second, prompt injection must be treated with the same seriousness as SQL injection.

09:02.120 --> 09:04.960
It cannot be ignored or fully eliminated.

09:05.360 --> 09:10.960
Third, data leakage is subtle but dangerous, requiring multiple layers of defense.

09:11.280 --> 09:14.960
Finally, preventing misuse is not a one time task.

09:15.160 --> 09:19.920
It demands continuous monitoring, policy enforcement, and adaptation.

09:20.360 --> 09:25.080
The final insight captures the mindset required to build secure systems.

09:25.440 --> 09:27.760
Assume hostile inputs by default.

09:28.040 --> 09:30.280
Security is not a feature you add.

09:30.440 --> 09:36.240
It is a foundational requirement embedded into architecture workflows and monitoring.