WEBVTT

00:00.860 --> 00:03.590
Welcome back to the Knowledge Portal Video series.

00:03.830 --> 00:12.290
So today we are going to talk about a very interesting topic called as Image Hotlinking and Http referrer

00:12.290 --> 00:12.860
header.

00:13.670 --> 00:21.590
So before we go and understand on how to implement image Hotlinking, let's see on why is it required.

00:23.780 --> 00:26.590
So let's say you have a website.

00:26.600 --> 00:31.520
So let's say this is a.com and this is a website.

00:31.760 --> 00:38.420
Now one day you post a blog on your website and suddenly the website or the blog has become very, very

00:38.420 --> 00:41.900
famous and lots of users are visiting it.

00:42.980 --> 00:49.760
So in Internet there is a tendency that if there is a very famous blog, a lot of other websites will

00:49.760 --> 00:54.590
copy paste the same contents of that blog onto their own server.

00:55.820 --> 00:59.000
So for our example, we have three sites.

00:59.130 --> 01:02.610
This is site one, this is site two and this is site three.

01:03.270 --> 01:10.920
Now all of these sites have copy pasted the contents from your blog onto their own websites.

01:12.710 --> 01:17.540
Now, generally, if you talk about blogs, there are two major components of it.

01:18.320 --> 01:23.000
One is the textual information and one is the images file.

01:28.010 --> 01:36.320
So let us understand on what happens once a user visits a website or the site which has copied your

01:36.320 --> 01:37.070
content.

01:37.310 --> 01:39.200
So this is a user.

01:41.860 --> 01:49.120
Now user was searching something on Google and now has so many website has copied your contents.

01:49.120 --> 01:55.570
Some of the website actually came up on the Google search engines so the user clicks on this link.

01:57.900 --> 02:03.540
And the site will give the response back to the user, which is your blog.

02:04.110 --> 02:09.450
So the blog has textual information as well as an image file.

02:11.490 --> 02:18.210
So generally what happens is most of the websites, they actually store the textual information on the

02:18.210 --> 02:19.380
server itself.

02:19.410 --> 02:26.220
However, as far as image file is concerned, they still reference it back to the originating server

02:26.220 --> 02:28.710
from where they had copied the contents from.

02:32.880 --> 02:41.220
So whenever a user opens this particular website, it will it can retrieve the text from the server

02:41.370 --> 02:42.270
itself.

02:42.300 --> 02:47.730
However, the image on the back end is retrieved from the original website.

02:50.480 --> 02:58.820
So during this time there is a header call as a http referrer header which comes into picture.

02:58.820 --> 02:59.600
So sorry.

03:01.070 --> 03:03.620
So let's take an example for that.

03:06.060 --> 03:08.700
So these are two websites.

03:09.630 --> 03:15.090
I'll say a.com and b.com.

03:18.860 --> 03:22.940
b.com basically has copied the contents from e.com.

03:22.940 --> 03:26.420
So the image file it is still referencing to the.com.

03:26.420 --> 03:31.400
So there is an image file over here and the user visits.

03:31.670 --> 03:40.850
b.com and it gets the response back with the text and image.

03:42.290 --> 03:47.930
Now the text portion is basically from the server itself.

03:47.930 --> 03:50.470
So this is where the text information is there.

03:50.480 --> 03:56.000
However, the image contents are basically retrieved from the originating server.

03:58.380 --> 04:06.480
Now, whenever B.comm. tries to retrieve a image from the originating server, there is a header called

04:06.480 --> 04:07.440
as referrer.

04:09.030 --> 04:17.130
And in this header there will be the name of the website from which the request is coming from.

04:17.130 --> 04:18.870
So it will be b.com.

04:21.090 --> 04:32.010
So basically this http header will help a.com know that the request is actually coming from B.com.

04:32.700 --> 04:37.260
So let me give you an example on how exactly this works.

04:42.580 --> 04:49.240
So I open the log files and let's see what.

04:50.980 --> 04:53.800
It means so I'll open server a.com.

04:58.140 --> 05:00.090
And this is a simple website.

05:00.720 --> 05:06.810
So here we have the text information and there is an image.

05:07.770 --> 05:16.320
So if we go to view source, we can see that text information is stored in the server itself.

05:16.350 --> 05:25.710
However, for the image it is referencing to some other website itself which is server.com/seeds.jpeg.

05:26.190 --> 05:34.800
So this image is not stored in the server A it is retrieving it from some other server itself.

05:35.820 --> 05:39.330
So if we look into the log file.

05:44.000 --> 05:48.950
Let me close this and let's refresh this.

05:50.980 --> 05:51.430
Okay.

05:53.110 --> 05:54.640
Let's go to the log file.

05:54.640 --> 05:59.950
And here you see get slash seeds, dot jpeg.

05:59.950 --> 06:03.610
And this is the referrer field which is server.com.

06:03.880 --> 06:14.290
So now the nginx of server B knows that server a.com is asking or is fetching seeds dot jpeg file.

06:15.370 --> 06:20.920
So there are two important reasons why referrer field is important.

06:21.010 --> 06:31.150
One is that a system administrator can know from where the request is coming from, so you can also

06:31.150 --> 06:36.010
determine from where the clients are basically coming from.

06:36.010 --> 06:43.090
So let's say if you post your website link on YouTube and if you want to see how many people are coming

06:43.120 --> 06:49.660
to your website from YouTube.com, you can search based on the header field and you can determine the

06:49.660 --> 06:52.790
exact number of clients that are coming.

06:55.190 --> 07:02.720
Now, a lot of times things happen that people copy paste the contents of your website onto their own

07:02.720 --> 07:14.480
blogs and you do not want that because let's say if this server.com gets 1000 requests, server b.com,

07:14.480 --> 07:21.590
which is hosting this particular image, will also get 1000 requests because the image has to be loaded.

07:21.680 --> 07:27.770
So basically this is something like stealing of bandwidth as well as stealing of resources.

07:27.770 --> 07:33.860
So many times a lot of bloggers or system administrators do not allow this referencing.

07:33.980 --> 07:36.380
So let's see on how we can do that.

07:37.280 --> 07:40.160
Or this is also called as Image Hotlinking.

07:40.430 --> 07:43.130
So I'll show you how it's done.

07:43.880 --> 07:45.680
Then it will be much more clearer.

07:45.680 --> 07:52.820
So I have uncommented I had commented out the image hotlinking let me uncomment it.

07:58.580 --> 07:59.600
Let's save.

08:01.310 --> 08:03.110
Let's reload the engine X.

08:05.170 --> 08:06.580
And let's open the tail again.

08:09.140 --> 08:10.040
Now.

08:11.910 --> 08:15.990
What basically has happened is I have enabled image hotlinking.

08:15.990 --> 08:28.110
So that means that the server will not respond to any request which comes from server 8.com as far as

08:28.110 --> 08:29.700
image files are concerned.

08:29.730 --> 08:34.470
So basically if I refresh now, the image would not come.

08:34.950 --> 08:36.570
So let me refresh it now.

08:39.550 --> 08:41.890
And you see the image file is blocked.

08:43.180 --> 08:54.850
So if I go to the log files, you see the server.com got a request which was get seeds dot Jpeg from

08:54.850 --> 08:58.090
server A and it gave a 403.

08:58.450 --> 09:01.240
So let's see on why exactly it is doing so.

09:02.770 --> 09:08.050
So opening the configuration file again.

09:08.800 --> 09:13.450
So these are the directives which is actually blocking it.

09:13.630 --> 09:17.570
So location is Jpeg, PNG or gif.

09:17.590 --> 09:22.200
So for any image file if valid reference.

09:22.210 --> 09:23.980
So this is a directive.

09:24.400 --> 09:34.840
If valid reference directive is none blocked server.com or any subdomain within server.com then allow

09:35.500 --> 09:37.820
otherwise return 403.

09:39.380 --> 09:41.000
So this means

09:43.850 --> 09:46.430
let me open the log file again.

09:47.930 --> 09:53.490
So this means if referrer so this is valid underscore referrer field.

09:53.510 --> 10:04.190
If valid underscore referrer field contains none blocked or server b.com, then you can allow this particular

10:04.190 --> 10:04.940
query.

10:05.060 --> 10:08.750
Otherwise give a 403 back.

10:10.710 --> 10:19.950
So this is one of the ways in which you can actually protect your resources to be copied or your image

10:19.950 --> 10:23.580
files to be copied from your server to other server.

10:24.390 --> 10:31.890
Now, also, remember, if you have this kind of a configuration, then Google or Bing or any search

10:31.890 --> 10:37.140
engine site which tries to retrieve the images, it will not be able to retrieve it.

10:37.170 --> 10:45.180
So one of the idle field that you can add, it depends upon the requirements that you have.

10:45.210 --> 10:55.590
You should also add maybe Google.com or maybe bing.com that should be allowed to open the images or

10:55.590 --> 10:57.570
give a reference to your images.

10:59.040 --> 11:05.880
So this is it about the http, referrers and image Hotlinking.

11:07.050 --> 11:14.380
And since this video is a bit short, let me just explain you one more thing.

11:14.590 --> 11:20.530
So here you see it is actually showing the referrer information.

11:20.530 --> 11:27.760
So if I go to the lock configuration file of Nginx.

11:30.550 --> 11:31.900
In the log format.

11:31.900 --> 11:36.360
You see there is a Http referrer variable.

11:36.370 --> 11:42.670
So this is one of the reasons why you can see the http referrer in the log file.

11:43.990 --> 11:51.010
So if you remove this, basically the log file will not contain the Http referrer information.

11:51.010 --> 11:53.280
So this is it about this video.

11:53.290 --> 11:57.310
I hope this has been informative for you and I'd like to thank you for viewing.
