Killing ads with the LAN-level Privoxy web proxy

Interceptor

Article from Issue 232/2020
Author(s):

Add-on ad blockers can help mitigate the degradation of your browsing experience, but sometimes you need to bring stronger weapons. A filtering web proxy can scrub web traffic to eliminate unwanted ads and scripts.

Most people see no ethical issues with the fact that websites display advertisements. Unfortunately, for lots of technical reasons, advertisements are often undesirable, to the point that removing them is sometimes necessary for a satisfactory surfing experience. (See the box entitled "Why Remove the Advertisements.")

Why Remove the Advertisements?

If advertisements are necessary for the sustainability of free websites, why remove them?

The first reason is that many sites are overloaded with advertisements to the point that they are barely usable. Advertisements mean longer page load times and higher bandwidth consumption. Some advertisements in websites are so invasive that users will spend more time closing ads and pop-ups than reading what they wanted to read in the first place. The user is often confronted by the choice between having a bad user experience or installing an ad blocker.

On top of that, some people have issue with the fact that they are paying an ISP and buying a data plan from them, only to make it possible for somebody else to use a big percentage of their bandwidth to stuff it with advertisements. Users are literally paying for the data channels that serve advertisements to them.

One of the biggest worries regarding advertising is the surveillance capabilities advertisers have. Most often, advertisement agencies use methods to know and register which sites you have been visiting, for how long, and other details that will help them know which advertisements to show to you in order to try to sell you things. Although trying to find what you like in order to offer it to you is not extremely evil behavior, it can lead to the advertiser having too much information about you. For example, by extrapolating the websites you visit, advertisers can find out details about your political affiliations or health issues.

The biggest concern is, however, security. Advertisements include pieces of code that are forced into your web browser and executed without your permission. Some advertising systems have been compromised by evil entities and corrupted into serving malware instead of regular advertising. Since advertising networks serve a very, very big number of ads to an enormous number of users, a single compromised advertiser can distribute an astounding amount of malware. An ad blocker protects you from the dangers associated with downloading and executing this malicious code.

Many users opt to use an ad blocker such as uBlock Origin [1] or Adblock Plus [2] in their web browsers. These blockers, which are installed as browser add-ons, are easy to set up and offer high quality advertisement filtering. However, they only work in the browser they are installed in, they aren't intended to work with smartphones and tablets, and they require significant duplication of labor if you need to support multiple computers on a local area network (LAN).

An earlier article in this issue described one possible network-based ad-blocking solution: the Pi-hole DNS tool. This article shows how to block ads using the Privoxy proxy server.

A proxy server is a tool that makes connections on behalf of other programs. Under regular operation, you would set up a proxy server in your LAN and configure every web browser in your network to use it. When a proxy-aware browser attempts to access a website, it connects to the proxy and asks the proxy to access the site. The proxy then establishes the connection, downloads the web data, and hands it to the browser that made the request. (Figure 1 shows a simple proxy setup.)

Figure 1: 1) The workstation with IP address 192.168.1.100 connects to the server running a web proxy at 192.168.1.200. 2) The proxy attempts to fetch a website in the name of the workstation. 3) The gateway, a router with internal IP 192.168.1.1, forwards the connection out of the LAN and into the Internet.

Proxies are useful for content filtering, such as detecting advertisements and sending an ad-free version of the site to the client browser (Figure 2). The most common technique is for the proxy to return a 404 HTTP error (Not Found) to the client when the browser attempts to download an advertisement. This approach is beneficial, because it prevents the advertisement from being downloaded and wasting your bandwidth.

Figure 2: The website (left) comes up faster and looks better when processed by an ad blocker (right).

The drawback of regular proxy operation is that you still have to configure every device in your network to use the proxy, so while a proxy allows you to maintain all your advertisement filter configuration in one place, making administration easier, it is not a silver bullet that solves all your problems. However, if your router has decent packet filtering capabilities, you can pull quite a neat trick: Use the proxy in interception mode.

Interception mode means the router detects web traffic that is trying to bypass the proxy and hands it to the proxy anyway. When your aunt accesses page-full-of-ads.eu with a computer that has no ad blocker and is not configured to use a proxy, the router takes the connection that would be sent to that website and instead sends it to the proxy. The proxy then knows that your aunt wants to visit that site, fetches it, and serves your aunt a version of the site that is advertisement free. The computer your aunt is running does not get to know that there is a proxy anywhere in the LAN. It is effectively a magnanimous man-in-the-middle (MITM) attack. Figure 3 displays how interception mode works.

Figure 3: 1) Interception mode: The workstation with IP address 192.168.1.100 attempts to connect to a website located somewhere in the Internet, using no proxy. 2) The router that acts as a gateway for the LAN intercepts the connection and redirects it to the proxy server at 192.168.1.200. 3) The proxy attempts to retrieve the website from the Internet in the name of the workstation. 4) The router forwards the connection to the Internet. Note that the workstation never connects to the external website directly even though it isn't configured to use the proxy.

Installation and Configuration

Privoxy [3], which is a derivative of the discontinued Internet Junkbuster, has the ability to block unwanted traffic and also modify traffic that passes through in order to make it safe (for example, disable canvas fingerprinting and other tracking methods in JavaScript code).

In order to get started with Privoxy, you need to install it first. Installation in the Devuan distribution is as simple as:

$ su -[password]# apt-get update# apt-get install privoxy

By default, Devuan launches services that have just been installed. Since Privoxy has not been configured yet, disabling it is advisable:

# /etc/init.d/privoxy stop

The configuration of the proxy program is stored in /etc/privoxy. Figure 4 shows a list of the configuration files. The user is strongly encouraged to modify only config, user.filter, and user.action and leave the rest of the files alone. The files default.action and default.filter include the default filters and blocklists and are updated every time Privoxy itself is updated. If the user makes modifications on these files, they will be lost when Privoxy is bumped to a new version.

Figure 4: Contents of /etc/privoxy.

The config file is long and contains the main configuration for the proxy. Fortunately, defaults are good enough for most users. In order to get started, the administrator just has to locate a line that reads

listen-address  127.0.0.1:8118

and replace it with:

listen-address  0.0.0.0:8118

The proxy service will now listen on every network interface of the server at port 8118. Some administrators might prefer to use port 8080, which is conventionally used for HTTP proxies. It is very important that this port is not accessible from the Internet. Most home routers act as firewalls that prevent Internet access to this port, so it is not an issue. The config file is very well documented, and you may perform additional changes if needed. For a quick summary of the configuration, go to Privoxy's information panel (Figure 5).

Figure 5: If Privoxy is working, and your browser is configured to use it in direct mode, visiting special website http://p.p will take you to Privoxy's information panel.

The default blacklist used by Privoxy is functional but not great. The user may add custom items to the blacklist by appending them to user.action. The file is well commented and contains examples, and many macros are included. The most basic usage is to append sites and advertisement providers that you don't want to see but aren't included in the default filters. Listing 1 shows some example rules for blocking content. Advertisements that are very graphic in nature can be handled as images. Advertisements that are handled as images are replaced by a harmless, placeholder pattern.

Listing 1

Custom Rules

 

The user.filter file is used to define custom filters. Filters are special instructions that replace content in the web pages that the proxy serves to the user. The filtering feature is quite advanced but also very useful, as it allows you to rewrite dangerous JavaScript code before it is delivered to the user. Filters are written using Perl replacement syntax. For example, the filter in Listing 2 deletes any script present in any HTML document that goes through the proxy.

Listing 2

user.filter Example

 

A list with the sites I want to subject to this stern filtering policy must be defined with a custom rule in the user.action file (see the box entitled "Populating a Blacklist").

Populating a Blacklist

Many blacklists are available for blocking unwanted sites and content. In addition to stopping advertisers, some blacklists also prevent malware and phishing sites. These blacklists are not published in a Privoxy-friendly format but may still be used with some creativity.

In a previous article [4], I made use of the StevenBlack blacklist [5]. You can incorporate the StevenBlack list into Privoxy by issuing the commands shown in Listing 3. Listing 4 describes how to incorporate the alternative EasyList [6] blacklist.

Listing 3

Incorporating the StevenBlack Blacklist

 

Listing 4

Incorporating the Easylist Blacklist

 

Once the configuration is done, you can start the service. The following command will make your proxy available in the LAN:

# /etc/init.d/privoxy start

Client Configuration

The easiest way to get a web browser to use a proxy server that is running in your LAN is through the browser's preference options. Most operating systems allow you to configure a preferred proxy that programs will be forced to use [7][8]. On Devuan, you can configure a workstation to use the proxy by issuing the commands described at Listing 5.

Listing 5

Configuring a Proxy

 

These commands will make most programs that are installed in the workstation and are capable of using a web proxy take advantage of the Privoxy instance. These commands assume that the proxy server is located at address 192.168.1.200 and that Privoxy listens on port 8118. Change these settings to suit your needs.

Enable Interception Mode

You can enable interception mode by searching for the accept-intercepted-requests parameter in /etc/privoxy/config and changing its value from 0 to 1. This step allows the proxy to process HTTP requests that are intercepted with the help of the router in the network.

Admittedly, most consumer-grade routers lack the ability to perform MITM attacks against devices located in the LAN at the command of the system administrator. There are many ways to intercept a connection headed to one place and hijack it into moving somewhere else (such as your intercepting proxy). I am covering the most natural approach here, which consists of using the router to perform Network Address Translation on the connection. In the following example, it is assumed that your router has a netfilter stack with iptables, which is the case for most common Linux distributions. If your router is performing classical masquerading (as most home routers are configured to do), then adding a single DNAT rule will suffice. Listing 6 shows a simple example configuration. The DNAT rule will instruct the router to catch any web request that does not come from the proxy server and force it to go through Privoxy. You may need to change the values of the network interfaces and the IP address.

Listing 6

Example Firewall Rules for an Intercepting Router

 

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Web Filters

    Content filters protect a web user’s privacy and keep the flood of unsolicited advertising at bay. We’ll show you a pair of popular Open Source content filters.

  • Tor and Privoxy

    Internet users typically reveal their IP addresses, and this lets companies compile a profile of your Internet activities. Tor and Privoxy can help protect your privacy.

  • Squid at Home

    Are your children wearing out their eyeballs on the Internet? Squid will help you impose some time limits and filter out inappropriate content.

  • upribox 2.0: secure communication on the Internet

    Upribox 2.0 acts as a router and filters both trackers and ads, saving you the annoying task of manually hardening your web browser with countless add-ons.

  • New Protech Linux Distribution Released

    The first stable version of the new Protech Linux distribution, which includes various security tools, has just been released.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News