Using Squid to filter Internet access
Content Control
Kurt describes how to use Squid's ACLs and ICAP when you want to limit Internet access, for whatever reason.
The last time I talked about Squid [1], I looked at using Squid [2] to intercept HTTPS sessions, which has become both more necessary (Wikipedia, for example, redirects all logged-in users to HTTPS) and more difficult (because sites like Google don't want HTTPS connections to be intercepted).
I did not, however, actually talk about the filtering aspects of Squid [3], and this has suddenly become a very important topic to me. My kids are getting older and will soon be using the Internet to surf the web, play games, Skype with relatives, and so forth. That means I want to give them access to cool sites, but I also really don't want to give them unfettered access to the Internet. Even with supervision, I don't trust all the third-party ads and junk served off many popular websites.
So, I'm going to assume you have Squid set up and running, either as a transparent intercepting proxy or for devices configured to use it. Squid provides two main mechanisms for controlling content: The first is internal ACLs you can configure, and the second is ICAP.
Identifying Users and Devices
If you only have a single network or you have multiple users and devices that require different levels of access sharing a network or proxy server, you'll need some way to identify users and/or devices. Segregating access based on user authentication is probably best, because you may want to share devices with your kids or other people.
Squid supports authentication and a wide variety of authentication back ends (database, password file, LDAP, PAM, SASL, SMB, etc.). If you don't have an existing authentication service, then using something like PAM or SASL is generally the simplest. Alternatively, you can identify devices by their IP addresses. This approach, however, is not infallible because you can easily clone a device's MAC address and IP address unless something like 802.1x is in use to control network access.
One advantage of controlling access based on the MAC or IP address of a client is that, if you do share devices, you don't have to worry about content getting cached on them because the address will be blocked by the proxy. You can use Squid's built-in capabilities,
acl macaddress arp 01:23:45:67:89:AB
if you do enforce access based on MAC.
Squid Authentication
Another option is to require user authentication. Note that, if you have devices that can take proxy configurations and store them, and you save the username and authentication, anyone who can access that device can access the web proxy using those credentials. Setting up a captive portal page can help. If you've used public WiFi, you've probably seen pages like this; the benefit of the captive portal is that you can provide a more helpful error if the user fails to login (e.g., who to contact for credentials). Setup is simple: You just create an access control list that requires authentication in squid.conf
:
auth_param basic program /usr/lib64/squid/basic_ncsa_auth /etc/squid/passwd
Then, you create accounts in /etc/squid/passwd
using htpasswd
. Alternatively, you can use the local system accounts via PAM:
auth_param basic program /usr/lib64/squid/basic_pam_auth
The benefit here, of course, is that this approach uses the system accounts, which are often easier to administer than a custom password file.
Squid ACLs
Saying that Squid has really rich ACLs is a severe understatement. You can block, based on the URL, the size of the request or reply, IPs, and time of day. You can require user authentication and assign different ACLs for different users. So, if you have little kids, you can have a restrictive default policy for non-logged-in users and allow more access for authenticated users (i.e., adults). In general, I recommend against requiring four-year-olds to log in to access their favorite game sites, so you need to save the credentials on the device or make the default access restricted.
The biggest decision you need to make is whether to default block or default allow. In other words, do you whitelist (i.e., only allow certain things and block everything else), or do you blacklist (e.g., block certain things and allow the rest by default)? The good news is that you can have your cake and eat it, too; for example, you can use a default (non-authenticated) access list that applies to public devices and allows specific things (e.g., Wikipedia) with a deny rule to block everything else. Then, you can have a second access list that requires authentication and blocks certain things and then has a default allow rule.
Buy this article as PDF
(incl. VAT)