Internet Filters Are a Fact of Life, but Some Are Worse Than Others
Filtering and Filter Software, by Lori Bowen Ayre (American Library Association Library Technology Reports, 2004).
Have Internet filters gotten any more accurate since the late 1990s, when they frequently denied access to material about breast cancer, pussy willows, Congressman Dick Armey, and much other nonpornographic information? How has the Supreme Court's 2003 decision upholding the "Children's Internet Protection Act" (CIPA),1 which requires filtering in most American libraries, affected this burgeoning industry - not to mention the library system and the reading public?
A guide written by Lori Bowen Ayre and published by the American Library Association answers with a qualified "yes" on the question of accuracy for at least some filtering software. But Ayre's basic message is that libraries that are subject to CIPA must choose carefully. They should use products that block as little as possible beyond the "visual depictions" of obscenity, child pornography, and "harmful to minors" material that CIPA explicitly targets. And they should adopt systems that can be easily disabled, in accordance with the Supreme Court's opinion that CIPA doesn't violate the First Amendment in large part because it authorizes librarians to disable filters on the request of any adult.2
To fill in the background, for those who have not followed the Internet filtering debate: software products designed to block sexual or other controversial content were developed in the 1990s as voluntary "parental empowerment" tools that could help parents shield their children from online pornography. They were widely promoted (by the Clinton/Gore Administration, among others) after the Supreme Court struck down Congress' first Internet censorship law, the "Communications Decency Act" (the CDA), in 1997.
The CDA was an extremely broad law that essentially criminalized the display of any "indecent" content online.3 Anxious about the proliferation of online porn, many saw filters as a welcome remedy after CDA was ruled unconstitutional. But there was one big problem: filters' mechanistic, keyword-based software blocked not only large quantities of valuable nonpornographic material, but material that had no conceivable relation to sex or other hot-button subjects.4 As Ayre recounts, some filters simply
"disappeared" words from the content of a page, resulting in pages that made no sense or that stated something quite different from the author's intent. One such incident reported by Peacefire resulted in a filter changing a sentence on a website from "the Catholic Church opposes homosexual marriage" to "the Catholic Church opposes marriage."5
Despite such worrisome (and often ludicrous) effects, it was not long before these voluntarily adopted systems became not-so-voluntary. Political pressures from groups such as "Family Friendly Libraries" caused some libraries to install filters, and in late 2000, Congress passed CIPA, which turns voluntary "parental empowerment" into mandatory censorship for the great majority of American libraries.
Unlike Congress' first two Internet censorship laws, the CDA and the "Child Online Protection Act" (or COPA), passed in 1998,6 CIPA does not censor directly, through a criminal prohibition, but indirectly - and more pervasively - by imposing a condition on the receipt of government funds or benefits. CIPA requires all schools and libraries that receive federal aid or e-rate discounts for Internet connections to install a "technology protection measure" on all computers, in order to block images that are obscene, child pornography, or (for users under 17) "harmful to minors" (which means, essentially, sexually explicit content that is prurient and lacks "serious value" for minors).7
But since no filter can distinguish obscenity, child pornography, or "harmful to minors" material from other, legal Internet content, CIPA effectively requires libraries to block a wide range of constitutionally protected expression.8 How wide a range depends, as Ayre explains, on what filtering system the many libraries that are subject to CIPA choose to install.
Ayre's short book is a valuable resource in guiding libraries - and other institutions that must or might choose Internet filtering - through the maze of products now on the market, keeping in mind the traditional library values of information access and intellectual freedom, and the relatively limited scope of material that CIPA actually forces them to block. Ayre emphasizes throughout that there are enormous differences among filtering products. Some, like CyberPatrol, block entire Web sites such as salon.com as "sexually explicit" based on just one page of erotica. Others go even farther and block everything sharing the same IP address based on only one instance of content that the manufacturers - or their Web searching algorithms - have classified as sexual or pornographic. Some have religious-right client bases and motivating ideologies that drive their blocking categories and decisions; as Ayre points out, these products are uniquely inappropriate for libraries, which by definition serve people from diverse backgrounds and cultures.9
Ayre describes possible ways of complying with CIPA while minimizing the problem of overblocking and also not spending a fortune on filtering software. (One of the ironies of CIPA is that it imposes these extra costs on precisely those libraries that can least afford them.) For example, open source filtering products are now available that are either free or much cheaper than most commercial products, and that are, as she says, "infinitely customizable" - provided that somebody on the library staff has the technical know-how.
Ayre also discusses the free, low-maintenance "KanGuard" system for Kansas libraries, which targets only Web sites that are considered obscene, child pornography, or "harmful to minors" within the meaning of state and federal law - thus, presumably, avoiding the massive inaccuracies and overblocking of most commercial filters. Librarians developed the KanGuard blocklist, which now contains about 100,000 URLs (updated weekly). But that's an awful lot of pornography to be viewing and evaluating individually, so presumably KanGuard also uses some mechanical form of Web searching.
Kanguard says it does not do keyword filtering,10 but, as Ayre's book makes clear, "keyword" has acquired a very narrow meaning within the industry, limited to blocking based exclusively on lists of forbidden words. Today's more sophisticated systems of trolling the Web's more than a billion Web sites (many of them changing daily) for material that fits one or more of the various filtering companies' largely subjective categories also relies, essentially, on keywords and phrases, whether the companies like to admit it or not.
As Ayre describes these forms of "artificial content recognition," most software now on the market works either through "URL filters" or "content filters." URL filters create a database of disapproved URLs by using a search engine and search term such as "shocking sex acts," then removing from the resulting list any sites with .edu or .gov suffixes, blacklisting the top 500 hits, and "spot checking" for errors.
Ayre explains that "content filters," by contrast, analyze the content of a Web page "on the fly" instead of precategorizing URLs. Although keyword analysis is part of the process, other factors are involved in the software's evaluation of the page, such as background color, links, banner ads, number of images and words, and average number of letters in a word. From this information, Ayre says, an "artificial intelligence algorithm" creates a "processed data vector" which "finds parameters that are useful in classifying the Web page."11
If this sounds vague, it's probably because underlying filter companies' claims to "artificial intelligence," their systems are still, necessarily (given the size of the Internet) mechanical attempts to classify that complex and infinitely variable phenomenon known as human expression. The inevitable result, as the lower court that struck down CIPA (before being reversed by the U.S. Supreme Court) found, is that filtering software's "harvesting" and "winnowing" processes continue, erroneously, to block tens of thousands of nonpornographic sites. Among the many examples given by the lower court was a Knights of Columbus site, misidentified by CyberPatrol as "adult/sexually explicit," and a site on fly fishing, misidentified by Bess as "pornography."12
Acknowledging these problems, Ayre reiterates that CIPA only requires the censoring of "visual depictions," and points to the recent development of software that would allow libraries to block only image files (.jpg, .gif, etc.) within a given category. Although this technology would not eliminate the errors of misidentified sites - for example, it would block images of fly fishing as "pornography" using Bess - at least it would not censor reams of constitutionally protected text.
Ayre also emphasizes the importance of the "default" page - what the library user sees when a URL is blocked. Whatever filtering system the library adopts should allow configuration of the default page to educate the user on how the filter works and how to request disabling. Similarly, she stresses ease of disabling as the key to both compliance with the Supreme Court's interpretation of CIPA and avoidance of lawsuits from patrons who are frustrated when a filter prevents them from accessing needed resources, and disabling is not easily available.
Finally, Ayre stresses that although no filter can make the legal distinctions that CIPA requires (indeed, judges and juries are hard put to do so), libraries need only select the narrowest filtering category in order to comply with the law. Filtering companies usually call this category "sexually explicit," "pornography," or "adult." If libraries choose additional categories, such as Internet chat, e-mail, games, or gambling, it is not because of CIPA but because the library administration has decided that these are not justifiable uses of much-in-demand computer time.
Despite the great value of Ayre's guide, there are some errors that could be cured if and when the ALA publishes a second edition. The footnotes definitely need proofreading. (One footnote, number 33, cannot be found at all.) It was President Clinton, not Bush, who signed CIPA in 2000.
A more substantive criticism is that Ayre seems to assume children should be blocked from access to sex education sites. Given the parlous state of sex education in our public schools, precisely the opposite should be true: curious youngsters should be able to find accurate information at their local libraries.
Thanks to Congress and the Supreme Court, Internet filtering is now a huge industry. Ayre's final pages discuss possibilities for easing some of the more burdensome requirements of CIPA, even though it is unlikely that the law will be repealed any time soon. She urges that libraries and their communities come together to persuade Congress to amend CIPA to allow staff computers and some adult computers to be unfiltered. As she writes:
Update: Author Lori Ayre wrote to FEPP in Sept. 2007 to clarify that she does not think minors should be blocked from sex education websites.
1. United States v. American Library Association, 123 S.Ct. 2297 (2003).
2. United States v. American Library Association, 123 S.Ct. 2297 (plurality opinion of Chief Justice Rehnquist). The concurring opinions of Justices Kennedy and Breyer especially relied on CIPA's disabling provisions as a way to save the law from unconstitutionality. Id. at 2309-12. These justices relied on the Solicitor General's representations at oral argument that CIPA requires librarians to disable the filters on request from an adult patron. CIPA's actual terms do not require disabling but simply allow it, at librarians' discretion, for "bona fide research or other lawful purposes." The justices also did not mention that although the e-rate portion of CIPA only allows adults to request unblocking, its sections dealing with funding under the Library Services and Technology Act allow both adults and minors to make the request.
3. For a description of the CDA litigation and the early Internet filters, see Marjorie Heins, Not in Front of the Children: "Indecency," Censorship, and the Innocence of Youth (Hill & Wang, 2001), pp. 156-200.
4. See, e.g., Internet Filters: A Public Policy Report (summarizing over 70 studies of the consequences of overblocking); Filters and Freedom: Free Speech Perspectives on Internet Content Controls (David Sobel, ed.) (Washington DC: EPIC, 1999); and the lower court decision in the CIPA case, American Library Association v. United States, 201 F. Supp.2d 401, 431-48 (E.D. Pa. 2002).
5. Lori Bowen Ayre, Filtering and Filter Software (American Library Association, 2004), p. 8.
6. COPA made it a crime to distribute "harmful to minors" material - essentially, "obscenity lite" - on the World Wide Web. It was struck down by the Supreme Court in June 2004. See The Right Result; the Wrong Reason. For the legal definition of "harmful to minors," and more on the fortunes of COPA, see also "Our Children's Hearts, Minds, and Libidos" - What's at Stake in the COPA Case.
7. The school provisions of CIPA were not challenged in United States v. American Library Association - in part because filters were already a fact of life in most public schools, and in part because as a matter of First Amendment doctrine, schools have much greater authority to censor what students may access and read on their premises than libraries have to limit the reading of their patrons.
9. Ayre, pp. 10, 36, 46.
10. "Kanguard," http://skyways.lib.ks.us/KSL/libtech/kanguard (accessed 9/1/04).
11. Ayre, pp. 15-17.
12. American Library Association v. United States, 201 F. Supp.2d at 431-48.
13. Ayre, p. 63.