The Yale Law Journal

VOLUME
127
2017-2018
Forum

Wikipedia and Intermediary Immunity: Supporting Sturdy Crowd Systems for Producing Reliable Information

09 Oct 2017
Jacob Rogers

abstract. The problem of fake news impacts a massive online ecosystem of individuals and organizations creating, sharing, and disseminating content around the world. One effective approach to addressing false information lies in monitoring such information through an active, engaged volunteer community. Wikipedia, as one of the largest online volunteer contributor communities, presents one example of this approach. This Essay argues that the existing legal framework protecting intermediary companies in the United States empowers the Wikipedia community to ensure that information is accurate and well-sourced. The Essay further argues that current legal efforts to weaken these protections, in response to the “fake news” problem, are likely to create perverse incentives that will harm volunteer engagement and confuse the public. Finally, the Essay offers suggestions for other intermediaries beyond Wikipedia to help monitor their content through user community engagement.

Introduction

Wikipedia is well-known as a free online encyclopedia that covers nearly any topic, including both the popular and the incredibly obscure. It is also an encyclopedia that anyone can edit, an example of one of the largest crowd-sourced, user-generated content websites in the world. This user-generated model is supported by the Wikimedia Foundation, which relies on the robust intermediary liability immunity framework of U.S. law to allow the volunteer editor community to work independently. Volunteer engagement on Wikipedia provides an effective framework for combating fake news and false information.

It is perhaps surprising that a project open to public editing could be highly reliable. Nevertheless, Wikipedia has historically performed similarly to traditional encyclopedias in terms of overall errors.1 The volunteer community has maintained consistently reliable, neutral, high-quality articles even on topics known to be controversial.2 In the course of dealing with the issue of “fake news,” which this Essay takes to include both intentionally and accidentally inaccurate factual reporting, Wikipedia has been robust in its efforts to present the public with high-quality information based on reliable sources.

The Wikimedia movement’s structure aids its ability to stave off the threat of fake news. The Wikimedia volunteer editor communities are self-motivated and self-governing. Editors would not contribute efforts if they did not feel a sense of ownership, control, and responsibility over the content of Wikipedia. In turn, the Wikimedia Foundation, the nonprofit that hosts the encyclopedia, relies on protections from intermediary liability, such as those enshrined in Section 230 of the Communications Decency Act (CDA 230) and the Digital Millennium Copyright Act (DMCA), to provide the volunteer editor community with the independence and individual agency needed to support this decentralized editing model. These protections keep Wikipedia accurate and robust, even in the face of fake news.

Part I of this Essay provides an overview of Wikipedia and its volunteer editing community as the primary and best-studied example of a user-generated content community, and discusses how Wikipedia’s practices prevent and redress false information. Part II details how the intermediary immunity legal regime supports communities like Wikipedia’s in addressing fake news and why weakening that regime would likely disrupt the effectiveness of the existing model, therefore making it more difficult to address fake news online. Part III suggests focusing regulation efforts more directly on fake news, and offers recommendations for other intermediaries to effectively implement a model like Wikipedia’s on their websites.

I. a brief overview of wikipedia and how it deals with misinformation

Wikipedia is a free online encyclopedia, edited by a global community of volunteer users. Any member of the public can edit, even without creating an account, by clicking “Edit” at the top of the page on (almost) every article. In practice, many editors become highly engaged with Wikipedia; there are over 7,500 editors with over 10,000 edits.3

The composition and motivations of the editor community in turn make it effective at dealing with false information, including fake news. Wikipedians are particularly excellent at focusing on sources and presenting accurate facts, even if someone else tries to introduce false information into an article. For example, one of the Foundation’s recent transparency reports highlighted an actor’s attempt to use fake documents to change their birthdate as listed on Wikipedia. However, Wikipedia’s procedures for reviewing the validity of supporting documentation prevented the desired (but erroneous) change.4 This approach—presenting accurate information and finding sources through engaged volunteers—tends to work better than merely fact-checking the accuracy of a statement.5 The emphasis on community engagement and collaboration results in content that editors feel invested in, and users rely on each other to act in good faith. A reader who takes the time to look into the editor community can understand why editors do their work and the independence of those editors, both of which are major factors in creating trust in media.6

These editors (as well as some automated “bots” trained through tens or hundreds of thousands of real-world examples provided by editors7) revert the majority of misinformation quite quickly. Cluebot NG, one of the most prominent programmed bots trained through over 60,000 human edits,8 is currently set to catch approximately forty percent9 of all improper edits on Wikipedia and fix them within a few minutes.10 Other examples, such as the defacement of articles suggested by comedian Stephen Colbert,11 were caught preemptively. Those articles were temporarily locked by volunteer administrators to prevent inaccurate changes.12 Perhaps most significantly, in Wikipedia’s nearly sixteen-year history, the notable examples of inaccurate information that did last for some time are scarce and number only about a dozen.13

Wikipedia is successful due to an understanding captured in “Linus’s Law”: with many eyes, all bugs are shallow.14 As many people work to improve an article, the article trends toward higher quality over time, even if an individual edit is flawed. This is true even in the face of deliberate attempts to insert false information, since those are usually found by others and changed to match available high-quality, reliable sources.15

Linus’s Law itself originates more broadly in the Free and Open Source Software (FOSS) movement, which uses a fundamentally different model than the traditional top-down hierarchy of most organizations.16 The values of the FOSS movement are worth noting because they both pre-date and inform Wikipedia, suggesting that the model of volunteer contributors works effectively in general, not merely for one or two unique projects. The difference between regular software creation and FOSS has been likened to the distinction between a cathedral and a bazaar.17 The traditional method of development is understood as a cathedral, in which work must be done carefully over long periods of time to ensure proper completion. By contrast, the FOSS model, like a bazaar, is open to many people cooperating to find mistakes and fix them through repeated and rapid iteration.18

Further, Wikipedia policies are themselves written by the volunteer editor community and adopted through community consensus. Thus, standards such as notability and verifiability (the policies by which a topic qualifies for inclusion in Wikipedia and by which an article should be sourced, respectively) are actually created, adopted, and enforced by the volunteer community, rather than by employees of the Wikimedia Foundation.19 These policies establish clear criteria for appropriate content for Wikipedia and provide some of the most effective ways to identify inaccurate information. For example, if a statement is added without a source in violation of the verifiability policy, an editor or even a robot checking it can flag that edit as likely troublesome.

Why do editors do all of this work? Studies of the English Wikipedia editor community have revealed a few primary reasons. First, editors are invested in the mission of Wikipedia to contribute to freely available, accurate knowledge for everyone.20 Second, the involvement in a community and the reward of doing something well for that community motivate many people.21 And lastly, the ownership and control over the work that they do underlie the motivations that many people have expressed for why they create a good product.22 While there have not been studies of other user-generated content websites as extensive as those of Wikipedia editors, it is worth noting that some studies have been carried out with enough formality to suggest that the above motivations, particularly the motivations related to group dynamic and belonging, may apply more broadly and are likely generalizable.23

These motivations contribute directly to the way editors work by leading them to find particular niches where they can succeed, creating a strong overall community with many people working together to make high-quality articles. For example, some editors are particular subject matter experts, while many are passionate, skilled researchers who are able to learn and then write about a variety of topics in order to help the spread of free knowledge.24 In combination, the effect of all these motivated contributors working on different things is the creation of a high-quality encyclopedia.

II. how intermediary immunity protects a community-centric model and combats fake news

The legal framework for intermediary liability in the United States and many countries around the world protects intermediaries from liability for statements made on a website by their users, as well as for certain copyright infringement by works posted on the website. This framework varies somewhat by country, but centers on the idea that it is neither possible nor wise for a website that allows millions of individuals to contribute writing, pictures, or other media to attempt to screen each contribution to ensure its quality and legality. This remains true today, even with still-improving technical measures, because such measures can only filter for specific characteristics and are not always accurate.25 For example, copyright filtering tends to almost always be overbroad and does not account for fair use or de minimis exceptions in U.S. law.26

The legal framework for intermediary liability and immunity is critical to supporting the decentralized FOSS editing model and creating the independence, motivation, and feeling of ownership that lead volunteers such as those in the Wikipedia community to commit to fact-checking. If intermediaries were forced to take more direct control of the websites they own due to legal risks and incentives, it would undercut the FOSS model and the complex web of motivation and engagement that lead to this fact-checking, therefore likely worsening the problem of fake news. This Part discusses the existing U.S. framework, court decisions that potentially weaken that framework, and the implications they may have for a decentralized fact-checking model like Wikipedia’s.

A. Summary of the U.S. Framework

The intermediary liability immunity framework in the U.S. consists of two primary laws: Section 230 of the Communications Decency Act (CDA 230)27 and Section 512 of the Digital Millennium Copyright Act (DMCA).28 Together, these two laws prevent a hosting company from (1) being treated as the publisher of material that is written by some other publisher, such as a user; and (2) being held liable for copyrighted material on the website unless the host has actual knowledge of infringement or receives a properly formatted removal request meeting all the notice requirements under the statute.29

Briefly, CDA 230 uses technical language to define what has come to be understood as a “hosting provider,” an individual or company that provides some kind of web platform where users create and upload their own content instead of the company exercising editorial control over the content (e.g., Wikipedia, YouTube, Facebook, or similar platforms).30 Under CDA 230, the hosting provider is not considered the publisher of what a user writes, meaning that the host is not liable under defamation laws or other civil causes of action (though this does not extend to intellectual property laws).31 The DMCA covers copyright, one of the primary types of intellectual property issues concerning most websites that host user-generated content. Under the DMCA, hosting providers have a “safe harbor” in which they are immune from copyright liability if they act reasonably to remove copyrighted content after receiving proper notice or knowledge of the content.32 This typically comes from a DMCA notice sent by the copyright owner, which sets forth statutory elements in order to allow a hosting provider to find and assess copyrighted content in order to make a removal.33

B. Challenges to Existing Protections

Two recent cases, Cross v. Facebook, Inc.34and Mavrix Photographs, LLC v. LiveJournal, Inc.,35 present possible exceptions to CDA 230 and to the DMCA, respectively. Both may lead to unintended results, including censoring truthful information, demoralizing volunteer contributors who help patrol for false information, or confusing readers. In turn, these sorts of results are likely to harm the ability of volunteer communities on user-generated content websites to combat fake news.

In Cross v. Facebook, the plaintiff Cross filed a claim for misuse of publicity rights (a right that allows individuals to allow commercial usage of their image without permission) against Facebook for user-generated content posted to Facebook. The content criticized him for mistreatment of independent contractors at his recording label that allegedly led to two of the contractors falling asleep at the wheel while driving and suffering car crashes leading to significant injuries and death.36 The California Superior Court found that because rights of publicity are considered intellectual property claims under California law, such claims against the hosting site were not barred under CDA 230.37 The trial court’s reasoning potentially allows state claims for misuse of publicity rights to proceed for any speech on social media that focuses on an actual person and is published on a website that runs advertisements (as the advertisements could be seen as commercialization prohibited by the publicity right). This could allow someone to claim that accurate but negative reporting was “fake news” designed to profit off their image, potentially leading to a risk-averse content takedown—or at least a confusing narrative that could mislead the public and make it difficult for editors to participate in fact-checking, following the “bazaar” rather than the “cathedral” model. For example, if a famous actor were to endorse an unpopular position like racial hatred, the actor could use reasoning similar to that followed by the court in Cross to attempt to remove or confuse accurate reporting on the issue on any website that had advertising accompanying the page, and in turn could chill any volunteers from contributing to improve the page.

In another troubling case, Mavrix v. LiveJournal, Mavrix, a photography company, submitted takedown requests for photos it owned that were posted on a LiveJournal site.38 LiveJournal invoked the DMCA safe harbors in response to these takedown requests.39 The Ninth Circuit noted that LiveJournal’s volunteer moderators could be considered “agents” of LiveJournal, in that they have shaped the content on LiveJournal such that the content was not “at the direction of the user,” as the DMCA requires.40 Additionally, those volunteer moderators may have acquired actual or red flag knowledge of infringement, which could be attributed to LiveJournal.41 This case creates a perverse incentive for intermediaries to avoid efforts to moderate content. While this is presented in a copyright context, volunteer moderation efforts may cover many topics, including assisting with user complaints about inaccurate information that can help combat fake news. Perversely incentivizing removal or regulation of such moderation systems may allow false information to flourish. For example, if Mavrix were a widespread doctrine when Stephen Colbert urged people to insert falsehoods online,42 it may not have been preempted by proactive moderation on the part of Wikipedia editors, and misinformation would have been readily available to the public because there was no moderator able to review it.

These cases are merely the most recent in a broader trend of cases questioning the limits of intermediary immunity both in the United States and elsewhere in the world.43 They demonstrate how even small erosions to these protections can create a great deal of uncertainty and a strong incentive to over-censor to avoid risks. Given the location of many intermediaries, such as the Wikimedia Foundation, Google, Facebook, Twitter, and a myriad of smaller companies, the laws of California and the Ninth Circuit as outlined here are likely to govern a disproportionately large amount of the Internet.

C. How Intermediary Immunity Combats Fake News on Wikipedia

The intermediary framework supports and empowers the volunteer community on a platform such as Wikipedia in a few critical ways. First, by allowing the host company to avoid controlling content, it enables the volunteer editor community to set their own policies, rules, and governance mechanisms. This gives volunteers the sense of ownership critical for motivating them to improve the website and allows the bazaar model of editing to flourish, advancing the effectiveness of Linus’ Law. These factors are necessary for stopping the influence of fake news on the project. If the owner of a web domain were forced to take over significant editorial control—or worse, curtail some operations due to concerns over liability for statements made by volunteers—this volunteer motivation would be significantly harmed and the open environment that enables free volunteer engagement would be damaged. A project in such an environment would likely become more susceptible to the insertion of false information, and, to return to the example of Wikipedia, the encyclopedia itself, could stagnate in such a situation. Accurate information on a range of topics would then simply not be available.

Wikipedia’s history shows that a return to a more centrally-controlled cathedral model would not work as effectively as the FOSS bazaar model.44 Wikipedia originated as a side-project that was intended to act as a draft space for Jimmy Wales and Larry Sanger’s original project, Nupedia, which would be controlled centrally and edited only by vetted experts.45 However, Nupedia grew slowly, and Wikipedia’s creators and community found that Wikipedia grew much faster while maintaining high quality and thorough treatment of its subjects.46 Fifteen years later, Wikipedia is of higher quality and of both greater breadth and depth than would have been achieved by a project with a single individual or entity acting as central editor.47

Further, reducing intermediary protections, through the cases outlined above or through other exceptions to CDA 230 or the DMCA, could force websites like Wikipedia to implement direct filtering systems to reduce liability. This would create several problems for confronting fake news. Because producers of fake news are willing to game the system to spread their content, transparency about filtering practices would allow bad actors to exploit loopholes for their own gain. Secrecy around filtering methods, however, can lead to further accusations of bias, and can make it even harder for people to distinguish between accurate and inaccurate information.48

The Wikimedia Foundation’s structure demonstrates how the intermediary protections are core not only to its operations, but also to the independence of the volunteer movement. Since inception, the Foundation has been committed to running Wikipedia through donations and avoiding commercial advertising or a small number of overly large sponsors that might create the appearance of control or bias.49 The intermediary immunity regime directly enables this model by allowing the Wikimedia Foundation to function as a nonprofit, donation-funded organization rather than an organization with considerable bureaucratic oversight. If the Foundation were responsible for monitoring and reviewing all of the content on Wikipedia, this would also harm the ability of the encyclopedia projects to grow. For example, Wikipedia is currently in nearly every language in the world, far more than those spoken by Foundation staff members.50 If the Foundation were forced to do such monitoring, it would not be able to rely on volunteers and independent affiliate organizations around the world to support the movement. Those additional limitations, in turn, would pose a threat to the free knowledge mission, including the dissemination of accurate information and combating of fake news.

III. options for addressing fake news going forward

In lieu of attempting to hold websites liable and forcing them to police fake news, strong protections for voluntary and community-led moderation efforts may generally be effective. By ensuring that efforts by volunteers and hosts to work together to moderate content do not create host liability, intermediaries will be free to try different cooperative methods to improve the quality of their content. This will allow for greater transparency and community engagement, while still making resources available from the intermediary organization.

Consider, for example, voluntary moderation efforts to curtail extremist content such as terrorist propaganda.51 If the host is liable for such content, it will likely over-censor and take control away from users, leading to difficulty for volunteers who want to introduce factually accurate content—and in turn, signifying that members of the public are less likely to find accurate information, contributing to bias, confusion, and misinformation.52 If the host is instead protected from liability, it will be free to work with users without risk, allowing for the development of transparent, community-embraced policies that take into account multiple viewpoints and are neutral in application to the extent possible.

Further, to the extent that fake news is spread through malicious intent, it does not come from intermediaries or the vast majority of the users on those intermediary platforms.53 Even if intermediaries became increasingly liable for false or misleading information on their platforms, the original creators would remain unpunished. The creators of fake news would be able to move on to new publishing platforms or encourage their followers to share false information via email or direct messaging. Placing the liability on intermediaries in this situation could therefore make it harder to discover the sources and track the spread of fake news without fully addressing the problem. On the other hand, strong intermediary protections that enable intermediaries and their users to safely engage in monitoring and moderation efforts could perhaps allow for a better understanding of these original malicious sources and enable them to be located (and corrected or removed as appropriate).

The example of Wikipedia demonstrates how a large volunteer user community can be incredibly effective at addressing fake news and false information if the users are empowered and engaged. Volunteers acting on the ground to identify and remove inaccurate information are able to cover a much wider space than a company trying to monitor information only with its own employees and programs. Further, volunteers may invent creative or surprising techniques (such as the Cluebot program mentioned earlier54) that effectively remove false information, but only when they can operate as part of a transparent, user-controlled system rather than as part of a proprietary platform.

While this has not been tested, I suspect that an effective means for users to assist in this manner is to find common policy ground among users of different backgrounds and beliefs. Wikipedians have agreed both on the need for an encyclopedia to be neutral55 and for common standards, including citations and source reliability.56 It is likely that agreement on the qualities and procedures that identify good sources, independent of the actual content within the source, allows users to cooperate effectively.

Support for engaging community members and understanding their values could allow websites to identify the broad principles on which their users agree (such as neutrality and source quality, in Wikipedia’s case). If a more expansive network such as Facebook or Twitter were to look into this solution, a likely first step could take the form of funding and expert advice to identify important stakeholders in their communities. Such experts could determine how to incorporate commentary and meaningful participation from stakeholders in crafting policies. The next step would be hosting such a discussion with appropriate public notice, and to have staff respond to questions, concerns, and suggestions. Finally, the company could analyze the discussion to identify the points of consensus, and then work to create or adapt technical tools that would empower engaged community volunteers to participate in curating content. If done correctly, this sort of solution would be much more likely to have a broad consensus, buy-in, and enforcement assistance from relevant community members. In turn, this could lead to a more effective regime of fact-checking on the part of such networks’ members.

Conclusion

This Essay explored the structure of Wikipedia, and how a large community of engaged, motivated volunteers who feel ownership in the quality of the work that they do can effectively address fake news and other false information. The intermediary immunity regime supports this kind of online community by allowing volunteer contributors to maintain their own independence and agency. Further, efforts to reduce intermediary protections would actually likely lead to over-censoring, accusations of bias, and confusion about accuracy. Instead, the most effective efforts to combat false information should focus on protecting the collaborative efforts of intermediaries and volunteers to improve quality, and on providing support for other intermediaries to understand their community’s values and create fair policies that are broadly supported by community consensus. Overall, fake news is a serious problem, but it is not one that will be solved by weakening the frameworks that support communities like Wikipedia in their efforts to create high-quality, accurate information on a range of topics.

Jacob Rogers is Legal Counsel for the Wikimedia Foundation. He manages the Foundation’s worldwide defensive litigation and copyright compliance practices. He also works regularly with outside parties to help them understand how Wikipedia works and resolve complaints. Special thanks to Ana Maria Acosta, former Legal Fellow at the Wikimedia Foundation, for her assistance in researching initial sources for this paper. Special thanks also to Leighanna Mixter, Legal Fellow at the Wikimedia Foundation, for assistance with editing and with drafting the legal background sections.

Preferred Citation: Jacob Rogers, Wikipedia and Intermediary Immunity: Supporting Sturdy Crowd Systems for Producing Reliable Information, 127 Yale L.J. F. 358 (2017), http://www.yalelawjournal.org/forum/wikipedia-and-intermediary-immunity.