Link 17 Page Essay Reddit Real Girls

Reddit, known as “the front page of the Internet,” has been one of the most widely visited Web sites since its inception in 2005. As a social networking site it is unique in that the personal relationships between its users are considered secondary to its content, which includes both original, user-generated content and links to outside sources. Although previous research has investigated other social networking platforms in depth, relatively little has been written on Reddit. The present research considers a variety of indicators, including text readability, emoticon usage, and domain linkage. It was found that the most popular communities on Reddit behave very differently from each other, in terms of language sophistication, sentiment, and topicality (as measured by top-level links to outside sources). The results can be used to inform future investigations of online discourse spaces, particularly those in the contemporary social media sphere.


Related work




The online community Reddit, known as “the front page of the Internet,” is one of the most popular sites in cyberspace. As of September 2016 it is the ninth most-visited Web site in the United States and the 25th most-visited Web site worldwide (Alexa, 2016). It has been estimated that six percent of Internet-using adults visit the site (Duggan and Smith, 2013). A more recent study claims that “seven percent of U.S. adults report using the site,” with more than three-quarters of these using Reddit as a news source (Barthel, et al., 2016). Reddit, originally founded as a simple link-sharing platform in 2005, now describes itself as “a platform for communities to discuss, connect, and share in an open environment, home to some of the most authentic content anywhere online” (Reddit, 2016). The “communities” referred to are more commonly referred to as subreddits, each of which has its own unique appearance, guidelines, team of moderators, and, of course, contributors (called “Redditors”). In this paper, subreddits are referred to in the following form: r/subredditName. This is a reference to the URL at which a subreddit can be visited: for example, the “news” subreddit can be viewed at This also follows a convention on Reddit, wherein references to a subreddit within a comment are expressed in the “r/subredditName” format.

Although Redditors are allowed to “subscribe” to subreddits (which means that threads from these communities show up on the user’s front page), in most instances it is not necessary to subscribe to a subreddit in order to contribute to the discourse. There are a variety of means by which Redditors can participate in a community, including posting comments on a thread, voting on the quality of threads and comments, and posting original threads, though in the latter case there are sometimes constraints on what can be posted. For example, r/politics (a community devoted to political matters in the United States) restricts users to posting links to outside articles (this only applies to users seeking to start a thread; user-generated content is encouraged in the comments section for any given thread). However, most subreddits allow for users to post original thoughts, ideas, and questions in top-level posts (many, such as r/explainlikeimfive [1] and r/AskReddit [2], are entirely comprised of original content).

Though it may be tempting to view Reddit as a monolithic enterprise, the nature of its subreddit system casts it instead as a loosely joined collection of vastly disparate communities. Reddit’s structure allows its users to reject permanent virtual identities (Bergstrom, 2011), and Becker (2013) observed that “each subreddit has its own theme and ‘personality,’ which cater to its online community of readers.” Moreover, subreddits allow for users to tailor the Reddit experience to their own interests (Mills, 2015).

Although previous research has analyzed social interaction across a variety of subreddits (e.g., Choi, et al., 2015), and subreddits have been discussed in a networked context with users as links and the various communities as nodes (Olson and Neal, 2015; see also Olson, 2013, for a visualization of community ties), there is a dearth of research on the actual content of comment threads. Accordingly, this study seeks to address this gap by adopting a content analysis approach in order to analyze readability and, to a lesser extent, user interaction sentiment across different subreddits (the subreddits under analysis were curated from a list of the most popular subreddits in terms of subscribers, as cited by Specifically, emoticon analysis (as a measure of sentiment) was used in conjunction with a battery of readability tests (Flesch-Kincaid, Gunning fog, and SMOG). These readability tests take into account average sentence length, syllable count, etc. in order to estimate the grade level (i.e., years of formal education) required to understand a text. Finally, links provided by original posters (that is, the users who create the threads under analysis) were analyzed and classified, in order to determine which types of sites are linked to via the most popular threads on Reddit. This is of particular interest when one considers the “Reddit hug of death,” which occurs when a highly visible thread calls attention to a little-known site without a robust server; the resulting traffic often crashes the site, necessitating the posting of mirrors so that Redditors can view the linked content. A somewhat less dramatic instance of the same phenomenon can be found by looking at Wikipedia page views, wherein pages related to a popular Reddit thread often experience temporary increases in viewcounts (Moyer, et al., 2015).

In sum, the research was guided by the following questions:

RQ1: To what degree do different subreddits (and subreddit categories) differ in terms of readability?

RQ2: To what degree do different subreddits (and subreddit categories) use emoticons (both in terms of frequency and in terms of sentiment)?

RQ3: What types of sites do different subreddits (and subreddit categories) link to, taking into consideration only links that are found in original posts (OPs)?


Related work

The primary metric of the popularity of any given thread or comment on Reddit is its score. This may be compared to Facebook’s “like” system or Twitter's retweets, although Reddit allows users to downvote items (expressing dislike). Accordingly, assessing the popularity of a thread or comment is not quite as straightforward on Reddit as it is on other social networking sites, a state of affairs that results in Reddit offering several methods of “sorting” comments within a thread or threads within a subreddit. These different methods take into account variables such as time (“hot”), raw upvote scores (“top”), and upvote/downvote ratios (“best”).

Reddit’s voting system has been criticized, with Gilbert (2013) claiming that it does not consistently identify “potentially popular links” [3], thus defeating the purpose of the site. Another criticism is that it leads to “Karma whoring” [4], wherein “users address the lowest common denominator and usually extend already popular topics” (Richterich, 2014) in the hopes of obtaining upvotes and thus a higher karma score. However, it has also been found that Reddit’s voting system was effective at identifying high-quality “quotable phrases” (Bendersky and Smith, 2012). In addition, Turcotte, et al. (2015) have observed that “social media recommendations improve levels of media trust, and also make people want to follow more news from that particular media outlet in the future,” which indicates that it is likely that Reddit’s content delivery system has an effect on frequent visitors. Although it has been noted that “consumption of news from information/news Web sites is positively associated with higher trust, while access to information available on social media is linked with lower trust” (Ceron, 2015), the fact that Reddit by its very nature actually links to outside information sources suggests that it captures the best of both worlds: the experience of social media coupled with the authority granted by “official” news sites (see also Johnson and Kaye [2014], wherein it was found that “reliance on other online sources is linked to perceptions of high credibility of SNS”).

Readability scores

Three of the most commonly used readability tests are Flesch-Kincaid, Gunning fog, and SMOG. All of these tests return an estimated grade level — that is, the amount of formal education required to comprehend any given text. Whereas the Flesch-Kincaid formula (hereafter referred to simply as “Flesch”) takes into account the total number of syllables in a text (Kincaid, et al., 1975), the Gunning fog index only considers the total number of “complex words” (defined as those with at least three syllables) (Bogert, 1985). Both tests take into account the total number of words and sentences in the text under analysis. The SMOG index, consciously designed as a simpler yet more accurate version of Gunning fog, only considers the total number of sentences and the total number of “complex words” (Fitzsimmons, et al., 2010).

Readability scores have been used across a variety of contexts in order to ascertain the “difficulty” of reading a document, including articles published in medical journals (Weeks and Wallace, 2002), business reports (Clatworthy and Jones, 2001; Jones, 1996; Smith and Taffler, 1992), mission statements (Busch and Folaron, 2005), and college textbooks (McConnell, 1982). Flesch scores have been used to detect deceptive language (Burgoon, et al., 2003) and analyze the degree of community present in a classroom setting (Rovai, 2002). In computer-mediated contexts, Flesch scores have been used to analyze online news articles (Knobloch-Westerwick and Johnson, 2014) and medical Web sites (Whitten, et al., 2008). All three scores were used to determine that a job analysis questionnaire was considered to be at the college-level in terms of readability (Ash and Edgell, 1975).

Perhaps the most frequent usage of readability tests in the academic literature is in regard to health care materials. The Gunning fog metric has been used to establish that health care literature provided to patients was too advanced for the intended audience (Gazmararian, et al., 1999), just as the same index (taken in conjunction with Flesch) indicated that consent forms regarding oncology protocols were also too advanced for lay patients (Grossman, et al., 1994). Both SMOG and Flesch were used by Beaver and Luker (1997) to determine that breast cancer booklets distributed in the U.K. were too advanced for most readers, while SMOG on its own has been used to argue that materials relating to other healh-care topics are too advanced for the general public, including HIV/AIDS (Wells, 1994), strokes (Sullivan and O’Conor, 2001), and dentistry (Jayaratne, et al., 2013). Finally, SMOG scores have been used to argue that many online materials relating to health are written above the average reader’s comprehension level (Aliu and Chung, 2010).

In the field of scholarly communication, Gunning fog was used to discover that the peer-review process improves readability, though the end results are still too advanced for many readers (Roberts, et al., 1994). It may be that editors and reviewers either consciously or unconsciously shy away from simplifying submissions too much, as more “difficult” texts are seen as containing better research (Armstrong, 1980).

Applying these scores to a computer-mediated environment is somewhat more difficult, as noted by Sallis and Kassabova (2000), although with proper adjustments (e.g., filtering out unparseable elements such as URLs) they can be quite powerful. The SMOG index in particular has been found to be an accurate measure of the readability of online documents (Gottron and Martin, 2009). In analyzing online messages, Walther (2007) used the Flesch index to determine that writers used more complex language when they believed they were speaking with a university professor, while using less sophisticated language when they were under the impression that they were speaking with a high schooler, indicating a degree of “language accommodation” in this context. Flesch scores have also been used to analyze e-mail correspondence in newsgroups (Sallis and Kassabova, 2000). Finally, Flesch was used as a means of ensuring that texts were written at a fifth-grade reading level (that is, the texts should be readable by a 10–11 year old), in order to facilitate the demands of a usability study concerning a computer-mediated communication health application (Lin, et al., 2009).

Given that there is something of a pervasive trend of “official” materials being too advanced for the general public, one would expect content generated by lay people, such as that found on Reddit, to be far more comprehensible in terms of language sophistication. The question, then is how much “simpler” the discourse on Reddit is, and how it varies across communities and topic types. A cursory observation of the site’s most popular subreddits (even in pure terms of simple comment length) indicates that comments in “entertainment”-based subreddits seem to be relatively short, and comment chains extend deeper into the tree most often when there is a recurring joke that multiple members are exploiting. Conversely, subreddits about “serious” topics (e.g., those classified as “news,” “politics/history,” and “science”) tend to encourage discussion (as opposed to mere reactions to an outside link or original story), which carries with it a greater attention to detail and information-seeking, which in turn would be expected to lead to a higher level of discourse in the comments section. Hence the following hypothesis:

H1: “Entertainment” subreddits will be rated as more accessible via readability tests when compared to “serious” subreddits.


In previous research, Dresner and Herring (2010) have described:

... three functions of emoticons ... (a) emotion, mapped directly onto facial expression (e.g., happy or sad); (b) nonemotional meaning, mapped conventionally onto facial expression (e.g., a wink as indicating joking intent; an anxious smile); and (c) illocutionary force indicators that do not map conventionally onto facial expression (e.g., a smile as downgrading a complaint to a simple assertion). [5]

All of these can be found in the Reddit environment. Although the impacts of emoticons on readers have been claimed to be minimal (Walther and D’Addario, 2001), emoticons do indicate intended sentiment, and indeed, Derks, et al. (2007a) found that emoticons “have an impact on message interpretation” and “are useful in strengthening the intensity of a verbal message.” Emoticons have also been used to aid machine learning efforts (Read, 2005). From another perspective, emoticons serve a useful purpose in professional e-mail messages (Skovholt, et al., 2014). Along the same lines, emoticons allow for non-verbal information to be communicated in a computer-mediated context (Lo, 2008), just as they allow users to clarify the intended meanings of their missives (Thompson and Filik, 2016). There does seem to be something of a generation gap in terms of using emoticons as non-verbal indicators (Krohn, 2004), but this is somewhat less relevant in the Reddit environment, given that Reddit users tend to be younger than Internet users as a whole (Barthel, et al., 2016). Finally, previous research has found that people “used more emoticons in socio-emotional than in task-oriented social contexts” [6], which leads to the second hypothesis:

H2: “Entertainment” focused subreddits will be more likely to use emoticons, while more “serious” subreddits will be less likely to use emoticons.

The results of this research can be used to gain insights into the manner in which Reddit as a whole operates, as well as to inform future research that seeks to analyze emoticon use, topicality, and readability in online communities.



A total of 204 subreddits were chosen for analysis; these subreddits represent the subreddits with the largest number of subscribers in February 2016 (two separate samples, consisting of the top 200 subreddits, were gathered during this time. As there were slight differences between the two lists, the total number of subreddits adds up to 204). These subreddits were identified via the “reddit metrics” site, which provides a wealth of information about Reddit, including the “fastest growing” subreddits, the “top new reddits,” and — most relevantly for the current research — a ranked list of subreddits (“Top subreddits,” 2016). All popularity metrics take the number of subscribers into account.

From each of the 204 subreddits, the top 75 threads of all time (based off of Reddit’s “top” sorting feature) were chosen, and up to 500 comments were selected from each thread (when a thread contained more than 500 comments, the Reddit sorting algorithm was deferred to — that is, the 500 “best” comments were selected). In addition, the “top” threads in a 24-hour period were sampled from each subreddit (up to a maximum of 25 threads per subreddit), and on two separate occasions the “hot” threads in a 24-hour period were also sampled from each subreddit (again, for each of these samples, up to a maximum of 25 threads per subreddit were harvested) [7. In short, the final sample consisted of the top all-time threads from the most popular subreddits, as well as a series of “snapshots” of these subreddits. Thus, this research is not intended to be a description of Reddit as a whole; rather, it seeks to consider what the most popular threads on the most popular subreddits are discussing, and the level of discourse evidenced in the most visible sections of the community.

Each of the subreddits was manually classified into one of 27 exclusive topical categories. While it is true that the comment trees in any given subreddit often drift away from the topic of the original post, the different subreddit categories effectively establish something of a baseline prompt, which makes the topical groupings of the subreddits meaningful. The classifications provided in the ModeratorDuck subreddit were used as a starting point (“Categorization of all subreddits,” 2014), although many subreddits were not included in this list, and several amendments needed to be made. A list of these categories, along with brief descriptions and member subreddits, can be found in Table A in the Appendix.

The study consists of three distinct sections: readability scores, sentiment (as measured by emoticon usage), and domain analyses. Each of these will be discussed in turn.


The Text-Statistics PHP library was used to facilitate large-scale computations of readability across the corpus (Childs, 2016). Individual scores were calculated for each comment, which were then averaged together to calculate the mean scores for each readability test for each subreddit. Finally, the mean scores of the three different readability tests (Flesch, Gunning fog, and SMOG), were averaged together in order to determine a mean readability score for each subreddit. These scores are expressed as a number that indicates the estimated grade level of education required to understand a text; thus, a score of 8 would indicate that a text is written at an eighth-grade reading level (that is, a 13-year-old student should be able to read the text without any significant difficulties).

Although the Text-Statistics library provided two further readability tests (Coleman-Liau and Automated Readability Index), these were found to be unsatisfactory for computer-mediated environments (particularly when a subreddit contained a high proportion of posts along the lines of “HAHAHAHAHAHAHAHAHAHAHAHAHA”), and thus these scores were not taken into consideration for this project (though it should be noted that the Automated Readability Index has been used in relation to product reviews, wherein language is somewhat more regulated, e.g., Hu, et al., 2012). In addition, any individual comments that were outside the normal Flesch range (0–100) were excluded from analysis for this particular test. A full list of results can be found in Table B in the Appendix.

Sentiment [Emoticons]

The emoticon list was drawn from the EmoticonLookupTable.txt file included in the SentiStrength download (see Thelwall, et al., 2010; Thelwall, et al., 2012). This file includes a list of emoticons mapped to their perceived sentiment (e.g., a smiley face — :) — is given a score of “1,” whereas a frowny face — :( — is given a score of “-1”). However, this list was slightly adapted in order to facilitate analysis. Specifically, the “:/” emoticon was removed from the list, as it generated a large number of false positives due to the presence of URLs in the comments (http:// being the most common violator). Emoji and other non-textual emoticons were not considered, as the Reddit platform only permits textual characters in the comments section.

Two separate analyses were carried out: the percentage of comments per subreddit that used at least one emoticon, and the average sentiment of all emoticons used across a given subreddit’s sampled comments. The results can be found in Table C in the Appendix.

Domain analysis

Whereas some subreddits prohibit the posting of links in top-level posts (often because the nature of the subreddit, such as r/explainlikeimfive and r/showerthoughts, stipulates that top-level posts should only consist of text, often in the form of a question or statement), others prohibit the posting of anything but links (for example, r/politics, wherein OPs must link to an outside source, and the title of the post must be drawn from said source). Keeping this in mind, it is instructive to consider all OPs links across the entire sample, as these represent the outside domains that were linked to most often in the most popular posts in the most popular subreddits (obviously, subreddits that prohibit outside links in OPs are not represented in this analysis). A list of Web sites that were linked to at least 10 times across the entire sample can be found in Table D in the Appendix. These Web sites (n = 21,797) represent 81.1 percent of the links that could be found in the OPs across the sample.

The various Web sites were classified into one of five categories. A list of these categories, along with example sites and brief descriptions, can be found in Table 1 (the precise categories can be found in Table D in the Appendix).


Table 1: Site classifications.
GIFsSites that host silent videos, animations, clips,,
ImagesSites that host static;
NewsSites that provide current;
User-generated contentSites that rely on original user-generated;
VideosSites that host videos with;




For each of the dependent variables, a one-way ANOVA was calculated to predict the dependent variable based on the subreddit category variable. A significant finding indicates that the dependent variable is influenced by the subreddit category. All of these relationships were found to be significant at p < .0001, and all had a moderate effect size (η2 > .25), per Ferguson (2009) (Table 2). Accordingly, we can say that that a subreddit’s category has a predictive effect on all of the variables under analysis: the average readability score of a subreddit is dependent on the subreddit’s category, the category of a subreddit is a reliable predictor of emoticon usage within the subreddit, etc.


Table 2: ANOVA results for dependent variables.
Dependent variableF-statisticsη2
Emoticon scoreF(26, 177)=4.3750.391
Emoticon percentageF(26, 177)=4.5860.403
GIFsF(26, 177)=3.5620.344
ImagesF(26, 177)=9.0420.57
NewsF(26, 177)=7.8360.535
Readability scores (mean)F(26, 177)=9.9010.593
VideosF(26, 177)=9.9580.594
User-generated contentF(26, 177)=3.660.35



The mean readability scores across the 27 subreddit categories ranged from 4.6 (indicating a fourth- to fifth-grade reading level) to 7.8 (indicating a seventh- to eighth-grade reading level). Subreddits that were classified as “porn,” “GIFs,” “videos,” and “images” (which, taken together, might be considered a “multimedia” macro category) were found at the lower end of the spectrum, while subreddits classified as “philosophy/religion,” “business/finance,” and “science” (all of which could be considered more “academic,” or at least more likely to spark intricate discourse) were found at the upper end of the spectrum. The full results can be seen in Figure 1 (the y-axis numbers have been selected to emphasize the distinctions between the subreddit categories).


Figure 1: Mean readability scores by subreddit type.


In terms of the Tukey tests, the subreddits classified as “porn” and “videos” (the latter not containing any pornographic material) were consistently rated as having a less-sophisticated discourse style than other subreddits, particularly compared to those categorized as “philosophy/religion” and, to a lesser degree, “business/finance.” In addition, “sports” subreddits tended to rank on the lower end of the readability spectrum. This suggests that the “philosophy/religion” and “business/finance” subreddits contain in-depth discourse (perhaps with the usage of terms that are sufficiently “sophisticated” to register highly on the various readability tests), along with relatively intricate sentence construction. Conversely, the “porn,” “videos,” and “sports” subreddits tend towards comments that consist of simple language with little technical jargon. Moreover, it appears that more actual “discussion” goes on in subreddits that ranked higher on the readability tests (as the resulting back-and-forth between members engenders increasingly sophisticated discourse), whereas the lower-ranked subreddits consist more of simple opinions or arguments (“You’re wrong,” etc.).

Sentiment [Emoticons]

Subreddits classified as “sports,” “random/assorted,” and “humor” had the lowest mean emoticon score (indicating that they had a tendency to use negative emoticons, a tendency to avoid positive emoticons, or a combination of both), while subreddits classified as “relationships” and “health/food” had the highest mean emoticon scores (Figure 2). In terms of emoticon use frequency, subreddits classified as “politics/history” and “news” were the least likely to contain comments that used emoticons, while subreddits classified as “health/food” were the most likely to contain comments that used emoticons (Figure 3).


Figure 2: Mean emoticon scores by subreddit type.



Figure 3: Mean emoticon percentage by subreddit type.


Domain analysis

The types of sites that were linked to by OPs were heavily dependent on the subreddit classification. “Porn” subreddits consistently linked to “GIF” sites at a much higher rate than other subreddit types. Unsurprisingly, “images” subreddits, along with “GIFs,” “photography,” and “porn” subreddits were most likely to link to “images” sites. Similarly, “news” and “business/finance” subreddits were most likely to link to “news” sites (as were “politics/history” subreddits, albeit to a somewhat lesser degree). Finally, the “meta” and “sports” subreddits were the most likely to link to sites containing “user-generated content.”



The various subreddits exhibited a number of differences in posting style and content. It was perhaps the readability analysis that exhibited the starkest differences, as the topical focus of a subreddit was a reliable predictor of the complexity of its discourse. Subreddits that aim at answering questions or encouraging discussion (e.g., r/science, r/philosophy, r/AskHistorians) possessed the most linguistically advanced discourses, whereas subreddits such as r/ass, r/milf, r/gonewild, and r/Amateur (the “porn” subreddits) were consistently ranked at the bottom (the conversations on the latter subreddits were generally conducted at no higher than a fourth grade reading level). The latter is hardly surprising, as the words used most often across subreddits such as r/RealGirls (with stopwords ignored) consisted almost exclusively of expletives, obscenities, and terms such as “love,” “yeah,” “fake,” “face,” and “hot.” Conversely, the subreddits that ranked highest in terms of linguistic sophistication used words such as “moral,” “human,” “access,” “articles,” “research,” “science,” “question,” and “answer,” indicating concepts that lend themselves to a deeper level of discourse (as well as an environment in which a question/answer dynamic is frequent, suggesting that requests for more information may lead to more erudite discussions).

Four subreddits — r/AskHistorians, r/philosophy, r/askscience, and r/changemyview — averaged an eighth grade reading level, which was the highest average level observed across the sample, with the exception of two notable outliers: r/rickandmorty and r/circlejerk. These subreddits scored abnormally highly on at least one readability test, illustrating the imperfections inherent in applying these tools to a computer-mediated context. The latter subreddit is easy to explain, as a single post that scores abnormally highly on one or more of the readability tests (e.g., “hahahahahahahahahaha”) will often be copied by many subsequent users, many of whom may add their own variations, most of which will fall outside of the “normal” realm of discourse expected by the readability tests. This same effect can be seen in r/rickandmorty, wherein the results were heavily skewed by the presence of one thread wherein more than 60 posts simply consisted of a long string of capital Hs. When these results were removed, the subreddit ranked near the bottom in terms of linguistic sophistication. As a final note, it is worth mentioning that r/DepthHub, which by its own description “gathers the best in-depth submissions and discussion on Reddit,” ranked highly in the readability tests, simultaneously validating the success of this subreddit and the applicability of selected readability tests for a computer-mediated environment.

The emoticon analysis was not quite as revealing (nor were the subreddits at either end of the spectrum as easily classified), but there were still some interesting findings. Of the six subreddits with a negative score (indicating a greater proportion of emotions classified as negative by SentiStrength’s dictionary), two are sports-related (r/nba and r/nfl), possibly because sports discussions tend to involve negativity towards players, teams, etc. However, it is important to emphasize that, on the whole, “sports” subreddits still had a positive emoticon score, although this was the lowest score witnessed across the 27 subreddit categories.

In terms of raw emoticon counts (not taking positivity/negativity into account), subreddits classified as “health/food” tended to contain more emoticons, whereas subreddits classified as “politics/history” and “news” tended to contain fewer emoticons. This appears to be due to the fact that many “health/food” subreddits involve dieting, wherein emoticons may be used as encouragement (or may be used as reflections of a poster’s individual experiences). Specifically, the two subreddits with the highest percentage of comments containing emoticons — r/loseit (10.43 percent) and r/MakeupAddition (13.18 percent) — are both lifestyle subreddits wherein motivational statements are highly valued. Subreddits such as r/SkincareAddiction and r/bodyweightfitness are not much further down the list. It is also important to note that these “health/food” subreddits also ranked highest in terms of emoticon sentiment, further lending credence to the idea that contributors to these subreddits use positive emoticons as a means of encouraging others and solidifying the community.

Conversely, subreddits such as r/liberal and r/conservative (two of the three subreddits with the lowest percentage of comments containing emoticons, with 1.11 percent and 1.26 percent, respectively), r/news (1.35 percent), and r/politics (1.61 percent) seem to shy away from emoticon use, possibly because factual discussion is prioritized in these communities over personal opinions (or, alternately, because proffered opinions are expected to be “straight,” without any emoticon embellishments or other niceties). Interestingly, “humor” subreddits rank in the lower third of subreddit categories in terms of emoticon usage, suggesting that it may be considered somewhat gauche to use emoticons in these subreddits. A possible explanation is that, in these subreddits, images, videos, GIFs, and plain text are used for humorous effect, and thus emoticons would be the Internet equivalent of a laugh track — leading, distracting, and frowned upon by many in the community. Yet another possible explanation is that the inclusion of r/4chan in the “humor” category may have contributed greatly to this score, considering that 4chan is associated with a rather venomous type of humor.

Finally, in regards to the domain analysis, the vast majority of sites linked to by the top Reddit threads are either mainstream news organizations (e.g., Guardian, New York Times) or social media (e.g., YouTube, Twitter, Imgur). This simultaneously supports and argues against Reddit’s claim to being “the front page of the Internet” — whereas the most popular threads on the most popular subreddits clearly link to popular sites, it is also true the front page of Reddit is not necessarily the best place to seek out information that is available only in less widely-known venues. The most visible segments of Reddit, then, appear to reflect the most visible segments of the Internet, from Wikipedia to Imgur/YouTube to English-speaking news sites (both in the U.K. and in the U.S., which is hardly surprising for an English-based Web site). Of course, the long tail evidenced in regard to linked domains (accounting for 19.1 percent of domains) indicates that Reddit indeed does manage to highlight lesser-known venues, although these venues are (rather predictably) not as prominent as more mainstream sources.



The topical category of any given subreddit is a reliable predictor of the content within the subreddit. While some consequences are expected (e.g., certain subreddits prohibit links in OPs, whereas others require a link to a site such as Imgur; in these situations, it is not surprising that there are systematic differences in OP link domains), others are more surprising. Different subreddits exhibit vastly varying levels of discourse sentiment and sophistication, with “porn” subreddits using more basic vocabularies (posts often consist simply of crass statements such as “hot girl”) and “philosophy/religion” subreddits using the most sophisticated vocabularies. The “health/food” subreddits simultaneously use emoticons most frequently and use them the most positively, indicating that encouraging and motivating others via emoticons is an integral part of being a member of these communities.

Future research in this area could undertake a more comprehensive sentiment analysis based on actual language patterns, although this would most likely need to be conducted manually. Similarly, a robust topical analysis of OPs and their associated comments would go a long way towards determining what, precisely, people are talking about on Reddit. Finally, this study only considered the most popular comments on the most popular threads in the most popular subreddits. The reasoning behind this was that it was considered desirable to analyze what the average Reddit user sees. However, it would be interesting to see if this study’s findings hold up across the whole of Reddit. What is certain, however, is that differences will continue to be found, as Reddit is not a monolithic enterprise. Rather, it is effectively a collection of very different communities, with different participants, different goals, and different norms, which share a common platform but very little else. This may well be the reason for its continued popularity; given that users can locate, join, and even create communities with ease, and given the wide variation in topicality, language use, and user interaction across the site, a large variety of audiences can find much of worth on “the front page of the Internet.”


About the author

Andrew Tsou is a Ph.D. student in the Department of Information & Library Science at Indiana University Bloomington. His research interests include computer-mediated communication and discourse patterns across social media platforms.
E-mail: iatsou [at] umail [dot] iu [dot] edu



The author would like to thank Patrick Shih for his assistance in preparing this manuscript.



1. Often referred to as “ELI5” for short, this subreddit allows users to ask questions about a variety of topics, with the understanding that responses should be written as simply as possible (despite the subreddit’s name, responses are not expected to be comprehensible by actual five-year-olds). Analogies are often used to make complicated points.

2. “AskReddit” is somewhat more informal than “explainlikeimfive,” in that many questions involve asking the Reddit community about their opinions/suggestions on a variety of topics.

3. Gilbert, 2013, p. 803.

4. “Karma” refers to the points that a user accrues by posting popular comments. It is something of a status symbol on Reddit, similar to retweets on Twitter and “likes” on Facebook.

5. Dresner and Herring, 2010, p. 263.

6. Derks, et al., 2007b, p. 842.

7. For more information about Reddit’s sorting system, see


References, 2016. “Reddit,” at, accessed 30 September 2016.

Oluseyi Aliu and Kevin C. Chung, 2010. “Readability of ASPS and ASAPS educational websites: An analysis of consumer impact,” Plastic and Reconstructive Surgery, volume 125, number 4, pp. 1,271–1,278.
doi:, accessed 25 October 2016.

J. Scott Armstrong, 1980. “Unintelligible management research and academic prestige,” Interfaces, volume 10, number 2, pp. 80–86.
doi:, accessed 25 October 2016.

Ronald A. Ash and Steven L. Edgell, 1975. “A note on the readability on the Position Analysis Questionnaire (PAQ),” Journal of Applied Psychology, volume 60, number 6, pp. 765–766.
doi:, accessed 25 October 2015.

Michael Barthel, Galen Stocking, Jesse Holcomb, and Amy Mitchell, 2016. “Nearly eight-in-ten Reddit users get news on the site,” Pew Research Center (25 February), at, 30 September 2016.

Kinta Beaver and Karen Luker, 1997. “Readability of patient information booklets for women with breast cancer,” Patient Education and Counseling, volume 31, number 2, pp. 95–102.
doi:, accessed 25 October 2016.

Bernd Becker, 2013. “Learning analytics: Insights into the natural learning behavior of our students,” Behavioral & Social Sciences Librarian, volume 32, number 1, pp. 63–67.
doi:, accessed 25 October 2016.

Michael Bendersky and David A. Smith, 2012. “A dictionary of wisdom and wit: Learning to extract quotable phrases,” Proceedings of the Workshop on Computational Linguistics for Literature, co-located with the 2012 Conference of the North American Chapter of the Association for Computational Linguistics, pp. 69–77; version at, accessed 25 October 2016.

Kelly Bergstrom, 2011. “‘Don’t feed the troll’: Shutting down debate about community expectations on,” First Monday, volume 16, number 8, at, accessed 25 October 2016.
doi:, accessed 25 October 2016.

Judith Bogert, 1985. “In defense of the Fog Index,” Bulletin of the Association for Business Communication, volume 48, number 2, pp. 9–11.
doi:, accessed 25 October 2016.

Judee K. Burgoon, J.P. Blair, Tiantian Qin, and Jay F. Nunamaker Jr., 2003. “Detecting deception through linguistic analysis,” In: Hsinchun Chen, Richard Miranda, Daniel D. Zeng, Chris Demchak, Jenny Schroeder, and Therani Madhusudan (editors). Intelligence and security informatics: Proceedings of the First NSF/NIJ Symposium, ISI 2003, Tucson, AZ, USA, June 2–3, 2003. Lecture Notes in Computer Science, volume 2665. Berlin: Springer, pp. 91–101.
doi:, accessed 25 October 2016.

Monique Busch and Gail Folaron, 2005. “Accessibility and clarity of state child welfare agency mission statements,” Child Welfare, volume 84, number 3, pp. 415–430.

“Categorization of all subreddits,” 2014. At, 22 October 2016.

Andrea Ceron, 2015. “Internet, news, and political trust: The difference between social media and online media outlets,” Journal of Computer–Mediated Communication, volume 20, number 5, pp. 487–503.
doi:, accessed 25 October 2016.

Dave Childs, 2016. “Text-Statistics” (PHP library), at, accessed 22 October 2016.

Daejin Choi, Jinyoung Han, Taejoong Chung, Yong-Yeol Ahn, Byung-Gon Chun, and Ted Taekyoung Kwon, 2015. “Characterizing conversation patterns in Reddit: From the perspectives of content properties and user participation behaviors,” COSN ’15: Proceedings of the 2015 ACM on Conference on Online Social Networks, pp. 233–243.
doi:, accessed 25 October 2016.

Mark Clatworthy and Michael John Jones, 2001. “The effect of thematic structure on the variability of annual report readability,” Accounting, Auditing & Accountability Journal, volume 14, number 3, pp. 311–326.
doi:, accessed 25 October 2016.

Daantje Derks, Arjan E.R. Bos, and Jasper von Grumbkow, 2007a. “Emoticons and online message interpretation,” Social Science Computer Review, volume 26, number 3, pp. 379–388.
doi:, accessed 25 October 2016.

Daantje Derks, Arjan E.R. Bos, and Jasper von Grumbkow, 2007b. “Emoticons and social interaction on the Internet: The importance of social context,” Computers in Human Behavior, volume 23, number 1, pp. 842–849.
doi:, accessed 25 October 2016.

Eli Dresner and Susan C. Herring, 2010. “Functions of the nonverbal in CMC: Emoticons and illocutionary force,” Communication Theory, volume 20, number 3, pp. 249–268.
doi:, accessed 25 October 2016.

Maeve Duggan and Aaron Smith, 2013. “6% of online adults are Reddit users,” Pew Research Center (3 July), at, accessed 25 October 2016.

Christopher J. Ferguson, 2009. “An effect size primer: A guide for clinicians and researchers,” Professional Psychology: Research and Practice, volume 40, number 5, pp. 532–538.
doi:, accessed 25 October 2016.

P.R. Fitzsimmons, B.D. Michael, J.L. Hulley, and G.O. Scott, 2010. “A readability assessment of online Parkinson’s disease information,” Journal of the Royal College of Physicians of Edinburgh, volume 40, number 4, pp. 292–296.
doi:, accessed 25 October 2016.

Julie A. Gazmararian, David W. Baker, Mark V. Williams, Ruth M. Parker, Tracy L. Scott, Diane C. Green, S. Nicole Fehrenbach, Junling Ren, and Jeffrey P. Koplan, 1999. “Health literacy among Medicare enrollees in a managed care organization,” Journal of the American Medical Association, volume 281, number 6, pp. 545–551.
doi:, accessed 25 October 2016.

Eric Gilbert, 2013. “Widespread underprovision on Reddit,” CSCW ’13: Proceedings of the 2013 Conference on Computer Supported Cooperative Work, pp. 803–808.
doi:, accessed 25 October 2016.

Thomas Gottron and Ludger Martin, 2009. “Estimating Web site readability using content extraction,” WWW ’09: Proceedings of the 18th International Conference on World Wide Web, pp. 1,169–1,170.
doi:, accessed 25 October 2016.

Stuart A. Grossman, Steven Piantadosi, and Charles Covahey, 1994. “Are informed consent forms that describe clinical oncology research protocols readable by most patients and their families?” Journal of Clinical Oncology, volume 12, number 10, pp. 2,211–2,215.

Nan Hu, Indranil Bose, Noi Sian Koh, and Ling Liu, 2012. “Manipulation of online reviews: An analysis of ratings, readability, and sentiments,” Decision Support Systems, volume 52, number 3, pp. 674–684.
doi:, accessed 25 October 2016.

Yasas S.N. Jayaratne, Nina K. Anderson, and Roger A. Zwahlen, 2014. “Readability of websites containing information on dental implants,” Clinical Oral Implants Research, volume 25, number 12, pp. 1,319–1,324.
doi:, accessed 25 October 2016.

Thomas J. Johnson and Barbara K. Kaye, 2014. “Credibility of social network sites for political information among politically interested Internet users,” Journal of Computer–Mediated Communication, volume 19, number 4, pp. 957–974.
doi:, accessed 25 October 2016.

Michael John Jones, 1996. “Readability of annual reports: Western versus Asian evidence — A comment to contexualize,” Accounting, Auditing & Accountability Journal, volume 9, number 2, pp. 86–91.
doi:, accessed 25 October 2016.

J. Peter Kincaid, Robert P. Fishburne Jr., Richard L. Rogers, and Brad S. Chissom, 1975. “Derivation of new readability formulas (automated readability index, fog count and Flesch reading ease formula) for Navy enlisted personnel,” Research Branch Report, 8–75. Millington, Tenn.: Naval Technical Training Command; version at, accessed 25 October 2016.

Silvia Knobloch–Westerwick and Benjamin K. Johnson, 2014. “Selective exposure for better or worse: Its mediating role for online news’ impact on political participation,” Journal of Computer–Mediated Communication, volume 19, number 2, pp. 184–196.
doi:, accessed 25 October 2016.

Franklin B. Krohn, 2004. “A generational approach to using emoticons as nonverbal communication,” Journal of Technical Writing and Communication, volume 34, number 4, pp. 321–328.
doi:, accessed 25 October 2016.

Carolyn A. Lin, Patricia J. Neafsey, and Zoe Strickler, 2009. “Usability testing by older adults of a computer-mediated health communication program,” Journal of Health Communication, volume 14, number 2, pp. 102–118.
doi:, accessed 25 October 2016.

Shao-Kang Lo, 2008. “The nonverbal communication functions of emoticons in computer-mediated communication,” CyberPsychology & Behavior, volume 11, number 5, pp. 595–597.
doi:, accessed 25 October 2016.

Campbell R. McConnell, 1982. “Readability formulas as applied to college economics textbooks,“ Journal of Reading, volume 26, number 1, pp. 14–17.

Richard A. Mills, 2015. “ — A census of subreddits,” WebSci ’15: Proceedings of the ACM Web Science Conference, article number 49.
doi:, accessed 25 October 2016.

Daniel Moyer, Samuel L. Carson, Thayne Keegan Dye, Richard T. Carson, and David Goldbaum, 2015. “Determining the influence of Reddit posts on Wikipedia pageviews,” Proceedings of the Ninth International AAAI Conference on Web and Social Media, at, accessed 25 October 2016.

Randal S. Olson, 2013. “redditviz — reddit interest network,” at, accessed 25 October 2016.

Randal S. Olson and Zachary P. Neal, 2015. “Navigating the massive world of Reddit: Using backbone networks to map user interests in social media,” PeerJ Computer Science, volume 1, article e4, at, accessed 25 October 2016.
doi:, accessed 25 October 2016.

Jonathon Read, 2005. “Using emoticons to reduce dependency in machine learning techniques for sentiment classification,” ACLstudent ’05: Proceedings of the ACL Student Research Workshop, pp. 43–48.

Reddit, 2016. “Reddit content policy,” at, accessed 30 May 2016.

Annika Richterich, 2014. “‘Karma, precious karma!’ Karmawhoring on Reddit and the Front Page’s econometrisation,” Journal of Peer Production, number 4, at, accessed 25 October 2016.

John C. Roberts, Robert H. Fletcher, and Suzanne W. Fletcher, 1994. “Effects of peer review and editing on the readability of articles published in Annals of Internal Medicine,” Journal of the American Medical Association, volume 272, number 2, pp. 119–121.

Alfred P. Rovai, 2002. “Development of an instrument to measure classroom community,” Internet and Higher Education, volume 5, number 3, pp. 197–211.
doi:, accessed 25 October 2016.

Philip Sallis and Diana Kassabova, 2000. “Computer-mediated communication: Experiments with e-mail readability,” Information Sciences, volume 123, numbers 1–2, pp. 43–53.
doi:, accessed 25 October 2016.

Karianne Skovholt, Anette Grønning, and Anne Kankaanranta, 2014. “The communicative functions of emoticons in workplace e–mails::–),” Journal of Computer–Mediated Communication, volume 19, number 4, pp. 780–797.
doi:, accessed 25 October 2016.

Malcolm Smith and Richard Taffler, 1992. “The chairman’s statement and corporate financial performance,” Accounting & Finance, volume 32, number 2, pp. 75–90.
doi:, accessed 25 October 2016.

Karen Sullivan and Frances O’Conor, 2001. “A readability analysis of Australian stroke information,” Topics in Stroke Rehabilitation, volume 7, number 4, pp. 52–60.
doi:, accessed 25 October 2016.

Mike Thelwall, Kevan Buckley, and Georgios Paltoglou, 2012. “Sentiment strength detection for the social Web,” Journal of the American Society for Information Science and Technology, volume 63, number 1, pp. 163–173.
doi:, accessed 25 October 2016.

Mike Thelwall, Kevan Buckley, Georgios Paltoglou, Di Cai, and Arvid Kappas, 2010. “Sentiment strength detection in short informal text,” Journal of the American Society for Information Science and Technology, volume 61, number 12, pp. 2,544–2,558.
doi:, accessed 25 October 2016.

Dominic Thompson and Ruth Filik, 2016. “Sarcasm in written communication: Emoticons are efficient markers of intention,” Journal of Computer–Mediated Communication, volume 21, number 2, pp. 105–120.
doi:, accessed 25 October 2016.

“Top subreddits,” 2016. At, accessed February 2016.

Jason Turcotte, Chance York, Jacob Irving, Rosanne M. Scholl, and Raymond J. Pingree, 2015. “News recommendations from social media opinion leaders: Effects on media trust and information seeking,” Journal of Computer–Mediated Communication, volume 20, number 5, pp. 520–535.
doi:, accessed 25 October 2016.

Joseph B. Walther, 2007. “Selective self-presentation in computer-mediated communication: Hyperpersonal dimensions of technology, language, and cognition,” Computers in Human Behavior, volume 23, number 5, pp. 2,538–2,557.
doi:, accessed 25 October 2016.

Joseph B. Walther and Kyle P. D’Addario, 2001. “The impacts of emoticons on message interpretation in computer-mediated communication,” Social Science Computer Review, volume 19, number 3, pp. 324–347.
doi:, accessed 25 October 2016.

William B. Weeks and Amy E. Wallace, 2002. “Readability of British and American medical prose at the start of the 21st century,” British Medical Journal, volume 325, number 7378, pp. 1,451–1,452.
doi:, accessed 25 October 2016.

James A. Wells, 1994. “Readability of HIV/AIDS educational materials: The role of the medium of communication, target audience, and producer characteristics,” Patient Education and Counseling, volume 24, number 3, pp. 249–259.
doi:, accessed 25 October 2016.

Pamela Whitten, Sandi Smith, Samantha Munday, and Carolyn LaPlante, 2008. “Communication assessment of the most frequented breast cancer websites: Evaluation of design and theoretical criteria,” Journal of Computer–Mediated Communication, volume 13, number 4, pp. 880–911.
doi:, accessed 25 October 2016.




Table A: Subreddit classifications and justifications.
Business/FinanceSubreddits concerned with finances or business practicesbusiness; Economics; personalfinance; shutupandtakemymoney
DrugsSubreddits dedicated to alcohol, drugs, etc.Drugs; trees; woahdude
GamingSubreddits about specific games or the gaming experience in general DestinyTheGame; DotA2; Fallout; GameDeals; Games; gaming; GlobalOffensive; hearthstone; leagueoflegends; Minecraft; pokemon; PS4; skyrim; smashbros; Steam; wow
General informationSubreddits containing general information that does not fit into any other categoryBuyItForLife; DIY; everymanshouldknow; freebies; GetMotivated; LearnUselessTalents; lifehacks; LifeProTips; malefashionadvice; travel; TwoXChromosomes; YouShouldKnow
GIFsSubreddits containing links to silent, animated imagesaww; creepy; gifs; holdmybeer; reactiongifs; Unexpected; Whatcouldgowrong; wheredidthesodago
Health/FoodSubreddits about exercise, food, etc.bodyweightfitness; Cooking; EatCheapAndHealthy; Fitness; food; loseit; MakeupAddiction; SkincareAddiction
HumorSubreddits containing jokes, humor, memes, etc.4chan; AdviceAnimals; AnimalsBeingJerks; BlackPeopleTwitter; cats; cringepics; dadjokes; facepalm; funny; humor; ImGoingToHellForThis; Jokes
ImagesSubreddits containing images and other static visual arts (includes discussions of such as well); Photographs are in a separate categoryArt; comics; CrappyDesign; dataisbeautiful; fffffffuuuuuuuuuuuu; meirl; oddlysatisfying; OldSchoolCool; pics; QuotesPorn; tattoos; TumblrInAction; wallpapers; youdontsurf
MetaSubreddits that refer to the Reddit experienceannouncements; bestof; blog; circlejerk; DepthHub;; SubredditDrama; TrueReddit
Movies/TelevisionSubreddits about movies and televisionanime; breakingbad; doctorwho; Documentaries; fullmoviesonyoutube; gameofthrones; movies; NetflixBestOf; rickandmorty; scifi; StarWars; television; thewalkingdead
MusicSubreddits about musichiphopheads; listentothis; Music
NewsSubreddits focused on contemporary eventsnews; nottheonion; offbeat; UpliftingNews; worldnews
Non-subredditSpecial category for links harvested from the front page and from r/allall; frontPage
Philosophy/religionSubreddits about religious or philosophical topicsatheism; Futurology; philosophy
PhotographySubreddits containing links to photographs (or discussion of such); this category takes precedence over “Images”AbandonedPorn; EarthPorn; FoodPorn; HistoryPorn; MapPorn; photography; photoshopbattles; RoomPorn
Politics/historySubreddits about contemporary political topics or historical topicsconservative; conspiracy; firstworldanarchists; guns; history; liberal; politics
PornSubreddits containing NSFW content (note that not all subreddits with the word “Porn” in their names are actually pornography in the traditional sense)Amateur; ass; Boobies; BustyPetite; cumsluts; FiftyFifty; gentlemanboners; gonewild; holdthemoan; milf; nsfw; nsfwgifs; RealGirls
QuestionsSubreddits wherein participants can ask questions about various topicsAskHistorians; AskMen; AskReddit; askscience; AskWomen; changemyview; DoesAnybodyElse; explainlikeimfive; iama; OutOfTheLoop; shittyaskscience
Random/assortedSubreddits with content that does not fit into any other categorygeek; interestingasfuck; InternetIsBeautiful; mildlyinfuriating; mildlyinteresting; MorbidReality; nonononoyes; WTF
Reading and WritingSubreddits about specific books, writing prompts, or reading/writing in generalasoiaf; books; harrypotter; nosleep; WritingPrompts
RegionsSubreddits dedicated to specific locations/countriescanada; europe
RelationshipsSubreddits about interpersonal relationshipsrelationships; seduction; sex
ScienceSubreddits dedicated to scientific topicspsychology; scholar; science; space; spaceporn
SportsSubreddits about sportshockey; nba; nfl; soccer; sports; worldcup
StoriesSubreddits wherein users share textual stories (often original thoughts or of the "this happened to me" variety)cringe; Frugal; JusticePorn; offmychest; PerfectTiming; Showerthoughts; TalesFromRetail; talesfromtechsupport; thatHappened; tifu; todayilearned
TechnologySubreddits about specific technologies (phones, etc.) or technology in general (e.g., programming)AlienBlue; Android; apple; baconreader; buildapc; gadgets; learnprogramming; linux; pcmasterrace; programming; technology
VideosSubreddits containing links to videosUnexpectedThugLife; videos; youtubehaiku



Table B: Readability scores by subreddit.
SubredditSubreddit classificationFleschGunning fogSMOGMean
asoiafReading and Writing6.2789797.3322565.4075996.339611
booksReading and Writing6.3890787.8324355.8004516.673988
BuyItForLifeGeneral information5.766347.3013135.263826.110491
DIYGeneral information5.4108936.8093514.8456385.688627
everymanshouldknowGeneral information5.9954967.3854135.2760116.218973
freebiesGeneral information5.2413575.9803614.5681765.263298
GetMotivatedGeneral information5.8622646.6761244.899015.812466
harrypotterReading and Writing6.0454126.8404695.2217726.035885
LearnUselessTalentsGeneral information5.5726096.450654.6958515.573037
lifehacksGeneral information5.5144626.5624414.6865075.587803
LifeProTipsGeneral information5.9042257.1892065.1412926.078241
malefashionadviceGeneral information5.5299036.6550284.8694445.684792
nosleepReading and Writing5.531516.2341054.5697585.445124

Last Wednesday afternoon I called Michael Brutsch. He was at the office of the Texas financial services company where he works as a programmer and he was having a bad day. I had just told him, on Gchat, that I had uncovered his identity as the notorious internet troll Violentacrez (pronounced Violent-Acres).

"It's amazing how much you can sweat in a 60 degree office," he said with a nervous laugh.

Judging from his internet footprint, Brutsch, 49, has a lot to sweat over. If you are capable of being offended, Brutsch has almost certainly done something that would offend you, then did his best to rub your face in it. His speciality is distributing images of scantily-clad underage girls, but as Violentacrez he also issued an unending fountain of racism, porn, gore, misogyny, incest, and exotic abominations yet unnamed, all on the sprawling online community Reddit. At the time I called Brutsch, his latest project was moderating a new section of Reddit where users posted covert photos they had taken of women in public, usually close-ups of their asses or breasts, for a voyeuristic sexual thrill. It was called "Creepshots." Now Brutsch was the one feeling exposed and it didn't suit him very well.

But Michael Brutsch is more than a monster. Online, Violentacrez has been one of Reddit's most reviled characters but also one of its most beloved users. The self-described "creepy uncle of Reddit" has played a little-known but crucial role in Reddit's development into the online juggernaut it is today. In real life, Brutsch is a military father and cat-lover. He lives with his wife in the Dallas suburb of Arlington, Texas. There are many sides to Violentacrez, and now that I had Michael Brutsch on the phone I hoped to find out where the troll ended and the real person began.


I first became aware of Violentacrez last year, when controversy erupted over a section, called "Jailbait," that Violentacrez had created on Reddit dedicated to sexualized images of underaged girls. (Brutsch adapted the name from "Violent Acres," a popular anonymous blogger he was fond of in the mid-2000s.) Reddit, for the uninitiated, is essentially a social news site; with a free username, anyone can submit and vote on content and can do so anonymously. And anyone can start a forum on Reddit dedicated to their interests, known as a subreddit. Today, there are about 10,000 active subreddits out of nearly 100,000 total, spanning a dizzying array of topics from funny pictures, to Power Rangers, to pooping. If a post gets enough "upvotes," as they're called, it can be propelled to the front page of Reddit and a massive audience.

Reddit's Child Porn Scandal

It appears a user of the popular message board Reddit has been caught distributing child…

The breadth of topics and dedication of users has made Reddit, which calls itself the "front page of the internet," the single dominant force in internet culture today, boasting over 3.4 billion pageviews this August. It reached a new level of legitimacy last month, when President Obama held a Q & A on Reddit. These days, Reddit is mentioned in the same breath as Twitter and Facebook by pundits expounding on the power of social media.

Obama Grants Interview to Racist Teen Nude Picture Website

Barack Obama, the President of the United States of America, hosted an "Ask Me Anything"…

But Reddit's laissez-faire attitude towards offensive speech has led to a vast underbelly that rivals anything on the notorious cesspool 4chan. And with Jailbait, Violentacrez decided to create a safe space for people sexually attracted to underage girls to share their photo stashes. I would call these people pedophiles; the Jailbait subreddit called them "ephebophiles." Jailbait was the online equivalent of systematized street harassment. Users posted snapshots of tween and teenage girls, often in bikinis and skirts. Many of these were lifted from their Facebook accounts and thrown in front of Jailbait's 20,000 horny subscribers.

Violentacrez and his fellow moderators worked hard to make sure every girl on jailbait was underage, diligently deleting any photos whose subjects seemed older than 16 or 17. Violentacrez himself posted hundreds of photos. Jailbait became one of Reddit's most popular subreddits, generating millions of pageviews a month. "Jailbait" was for a time the second biggest search term bringing traffic to Reddit, after "Reddit." Eventually, Jailbait landed on CNN, where Anderson Cooper called out Reddit for hosting it, and Violentacrez for creating it. The ensuing outcry led Reddit administrators to reluctantly ban Jailbait, and all sexually suggestive content featuring minors.

Reddit Reluctantly Bans Child Porn

Metastasized Xbox fan message board Reddit is a little less pervy this morning after it explicitly…

Michael Brutsch at a Reddit meet-up.

On the phone, Michael Brutsch insisted he is not a pedophile but was unapologetic about Jailbait. He compared the photos of underage girls he posted to Britney Spears' sultry "Hit Me Baby One More Time" video. She was 16 at the time, he said—how was that different than what he was doing? Brutsch said he only reposted photos that he'd found elsewhere, mostly on 4chan, and that he promptly removed any outright child porn that was posted.

"I've always been upfront about the sorts of things that I find attractive," he said. Brutsch didn't create the creepshots subreddit which was launched earlier this year. But when it started to get heat after a teacher in Georgia was fired at the end of September for allegedly posting covert pictures of his underage students, it only made sense that the section's moderators would bring Violentacrez on to help deal with the newfound attention. He was a moderator until Creepshots was banned this week amid increasing controversy. (The circumstances surrounding Creepshots' ban is unclear, as Reddit's General Manager had told Buzzfeed they would not ban the subreddit because it wasn't breaking Reddit's rules.)

How to Shut Down Reddit's CreepShots Once and for All: Name Names

A 25-year-old female Redditor has finally come up with a way to beat Reddit's r/CreepShots…

Having his screenname mangled by Anderson Cooper on CNN for Jailbait was Violentacrez's biggest moment as a troll, but it wasn't his first time in the spotlight. Since Brutsch stumbled on Reddit from a link on the internet culture blog Boing Boing in 2007, he has pushed the boundaries of Reddit's free-speech culture. He has done this mostly through creating offensive subreddits to troll sensitive users. Some of the sections Violentacrez created or moderated were called:

  • Chokeabitch
  • Niggerjailbait
  • Rapebait
  • Hitler
  • Jewmerica
  • Misogyny
  • Incest

You can look those up on Reddit and visit them if you'd like to ruin your day, but the content is self-explanatory.

Unlike Jailbait, which apparently sprung from a sincere interest, many of Violentacrez's most offensive subreddits were created just to enrage other Reddit users. At this they were very effective. What happened was, some do-gooder would stumble upon one of his offensive subreddits and expose it to the rest of Reddit in an outraged post. Then thousands more would vote the thing to the front page of Reddit. Cries to censor it would sound out, to be almost inevitably beaten back by cries of "free speech!" The idea of free speech is sacred to many Reddit users, a product of the free-wheeling online message board culture from which Reddit springs. If you criticize someone else for posting something you don't like, you are a whiny fascist.

Violentacrez explained his trolling philosophy to the internet culture website the Daily Dot in August of 2011. He had sparked yet another controversy by posting a graphic image of a partially clothed woman being brutally beaten by a large man, in "beatingwomen," a subreddit dedicated to glorifying violence against women. A Redditor had called out the picture in a post, and it was voted to the front page.

"People take things way too seriously around here," Violentacrez said. " I was not surprised by the outrage of the person who made the post, because I see it all the time. What was surprising was the community support for it. Most posts that complain about these things never do very well, and are quickly buried or deleted. I think it's interesting how many people defend my right to act the way I do, while decrying my posts themselves."

A troll exploits social dynamics like computer hackers exploit security loopholes, and Violentacrez calmly exploited the Reddit hive mind's powerful outrage machine and free speech values at the same time.

It was this pattern, repeated to various degrees dozens of times, that made Violentacrez an unlikely hero to many of the white male geeks who make up Reddit's hard core. They saw Violentacrez as a champion in the fight against the oppressive schoolmarms: "He upheld a certain amount of freedom for the worst of us to ensure freedom for all of us," wrote one user in a post mourning his departure. Fans followed him wherever he went on the site.

As his fame grew, Brutsch began selling T-shirts with an illustration of a zombified version of Reddit's alien logo, designed by a professional illustrator, that he had adopted as Violentacrez's logo. He created a subreddit called Violentacrez, dedicated to news and posts about himself. Last year, the Daily Dot named him the most important Redditor of the year. Violentacrez was the most influential user of one of the most influential websites on the internet.

Violentacrez was a troll, but he was a well-connected troll.

All the while, Violentacrez's critics cried out the same refrain: "How does he get away with this?" One reason Violentacrez continued to occupy such a high-profile position on Reddit was of course his free speech rhetoric. But Violentacrez has historically had a close relationship with Reddit's staff, a fact far less well-known than his controversial behavior. Violentacrez was a troll, but he was a well-connected troll. He told me he was close with a number of early Reddit employees—many of whom have now moved on—chatting with them on IRC or sometimes even on the phone. A few years ago, while Jailbait was still going strong, Reddit's administrators gave him a special one-of-a-kind "pimp hat" badge to honor his contributions to the site, which he proudly displayed on his profile. Brutsch said he was even in the final running for a job as a customer support representative at Reddit last year.

During the Jailbait controversy, Erik Martin, the site's General Manager, reached out to Violentacrez beforehand to warn him that they were going to have to shut down his prized possession, according to a chat conversation Violentacrez leaked at the time.

"Want to give you a heads up," Martin wrote. "We're making a policy change regarding jailbait type content. Don't really have a choice."

(Martin did not respond to requests for comment.)

Violentacrez's privileged position came from the fact that for years he had helped administrators deal with the massive seedy side of Reddit, acting almost as an unpaid staff member. Reddit administrators essentially handed off the oversight of the site's NSFW side to Violentacrez, according to former Reddit lead programer Chris Slowe (a.k.a. Keysersosa), who worked at Reddit from 2005 to the end of 2010. When Violentacrez first joined the site and started filling it with filth, administrators were wary and they often clashed. But eventually administrators and Violentacrez came to an uneasy truce, according to Slowe. For all his unpleasantness, they realized that Violentacrez was an excellent community moderator and could be counted on to keep the administrators abreast of any illegal content he came across.

"Once we came to terms he was actually pretty helpful. He would come to us with things that we hadn't noticed," said Slowe. "At the time there was only four of us working so that was a great resource for us to have."

Administrators realized it was easier to outsource the policing of questionable content to Violentacrez than to dirty their hands themselves, or ostracize him and risk even worse things happening without their knowledge. The devil you know. So even as Jailbait flourished and became an ever-more-integral part of Reddit's traffic and culture—in 2008 it won the most votes in a "subreddit of the year" poll—administrators looked the other way. "We just stayed out of there and let him do his thing and we knew at least he was getting rid of a lot of stuff that wasn't particularly legal," Slowe said. "I know I didn't want it to be my job."

Violentacrez's close relationship to administrators made him an elite member of Reddit's army of moderators, known as "mods" on the site. Though much is made of the millions of users who submit content to Reddit, it's Reddit's over 20,000 volunteer mods who are the real secret behind its success. They act as janitors and editors, keeping their subreddits clean and well-stocked with content. Reddit's main innovation has been to move these users up the food chain, from simple content-generators to management positions. This allows Reddit's mind-boggling breadth of content and users to be overseen by just a few paid employees. The downside is that it requires Reddit's official management to enter into uneasy symbiotic relationships with sketchy but effective moderators like Violentacrez.

And sometimes those relationships become more trouble than they're worth. After the Jailbait controversy, Violentacrez claimed repeatedly on Reddit, he was cut off from administrators who had been burned by the controversy. In fact, when I spoke to him, Brutsch said Reddit admins had been keeping their distance for a while. He suggested that the site wasn't what it used to be. In recent days, he has been posting less, stirring up less drama.

When it comes to mods, the political model of Reddit is not so much a vast digital democracy, as it's often framed by fans and users, as online feudalism. Moderators like Violentacrez are given absolute control over their turf in exchange for keeping the kingdom of Reddit strong. Moderators become more or less powerful in direct relation to the number and popularity of the subreddits they moderate, so they try to take over other subreddits to boost their profile in the community. Inevitably, Reddit's administrators develop relationships with the most influential moderators. Like feuding medieval lords vying for the king's favor, moderators form alliances or wage epic flame wars over power struggles.

This is how Violentacrez, Reddit's creepiest user, also became its most powerful. Sure, he was responsible for the absolute worst stuff on Reddit, and by extension, some of the worst stuff on the internet. But Violentacrez was also seen to be, as Chris Slowe put it to me, "a trustworthy and a positive member of the community." He moderated more than 400 subreddits and had many high-profile friends, amassed over many years. His stable at times included hundreds of popular mainstream subreddits, like Funny and WTF, that reach audiences of millions. Violentacrez further solidified his reach by becoming a mentor to other moderators. He created the first FAQ for Reddit's rather unintuitive moderator interface. He also helmed a number of subreddits dedicated to providing guidance and camaraderie for other moderators, including the essential modhelp.

So it was no surprise that when news got out earlier this week that I was working on a story that would expose Violentacrez's real identity, other moderators on Reddit rallied to defend him. The popular politics subreddit led the charge, by banning all Gawker links.

"As moderators, we feel that this type of behavior is completely intolerable," they wrote. "We volunteer our time on Reddit to make it a better place for the users, and should not be harassed and threatened for that. We should all be afraid of the threat of having our personal information investigated and spread around the internet if someone disagrees with you."

Some have taken this as an expression of Reddit's users' fondness of Violentacrez's pornographic generosity. In fact the ban was probably more an expression of friendship by the Politics subreddit moderators. Violentacrez probably trained some of them. They were mad that their buddy was going to be outed for simply, in their mind, exercising his free speech—his unalienable right to anonymously post stalker shots of women.


When I called Brutsch that Wednesday afternoon and told him I knew who he was, I was a little taken aback by how calm he remained during our intense but civil hour-long conversation. I had figured that a man whose hobby was saying horrible shit just to screw with people online would rise to some new horrible level when conditions on the ground actually called for it. Instead he pleaded with me in an affectless monotone not to reveal his name.

"My wife is disabled. I got a home and a mortgage, and if this hits the fan, I believe this will affect negatively on my employment," he said. "I do my job, go home watch TV, and go on the internet. I just like riling people up in my spare time."

I asked if he regretted anything he had posted, now that he'd be found out. No, he said. "I would stand by exactly what I've done." The problem was, he explained, that if his identity got out, his many enemies would start attaching lies to his name because they simply don't like his views. They would say he was a child pornographer, when all he had done was spearhead the distribution of thousands of legal photos of underage girls. They would say the fact that he created a subreddit dedicated to Hitler meant he was anti-Semitic, when really it was just trolling. (Brutsch says he's got Jewish blood himself: "If you see a picture of me, I'm about as Jewish looking as they get.") They would Google-bomb his name and the word "pedophile" along with his publicly-traded company's name.

I asked if he regretted anything he had posted, now that he'd be found out. No, he said. "I would stand by exactly what I've done."

He needed to keep his anonymity to protect his ability to express things many people think but hardly anyone says. With Violentacrez, "I got the freedom to talk about my personal life, my personal feelings... I'm sure there's more than one person in this building who's a pervert," he said, referring his office building.

He asked a number of times if there was anything he could do to keep me from outing him. He offered to act as a mole for me, to be my "sockpuppet" on Reddit. "I'm like the spy who's found out," he said. "I'll do anything. If you want me to stop posting, delete whatever I posted, whatever. I am at your mercy because I really can't think of anything worse that could possibly happen. It's not like I do anything illegal."

I told him it wasn't my place to tell him what to do, that I was just reporting on what he'd already done, but this did shake me a bit. It didn't help that our phone call had been unplanned and I hadn't properly steeled myself for a tough conversation. In the beginning it was just supposed to be a friendly gchat conversation with Violentacrez, not a confrontation with Brutsch himself.

I had initially told Violentacrez I was interested in profiling him in light of the new controversy surrounding creepshots. I arranged the Gchat interview without hinting that a former online friend had tipped me off to his real identity during the Jailbait scandal, after the friend had become disgusted with his obsession with underage girls. Since then, Violentacrez had recorded the geek podcast The Drill Down with other high-profile Reddit moderators, outing his voice. All I had to do was call up Michael Brutsch and match his voice to Violentacrez's. My plan that Wednesday was to have the chat with Violentacrez before calling Brutsch. I didn't want to risk calling Brutsch first, only to have him shut down completely once he realized he was outed.

Unfortunately, I've never been good at keeping secrets. My poker face is so bad it can be read even through a computer screen, apparently. In our Gchat, I pressed Violentacrez about his anonymity enough that he grew suspicious. We were chatting about why he feels comfortable attending IRL meet-ups of Redditors if his anonymity was so important to him when he caught on.

me: it seems like you're not super careful about keeping your identity under wraps, if you meet people in real life. A lot of trolls I've talked to would never do that or give out as many details about themselves as they do.

violentacrez: have you been given my real name?

me: yeah

violentacrez: that's not good

me: it seems like you've told a lot of people. Are you surprised it would get out?

violentacrez: yes, I thought I could trust those who know. Are you going to out me?

Panicking a bit, I quickly picked up the phone and dialed the number I had found on Brutsch's online resume so I could hear Brutsch's voice to see if it matched Violentacrez. It did.

"So, are you going to out me?" he said.


One thing that Brutsch wasn't worried about when I talked to him on the phone was his immediate family finding out about his online habits.

"He won't really care," said Brutsch of his teenage son, the one about to join the Marines. "He thinks I'm creepy as it is."

The Violentacrez clan seems to have walked out of a Todd Solondz movie, and a significant part of Violentacrez's mythos on Reddit comes from the details he's shared about his family. In 2010, Violentacrez hosted a legendary "Ask Me Anything" thread"—the same Q & A feature Barack Obama took part in last month. He was asked what was the creepiest thing he'd done "IRL" and delighted readers with a tale ripped out of Penthouse letters. "That'd be a tough call," Violentacrez wrote, "Perhaps oral sex with my 19-year-old stepdaughter." It was completely consensual, he claimed in the post, and went on to brag about how awesome it had been in graphic detail.

This happened over ten years ago, Violentacrez claimed. When his then-wife, the girl's mother, found out, she "got mad, then got over it," Violentacrez wrote. He says they were married for ten more years.

His current wife is similarly accepting of Brutsch's unsavory side, according to Brutsch. She is not only aware of his online habits, she's also a prolific Redditor under the handle not_so_violentacrez. She is a founder of the Fibromyalgia subreddit. She has diabetes and plays the online game Kingdom of Camelot. Violentacrez said that at home, the two would lie in bed together with their laptops, both on Reddit, him posting his porn, she posting cute animal videos and pictures of dolphins.

About a year ago, Violentacrez's teenage son did his own Ask Me Anything thread. His son uses the handle Spawn_of_VA and he is dad's biggest fan. Interspersed among talk of family game night, Spawn_of_VA regaled readers with more weird tidbits about his father, including the fact that he has a "suitcase full of dildos in his closet" and a "roller type thing with spikes on it, he uses that to roll on his balls."

When I first read Violentacrez's and his son's AMAs I, like many other readers, figured this was just some next-level trolling. Violentacrez's wife and his son were probably just sockpuppets, right? But on the phone, I asked Brutsch if Spawn_of_VA was really his son. He is, Brutsch said. I asked if everything he and his son had said in their AMAs were real. As far as he could remember, he said, it was.


The extent to which trolls separate, or fail to separate, their online and IRL lives is as varied as people themselves. There's an idea of the troll as an information age Jekyll & Hyde, with the anonymity provided by the internet playing the role of Hyde's serum that transforms the mild-mannered geek into a monster. Observers often cite the psychological theory called deindividuation, which argues people literally lose themselves when granted anonymity.

But Violentacrez/Michael Brutsch upset this idea by blurring his online and offline lives. Brutsch adopted a new name for trolling, but he built his horrible character on many details from his real life. In real life, Brutsch is an unabashedly creepy old man with seven cats and two dogs and a disabled wife and a teenage son about to join the Marines. He was all of that online, too—only he was famous for it.

Both offline and online he could be either a creepy uncle, or a loyal friend and helpful guide. Violentacrez had a surprising number of friends on Reddit, for someone who once created an entire subreddit dedicated to pictures of dead teenage girls (Picsofdeadjailbait). He helped organize IRL meet-ups, where he showed up in a t-shirt with his zombie logo on it, and told everyone there to call him "VA." Attendees agreed to blur his face in any resulting pictures before posting them to Reddit. Brutsch is an internet minister, and he said he once married a pair of Redditors in real life, though they only knew him by his "clean" handle: mbrutsch.

One longtime Redditor I spoke to talked about Violentacrez with the warmth of an old college friend.

"He's a really a good guy," she said. This user was in the Arlington area for business once, and she stopped by Brutsch's house for lunch. "He has the manners of a Southern gentleman," she said. "A bunch of neighborhood kids were over playing at his house."

The only thing missing was joining the name Violentacrez to the name Michael Brutsch, and even that information he had given to many of his online friends. Reddit administrators have long known his real identity, Brutsch said, which he gave them in order to prove that he had nothing to hide. But Brutsch was still anonymous to the people he wanted to be, mainly his employers, and by unmasking him I am sure to get criticism for supposedly violating his privacy.

Even before I published this article, Reddit had already exploded in outrage. (Gawker sites are now banned from over 60 subreddits, and some pissed off user has signed me up for approximately two dozen mailing lists.) The irony of being upset that a noted custodian of "creepshots" is getting some unwanted attention himself is obvious. Jailbait defenders would often argue that if 14-year-olds didn't want their bikini pictures to be posted to Reddit, they should not have taken them and uploaded them to their Facebook accounts in the first place. If Brutsch did not want his employers to know that he had become a minor internet celebrity through spending hours every day posting photos of 14-year-olds in bikinis to thousands of people on the internet, he should have stuck to posting cat videos.

But for Reddit, the stakes are higher than just one man having to answer for things he's done online. To them, the "doxing" of Violentacrez—"doxing" is hacker slang for publishing someone's personal information in order to intimidate or punish them—is an assault on the very structure of Reddit itself. The Daily Dot sums up their logic:

At Web communities like Reddit, which thrive because users are free to say and do anything they want, doxing is a severe crime, both to users and the site's staff. It's far worse than offensive speech like racism and homophobia or, yes, even posting surreptitiously snapped photos of innocent women for creeps to perv over. Why? Because doxing undermines the community's structural integrity: Reddit simply would not exist as we know it if users weren't operating under the freedom of a flexible identity. So redditors aren't banning Gawker to protect violentacrez, they're doing it to protect themselves.

Under Reddit logic, outing Violentacrez is worse than anonymously posting creepshots of innocent women, because doing so would undermine Reddit's role as a safe place for people to anonymously post creepshots of innocent women.

I am OK with that.


Brutsch shut down the Violentacrez account abruptly this past Tuesday, six days after we spoke. When I Gchatted him that night, Brutsch told me, "I guess I just got tired of all the hassle." He said he was done with Reddit for good. "Reddit ceased being fun a while ago," he said.

Now he's going to spend the hours he used to lose to Reddit on work. "Oh, and possibly looking for a job will obviously keep me busy ;)"

I asked what he'll miss most about Reddit. "The people," he said. "Reddit is nothing without the community. I've already gotten a few cheery goodbyes from people: 'Keep in touch. You are still a good friend.'"

But he didn't stay away long. In the past couple days, he apparently popped up in a private subreddit called "modtalk," where moderators and administrators talk clear up some misconceptions about why he'd left. (Including one rumor that I had somehow "blackmailed" him into quitting.) In the ensuing discussion, a user named themanwithnoname wrote, "VA, I don't know you personally, but I've appreciated some of your comments over the years. I hope your life rocks from here out."

To which Violentacrez replied, under the handle VA_11102012: "I miss posting porn."

Update, Monday 10/15:Reddit's Biggest Troll Fired From His Real-World Job; Reddit Continues to Censor Gawker Articles

Reddit’s Biggest Troll Fired From His Real-World Job; Reddit Continues to Censor Gawker Articles

Michael Brutsch, the Texas man behind the notorious Reddit troll Violentacrez, was fired this…


Leave a Reply

Your email address will not be published. Required fields are marked *