Spam and Online Fraud

Spam and Online Fraud #

Nelly Agbogu couldn’t believe how popular her Instagram page had become. After successfully building her own popular healthy food brand, Nellies Nigeria, she had decided to start Naija Brand Chick, a platform to help other women in Nigeria grow their online brands. With Instagram as her primary outlet, she posted tips on how small business owners could use Instagram to expand their businesses, and her page quickly grew to over 200,000 followers. The hashtag #NaijaBrandChick became one ofthe three most popular hashtags in Nigeria on Instagram with over 6.6 million posts.

Toward the end of 2019, Nelly began to notice people reporting her Instagram account. A couple people had told her that there were accounts impersonating her in order to scam Nigerians into making fake investments, but she didn’t worry about it too much because the incidents were infrequent. Over the following year, however, the impersonations exploded across several platforms. Nelly frequently saw stolen images of herself on Instagram, WhatsApp, Facebook, and LinkedIn. To make the impersonations more convincing, the scammers had even stolen photographs of her husband and her children. “It really traumatized me,” she said of first encountering the fake accounts.

Nelly assumed the scammers had targeted her account because she used her platform to warn her followers about fraud online: “I call out when someone scams someone on Instagram. I tell people how not to get scammed on Instagram, how to keep your money safe online.” Because she posted so frequently about how Nigerians could protect themselves online, people trusted her. Some would even ask her if a product was legitimate before buying it online. The trust she built with her followers made it easy for scammers to use images of her to direct users to fake products or trick victims into sending them money. Adopting her communication style to imitate her posts, the fake accounts were so convincing that they successfully scammed a member of Nelly’s own family. As the accounts multiplied, she even started receiving death threats from victims of the scam.

Nelly tried desperately to seek help from Facebook customer support to bring down the fake accounts. But as soon as she reported one account, new ones would crop up. Knowing she would report their content, the scam accounts blocked her, so she had to create another account to view and report their posts. A Facebook representative told her the accounts were under investigation. Yet of the 50 accounts Nelly reported to Facebook, 28 remained active in June 2020.1

The damage to her business has been significant. While her Naija Brand Chick account was previously gaining more than 10,000 followers each month, it now attracted barely an additional 500 each month. According to Nelly, her page is no longer featured in Instagram’s Explore section, and people tell her they can’t find her page when they search for it on Instagram. “It is now crippling my own business,” she insists. “I don’t get the same visibility anymore.” The scam has also had consequences well beyond her business. She fears for her safety and doesn’t dare post photos of her family anymore.

Nelly feels defeated. “If I continue to complain, people will ask, ‘You’re supposed to be the Instagram guru?’ It doesn’t speak well for my brand.” She has since tried to be less vocal on Instagram about the fake accounts: “I’m not going to win this fight. The people that are supposed to help are not listening…I have lost hope.”2

Figure 1: Posts from Nelly encouraging followers to report and be wary of scam accounts assuming her identity. (Source: Instagram.)

Introduction to Spam and Online Fraud #

For as long as people have traded goods and services, there have been pernicious actors who have taken advantage of the good nature of others. Yet for much of human history, the scope of the problem remained relatively limited. Prior to the advent of cheap, reliable mail service and reliable remote payment mechanisms (such as physical checks), scammers were limited by the fact that they operated in the same physical vicinity and legal jurisdiction as the victim. Even with the invention of the telegraph and telephone, the relatively high cost of conducting scams provided some limit to the effectiveness of professional scammers. The widespread adoption of the internet removed many limits to small-scale crimes at scale, and unwanted commercial communication, or spam, has become the most prevalent and universal abuse on the internet.

While spam is generally defined as unwanted communications, any more precise definition will depend on the particular platform. For instance, YouTube’s spam policy explicitly prohibits “incentivization spam,” or content that sells engagement metrics such as views or likes. While Yahoo, once a prolific disseminator of spam emails, defines spam as any message that is sent to multiple recipients who have not specifically requested the message. Generally, the primary purpose of spam is to deliver a message that contains a payload–the information in a transmitted message–which can range from benign advertising to incredibly destructive malware.3

We begin this book with a discussion of spam for three reasons. First, its history is closely tied with the history and evolution of the internet itself.4 In many ways, spam was the original trust and safety problem, and many of the approaches to safety discussed throughout this book are derived from the initial fight against spam and fraud. Spam was, for instance, the first abuse that users could report and prompted the first use of machine learning to counteract it. Second, spam is the primary vehicle for a great deal of cybercrime.5 As such, spam has led to incredible technological advancements on both the spam and anti-spam side.

And third, spam demonstrates an important lesson about trust and safety work: when any type of abuse can be monetized, the number of people attempting it will increase by orders of magnitude. The massive news coverage of a handful of controversial content moderation topics belies the fact that spam comprises the vast majority of content removed by intermediaries. Of the hundreds of billions of emails sent each day, 45.6% are estimated to be spam.6 During three months of 2024, Facebook removed 730 million pieces of spam content, far outpacing the 10.5 million pieces of terrorist content.7 Unless a company is able to combat spam at scale, the avalanche of spam can quickly make the product unusable.

This chapter will begin by providing examples of spam and fraud to indicate the kind of harm that these problems have inflicted in the internet age. It will then examine how spam and fraud exploit problems related to online identity and offer a brief history of spam and fraud in the internet era. Finally, it will discuss how spam and fraud can be addressed through policy, operational, product, and technical responses.

Fraud is powered by spam. But unlike spam, fraud involves deception. Generally speaking, we can say that fraud occurs when individuals intentionally and knowingly deceive someone by misrepresenting, concealing, or omitting facts about services for a financial gain. While fraudulent schemes come in endless variations, we can grasp the general flavor of fraud on the internet today by briefly discussing a few of the most prevalent examples.8

  • Pig Butchering Scam - Over many months a scammer builds an online relationship, often romantic, to gain a victim’s trust. The scammer offers a small investment opportunity where the victim sees fake profits. The scammer then offers larger investment opportunities, only to disappear once the victim has transferred significant funds. This is called a pig butchering scam as the scammer “fattens” the victim with a long-term relationship, only to “butcher” them with large financial losses.9

  • Advance Fee Fraud - An updated version of what is known as the Spanish Prisoner fraud (discussed below), this email scam began circulating internationally in the late 1990s.10 A self-proclaimed government official or wealthy individual (often a Nigerian Prince) would offer the recipient of the email the opportunity to split millions of dollars that the author was trying to transfer out of the country, in exchange for an upfront fee. The scammer would often claim that they were unable to access the money without the help of the recipient due to political turmoil, such as a civil war or a coup. The scam, which is sometimes referred to as a 419 scam (the number “419” refers to the section of the Nigerian Criminal Code addressing fraud), has left many victims penniless and has driven some to suicide or murder.11

Figure 2: A 2020 version of the 419 scam. (Source: Brian Krebs12)

  • Romance scams - One of the most shockingly effective modern scams is the romance scam, which involves the use of a fake online persona to seduce and build relationships with individuals in order to solicit funds. Despite the individualized nature of romance scams, con artists that target victims for romance scams are often part of an underground network of criminal organizations that work together.13 Median losses due to romance scams in the U.S. in 2023 were $2,000, higher than any other imposter scam tracked by the FTC .14

    Romance scams often involve the impersonation of military personnel. The U.S. Army Criminal Investigation Command receives hundreds of allegations each month from victims who claim to have been duped into an online relationship with an individual claiming to be a U.S. soldier on a legitimate dating website or other social media website.

    Indeed, when I was at Facebook, we were once contacted by the US Cyber Command, the combatant command of the Department of Defense that handles cyber attacks, who informed us that they were sending a one star general to Silicon Valley to meet with us. We assumed the meeting would address North Korean hackers trying to break into the electrical grid, or perhaps the work we did around the GRU and the 2016 election. However, when the general arrived and we asked what we could do to help, he responded by asking if we could take down a fake romance profile impersonating his boss, who had an uncomfortable conversation with his wife due to his online doppelgänger looking for single older women on Facebook.

    Figure 3: Screenshot of military romance scammer’s Facebook profile. (Source: Facebook15)

  • Shopping scams - A number of scams focus on online shopping, often involving imitation products. Scams will arrive via email or social media, directing users to a third-party eCommerce store where they offer luxury items at a low price. Third party sellers on Amazon have, for instance, listed thousands of banned, unsafe, or mislabeled products, including expired food.16 Some consumers have attempted to buy expensive hard drives only to receive a box with a paperweight inside.17 Since there are rarely consequences for sellers that peddle fake items beyond being deplatformed, sellers can simply reestablish their presence using a different brand or name.18

Figure 4: A counterfeit Amazon child harness falsely claims it is FAA approved. It looks identical to the real harness and is selling for $50 cheaper. (Source: New York Times19)

  • Fake Antivirus software - Cyber-criminals know very well how critical it is for users to have strong security. Rogue antivirus software, often displayed as a pop-up claiming that the user’s device may be infected, directs users to a website where they can purchase the fake software. Upon installation, the fake anti-virus software installs malware on the user’s computer, and the scammers proceed to exploit the financial data with which the user completed the purchase.20

  • Fake charities - Scammers will often impersonate genuine charities, via electronic communications or fake social media accounts, to take advantage of people’s compassion and generosity. Scammers will ask for donations, particularly during the holidays and after natural disasters or major events, diverting much-needed funding from legitimate charities.

Figure 5: A scam email soliciting donations for the World Health Organization during COVID-19. (Source: identityforce.com21)

Fraud and Online Identity #

The scams discussed in the previous section point to a fundamental issue: how do individuals prove their identity online? The internet is a network of networks; a loose confederation of systems operated both for profit and for the benefit of society by organizations who agree to some loose rules and operational procedures. The fact that it works at all is a minor miracle, one which relies upon the good will of individuals, the pervasive use of the end-to-end argument22, and the flexibility granted to individual networks to provision and operate services as they see fit.

The internet has no fundamental conception of identity, other than the identity of the participant networks.23 Yet as the internet has become a hub of communication and commerce, we have bootstrapped new ideas of identity on top of it. The violation of these different identities is a key component of many of the harmful activities described in this book, including with spam and fraud. Broadly speaking, three major identity mappings are relevant to this discussion: the relationship between organizations and infrastructure, the relationship between users and accounts, and the relationship between accounts and real humans. In the following section, we examine how fraudsters take advantage of each of these three levels.

ID mapping Example
Organization <-> Infrastructure Who runs this computer I’m sending my information to?
User <-> Account Is this electronic identity under the control of the same person or persons who initially created it?
Account <-> Real Human To whom does this electronic identity belong?

Table 1: Three levels of online identity mapping.

Figure 6: Examples of the three levels of identity mapping.

Organization ↔ Infrastructure Vulnerabilities #

The organization ↔ infrastructure relationship involves mapping an organization to the infrastructure that hosts that organization. If you are interacting with a system online–whether that system is a web app, a mobile app on your phone, a web service, or email–as a user, you want to ensure that the internet infrastructure you are interacting with actually belongs to the organization you are trying to reach. Scammers can violate this identity by designing infrastructures that imitate the infrastructure used by a legitimate organization. In doing so, their goal is to obtain authentication or personal information by fooling users into connecting to the scammer’s counterfeit infrastructure.

Such attacks could be targeted against specific victims or aimed at a broad swatch of potential victims. Five targeted techniques include:

  • Phishing - Phishing is the biggest scourge facing online identity today. Phishing involves sending users links to websites that imitate legitimate sites, in order to induce users to enter their login credentials to the fake site. For example, a scammer might send a fake Gmail password reset email, which sends users to a fake password reset site at qoogle.com instead of google.com. Users who don’t carefully examine the sending address and URL may think that the email is a legitimate request from Google and enter their real username and password into the fake site in order to “reset their password”. The attacker can then collect this information and log into the user’s real Gmail account. If the user has enabled two-factor authentication, the attacker need only to redirect the user to a second false web page that prompts the user to enter the real code they receive from Google once the scammer has logged into their real Gmail account.

Figure 7: Example of a site that could be used for a phishing campaign.

In the example above, the metadata provided by the browser (fakebook.hacklab.info) is an indication that, while all the content may look exactly like the real Facebook login page, the page doesn’t belong to Facebook. It’s far easier than you might think to create identical copies of the web page content.

  • Meddler-In-The-Middle (MITM) Attacks - Meddler-in-the-middle attacks occur when an attacker intercepts what is thought to be a secure conversation between two entities in order to steal or alter information that is passed between them. One way to execute MITM attacks is via phishing, described above. Another way is by setting up a false cellular or WiFi network that sits between a user’s device and the real network that the user wants to connect to. Users will then connect to the false network instead of the legitimate one, allowing scammers to view any content sent over that network, including non-encrypted passwords and financial information. Compared with standard phishing attacks, MITM attacks are much more difficult to execute and so less common.

  • Typosquatting - Typosquatting involves the registration of domain names that are extremely similar to legitimate ones, but with common typos (e.g., whitehouse.com instead of whitehouse.gov). Users who unwittingly make one of the typos associated with these alternate domain names may be taken to the attacker’s false website, where they could be exposed to unwanted content, acquire a computer virus, or enter their login credentials to the untrustworthy site. Often, typosquatting is used to perpetuate low level scams, like search engine optimization scams, where malicious actors try to create sites that generate traffic to raise their ranking in Google.

  • Mismatched domains - Some security holes allow hackers to compromise the DNS settings so that when a user clicks on a link, the DNS redirects them to the hacker’s own website. Additionally, hackers may disguise malicious links behind seemingly benign URL names in emails; for example, the link may say www.gmail.com but in fact redirect to a completely different site. This technique may be especially difficult to identify when the malicious URL is further disguised using an IDN Homograph attack (see below).

  • Internationalized Domain Name (IDN) Homograph Attack - Similar to typosquatting, IDN homograph attacks involve creating fake links using characters that are very difficult to distinguish from those in the real link. For example, the lowercase letter “l” could be replaced by the uppercase letter “I” or the number “1”, or the letter “e” could be replaced by the latin symbol “e” (which looks almost exactly the same to the naked eye). Unicode–the standard that handles the encoding and representation of written text–is spectacularly complex, with different code pages that effectively support most of the world’s languages. There are several different languages that have characters that look almost exactly the same but function differently. IDN homograph attacks differ from typosquatting in that they don’t depend on users making typos; rather, they intentionally design links to be visually indistinguishable from legitimate ones, and then use these false links to lure users to false websites.

Part of the reason that these issues apply to email, even today, is that the basic protocol for exchanging mail between domains, the Simple Mail Transport Protocol (SMTP), was defined in 1982 with almost no security features in mind. Most people don’t understand that email is fundamentally built upon trust. For instance, the “From:” line in a message is defined by the sender and was never meant to be something on which recipients could rely for security purposes. The explosion of Spam and other abuses since 1982 has led to the creation of a variety of email authentication protocols, all of which are still optional for both senders and receivers to use. These protocols include DMARC (Domain-based Message Authentication, Reporting & Conformance), Sender Policy Framework (SPF), and DomainKeys Identified Mail (DKIM), the correct use of which is beyond the scope of this chapter.24

User ↔ Account Vulnerabilities #

Figure 8: Screenshot from 2020 Twitter hack that compromised user <-> account mapping to circulate a cryptocurrency scam. (Source: New York Times25)

Authentication is critical to establishing and protecting a user’s online identity in order to ensure that the user accessing the account is its legitimate owner.26 Authentication verifies who a user claims to be by mapping one entity to another. Logging into email, bank accounts, and social media using a username and password is the most familiar form of authentication. However, the username and password paradigm, which dates back to the 1970s, was developed when authentication only occurred locally. University or business employees accessed a nearby mainframe via usernames and passwords, which made it easier to track which departments were using the mainframe, and bill different departments accordingly. The username and password system was never meant to be used for remote authentication, yet we still live with this paradigm today.27

Beyond the username-password paradigm, accounts can have additional layers of protection by requiring the user to complete multiple challenges. For instance, corporate accounts often require multi-factor authentication (such as receiving a text with a security code that must be entered before logging in). Other accounts, such as a bank account, will time out after a relatively short period of inactivity, forcing the user to re-enter your credentials. Although these measures may seem simple - at times even irksome - they are crucial to maintaining the security of any system that requires authentication.

Regardless of the sophistication of a platform’s authentication scheme, all accounts have some vulnerabilities. Keeping accounts secure is a cat-and-mouse game, and bad actors will always seek ways into even the most robust systems.28 The first crucial relationship that may be exploited by spammers is that of a legitimate platform user and their account. If I create a pseudo-anonymous account on a gaming website, and I call myself n00byd00, I want to be able to maintain that pseudo-anonymous identity that only I have control over. As my account interacts with people, pwns opponents, and builds an identity on the platform, people expect that the same human is behind that account. They may not know that I am stamos1979, but they know it’s the same person with whom they’ve been interacting.

If the User ↔ Account relationship holds, only the user will be able to access their account: the account is secured using a password or other authentication scheme to ensure that no other individuals have access to it. If a scammer can break into a legitimate user’s account, the integrity of the User ↔ Account relationship is compromised. Once inside, the scammer can use the account for a number of malign activities, including sending spam to other users, stealing the user’s information (which could then be used to impersonate the user), breaking into other accounts belonging to the user, or transfering the user’s money to the scammer’s bank account.

Scammers may break into secure accounts in multiple ways.

  • Stealing Passwords. Recent spam has adapted to the ubiquity of online scams and the accompanying paranoia that most individuals feel about their online identity. Among the most successful tactics to compromise a user’s account is to tell them it’s already been hacked.29 Bad actors may also steal passwords using techniques like keyloggers, phishing, and coercion (for example, an abusive partner who forces their spouse to share their passwords).

    Figure 9: Screenshot of hacked accounts being sold on Dark Web in January 2021.

  • Taking Advantage of Information from Data Breaches. Data breaches are gold mines for hackers looking to gain login information. In the 2017 Equifax breach, for instance, the credentials of 147 million Americans were exposed.30 In the screenshot provided above, hacked accounts are being sold on the black market where purchasers can attempt to redeem the money in the account. In addition to allowing hackers to infiltrate accounts, data breaches can provide hackers with sensitive information that can also be used for identity theft.

  • Credential Stuffing. Hackers may try common passwords (like “qwerty”, “123456”, and “admin”) on multiple accounts, trying their luck until they find one that works. Alternatively, they may gain access to a trove of login information from a data breach and attempt to “stuff” many combinations of usernames and passwords into random accounts until some work. Since many users reuse usernames and passwords across accounts, if a hacker cracks one account, they may easily access several accounts. This method is particularly attractive to spammers who are not seeking to steal information from any particular user but instead are hoping to gain control of authentic accounts from which they can proceed to spew bad information. For example, a foreign state actor may compromise accounts of real U.S. people and use those accounts to spread election disinformation. In order to execute credential stuffing effectively, attackers must use automated software that logs into thousands of accounts using different combinations of possible credentials. Systems may protect against this technique by enforcing rate limiting (limiting the number of queries and login attempts that can be made in a single period of time), IP reputation (which checks the reputation of an IP address to see whether or not it is trustworthy), and CAPTCHA (puzzles required at login that are designed to distinguish humans from computers).

  • Malware. Malware can be used to infect a user’s computer with a virus that steals their information, takes over their account or device, or even sends spam messages to their contacts. Malware can be sent in multiple ways. A few examples include: a malicious email attachment that executes when opened, a link in the body of a message that executes when the user clicks on it, and an online ad that installs malware if clicked.31 All of these examples are known as Trojans, a term used to describe malware that is disguised as something legitimate (a link, attachment, advertisement, or software application) in order to entice users to open it. Once opened, the malware concealed inside a Trojan can wreak havoc on a user’s system. The best way to avoid Trojans is for users of a platform to exercise constant vigilance: users should double check URLs embedded within emails, double check executable attachments before opening them, and never click on links or attachments from unknown senders. Of course, nothing will make a system impermeable to malware attacks, but exercising caution is a simple step that can lead to a significant improvement in security.

Account ↔ Real Human Vulnerabilities #

A final relationship that scammers may exploit is that between accounts and the real humans behind them. In some circumstances, a product may require users to attach their account to their offline identity. In other cases, tying an account to a real human may be an optional feature. For example, your Facebook identity is meant to be tied to your offline identity. Facebook attempts to require users to attach their real identity to their Facebook account, and users are only meant to have one Facebook account. X, on the other hand, supports a range of identity options from the pseudo-anonymous, like the “Fake Steve Jobs” account, to a real identity-verified, blue checkmark account.

Fraud at this level includes identity theft and account impersonation. If, after a hacker breaches a user’s account, the hacker then assumes the user’s real human identity in order to exploit other accounts and users, the hacker has violated the integrity of the relationship binding the account to a person’s identity in the real world. This can have hugely destructive consequences for the victim. In other cases, spammers will sometimes convince a target that they have acquired compromising personal information about them, and that the person must pay them a ransom to prevent them from publicizing the information. This kind of scam often includes the target’s email address and password, usually obtained from a data breach, to convince the victim the scammer hacked their computer.

Figure 10: Example of a threatening message that aims to deceive the recipient into thinking their computer was hacked.

A Brief History of Spam and Online Fraud #

Figure 11: 1898 New York Times article on the Spanish Prisoner scam. (Source: New York Times32)

Fraud has existed in some form since the dawn of commerce. The first known instance dates back to 300 BC when a Greek merchant, Hegestratos, took out a significant insurance policy on a boat he later intended to sink.33 However, when the crew caught wind of his plan and confronted him, Hegegstratos jumped overboard and drowned.

While Hegestratos’s fraud scheme failed, others were incredibly successful. A 19th-century “Spanish Prisoner” scam was so effective that the New York Times published an article warning readers about “one of the oldest and most attractive and probably most successful swindles.”34 The premise was straightforward: an unsuspecting victim received a letter from a Spanish prisoner who had a significant sum of money hidden away, but had been imprisoned for a political offense. The prisoner knew of the good character of the recipient through a mutual friend and promised to give some portion of the fortune to the letter recipient, should the recipient use what was left to take care of his young and helpless daughter. Upon receiving a sympathetic response from the victim, the Spanish prisoner would request money to pay for the daughter’s travel expenses.35 As discussed above, versions of the Spanish prisoner scam would later appear as the “419” or “Advance Fee” email scam, where scammers request help moving money out of a country in exchange for a commission.

Spam Spam, Spammity, Spam: Community Forums (ARPANET, Usenet) #

Like fraud, spam indeed circulated before the proliferation of the World Wide Web; however, as far as we know, there is no Hegestratos of spam, which took hold roughly 2300 years later with the advent of early community forums. The term spam was first coined on the U.S. Department of Defense’s Advanced Research Projects Agency Network (ARPANET), the precursor to the internet. In the bandwidth-constrained, text-only world of ARPANET, the up-arrow key, which allowed users to repeat a word, was a powerful tool. Many users who didn’t know how best to create conversation on this new network, would type favorite song lyrics or quotes. The Monty Python routine, “Spam, spam, spam, spammity, spam” was a favorite because users could type “spam” once and use the up-arrow key to repeat it. “Spamming” ultimately became synonymous with flooding a chat room with comments like the Monty Python “Spam” routine. In these early days of the internet, counter-“Spamming” efforts were primarily concerned with maintaining communities and managing scarce resources on networked computers.36 As will be a recurring theme in the story of spam, the technology, in large part, informed the tactics available to spammers.

Even on ARPANET, where participants came from the upper-echelons of elite academic or military institutions, users of the network never reached consensus on how the net was to be governed.37 When Carl Gartley, a computer tech for Digital Equipment Corporation, posted to publicize the company’s new DEC-20 computers in May 1978, users of the network were livid. One user deemed the post “A FLAGRANT VIOLATION OF THE USE OF ARPANET.”38 This first instance of commercial spam (spam for financial gain as opposed to entertainment) prompted intense debate about advertising on the network, what constituted appropriate content, and censorship on the net. John McCarthy, a Stanford computer scientist who later coined the term “artificial intelligence,” posed a question that continues to vex security professionals and researchers today: “Leaving questions peculiar to ARPANET aside, how should advertising be handled in electronic mail systems?”39

As Usenet, the first text-based, community-centric application of the internet, grew popular during the 1980s, adjudicating what behavior was and was not acceptable on the forum became even more complicated. Compared to ARPANET, the barrier to entry for Usenet was much lower. The network was available to anyone who had access to the Unix operating system. Usenet created an imbalance that still constitutes the core economic incentive for spam distribution: it cost nothing for an individual to post a message, but it cost other users “something to receive it, in money, in disk space, in opportunity cost, and in attention.”40

By 1994, the internet had undergone a notable shift away from computer science professionals to the general population, and away from the noncommercial order that had previously defined the net.41 Two immigration lawyers, a husband-and-wife law firm run by Laurence Canter and Martha Siegel, were eager to capitalize on this transition. Congress had recently devised the Green Card Lottery program to diversify immigration to the US, and less scrupulous lawyers like Canter and Siegel charged high fees in exchange for filing lottery entries for immigrants. On April 12, 1994, Canter and Siegel flooded Usenet newsgroups with messages promoting their services. Canter and Siegel had paid a programmer to write a basic script that pulled the names of all the newsgroups on a particular server and sent the message to each of them. It took one to two hours for the script to run its cycle and for Canter and Siegel’s message to be posted on nearly every newsgroup on Usenet.

Figure 12: Canter and Siegel Green Card spam message on Usenet. (Source: Trust and Safety Foundation42)

Canter and Siegel opened Pandora’s Box by demonstrating the effectiveness of exploiting the internet for financial gain. After the success of the Green Card spam, Canter remarked, “We probably made somewhere between $100,000 to $200,000…which wasn’t remarkable in itself, except that the cost of doing it was negligible.” Copycats noticed Canter and Siegel’s success and began using the Internet to spread commercial spam.43 Thus began the rise of spam as a business and the end of an era of “netiquette.”

Professionalization of Spam: Spam Kings and Queens #

As advertising and online marketing became cornerstones of operating businesses online in the mid-90s, questions of appropriate commercial uses of the internet still remained unanswered. With the beginning of e-commerce in 1994 and the dotcom boom of the late 90s, the opportunities to defraud individuals via unsolicited commercial email were endless. From 1994 through 2003, spam grew at an exponential rate until it comprised the vast majority of email messages sent worldwide and diversified into a huge range of methods and markets.

Spam’s boom years led to the rise of different models: spamming to sell products of your own or sending spam on behalf of clients. Some spammers self-presented as legitimate marketers, working on behalf of small businesses that did not have direct access to marketing over the internet. Others worked covertly, scamming individuals or selling counterfeit products. Spam campaigns ran the gamut, selling everything from pharmaceutical products, like the infamous Viagra emails,44 to mortgages and debt consolidation. Absent any regulation, email marketers took advantage of a digital Wild West to market products at scale, generating about $200 million annual gross revenue worldwide by 2011.4546

Spamming at such a massive scale required significant resources, including bandwidth, addresses from which to send spam, relay servers to disguise the source of messages, and website hosts for the pages where people could fill out forms.47 Consequently, the spam ecosystem was dominated by a handful of individuals who ran their own spamming businesses. Businesses like Amazing Internet Products, run by the former Neo Nazi turned professional spammer, Davis Hawke, used email to market a bevy of questionable products, including penis-enlargement pills, human growth hormones, and inkjet printer refills. Others, like Premier Services, run by Rodona Garst, did for-hire work, charging $500/day for sending 300,000 emails. Garst’s company was also responsible for popularizing a spam campaign that involved “pump-and-dump” stock scams: artificially inflating the price of stock through false, misleading, or exaggerated recommendations about its value. Garst would send millions of emails with fake press releases about microcap companies. Her colleagues would then sell large chunks of the stocks, benefiting from increased traffic from the emails.48

“A Plan for Spam” #

As spamming grew into a business, antispam activists worked diligently to publish spam blacklists, file complaints with Internet Service Providers (ISPs) that hosted sites operated by spammers, close off networks to the infringers, and publicly shame spammers on message boards. One anti-spam activist went so far as to hack into Premier Services’ network and published partially-nude photos of Spam Queen Rodona Garst. To the great advantage of spammers, spam filters were still blunt tools, miscategorizing emails as often as not. And with every advance in spam filters, spammers adopted new techniques, such as image spam or lit spam to circumvent content-based filters. Spammers also found ways to evade IP-based blacklists by changing the sending IP address.49

Figure 13: Techniques used to defeat anti-spam filters in 2007 (Source: Wang et al 200750)

By the early aughts, the situation with spam email was dire. Deeply committed, anti-spam activists were largely fighting a losing battle. However, two 1998 papers argued for adopting a filtering system based on naive Bayesian statistical analysis that proved incredibly influential in the fight against spam.51 One of these papers was written by Stanford computer scientist Mehran Sahami, whose Naive Bayesian classifiers were deployed in commercial spam filters. This technique was later refined by Paul Graham, a computer scientist who later founded the seed money startup accelerator, Y Combinator, in his influential essay “A Plan for Spam.”52 Graham argued for content-based filtering for email spam, writing, “If we can write software that recognizes their messages, there is no way they can get around that.”53

Such an approach proved incredibly effective at classifying spam with few false positives. The Bayesian filtering technique for which Graham and others advocated, which shifted the burden of classifying spam from humans to machines, helped to destroy email spamming as it existed in the early 2000s.54

Spam as specialized criminal enterprise #

The passage of the CAN-SPAM Act (2003), which created the first national standards for sending commercial email, along with the arrest of several high-profile spam kingpins, forced spam underground. As a result, the amount of spam advertising legitimate goods and services began to decrease,55 while a new arsenal of deceptive tactics emerged in the form of phishing and identity theft and virus and malware distribution.

The consolidation of criminal spam gangs and the increase in internet users meant that overall volume of spam continued to increase, even while machine learning methods improved our ability to classify it. By 2010, an estimated 88% of email traffic was spam.56 Spam was so bad at Yahoo that the NSA Prism slides leaked by Edward Snowden included a complaint about how much spam NSA employees had to sift through on Yahoo.

Figure 14: 22 percent of Yahoo emails were spam in 2013. (Source: Krebs on Security57)

The NSA was not alone in their frustration. In 2013, email spam cost businesses around $20.5 billion.58 The fundamental spam imbalance that began with Usenet continued at enormous cost to companies and individuals.

As criminals came to recognize the vast potential of combining malware and spam, they began creating malicious botnets, a collection of computers connected by malware that allows them to be controlled through a command-and-control server, deployed to perform automatic tasks. Botnets can be used to steal passwords, personal data, and identities, as well as engage in bank fraud and click fraud, the practice of deceptively clicking on search ads with the intention of either increasing third-party website revenues or exhausting an advertiser’s budget.59

There is a vast market for botnets in the thriving underground economy of cybercrime. On the Dark Web, potential consumers can rent a botnet for 24 hours to send millions of spam messages for $6760 or assemble their own botnet by paying 2 to 10 cents per infected device.61

While the cost to build botnets may seem exorbitant (researchers estimate that creating a botnet linked to 10 million devices costs around $16 million),62 the returns can significantly outweigh the cost. Spam advertising with 10,000 bots can generate around $300,000 a month, and bank fraud carried out by 30,000 bots can generate over $18 million per month. The most profitable undertaking is click fraud, which can generate well over $20 million a month of profit.63 Fraudsters often use botnets to conduct ad fraud, selling bot traffic to publishers looking to boost audience engagement or spoof domains so that it appears to advertisers that their ad was served on a premium publisher’s site, like the Washington Post, instead of on an empty website.

With the potential to make money off each compromised account, spam began to attract more skilled talent, on both the business and programming sides.64 Today, there are markets specific to stolen data, easy-to-use malware, phishing kits, and hackers for hire. There are teams entirely devoted to research and development on malware. These teams find exploits, write malware that evades antivirus detection, and sell the malware to a development team. That team will take the malware and customize it, which makes it harder to detect using antivirus software.

Figure 15: Screenshots from the Dark Web (January 2021) offering access to hacked accounts and hacker-for-hire services

Web 2.0 and New Forms of Spam #

With the birth of new platforms offering a diverse array of services, spamming took new forms, expanding beyond email into an activity that existed across different platforms. When search engines became ubiquitous, search engine optimization spam, or spamdexing, rose along with it. By keyword stuffing,65 using invisible text on web pages, or using links to or from web pages to artificially elevate the reputation of a web page, spammers manipulated search engine rankings to lure traffic to a scam site, which they could then monetize. Google’s success as a search engine came in part due to its ability to tailor algorithms and page rank systems to circumvent spamdexing that made most search engines useless.66

As online marketplaces like eBay and Amazon became popular, payment and physical goods fraud also ran rampant. With the rise of social media, criminals could more easily impersonate others to target victims, as in the example at the beginning of the chapter.67 As a result, spamming has become more personalized and tailored as scammers hope to maximize profit out of the individuals they target.68 Fake accounts are frequently used for this purpose, as are compromised authentic accounts.

One particularly common use of fake accounts involves impersonating celebrities and executives–anyone from Mark Zuckerberg to Taylor Swift–to scam Facebook users out of cash. Perhaps the most impersonated person on the internet at the moment is Elon Musk. He’s so rich and eccentric that people believe he would actually send people Bitcoin. And so, as a result, there are a huge number of Elon Musk impersonators across social media platforms asking for bitcoin donations and offering to send a large sum of money in return.69 I have a tiny fraction of impersonators compared to Elon Musk, but it does happen, and it would happen on Facebook too. Most of my impersonators were encouraging people to buy Facebook sweepstakes tickets, who would then complain they did not get paid from Facebook sweepstakes.

Figure 16: Scammers leveraged Elon Musk’s appearance on SNL to run cryptocurrency giveaway scams. (Source: TRM Labs70)

Figure 17: Fake Alex Stamos X accounts. (Source: X71)

Financially motivated spammers have also capitalized on humanitarian crises and conflicts to cajole users into buying merchandise or visiting third-party websites displaying ads. Following Russia’s invasion of Ukraine, Meta removed thousands of accounts, Pages, and Groups “streaming live-gaming videos and reposting popular content including other people’s videos from Ukraine as a way to pose as sharing live updates”72 in an attempt to generate profits.73

Figure 18: Scammers capitalizing on the war in Ukraine. (Source: Input Magazine74)

Responses #

Virtually no popular online medium has been left untouched by spam. As a result, it is critical for every company that carries any kind of free communication to prepare for the possibility that it will be used to transmit unsolicited communications and scams. Below are some guidelines for preemptively dealing with spam and for addressing spam once it emerges on platforms.

Policy and Operational #

Create and enforce identity policies. #

If a platform does not possess clear rules around identity and commercialization, people will immediately try to make money by sending spam or exploiting users. First, platforms need to create rules around how individuals can maintain identity on their platform. In doing so, companies should recognize that ideas of identity differ both between and within cultures. Products built in a western context often have a stricter conception of identity that does not match local contexts when applied elsewhere. In some countries, such as Zimbabwe, people frequently use pseudonyms online, even though their friends and family still know the true identity of the account. Similarly, activists have long argued that policies that require the use of one’s legal identity on online platforms are discriminatory against trans users who do not identify with their legal name.

Companies must also recognize that stricter identity mapping policies can carry their own risks. Mapping a real human to an account exposes, for instance, makes a company responsible for safeguarding the information used to establish that mapping. The social networking site Parler75 required users to provide a photocopy of identification (typically a state driver’s license) in order to be “verified” on the site. When the site was hacked in January of 2021, that highly sensitive data was exposed.

Similarly, if companies allow users to tie their real name to their account, then they will need to ensure that the user continues to use their real identity on the platform. After receiving a blue checkmark, journalist Sarah Jeong was able to change her Twitter handle to “a literal psyduck.” Companies will also need to ensure that verified accounts remain in the original user’s hands. Verified accounts are often hijacked by scammers, because users tend to view information from verified accounts as more trustworthy.76 Scammers can also imitate verification to gain the confidence of unsuspecting users. For instance, X allows unicode characters in a user’s name, which is important for internationalization, but has also given scammers the capability to make it look like they have a checkmark associated with their account.77

Build policy around specific abuse types. #

As discussed above, spammers and scammers adapt to the unique features of each platform, so companies must implement policies around the specific types of abuses leveled on their platforms. More generally, commercialization works differently on each platform, so different companies will need to build policies around what is and isn’t appropriate commercial activity on their platform to combat the presence of spam. But first, they will need to understand how spam might manifest on their platform and define it in ways that will prevent its proliferation. The table below outlines how various companies define spam on their own platforms.

Definition: Spam
NIST “Electronic junk mail or the abuse of electronic messaging systems to indiscriminately send unsolicited bulk messages.”
Meta “[C]ontent that is designed to deceive, mislead, or overwhelm users in order to artificially increase viewership”
Google “Spam includes, but is not limited to, unwanted promotional, commercial, or manipulative content that adds little value for users or to Google.”
YouTube Video spam “Content that is excessively posted, repetitive, or untargeted and does one or more of the following: § Promises viewers they’ll see something but instead directs them off site. § Gets clicks, views, or traffic off YouTube by promising viewers that they’ll make money fast. § Sends audiences to sites that spread harmful software, try to gather personal info, or other sites that have a negative impact.” Incentivization spam “Content that sells engagement metrics such as views, likes, comments, or any other metric on YouTube. This type of spam can also include content where the only purpose is to boost subscribers, views, or other metrics. For example, offering to subscribe to another creator’s channel solely in exchange for them subscribing to your channel, also known as “Sub4Sub” content.” Comments spam “Comments where the sole purpose is to gather personal information from viewers, misleadingly drive viewers off YouTube, or perform any of the prohibited behaviors noted above.” Repetitive comments “Leaving large amounts of identical, untargeted or repetitive comments.”
X Content Spam: You may not share or post content in a bulk, duplicative, irrelevant or unsolicited manner that disrupts people’s experience. Engagement Spam: We prohibit inauthentic use of X engagement features to artificially impact traffic or disrupt people’s experience.”
Reddit “Repeated, unwanted, and/or unsolicited actions, whether automated or manual, that negatively affect Reddit users, Reddit communities, and/or Reddit itself.”
TikTok “We do not allow the use of accounts to engage in platform manipulation. This includes the use of automation to register or operate accounts in bulk, distribute high-volume commercial content, artificially increase engagement signals, and circumvent enforcement of our guidelines.”
WhatsApp Rules disallow “sending illegal or impermissible communications such as bulk messaging, auto-messaging, auto-dialing, and the like”

Table 2: Spam definitions

Join industry working groups #

The shifting economies of scale of cybercrime has meant that the business of email spamming, fraud, and scamming over the internet has become far more centralized. Only a few hundred groups are now responsible for more than 80 percent of spam.78 Similarly, the vast majority of cybercrime depends on a select few “bulletproof” hosting services that are particularly lenient in the material they allow customers to upload and distribute as long as they continue to pay expensive hosting fees. Hosting providers that are oblivious or willfully ignorant to the abuse taking place on their servers often do not act on abuse complaints and ignore subpoenas from law enforcement. Spammers, phishers, botnet operators, and malware distributors choose to host their infrastructure on such services.

Given the scale and centralization of the problem, cross-industry collaboration prompted by industry working groups can be hugely helpful to combat spam and fraud. Working groups are essential, in part, because dismantling globally distributed Botnets or disrupting “bulletproof” hosting services requires collaboration to detect, mitigate, and bring in law enforcement. Similarly, fraud schemes often target the entire advertising ecosystem rather than a single player or system. To combat such schemes, companies may need to partner with the cybersecurity firm, or other industry and information security partners.

Adopt industry standards and build on commonly used tools #

Several of the largest tech companies have created standards and tools to prevent spam. Yahoo, for instance, has been the driver behind most email authentication technologies because it was the first company to suffer significant economic repercussions from spammers. Many companies will benefit from using existing commercial email providers (e.g., Gmail, Yahoo Mail, Microsoft Outlook, etc.), which come with sophisticated built-in spam filters. However, even companies that need to build their own email security should look to pre-existing standards and to commonly used tools to build out their infrastructure.

One example of a commonly-used tool that companies can download and try to emulate is Apache SpamAssassin. SpamAssassin is an open-source tool that provides configurable spam filtering capabilities. It can be used both by users who run email servers and by those who simply have an email account on a server. SpamAssassin is useful because it comes with built-in spam detection capabilities that users can adjust to meet their needs by, for example, deciding what email to mark as spam, and what to do with a message after it is identified. Broadly speaking, burgeoning companies should utilize open-source tools to build out their own spam filtering capabilities.

Seek outside counsel #

There are a number of U.S. laws that govern financially-motivated criminal activity on the internet. Federal statutes include the Computer Fraud and Abuse Act (18 USC § 1030), the Wire Fraud Act (18 USC §1343), the CAN-SPAM Act, and others, including laws that pertain to ID fraud and ID theft (18 USC §1028, §1028 A), and “Access Device” fraud (18 USC §1029). These laws are specific and lengthy, and companies should seek outside counsel when building out their spam policies to understand their responsibilities to users and options for redress should spam become a problem on their platforms.

The most notable federal statute governing computer trespass is the Computer Fraud and Abuse Act (CFAA). The CFAA criminalizes unauthorized access of computers or accessing a computer in a manner that “exceeds authorization.” Phishing, distribution of malware, hacking into computers, and trafficking in passwords are just a few of the federal criminal offenses punishable under the CFAA. In the past, Facebook has sued companies under the CFAA and CAN-SPAM Act for collecting Facebook user information. For example, Sanford “Spam King” Wallace sent more than 27 million spam messages to 500,000 Facebook users between November 2008 and March 2009. Facebook sued him for violating the CAN-SPAM Act and the CFAA, resulting in an order for Wallace not to access any Facebook network (which he then violated). He was sentenced to 30 months in prison and ordered to pay more than $300,000 in restitution.79

Recently, the Supreme Court has narrowed the scope of the CFAA. In Van Buren v. United States, the Court ruled that CFAA does not make it a crime to break a promise online or to violate a company’s terms of service.80 It still may be possible, however, to utilize the CFAA in situations where a spammer or fraudster goes well beyond the normal use of the product to sell their wares. For example, when I was at Facebook, there was a contractor in our Austin office who had attended high school with the man who was destined to become the Spam King of Texas. This contractor had been able to get himself assigned to the ads integrity team, where he had access to the technical details of Facebook’s anti-spam tooling, as well as information about accounts that had been actioned. The contractor provided the Spam King and his associates with hundreds of documents, and eventually installed remote control software on his Facebook-provided device to allow the spammers direct access to Facebook’s systems even during off-work hours. This behavior triggered several alarms and an investigation by our internal threat team. Upon discovering that this contractor had violated our policies for the benefit of his spammer high school buddy, we involved the FBI, who was able to turn the patsy against his friends, resulting in the prosecution of several of the spammers for violations of the CFAA and wire fraud statutes.

While this is an extreme example, there are still situations where federal law might be usable against financially motivated abusers, even post-Van Buren. The law in this area is complex and ever changing, so companies should always involve expert outside counsel well before attempting any civil actions or criminal referrals.

Product and Technical #

In addition to implementing policies and operational mechanisms to combat spam and fraud, companies can also tool their products and technical stacks to stave off fraud and spam.

Deploy multi-factor authentication solutions #

Multi-factor authentication schemes are one of the best ways to ward off attackers who seek to break into users’ accounts to steal their information or their identities. A good multi-factor authentication scheme will require a user to enter their username, password, and at least one additional piece of information that falls into the following categories:

  • Something only they know - Users can be sent a piece of information that should only be available to that real user, such as a code that is sent to them via another device (such as their cell phone) or a security question that only they can answer. Password reset questions are famously bad and not very helpful because they often involve information that a bad actor can easily figure out. For instance, celebrities have had their accounts taken over because the answers to their security questions were easily discoverable on their public Wikipedia page. But even for the non-famous, this information (e.g., mother’s maiden name, street on which you grew up, etc.) is often discoverable with only modest amounts of research.

  • Something only they have - Users can have a physical device, such as a physical security key (e.g., a YubiKey or other USB-like device containing a secure token) or an app that adds an additional layer of verification.81

  • Something only they are - Users can be asked to provide biometric information, such as a fingerprint, retina scan, or Face ID. However, biometrics are not the panacea that people often assume. There are obvious privacy issues with the use of biometrics, and biometrics can only ensure user privacy when used locally. For instance, when your phone uses face ID to unlock access to the device, the computer model of your face never leaves the phone. This is useful from a privacy perspective, but makes it completely useless if you want to authenticate to Google or Apple.

Multi-factor authentication schemes are relatively difficult for fraudsters to attack since they depend on modes of authentication that are difficult for many attackers to infiltrate. For example, even if an attacker successfully steals a user’s password via a phishing link or data breach, they will be unable to log into the user’s account if multi-factor authentication requires another form of verification, such as a code that is texted to the user or a YubiKey token.

Figure 19: Multi-factor authentication example

When designing security systems, companies should keep in mind that their users likely have no sense of how they are going to get hacked. Most users believe that simply using antivirus software and strong passwords will keep them safe online. Giving users different authentication options may make it easier for the company to manage spam risk, but it creates a risk that people will make the wrong choice for their threat level.

Let’s use Google as an example (see above). If a company has enabled multi-factor authentication for users, users will be given a number of different pre-authenticated options in order to access their account after entering their password. The security of these different options, however, is actually quite different. If the user has a pre-installed Google application on their phone or tablet that has verified them and that they have control over, that user will be reasonably secure against spam attacks. Getting a verification code, by contrast, is much less secure. There are a number of ways to compromise the SMS message or an individual’s SIM or phone directly, and there are no mechanisms right now for tech companies to find out if somebody’s phone number has changed.

Multi-factor authentication also poses a number of backup issues. Just as there is the standard “Forgot your password” backup solution for usernames and passwords, there are equivalent backup solutions for every single layer of multifactor authentication. By adding a backup solution for each layer of complexity that is added to the system, companies create multiple points of risk, which they must take care to monitor. Nevertheless, a multifactor solution is still almost always the most secure, feasible approach to prevent attackers from breaking into user accounts.

Deploy risk-based conditional access mechanisms #

In addition to multi-factor authentication, companies can implement risk-based authentication to require more levels of verification if a login attempt is flagged as having a higher probability of being fraudulent. The probability of a login attempt being fraudulent can be determined by contextual information about the login. This may include the following:

  • Whether or not the device is registered

  • Whether the login attempt is coming from a secure/recognized network or an unknown network

  • The time and location of the login attempt

  • Whether or not the user whose account is being accessed has fallen victim to phishing in the past (either real phishing attacks or a practice phishing test sent out by the security team)

  • The number of login attempts that have been made

  • The type of information being accessed (for instance, personal emails and social media accounts usually require less security than financial and corporate accounts)

How to weigh the importance of these factors will be context-dependent. In general, however, the higher the estimated probability that the login attempt is fraudulent, the more multi-factor authentication steps should be required of the user. For example, a login attempt to a personal email account that the user frequently accessed at the time and location of the login would be considered low-risk: in this case, the user might only be prompted to enter in a code that is sent to them via text. However, a login to a brokerage account that the user hasn’t accessed in several months might prompt additional steps, like responding to security questions. Companies that give employees access to sensitive information should provide those employees with a hardware key in addition to employing login codes and security questions, as these methods may not be robust to sophisticated phishing and social engineering attacks (which fool a user into willingly handing over their access codes, passwords, and other credentials to bad actors). Extremely sensitive data may benefit from an additional layer of biometric security.

Leverage machine learning for spam and fake account detection. #

Machine learning (ML) can be used to flag accounts that exhibit inauthentic characteristics. For example, accounts that create posts at odd times of day, clusters of accounts that are all created at the same time, and accounts that make frequent grammatical errors characteristic of non-native speakers are all signs of account impersonation or a botnet. Training ML models to recognize such accounts and bring them to the attention of engineers on the trust and safety team can help effectively curb their spread. In addition, ML can be used to identify and remove spam emails based on characteristics like words contained in the email, the IP address of the sender, and the email address of the sender. For more on machine learning and how it can be used to detect spam, please see the sections on introduction to ML and adversarial ML.

Implement verification checkmarks. #

Prominent and known users should have a verification checkmark displayed by their account name to indicate that their identity is legitimate (i.e., they are who they say they are).

Figure 20: Examples of verification checkmarks. (Sources: X and Facebook82, 83)

Reporting mechanisms. #

Platforms should also develop mechanisms for users to report fake accounts directly to the trust and safety team. The logic is simple: if it is difficult to report spam, then fewer users will report spam. Conversely, if reporting is easy, users will provide large amounts of data that will make it easier to counteract spam. Unlike other abuse types, the economics of spam mean that it must be sent to many people in order to be successful. Given this, every specific instance of spam will tend to be reported by a large number of users, which makes user reporting particularly useful and reliable, since it undermines the possibility of adversarial reporting.

Reporting mechanisms, however, can be gamed. The audio app Clubhouse built a blocking tool more consequential than other platforms. Once blocked, a user can’t join conversations started by the blocker. When enough users block someone, the blocked user’s profile appears with a black shield to others (but this icon is not visible to the blocked user). While the appeal of such a feature may seem obvious–crowdsource the identification of nefarious actors–the block feature was weaponized to silence speech and to target several vulnerable groups on the app.84

Figure 21: Clubhouse’s black shield indicating a user has been blocked by many other users.

Operational checks on well-known targets. #

Trust and safety teams should proactively check for false accounts impersonating well-known targets. Examples include the Pentagon, Elon Musk, and military personnel (e.g., those impersonated as part of romance scams).

Looking Ahead #

One of the fundamental problems in countering spam is that it is unconstrained by borders, while the legislation that governs it remains limited to the jurisdiction of individual countries.85 An advertisement for dodgy products targeting Americans may be distributed from an infected computer in Iran with the help of an ISP in China. The definitional challenges inherent in spam, the difficulty of investigating and proving spammers’ guilt, and a failure on the part of authorities to view spam as a serious threat, have meant that legal remedies and efforts by authorities to counteract spam have taken a backseat to computer fraud and identity theft online.86 Since spam is one of the main driver’s of fraud and identity theft, this narrow focus is shortsighted.

Even the anti-spam lawsuits and criminal prosecutions that have succeeded haven’t stemmed the tide.87 SecureList estimates the share of spam in mail traffic was 56.51% in 2019,88 only slightly lower than the 70% reported in 2004.89 The United States still tops Spamhaus’s World’s Worst Spam Enabling Countries list.90 While CAN-SPAM was an important first step in the FTC’s attempt to combat spam, it does not seem to have had much impact on the amount of spam nor the behavior of spammers.

There are also reasons to worry that spam will soon become much more difficult to detect. For instance, artificial intelligence provides more powerful tools to automatically generate deceptive content. Although the full implications are impossible to predict, the history outlined above demonstrates how new technological capabilities have the potential to increase abuse. Machine learning can be used to write messages more personalized and more believable to target individuals without human interaction. Researchers worry about the use of “deep fakes” to produce personalized spam using information, like photos or videos, about an individual’s friends.91

Spam is largely perceived to be an inevitable nuisance of existing online. The world at large needs to recognize that spam is a serious problem. Considering the international scope of most cybercriminal operations, more cooperation between countries is necessary. We need a global, systematic effort to address the number of outdated, unfixable, and unprotected systems.92


  1. Nelly is a real person and this story is real. Our team wrote about this issue in more detail here: https://cyber.fsi.stanford.edu/io/news/binomo-trading-scam ↩︎

  2. Source: Nelly Agbogu ↩︎

  3. https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.178.4052&rep=rep1&type=pdf ↩︎

  4. https://www.internetsociety.org/wp-content/uploads/2017/08/History20of20Spam.pdf ↩︎

  5. https://www.amazon.com/Spam-Nation-Organized-Cybercrime-Epidemic/dp/1492603236 ↩︎

  6. https://www.statista.com/statistics/420400/spam-email-traffic-share-annual ↩︎

  7. https://transparency.meta.com/reports/community-standards-enforcement/spam/facebook/ and https://transparency.meta.com/reports/community-standards-enforcement/dangerous-organizations/facebook/ ↩︎

  8. For other common scams, see https://www.fbi.gov/how-we-can-help-you/scams-and-safety/common-frauds-and-scams ↩︎

  9. https://www.wsj.com/world/asia/how-a-young-mayor-turned-her-town-into-a-hub-for-pig-butchering-scammers-da89e2a5 ↩︎

  10. The Advance Fee Fraud existed as a paper-based business in letter form before U.S. postal officials cracked down in 1998 (Brunton, 149) ↩︎

  11. https://www.wired.com/2006/08/baiters-teach-scammers-a-lesson/ ↩︎

  12. https://twitter.com/briankrebs/status/1326896690524250113/photo/1 ↩︎

  13. https://www.fbi.gov/scams-and-safety/common-scams-and-crimes/romance-scams ↩︎

  14. https://www.ftc.gov/business-guidance/blog/2024/02/love-stinks-when-scammer-involved ↩︎

  15. https://www.facebook.com/MilitaryRomances/photos/a.838119999534073/2133670913312302/ ↩︎

  16. https://www.nytimes.com/wirecutter/blog/amazon-counterfeit-fake-products/; https://www.wsj.com/articles/amazon-has-ceded-control-of-its-site-the-result-thousands-of-banned-unsafe-or-mislabeled-products-11566564990 ↩︎

  17. Scammers match the weight of the fake item with that of the real item so that the tracking metadata sent to Amazon is not flagged by an automatic detection algorithm. ↩︎

  18. https://www.nytimes.com/wirecutter/blog/amazon-counterfeit-fake-products/ ↩︎

  19. https://www.nytimes.com/wirecutter/blog/amazon-counterfeit-fake-products/ ↩︎

  20. https://www.mcafee.com/blogs/consumer/consumer-threat-reports/fake-antivirus-software/ ↩︎

  21. https://web.archive.org/web/20200412074723/https://www.identityforce.com/identity-theft/coronavirus-scams ↩︎

  22. Not to be confused with end-to-end encryption, end-to-end argument pushes as much complexity as possible to the endpoints of a network rather than to the systems in the middle (such as routers in the internet example). This idea, which is the basis of what made the internet so successful, is under constant attack by parties who benefit from granting powers to those systems that make the internet work. A good early formulation of this argument is here: http://web.mit.edu/Saltzer/www/publications/endtoend/endtoend.pdf ↩︎

  23. These are called autonomous systems (AS), which are assigned numbers and blocks of IP addresses by the Internet Assigned Numbers Authority. How those IP addresses are used are up to each AS. ↩︎

  24. https://returnpath.com/downloads/authenticating-email-dmarc-spf-dkim-quick-start-guide/#:~:text=SPF%20(Sender%20Policy%20Framework)%20and,make%20up%20the%20DMARC%20process.&text=DMARC%20allows%20senders%20to%20instruct,messages%20that%20fail%20DMARC%20authentication. ↩︎

  25. https://www.nytimes.com/2020/08/02/technology/florida-teenager-twitter-hack.html ↩︎

  26. Authorization (authz) is often confused with authentication (authn). Authorization involves what users are allowed to access once they are authenticated. There are a host of problems caused by violations of authorization. Generally-speaking those are infosec issues, and we will not discuss them here. ↩︎

  27. It is quite difficult to replace passwords despite all their flaws for a number of reasons, including user reluctance and usability issues, individual control of end-user platforms, and the fact that no single organization can impose a single solution https://link-springer-com.stanford.idm.oclc.org/content/pdf/10.1007%2F978-3-642-03549-4.pdf ↩︎

  28. Hollywood action films are filled with illustrations of this, some of which can be surprisingly realistic. Take the 2008 movie Get Smart, where main character Maxwell Smart (played by Steve Carell) escapes capture by knocking out his attacker, heaving him over to a biometric scanner, and forcing his eye open to unlock a door using his retina scan. This might seem far-fetched, but it shows how even crude, brute-force tactics can be effective at getting past sophisticated authentication systems. ↩︎

  29. https://www.nytimes.com/2019/08/20/style/spam-email.html ↩︎

  30. https://www.ftc.gov/enforcement/cases-proceedings/refunds/equifax-data-breach-settlement ↩︎

  31. https://www.comodo.com/business-security/email-security/email-virus.php, https://www.malwarebytes.com/spam/ ↩︎

  32. https://timesmachine.nytimes.com/timesmachine/1898/03/20/102108294.html?pageNumber=12 ↩︎

  33. https://www.oreilly.com/library/view/bank-fraud-using/9780470494394/08_chapter-01.html ↩︎

  34. NYT Spanish Prisoner Scam 1898.pdf ↩︎

  35. https://historyhouse.co.uk/articles/spanish_prisoner_swindle.html ↩︎

  36. Brunton, F. (2013). Spam: A shadow history of the internet. https://mitpress.mit.edu/9780262527576/spam/ page 97 ↩︎

  37. Bruton, page 48 ↩︎

  38. https://www.templetons.com/brad/spamreact.html#msg ↩︎

  39. Bruton, page 60 ↩︎

  40. https://mitpress.mit.edu/9780262527576/spam/ page 64 ↩︎

  41. https://mitpress.mit.edu/9780262527576/spam/ page 84. ↩︎

  42. https://www.tsf.foundation/blog/usenet-has-to-figure-out-how-to-deal-with-spam-april-1994 ↩︎

  43. https://www.cnet.com/news/the-father-of-modern-spam-speaks/ ↩︎

  44. https://www.politico.com/magazine/story/2014/12/pharma-spam-113562/ ↩︎

  45. https://mitpress.mit.edu/9780262527576/spam/ page 104 ↩︎

  46. https://pubs.aeaweb.org/doi/pdf/10.1257/jep.26.3.87 ↩︎

  47. https://mitpress.mit.edu/9780262527576/spam/ page 110 ↩︎

  48. https://www.oreilly.com/library/view/spam-kings/9781491916124/ page 65 ↩︎

  49. https://firstmonday.org/article/view/2793/2431 ↩︎

  50. https://www.cs.princeton.edu/cass/papers/spam_ceas07.pdf ↩︎

  51. Patrick Pantel and Dekang Lin. ``SpamCop– A Spam Classification & Organization Program.’’ Proceedings of AAAI-98 Workshop on Learning for Text Categorization; Mehran Sahami, Susan Dumais, David Heckerman and Eric Horvitz. ``A Bayesian Approach to Filtering Junk E-Mail.’’ Proceedings of AAAI-98 Workshop on Learning for Text Categorization. ↩︎

  52. See Machine Learning chapter; http://paulgraham.com/better.html↩︎

  53. http://www.paulgraham.com/spam.html ↩︎

  54. https://mitpress.mit.edu/9780262527576/spam/ page 180 ↩︎

  55. https://securelist.com/kaspersky-security-bulletin-spam-evolution-2013/58274/#:~:text=The%20percentage%20of%20spam%20in%20total%20email%20traffic%20decreased%20by,year%20and%20came%20to%2069.6%25 ↩︎

  56. https://www.internetsociety.org/wp-content/uploads/2017/08/History20of20Spam.pdf ↩︎

  57. https://krebsonsecurity.com/2013/01/spam-volumes-past-present-global-local/ ↩︎

  58. https://www.propellercrm.com/blog/email-spam-statistics#:~:text=Spam%20costs%20businesses%20a%20whopping%20%2420.5%20billion%20every%20year ↩︎

  59. https://pubsonline.informs.org/doi/10.1287/mksc.1080.0397 ↩︎

  60. https://www.zdnet.com/article/study-finds-the-average-price-for-renting-a-botnet/ ↩︎

  61. https://www.technologyreview.com/2018/05/14/142895/inside-the-business-model-for-botnets/ ↩︎

  62. https://www.technologyreview.com/2018/05/14/142895/inside-the-business-model-for-botnets/ ↩︎

  63. https://www.technologyreview.com/2018/05/14/142895/inside-the-business-model-for-botnets/ ↩︎

  64. https://mitpress.mit.edu/9780262527576/spam/ page 206 ↩︎

  65. Keyword stuffing refers to the practice of overloading popular keywords onto a Web page so that search engines will read the page as being relevant in a Web search. Source: https://www.webopedia.com/TERM/K/keyword_stuffing.html ↩︎

  66. https://resources.infosecinstitute.com/spamdexing-seo-spam-malware/ ↩︎

  67. https://www.nytimes.com/2019/07/26/the-weekly/facebook-scams.html?action=click&module=RelatedLinks&pgtype=Article ↩︎

  68. https://www.nytimes.com/2019/08/20/style/spam-email.html ↩︎

  69. https://www.wired.co.uk/article/elon-musk-bitcoin-scam-twitter ↩︎

  70. https://blog.trmlabs.com/post/dogecoin-elon-musk-snl-scam ↩︎

  71. https://twitter.com/nessaweir; https://twitter.com/olamide90328153 ↩︎

  72. https://about.fb.com/wp-content/uploads/2022/04/Meta-Quarterly-Adversarial-Threat-Report_Q1-2022.pdf ↩︎

  73. https://about.fb.com/wp-content/uploads/2022/04/Meta-Quarterly-Adversarial-Threat-Report_Q1-2022.pdf ↩︎

  74. https://www.inputmag.com/culture/ukraine-russia-war-pages-instagram-meme-scams ↩︎

  75. Parler is a social networking platform that advocates with a significant user base of Donald Trump supporters, conservatives, conspiracy theorists, and far-right extremists. ↩︎

  76. https://www.buzzfeednews.com/article/charliewarzel/twitter-allowed-cryptocurrency-scammers-to-hijack-verified ↩︎

  77. https://www.buzzfeednews.com/article/charliewarzel/twitter-allowed-cryptocurrency-scammers-to-hijack-verified ↩︎

  78. https://mitpress.mit.edu/9780262527576/spam/ page 250 ↩︎

  79. https://www.justice.gov/usao-ndca/pr/sanford-spam-king-wallace-sentenced-two-and-half-years-custody-spamming-facebook-users ↩︎

  80. https://www.lawfareblog.com/supreme-court-reins-cfaa-van-buren ↩︎

  81. See description at https://www.yubico.com/ ↩︎

  82. https://twitter.com/CDCgov ↩︎

  83. https://www.facebook.com/POTUS/ ↩︎

  84. https://www.theatlantic.com/technology/archive/2021/05/clubhouse-has-blocking-problem/618867/ ↩︎

  85. https://securelist.com/spam-and-the-law/36301/ ↩︎

  86. https://www.everycrsreport.com/reports/97-1025.html#_Toc401151139 ↩︎

  87. https://krebsonsecurity.com/2017/07/is-it-time-to-can-the-can-spam-act/ ↩︎

  88. https://securelist.com/spam-report-2019/96527/ ↩︎

  89. https://krebsonsecurity.com/2017/07/is-it-time-to-can-the-can-spam-act/ ↩︎

  90. https://www.spamhaus.org/statistics/countries/ ↩︎

  91. https://www.itpro.co.uk/security/34784/the-future-of-spam-is-scary ↩︎

  92. https://icdt.osu.edu/spam-nation ↩︎