The Unveiling of an AI-Generated Clickbait Factory: A Small Iowa Newspaper’s Website Sagas

Author: Kate Knibbs

In his spare time, Tony Eastin likes to dabble in the stock market. One day last year, he Googled a pharmaceutical company that seemed like a promising investment. One of the first search results Google served up on its news tab was listed as coming from the Clayton County Register, a newspaper in northeastern Iowa. He clicked, and read. The story was garbled and devoid of useful information—and so were all the other finance-themed posts filling the site, which had absolutely nothing to do with northeastern Iowa. “I knew right away there was something off,” he says. There’s plenty of junk on the internet, but this struck Eastin as strange: Why would a small midwestern paper churn out crappy blog posts about retail investing?

Eastin was primed to find online mysteries irresistible. After years in the US Air Force working on psychological warfare campaigns. Later, he joined Meta where he investigated issues like child abuse and political influence operations. Now he was between jobs, and this new mission seemed welcoming. Eastin then reached out to Sandeep Abraham, a friend and former Meta colleague who previously worked in Army intelligence and for the NSA. So, they began digging.

What they discovered provides a snapshot of how generative AI is enabling new, deceptive online business models. Networks of websites filled with AI-generated clickbait are built by exploiting the reputations of established media outlets and brands. These outlets flourish by confusing and misleading both audiences and advertisers. They’re “domain squatting” on URLs that previously belonged to more reputable organizations. The suspicious site Eastin was referred to no longer belonged to the newspaper, but it still exploited its name.

Though Eastin and Abraham presume that the network the old site of the Register has now joined was established with simple goals of profit, they express concern that individuals with harmful intentions may employ similar strategies to distribute false information and propaganda via search results. “This poses a huge risk,” warns Abraham. “We aim to sound a warning.” In response, they have published a report detailing their discoveries and intend to share more as they delve into the realm of AI clickbait. They hope their part-time endeavours will cultivate awareness among the general public and lawmaking bodies.

The Clayton County Register was established in 1926 and reported on the small community of Ekader, Iowa, and the broader Clayton County situated against the Mississippi River in the northeastern region of the state. “It was a widely read paper,” recalls ex-coeditor Bryce Durbin, who is “repulsed” by the present content published at its old web address, claytoncountyregister.com. (The authentic Clayton County Register combined forces with The North Iowa Times in 2020 to become the Times-Register, which operates from a distinct site. It remains unknown how the newspaper lost ownership of its web domain, and the Times-Register did not respond to requests for comment.)

As Eastin found when conducting research for his pharmaceutical stock, the site still identifies as the Clayton County Register though it no longer provides local news coverage and has morphed into a financial news feed. It publishes what seem to be AI-generated articles on the stock prices of public utility companies and Web3 startups, complemented by visuals that also appear to have been produced by AI.

“Not only were the AI-generated articles that we assessed, but the included images in each article were all created using diffusion models,” states Ben Colman, CEO of deepfake detection startup Reality Defender, which executed an analysis on several articles at the behest of WIRED. In addition to this finding, Abraham and Eastin observed that some articles included text acknowledging their artificial genesis. “It’s important to note that this information was auto-generated by Automated Insights,” were statements included in some articles, referring to a company that provides language-generation technology.

Lauren Goode

Lauren Goode

Matt Burgess

Angela Watercutter

When Eastin and Abraham examined the bylines on the Register’s former site they found evidence that they were not actual journalists—and probably not even real people. The duo’s report notes that many writers listed on the site shared names with well-known people from other fields and had unrealistically high output.

One Emmanuel Ellerbee, credited on recent posts about Bitcoin and banking stocks, shares a name with a former professional football player. When Eastin and Abraham started their investigation in November 2023, the journalist database Muck Rack showed that he had bylined an amazing 14,882 separate news articles in his “career,” including 50 published the day they checked. By last week, the Muck Rack profile for Ellerbee today showed that output has continued apace—he’s credited with publishing 30,845 articles. Muck Rack’s CEO Gregory Galant says the company “is developing more ways to help our users discern between human-written and AI-generated content.” He points out that Ellerbee’s profile is not included in Muck Rack’s human-curated database of verified profiles.

The Register’s domain appears to have changed hands in August 2023, data shows, around the time it began to host its current financial news churn. Eastin and Abraham used tools to confirm that the site was attracting most of its readership through SEO, targeting search keywords about stock purchasing to lure clicks. Its most notable referrals from social media came from crypto-news forums on Reddit where people swap investment tips.

The whole scheme appears aimed at winning ad revenue from the page views of people who unwittingly land on the site’s garbled content. The algorithmic posts are garnished with ads served by Google’s ad platform. Sometimes those ads appear to be themed on financial trading, in line with the content, but at others are unrelated—ads have been seen for the AARP. Using Google’s ad network on AI-generated posts with fake bylines could fall foul of the company’s publisher policies, which forbid content that “misrepresents, misstates, or conceals” information about the creator of content. Occasionally, sites received direct traffic from the CCR domain, suggesting its operators may have struck up other types of advertising deals, including a financial brokerage service and an online ad network.

Eastin and Abraham’s attempts to discover who now owns the Clayton County Register’s former domain were inconclusive—as were WIRED’s—but they have their suspicions. The pair found that records of its old security certificates linked the domain to a Linux server in Germany. Using the internet device search engine Shodan.io, they found that a Polish website that formerly advertised IT services appeared associated with the Clayton County Register and several other domains. All were hosted on the same German server and published strikingly similar, apparently AI-generated content. An email previously listed on the Polish site was no longer functional and WIRED’s LinkedIn messages to a man claiming to be its CEO got no reply.

One of the other sites within this wider network was Aboutxinjiang.com. When Eastin and Abraham began their investigation at the end of 2023 it was filled with generic, seemingly-AI-generated financial news posts, including several about the use of AI in investing. The Internet Archive showed that it had previously served a very different purpose. Originally, the site had been operated by a Chinese outfit called “the Propaganda Department of the Party Committee of the Xinjiang Uygur Autonomous Region,” and hosted information about universities in the country’s northwest. In 2014, though, it shuttered, and sat dormant until 2022, when its archives were replaced with Polish-language content, which was later replaced with apparently-automated clickbait in English. Since Eastin and Abraham first identified the site it has gone through another transformation. Early this month it began redirects to a page with information about Polish real estate.

Lauren Goode

Lauren Goode

Written by: Matt Burgess

Written by: Angela Watercutter

Due to the investigations of Eastin and Abraham, it has been found that nine varied websites are connected to a Polish IT firm. These websites seem to be part of an AI clickbait system. The selection of these websites seems to be based on their previously established reputation with Google which could aid in securing priority in search result rankings and thus garner more clicks.

Google asserts that it has mechanisms to confront attempts to manipulate search rankings via the acquisition of expired domains. Furthermore, the company states that using AI to generate articles with the primary goal of achieving high rankings is considered spam. “The strategies used by these sites largely contravene Search’s spam policies,” states spokesperson Jennifer Kutz. Websites found to be in violation of these policies could face penalties in their search ranking or could be completely removed from Google’s listing.

Still, this type of network has become more prominent since the advent of generative AI tools. McKenzie Sadeghi, a researcher at online misinformation tracking company Newsguard, says her team has seen an over 1000 percent increase in AI-generated content farms within the past year.

WIRED recently reported on a separate network of AI-generated clickbait farms, run by Serbian DJ Nebojša Vujinović Vujo. While he was forthcoming about his motivations, Vujo did not provide granular details about how his network—which also includes former US-based local news outlets—operates. Eastin and Abraham’s work fills in some of the blanks about what this type of operation looks like, and how difficult it can be to identify who runs these money-making gambits. “For the most part, these are anonymously run,” Sadeghi says. “They use special services when they register domains to hide their identity.”

That’s something Abraham and Eastin want to change. They have hopes that their work might help regular people think critically about how the news they see is sourced, and that it may be instructive for lawmakers thinking about what kinds of guardrails might improve our information ecosystem. In addition to looking into the origins of the Clayton County Register’s strange transformation, the pair have been investigating additional instances of AI-generated content mills, and are already working on their next report. “I think it’s very important that we have a reality we all agree on, that we know who is behind what we’re reading,” Abraham says. “And we want to bring attention to the amount of work we’ve done just to get this much information.”

Other researchers agree. “This sort of work is of great interest to me, because it’s demystifying actual use cases of generative AI,” says Emerson Brooking, a resident fellow at the Atlantic Council’s Digital Forensic Research Lab. While there’s valid concern about how AI might be used as a tool to spread political misinformation, this network demonstrates how content mills are likely to focus on uncontroversial topics when their primary aim is generating traffic-based income. “This report feels like it is an accurate snapshot of how AI is actually changing our society so far—making everything a little bit more annoying.”

Total
0
Shares
Leave a Reply

Your email address will not be published. Required fields are marked *

Previous Article

Score Logitech Gaming Keyboard & Mouse for Less than $100 in an Amazing Amazon Deal

Next Article

Potential Impact of Right-Wing Controversy on US Election Security

Related Posts