You get the internet you deserve • Sign up

Comment Let’s first follow the trail of debris to see where this garbage truck went.

It’s hard to trace back, but the effects of internet content mills that flourished as far back as 2010 are still readily apparent.

The net effect of content being generated at the rate of ten to thirty pieces a day on specialized topics—all by non-specialists with Google Trends guiding hand—led to an internet saturated with fluff (if) by 2010. no nonsense), keyword-stuffed articles that offer little in the way of usable information, and in many cases a lot of advice and information that is just plain wrong.

Since content mills naturally beget more content mills (and why not when the worst offender, Associated Content, was sold to Yahoo)! 100 million dollars) what happened next was inevitable. These new content companies simply grabbed what they found in the larger content mills, using the internet of the time as a training set. A series of bad articles with little detail or worse, inaccurate detail was repeated over and over again until it became difficult to distinguish one article from another unless it was found on one of the few edited, reputable sites.

The name of the game for these early content companies was transparency volume. Ad network (Google Adsense, etc.) revenues were already declining by 2005, but with thousands, if not millions, of articles generating three cents a day, the money wasn’t bad. For a content mill with 200,000 articles, that was a tidy $2 million with a very low overhead. Hosting wasn’t that expensive, web design was easy with open source CMS tools like WordPress, Drupal and others, and most importantly (and ultimately most disastrous) bulk content could be bought from offshore shops for just cents per article.

This model meant that the internet was quickly filled with poorly written nonsense, much of it still searchable in its original form or, worse, altered. Google had to start upping its game to filter around this and learn how to deliver quality content versus the magic keyword mix that content mills can use.

The problems with content mills are obvious, especially after all these years, but it was all on a human scale with the limitations of “slow” writers and keyword stuffers. The future presents us with a new challenge – one that could disrupt how we use the internet for good.

Let’s do some math

Let’s say it’s 2006 and you’re in the content milling business. You are at the top of your game. You have a team of 100 writers earning the equivalent of $10 a day to write and publish twenty 400 word pieces in India (expertly dictated topics with Google keyword trending data, etc.).

Your daily expenses for salary are about $1,000. Every day your content mill sends 2,000 pieces of “unique” content 365 days a year, and each of those articles will generate three cents per article, assuming good search engine rankings (which were easily solved with keyword tricks back then). . day.

While using nice, rounded numbers for convenience, consider these annual numbers (annualized because you only have to run this business for one year, Adsense money comes in at least for a while):

Writers who create, post and tag 20 articles a day cost you $365,000 a year. They generate 730,000 pieces of content per year at $10.95 per piece (assuming three cents per day for 365 days). And all that’s pretty accessible to you, Lord of Western Content, which means you’ve got a business that makes about $8 million a year.

Oh. But you need to remove hosting and so on. Let’s call it five grand. Big ugly expense? All those “expensive” writers. And you think to yourself, who needs?

Well, no.

Because oh boy, there’s a new business model for content mills. While its predecessors from the early 2000s were filled with junk articles that annoyed the internet and hit keyword and word count targets without saying a word, this one is disruptive enough to turn the internet into complete trash. And not only in terms of content, but also in terms of how internet business works.

Deploying Si to IoS

This new business model is already spreading. You have probably read many articles generated by GPT or similar AI models. The reason you probably don’t notice is because they aren’t bad. Well, you think they’re not bad, but that’s because you’ve been weaned on the Internet of Shit (IoS) brought on by content mills that have trained us to lower our expectations when it comes to information consumption.

The problem is where these AI-generated articles are supposed to get enough data to churn out new clones of information hidden in slightly more fluent language. And where do AI training algorithms get all this? From IoS of course.

Doing more math, let’s assume that 10 percent of the training data from the IoS has actual errors. As the AI ​​trains, and then retrains, and retrains, these errors increase. And montage. And multiply, and after ten years of retraining on bad, weird, weirdly worded, and increasingly incomprehensible data, we’re actually left with IoS.

Math is still very important – so is volume.

For example, a single content mill operator on the scale of a Western Content Lord can use free tools to create content as fast as human operators can enter it with a simple sentence of encouragement. The same team of 100 workers can enter 300 units per day.

They don’t write, they just ask ChatGPT. They can ask him to stuff keywords like a mofo and generate keywords for that matter too. Finally, the ChatGPT process (as one of many examples) will have API hooks to publish the output directly to WordPress or wherever the CMS Content Lord chooses.

When the AI-to-CMS platform integration is complete, the circle is complete: the Internet just talks to itself.

Race to the bottom

What Western Content Lord and its competitors don’t realize is how fast the race to the bottom is about to begin — and soon.

Google Adsense and every other ad network on the planet will recognize the flood and drop the amount it pays per click or view to almost nothing. And then nothing happens, but not before Google and the first known AI blacklist content mills. But most of them will appear very quickly. It will be easier for Google, for example, to create a safe list of well-known publishers supported by smart people.

great, do you think balance restored! Not much.

Keeping up with all the innovations in search that pull these IoS results down for you will cost Google money, billions of dollars in AI training, and significant, frequent retraining of the internet corpus. This corpus will be infected fast and furious, and how are the search giants paying for all these search innovations? Through advertising revenue.

Search advertising giants like Google can hold their noses and queue the results of the content mill because it is in their economic interest to do so. But what if the pool of “acceptable” content drops by 95 percent?

The exponential rate of internet degradation

We return to the subject of mathematics and volume and address the most important point: the information threat is an exponential problem. A series of mistakes created, then repeated by content mills over a decade, means that these problems are trained and reinforced from the internet corpus into the underlying AI language model.

It’s one thing to live in an age of fake news, partly because it’s obviously fake to most thinking people. When the Internet repeats a mistake often enough, it becomes a truth, and this is the most insidious coincidence of all.

Personally, I would have preferred to end this piece with some kind of “fight the power” message, but honestly, the cat’s out of the bag at this point. Content mills can be content with income per article measured in value plans over five years, which can be as little as .05 cents over time. But who cares, right? It’s free money. Hosting is cheap, CMS is free and passive income is worth the effort as long as there is advertising money.

This is probably the internet you deserve. ®

Source link