Why Google cannot avoid Doom (but Open Source can)
If you’ve been reading the same news as me, then you already know that:
- Google Search Is Dying
- Google doesn’t work anymore for exact matches
- Reddit can’t build a better search engine
- Google is returning ‘Untitled’ results that redirect to malware/spam
Why is that?
Money!
It’s actually quite simple. Google became a mind-bogglingly valuable company by first organizing the world’s information and then slowly replacing more and more of it with ads.
Google == Ads + tolerated SEO Spam + Content on Page 7
“In 2020, Google was the largest media company worldwide, with advertising revenues of nearly 132 billion U.S. dollars.” (source)
You probably already knew that, though. Google’s most profitable customers are large companies that need a lot of attention from us users. They also tend to be the companies that invest a lot into SEO and gaming algorithms, because they can expect to have the largest financial gains from somehow getting even more attention. In fact, one could universally say that no matter how shady, more attention can always be converted to more revenue. That’s why search results are always becoming more shady - and profits go up.
That is, BTW, also what is plaguing Reddit recently. If users trust the opinions they read on Reddit, then it makes sense for companies to spend money to manipulate Reddit, so you see an increase in Reddit spam bots.
My prediction is that both Google and to a lesser degree Reddit are going to milk their user’s trust for money until they crash hard into the wall when enough people recognize that Google is effectively selling their attention to the highest bidder, no matter if that helps you (the user) find the stuff that you want, or not.
Google can’t change course
I predict that there is nothing that Google can do to avoid this. And it will be certain doom for the company.
But for the people working in Google’s management, it would be an epic career suicide to announce that they will reduce advertisements. Or that they will voluntarily kick out highly profitable customers because their SEO spam makes the internet horrible for users. Google is so invested into advertising, that any solution which reduces ads is simply unacceptable for them.
But making web search spam-free might be the only way to make it useful again. And spam-free naturally also includes ad-free.
SPAM-free == Useful :)
Some of us still remember the good old days when the internet was full with useful knowledge, authentic product reviews, and diverse opinions. All of that really happened, but it was before companies noticed just how profitable SEO spam can be.
If we can sanatize our search results from all that commercial bot spam, the internet might feel like that again.
Google can never even attempt it, as it would undermine the foundation of their ad revenue. Also, if they would kick out too many companies just for being assholes, they’ll likely face legal action and people will call them out for abusing their monopoly power. But us users, the people that use a search engine to get stuff done - and not to get distracted for money - we could delete horrible domains out of our search results with impunity. And it is in our best interest to do so.
Friends of Friends - The Party Solution
Imagine you’re having a party and someone shits on the floor to get everyone’s attention and then starts drowning out any conversation by loudly shouting brand name slogans. I.e. imagine the real-life equivalent of 90% of SEO spam.
In the real world, you can easily make sure this never happens - or at least never again. The people who attend your party are your friends, and those that your friends trust enough to bring them along. If someone doesn’t behave, you (and/or your friends) will cut ties with them and they don’t get invited to the next party. This is how we should run internet search!
If I trust a website to give me reasonable content, then I want them to show up in my search results. If they throw shit at me, then I want to exclude them from my search results. With a bit of effort (and by analyzing my browsing history) I can narrow the internet down to roughly 200 domains that I want to show up in my search results. A few newspapers, a few forums, a few readmes, and a few blogs. I’m pretty sure you can do the same. And it would produce much better search results, because we can kick out all the ads and the spam.
Of course, maintaining the whole list of 200 domains is quite a lot of effort.
Inviting 100 people to your party is also quite a lot of effort.
So you invite your friends and tell them to bring a few of their friends, too.
We can do the same with internet search.
I maintain my own list of website domains that I trust.
My friends maintain their lists of website domains that they trust.
And with the help of technology, I can #include
the trust lists of my friends into my own trust list.
For more mainstream use, someone could maintain public trust lists and run a Patreon to support their work. I mean we’re already doing the same for ad blocking filter lists. And it works well. It’ll probably work equally well for ad-free inclusion lists.
I propose to call this way of managing which domains show up in search results “Federated Search”.
“Federated Search” is impossible for Google
The core idea of federated search is that the user has full control over what to include and exclude in their search results. (Read the preceding section for more details.)
That also means that the user has full control over the index data that is being used to create the search results. Creating this index data is a lot of work and very expensive. And once you have this data, there’s no reason for you to use an ad-ridden interface for using it. So an ad-financed company like Google can never ever allow outsiders access to this data. Doing so would kill their business model.
But for the Open Source community, it would be entirely feasible to create an open source file format, to build open source tools and clients to work with the data, and to freely share the data once it exists. Imagine if you could legally BitTorrent the data for running your own Google clone, but one that still has the "" operator and one that only searches domains that you want, and not all that other crap trying to game the rankings with SEO.
But even Open Source needs the (paid) data
Imagine you had all those configuration options I talked about before and you could use all of it with Open Source tools. Then you would still need to get the actual search index data somehow. And while the finished search index will likely be in the range of a few TB of data, producing it will require very beefy server clusters to store all that intermediate data. So crowdsourcing this isn’t really possible if you want to operate at web scale and at a competitive quality level.
But once the data has been created, it can be freely used and shared until it becomes too old to be useful. (You probably need to update the StackOverflow index once per week.) That means it would be feasible for people to team up, pay for creating the search index data, and then split the costs among everyone who will use it.
It’s Discussion Time!
-
If you could create your own search index (which you can redistribute, resell as you please) for let’s say up to 200 domains, updated weekly, would you pay $100 monthly for that?
-
Do you think someone could run a 10000 subscribers x $1 monthly Patreon to maintain a search index of mainstream news websites?
-
Would you rent and set up a $20 monthly VPS to host your own ad-free private federated search engine for yourself and your friends and family?
-
Would you consider hosting a public search engine with ethical ads if you had free access to the necessary software and an affordable way to subscribe to the data?
-
Do you think it is possbile to set up a company which will be financially self-sufficient (i.e. no dependence on ads) and produce the necessary data?
-
If the opportunity was there, would you set up such a company to populate an emerging market of ethically uncompromised search data providers?
-
This seems like it would benefit the greater good. Are there any government subsidies for this kind of stuff?