Hacker News new | past | comments | ask | show | jobs | submit login
The man who killed Google Search? (wheresyoured.at)
1879 points by elorant 10 days ago | hide | past | favorite | 877 comments





Ex-Google search engineer here (2019-2023). I know a lot of the veteran engineers were upset when Ben Gomes got shunted off. Probably the bigger change, from what I've heard, was losing Amit Singhal who led Search until 2016. Amit fought against creeping complexity. There is a semi-famous internal document he wrote where he argued against the other search leads that Google should use less machine-learning, or at least contain it as much as possible, so that ranking stays debuggable and understandable by human search engineers. My impression is that since he left complexity exploded, with every team launching as many deep learning projects as they can (just like every other large tech company has).

The problem though, is the older systems had obvious problems, while the newer systems have hidden bugs and conceptual issues which often don't show up in the metrics, and which compound over time as more complexity is layered on. For example: I found an off by 1 error deep in a formula from an old launch that has been reordering top results for 15% of queries since 2015. I handed it off when I left but have no idea whether anyone actually fixed it or not.

I wrote up all of the search bugs I was aware of in an internal document called "second page navboost", so if anyone working on search at Google reads this and needs a launch go check it out.


Machine learning or not, seo spam sort of killed search. It’s more or less impossible to find real sites by interesting humans these days. Almost all results are Reddit, YouTube, content marketing, or seo spam. And google’s failure here killed the old school blogosphere (medium and substack only slightly count), personal websites, and forums

Same is happening to YouTube as well. Feels like it’s nothing but promoters pushing content to gain followers to sell ads or other stuff because nobody else’s videos ever surface. Just a million people gaming the algorithm and the only winners are the people who devote the most time to it. And by the way, would I like to sign up for their patreon and maybe one of their online courses?


I think a case can be made that the spam problem can be traced all the way back to Google buying Doubleclick.

Its really easy to spot the crap websites that are scaping content-creating websites ... because they monetize by adding ads.

If Google was _only_ selling ads on the search results page, then it could promote websites that are sans ads.

Instead, it is incentivised to push users to websites that contain ads, because it also makes money there.

And that means scraping other sites to slap your ads onto them can be very profitable for the scammers.


They hired people who introduce Jack Welch methods.

This is like in that Steve Jobs video about product people being kicked out and exchanged by ones who dont care about product:

https://m.youtube.com/watch?v=P4VBqTViEx4

They will not make good search. That is not their priority.


We need a Reverse Google search that will weed out the garbage.

https://kagi.com/ de-prioritizes SEO ad sites and also lets you blacklist sites from your search reaults. Never going back to google after trying it

Doesn't seem to be doing great? The example search I got on their home page was 'best headphones' which pretty immediately surfaces http://www.quietheadphones.com/ - which is openly for sale, and also covered in affiliate links.

A bit farther down the page is a 'best headphones for 2020' article.

And this is the example result set they push on the home page to a potential buyer.

You guys pay for this thing?


What are you comparing it against? Do you actually have a better alternative or just having a bad day?

The fact that you tried to pick on 2 of the results for such a generic keyword, show that it's miles ahead of mainstream search engines which are filled with SEO spam.

I tried that same search on Google, duckduckgo, bing, brave, yandex, even yahoo and needless to say the results were pretty much all SEO spam, list-style keywords farming from generic websites such as NYTimes (how tf is NYTimes an authoritive source on purchasing headphones?). Whereas in Kagi you get a wide range of helpful results focused around reviews/enthusiasts/forums, here are some of the results: youtube video reviews, reddit discussion, discussions on sound design forums, a Quora qusetion, the headphones page on best buy, amazon, walmart, etc.

And as the other comment said, Kagi also has life-saving features that empower the user to have control over the search results [0]. As far as I know the only weak point in Kagi (at the moment) is doing more local-focused searches.

Regardless of the quality of results (which mind you, are already quite superior), it'd be still worth paying for if only to support its ad-less search model and help nurture it. Prove that it's a viable model for the sake of the web. For everyone sake. It's a great effort for that alone. Combine both the model and high-quality results and it's the best in class with no one even close.

[0] https://help.kagi.com/kagi/features/website-info-personalize...


Google, with blacklisted domains. I wish an actual better alt existed.

I didn't 'try to pick on' - I pointed out two garbage results in a query that they literally push you to from the home page as examples for potential customers. If those results aren't doing what people claim (not highlighting seo spam) then I'm not really left with any faith that the queries they don't elevate to their home page will be better.


> how tf is NYTimes an authoritive source on purchasing headphones?

Acqui-hire. So what happened was in around 2010 or so a voice-over artist named Lauren Dragan who I think was already dabbling in professional tech journalism, wanted to write about headphones and microphones since she was getting really opinionated about them in her VO work.

So she contributed an article to “The Wirecutter,” which was trying to be like Tom’s and Engadget (I think they then dropped “the” from their name? Which makes one want to abbreviate as WC which is just tragic). I think it was just a freelance article on “audiophile headphones”...?

Well, the audiophile community online was growing etc. and this proved to be remarkably successful because it gave the audiophiles some professional validation, right? “I work in audio booths, I have to listen super closely, I know what I am talking about.” So it made money for The Wirecutter and they pitched her on “if we just bought you dozens of headphones online would you take notes and make a rec” and she's been doing stuff like that for them ever since.

Wirecutter broadened its focus to a lot of other topics, usually not with the same reliability—it really depends on the reviewer’s biases and such, and Lauren’s VO/audiophile bias of “I want my headphones to have a very flat EQ to match what's on the track, it's more important that they don't croak at higher volumes...” was something she could communicate well about in terms of sibilant highs or feeling too much or too little bass. Vs “we looked at air purifiers and, uh, they purify air!” ...

Meanwhile NYT was trying to grow their online presence as newspaper sales die... So they bought up Wirecutter, as a sort of “new journalism,” a “we wanted to get into this anyway, and it's easier if we don't try to build up the network effects ourselves but just take a site’s traffic who is already successful.” So yeah, they aqui-hired Wirecutter and put all their stuff on their domain and it kinda sucks now, but some of that were trends that were already beginning before they were acquired and there's still usually some decent data hiding in the “the competition” section of every “WC” article.


I've also been using (and paying for) Kagi for a few months now. It's fantastic.

Feels a bit silly to ask such an anecdotal question to somebody I don't know, but is it really better than Google? If you don't consider all the privacy yadda-yadda issues. I mean more like the size of the index, how quickly it updates things, how good is it at actual searching (like finding an almost exact quote which happens to exist on only one obscure site on the internet), stuff like that. I could also mention stuff like blacklisting doorways, but honestly it's less interesting, and I totally believe that it does it better than Google.

Personally, I use DDG on the daily basis, and it's mostly ok, but very-very far from perfect. More so, at least once in several days I have to switch to Google, because it is seriously better at updating the index, and DDG often fails to find something on some obscure forum, even if I know it's there (because I was a part of discussion myself!) and try to assist it with finding it as much as I can. Also, Google is immensely better at knowing local shops and finding products.

Also, Google search, being bad as it is, it still the only thing I find usable on mobile. First off, it's faster, it is integrated nicely into Pixel UI, and it's somewhat good at all these "more than just a search" type of things, like converting a timezone for me, showing wikipedia summary, flight schedule, etc. Also, integration with Google Maps, working hours and venue locations, it is actually far more reliable than, say, Tripadvisor.

Still, I feel reluctant to vendor-locking myself into payed service unless it's actually far better than everything else and can replace DDG and Google completely.


> Also, Google is immensely better at knowing local shops and finding products.

Tangential, but this is precisely the "problem" with Google search. Whatever the internal decision-making process was, Google search at some point embraced race to the bottom incentivizing outspending others, either by paying for ads or showing ads. This race is ultimately won by content scrapers/generators slapping ads on top and businesses selling stuff.

Anecdotally, there is a pet supply store near me. It's nearly impossible to find on Google maps. If I zoom over the shopping mall this particular store does not appear, if I search for "pet store" it does not appear. Only if I do search for "petstore inc." it appears in results and map. So Google knows about the store, but actively tries to hide it, presumably because Google does not make money off it.

> I have to switch to Google, because it is seriously better at updating the index

On one hand yes, Google is in some cases really quick at updating the index with new entries. However, at the same time it is equally good at updating the index with removals making old content very hard to find.


I'm a paying subscriber.

It's not "that much" better for some definitions of "that much".

But they're working on making the best search engine for their customers, and it does have a lot of features for helping make your search better and less ad-driven.

I was trying to find the age of an obscure local lava flow. Google was useless for it. Kagi had it on the third hit. So sometimes it's brilliantly better.

But what I like the most is that their incentives are aligned with mine (because I'm paying them to be).

Google is going to maximize revenue which means making it as shitty as possible without you leaving. How many ads can I cram down their throats before they split? Kagi is also maximizing revenue, but they want to make it as great as possible so you don't leave.

Are the results worth it? It's up to you, really. Try it for free--if you don't miss it after you run out of free searches, then it's not for you.


> Privacy yadda yadda

?


I’ve been toggling between Kagi and Perplexity, can honestly say I don’t miss google search (still use maps though)

Reverse of Google Search is also Google Search, due to how the ranking works.

A bit chicken-and-egg. Another perspective: Google’s system incentivizes SEO spam.

Search for a while hasn’t been about searching the web as much as it has been about commerce. It taps commercial intent and serves ads. It is now an ad engine; no longer a search engine.


Best exercise bike articles, and such, are what lots of people people actually search for. There is no incentive to provide quality work which answers these queries hence the abundance of spam and ads.

If you want to purchase consumer products at your own expense and offer an impartial opinion on each of them then you will have no problem getting ranked highly on google. You will lose a lot of money doing so, however, and will also be plagiarized to death in a month. The sites you want to be rid of will outrank you for your own content, I have been there and have the t-shirt.


> Best exercise bike articles, and such, are what lots of people people actually search for

Google doesn’t have to return the SEO-optimized page. Google has other options:

- Return 10 results of the 10 top products,

- Derank any site that seems SEO-optimized,

- Derank any commercial site,

- Derank any site with a cookie banner (implying the user is tracked and the writer is trying to write what the user wants to read) or the infamous mailing list popup,

- Prioritize comparisons from brick-and-mortar journals, or give credentials to other vectors of trust,

- Act as a paid directory, where only paid answers appear,

- Return individual positive and negative comments about products, extracted from review pages, maybe even in a graph (“Good for USB-C according to 95% of the reviews, provides an electric shock according to 7% of non-affiliated comments”).

There WERE many options. Google CHOSE to rank awful sites that provide decreased value, and worse than that, it chose that all other sites won’t be viable, killing them. Google chose the face of the internet today.


Absolutely this. I don't think many people consider how odd it is that the largest internet advertising company in the world and the largest search engine company in the world are one and the same, and just how overt a conflict of interest that is, so far as providing quality service goes. It would be akin to if the largest telephone service company in the world was also the largest phone maker in the world. Oh wait, that did happen [1] - and we broke them up because it's obviously extremely detrimental to the functioning of a healthy market.

[1] - https://en.wikipedia.org/wiki/Breakup_of_the_Bell_System


For me what killed search was 2016, after that year if some search term is "hot news" it becomes impossible to learn anything about it that wasn't published in the last week and you just get the same headline repeated 20 times in slightly different wording about it.

After that I only use search for technical problems, and mouth to mouth or specific authors for everything else.


Yes, this is a thing I find really frustrating about Google. Especially as I often search for old news stories to find out what people were saying on a topic a few years ago in order to give some context to more recent stories.

Most of the problems I complain about are not related to SEO spam but to Google including sites that does not contain my search terms anywhere despite my use of doublequotes and the verbatim operator.

As for SEO spam a huge chunk of it would have disappeared I think if Google had created the much requested personal blacklist that we used to ask them for.

It was always "actually much harder than anyone of you who don't work here can imagine for reasons we cannot tell or you cannot understand" or something like that problem, but bootstraped Kagi managed to do it - and their results are so much better that I don't usually need it.


I've heard this argument again and again, but I never see any explanation as to why SEO is suddenly in the lead in this cat-and-mouse game. They were trying ever since Google got 90%+ market share.

I think it's more likely that Google stopped really caring.


Well yeah, it's in the article - at some point, they switched completely to metrics (i.e. revenue) driven management and forgot that it's the quality of results that actually made Google what it is. And, with a largely captive audience (Google being the default-search-engine-that-most-people-don't-bother-or-don't-know-how-to-change in Chrome, Android, on Chromebooks etc.), they arguably don't have to care anymore...

Well, it's in the name. SEO is a fancy name for trying to game whatever heuristics Google employs to form their SERPs. It's just that at some point those heuristics shifted from rewarding "quality content" as defined by the disgruntled towards enshitification.

There are various kinds of SEO - internal: technical, on-page and external. A long time ago Google had an epiphany that instead of trying to make sense out of sites themselves they could offload that effort to website administrators and started ranking pages how well they implement technical elements helping Google index the web. For a very long time that was synonymous with white-hat SEO. Since Google search was in part based on web-of-links, various shady tactics to inflate number of indexed backlinks and boost rankings. That was black-hat SEO.

These days Google search puts tremendous focus on on-page SEO. So much that as long as the internal structure of a site is indexable (no dead links, internal backlinks, meta info) it is typically better to hire copywriters spitting out LLM-like robotic mumblings than to try and optimize further.


Massive media companies finally caught on and started churning out utter shit because it's wildly profitable.

When the 'trusted websites' caught on and embraced the game, Google was apparently helpless to stop it.


I don't know, but Youtube seems to have a more solid algorithm. I'm typically not subscribed to any channel, yet the content I want to watch does find me reasonably well. Of course, heavily promoted material also, but I just click "not interested in channel" and it disappears for a while. And I still get some meaningful recommendations if I watch a video in a certain topic. Youtube has its problems, of course, but in the end I can't complain.

I don't think youtube is trying that hard to desperately sell stuff to you via home screen recommendation algorithm. And I agree its bearable and what you describe works cca well, albeit ie I am still trying to get rid of anything related to Jordan Peterson whom I liked before and detest now after his drug addiction / mental breakdown, it just keeps popping back from various sources, literal whack-a-mole.

I wish there was some way to tell "please ignore all videos that contain these strings, and I don't mean only for next 2 weeks".

Youtube gets their ads revenue from before/during video, so they can be nicer to users.


These search companies should have hired moderators to manually browse results and tag them based on keywords instead of leaving tagging up to content and info creators. The entire results game became fixated on trending topics and SEO spam that it became a game of insider trick trading, that's what makes results everywhere so terrible now.

In a bid for attention, only the fraudsters are winning, well, the platforms are winning lots of money from selling advertising, I guess that's why they're perfectly fine with not fixing results and ranking for many years now. I'm not sure there is a way back to real relevance now, there's no incentive for these large companies to fix things, and the public has already become used to the gamified system to go back to behaving themselves.


What I don't understand about this explanation is that Google's results are abysmal compared to e.g. DuckDuckGo or even Brave search. (I haven't tried Kagi, but people here rave about it as well.) Sure, all the SEO is targeting googlebot, but Google has by far more resources to mitigate SEO spam than just about anyone else. If this is the full explanation, couldn't Google just copy the strategies the (much) smaller rivals are using?

Have you read the article this thread is about?

To summarize it: Google reverted an algorithm that detected SEO spams in 2019.

(Note that I never work for Google and I don't know whether it's true or not. It's just what this article says.)


I wasn't responding to the article; I was responding to the claim that Google's results are bad because of all the SEO. It's a claim I've heard from Google apologists including some people I know at Google. I think it's nonsense both for the reasons I stated and for the reasons enumerated in the article.

You are totally correct I think.

This isn't about what is possible.

It is about Google not wanting to say goodbye to the sweet dollars from spammy sites.

Otherwise making the probably number one requested feature, a personal block list, wouldn't have been impossible for a company with so many bright minds.

I mean: little bootstrapped Kagi had it either from the beginning or at least since shortly after they launched.

People always think they lost against SEO spam. But my main reason for quitting as soon as an alternative showed up was because they started to overrule my searches and search for what they thought I wanted to search for.

For a while I kept it at bay by using doublequotes and verbatim but none of those have worked reliably for a decade now.

That isn't SEO spam. That is poor engineering or "we know better than you" attitude.


Google's search results are just bad. For example, search: "Does Quebec have an NHL team?"

The results suggest that Quebec does not have an NHL team, because it confuses the province of Quebec with Quebec City. Montreal, in Quebec, has the Montreal Canadiens and this isn't mentioned in the search results at all.


When a large search engine deranks spam websites, the spam websites complain! Loudly! With Google they have a big juicy target with lots of competing ventures for an antitrust case; no such luck for Kagi or DDG.

This is an interesting theory. Is there evidence that it's happening? Is Big SEO unreasonably effective at lobbying the Justice Department?

It’s definitely a concern where I work (not Google). Deranking anybody who happens to share a vertical we’re in is colorable as an anticompetitive action[0], and due to our dominance in another sector (not search), effectively any anticompetitive action anywhere is a no-go. And since we don’t have time to review whether a particular competitor also competes in one of our verticles and run everything by legal, nothing gets de-ranked manually.

0: for context, us doj does not take antitrust action against companies simply for market dominance; it requires market dominance plus an anticompetitive action. However, they don’t like monopolies, so effectively any pretext can be used — see the apple lawsuit or the 90s ms lawsuits for how little it takes.


The EU fined Google for prioritising Google Shopping results after complaints by other shopping/price-comparison websites.

https://en.m.wikipedia.org/wiki/Antitrust_cases_against_Goog...


I've been using Kagi for a while, and I find that it delivers better results in a cleaner presentation.

spam didn't kill search. Google willingness to promote spam for ads killed Google. Google is not search.

Machine learning is probably as much or even more susceptible to SEO spam.

Problem is that the rules of search engines created the dubious field of SEO in the first place. They are not entirely the innocent victim here.

Arcane and intransparent measures get you ahead. So arcane that you instantly see that it does not correspond with quality content at all, which evidently leads to a poor result.

I wish there was an option to hide every commercial news or entertainment outlet completely. Those are of course in on SEO for financial reaesons.


>I wish there was an option to hide every commercial news or entertainment outlet completely.

There's alway plugins or you can subscribe to Kagi, although I don't think there's any blocklist preconfigured for "all commercial news websites"


Hard disagree. As another reply mentions, just compare the alternatives such as Kagi that aren’t breaking search by pursuing ad growth.

Kagi isn't amazing, it's just not bad and it really makes plain how badly Google has degraded into an ad engine. All it takes to beat Google is giving okay quality search results.

This explodes for search terms dealing with questions related to bugs or issues or how to dos. Almost all top results are YT videos, each of which will follow the same pattern. First 10 secs garbage followed by request for subscribe and/or sponsorship content then followed by what you want.

SEO Spam didn't kill search so much as Google failed to retain Matt Cutts or replicate his community involvement https://www.searchenginejournal.com/matt-cutts-resigns-googl...

What did he used to do ? Your comment seems contradictory cutts seem to be on anti spam but your comment implies seo did not kill search . Is seo not part of spam?

Even when matt_cutts used to be here it was still impossible to get him (or anyone else) to care about search results including lots of results I never asked for.

Not low quality pages that spammed high ranking words but pages that simply wasn't related to the query at all as evidenced by the fact that they didn't contain the keywords I searched for at all!


It's like "Do some SEO magic and Tada!"

And who forgot the recent Reddit story.


Could you link it please? I have unfortunately no idea what you are referencing

Much agreed, and this is prompting me to experiment with other search engines to see if they cut off also the interesting humans sites. With todays google I feel herded.

[flagged]


This is the correct insight. Google has enough machine learning prowess that they could absolutely offload, with minimal manhours, the creation of a list ranking a bunch of blogspam sites and give them a reverse score by how much they both spam articles or how much they spread the content over the page. Then apply that score to their search result weights.

And I know they could because someone did make that list and posted it here last year.


I'm waiting for folks to implement a Reverse Google Search.

> where he argued against the other search leads that Google should use less machine-learning

This better echoes my personal experience with the decline of Google search than TFA: it seems to be connected to the increasing use of ML in that the more of it Google put in, the worse the results I got were.


It's also a good lesson for the new AI cycle we're in now. Often inserting ML subsystems into your broader system just makes it go from "deterministically but fixably bad" to "mysteriously and unfixably bad".

I think that’ll define the industry for the coming decades. I used to work in machine translation and it was the same. The older rules-based engines that were carefully crafted by humans worked well on the test suite and if a new case was found, a human could fix it. When machine learning came on the scene, more “impressive” models that were built quicker came out - but when a translation was bad no one knew how to fix it other than retraining and crossing one’s fingers.

As someone who worked in rules-based ML before the recent transformers (and unsupervised learning in general) hype, rules-based approaches were laughably bad. Only now are nondeterministic approaches to ML surpassing human level tasks, something which would not have been feasible, perhaps not even possible in a finite amount of human development time, via human-created rules.

The thing is that AI is completely unpredictable without human curated results. Stable diffusion made me relent and admit that AI is here now for real, but I no longer think so. It's more like artificial schizophrenia. It does have some results, often plausible seeming results, but it's not real.

Yes, but I think the other lesson might be that those black box machine translations have ended up being more valuable? It sucks when things don't always work, but that is also kind of life and if the AI version worked more often that is usually ok (as long as the occasional failures aren't so catastrophic as to ruin everything)

> Yes, but I think the other lesson might be that those black box machine translations have ended up being more valuable?

The key difference is how tolerant the specific use case is of a probably-correct answer.

The things recent-AI excels at now (generative, translation, etc.) are very tolerant of "usually correct." If a model can do more, and is right most of the time, then it's more valuable.

There are many other types of use cases, though.


A case in point is the ubiquity of Pleco in the Chinese/English space. It’s a dictionary, not a translator, and pretty much every non-native speaker who learns or needs to speak Chinese uses it. It has no ML features and hasn’t changed much in the past decade (or even two). People love it because it does one specific task extremely well.

On the other hand ML has absolutely revolutionised translation (of longer text), where having a model containing prior knowledge about the world is essential.


Can’t help but read that and think of Tesla’s Autopilot and “Full Self Driving”. For some comparisons they claim to be safer per mile than human drivers … just don’t think too much about the error modes where the occasional stationary object isn’t detected and you plow into it at highway speed.

relevant to the grandparent’s point: I am demoing FSD in my Tesla and what I find really annoying is that the old Autopilot allowed you to select a maximum speed that the car will drive. Well, on “FSD” apparently you have no choice but to hand full longitudinal control over to the model.

I am probably the 0.01% of Tesla drivers who have the computer chime when I exceed the speed limit by some offset. Very regularly, even when FSD is in “chill” mode, the model will speed by +7-9 mph on most roads. (I gotta think that the young 20 somethings who make up Tesla's audience also contributed their poor driving habits to Tesla's training data set) This results in constant beeps, even as the FSD software violates my own criteria for speed warning.

So somehow the FSD feature becomes "more capable" while becoming much less legible to the human controller. I think this is a bad thing generally but it seems to be the fad today.


I have no experience with Tesla and their self-driving features. When you wrote "chill" mode, I assume it means the lowest level of aggressiveness. Did you contact Tesla to complain the car is still too aggressive? There should be a mode that tries to drive exactly the speed limit, where reasonable -- not over or under.

Yes there is a “chill” mode that refers to maximum allowed acceleration and “chill mode” that refers to the level if aggressiveness with autopilot. With both turned on the car still exceeds the speed limit by quite a bit. I am sure Tesla is aware.

> For some comparisons they claim to be safer per mile than human drivers

They are lying with statistics, for the more challenging locations and conditions the AI will give up and let the human take over or the human notices something bad and takes over. So Tesla miles are miles are cherry picked and their data is not open so a third party can make real statistics and compare apples to apples.


Or in some cases, the Tesla slows down, then changes its mind and starts accelerating again to run over child-like obstructions.

Ex: https://www.youtube.com/watch?v=URpTJ1Xpjuk&t=293s


Tesla's driver assist since the very beginning to now seems to not posses object/decision permanence.

Here you can see it detected an obstacle (as evidenced by info on screen), made a decision to stop, however it failed to detect existence of the object right in front of the car, promptly forgot about the object and decision to stop and happily accelerated over the obstacle. When tackling a more complex intersection it can happily change its mind with regards to exit lane multiple times, e.g. it will plan to exit on one side of a divider, replan to exit onto upcoming traffic, replan again.


Well Tesla might be the single worst actor in the entire AI space, but I do somewhat understand your point. The lake of predictable failures is a huge problem with AI, I'm not sure that understandability is by itself. I will never understand the brain of an Uber driver for example

yes, who exactly looked at the 70% accuracy of "live automatic closed captioning" and decided Great! ship it boys!

My guess: They are hoping user feedback will help them to fix the bugs later -- iterate to 99%. Plus, they are probably under unrealistic deadlines to delivery _something_.

But rule-based machine translation, from what I've seen, is just so bad. ChatGPT (and other LLM) is miles ahead. After seeing what ChatGPT does, I can't even call rule-based machine translation "tranlation".

*Disclaimer: as someone who's not an AI researcher but did quite some human translation works before.


I think NM translation was broken all along. Not in the neural network part but in choosing the right answer. https://aclanthology.org/2020.coling-main.398.pdf

They are just trying to kill all Mine also https://techpanga.com/roblox-image-ids/

Since LLMs are loosely based on NM models, it seems research on newer sampling methods like Mirostat might help here.

Perhaps using a ML to craft the deterministic rules and then have a human go over them is the sweet spot.

Rules could never work for translation unless the incoming text was formatted in a specific way. Eg, you just couldn't translate a conversation transcript in a pro-drop language like Japanese into English sentence-by-sentence, because the original text just wouldn't have sentences in it. So you need some "intelligence" to know who is saying what.

I've heard AI described as the payday loan (or "high-interest credit card") of technical debt.

I think - I hope, rather - that technically minded people who are advocating for the use of ML understand the short comings and hallucinations... but we need to be frank about the fact that the business layer above us (with a few rare exceptions) absolutely does not understand the limitations of AI and views it as a magic box where they type in "Write me a story about a bunny" and get twelve paragraphs of text out. As someone working in a healthcare adjacent field I've seen the glint in executive's eyes when talking about AI and it can provide real benefits in data summarization and annotation assistance... but there are limits to what you should trust it with and if it's something big-i Important then you'll always want to have a human vetting step.

> I hope, rather - that technically minded people who are advocating for the use of ML understand the short comings and hallucinations.

The people I see who are most excited about ML are business types who just see it as a black boxes that makes stock valuation go vroom.

The people that deeply love building things, really enjoy the process of making itself, are profoundly sceptical.

I look at generative AI as sort of like an army of free interns. If your idea of a fun way to make a thing is to dictate orders to a horde of well-meaning but untrained highly-caffienated interns, then using generative AI to make your thing is probably thrilling. You get to feel like an executive producer who can make a lot of stuff happen by simply prompting someone/something to do your bidding.

But if you actually care about the grit and texture of actual creation, then that workflow isn't exactly appealing.


They wouldn’t think this way if stock investors weren’t so often such naive lemmings ready to jump off yet another cliff with each other.

We get it, you're skeptical of the current hype bubble. But that's one helluva no true Scotsman you've got going on there. Because a true builder, one that deeply loves building things wouldn't want to use text to create an image. Anyone who does is a business type or an executive producer. A true builder wouldn't think about what they want to do in such nasty thing as words. Creation comes from the soul, which we all know machines, and business people, don't have.

Using English, instead of C, to get a computer to do something doesn't turn you into a beaurocrat any more than using Python or Javascript instead does.

Only a person that truly loves building things, far deeper than you'll ever know, someone that's never programmed in a compiled language, would get that.


Getting drunk off that AI kool-aid aren't ya

the othering of creators because they use a different paintbrush was bothering me.

I can relate, AI is a tool, and if I want to write my code by LEGOing a bunch of AI-generated functions together, I should be able to.

please go other yourself somewhere else

Hit a nerve, it seems. Apologies.

> Using English, instead of C, to get a computer to do something doesn't turn you into a beaurocrat any more than using Python or Javascript instead does.

If one uses English in as precise a way as one crafts code, sure.

Most people do not (cannot?) use English that precisely.

There's little technical difference between using English and using code to create...

... but there is a huge difference on the other side of the keyboard, as lots of people know English, including people who aren't used to fully thinking through a problem and tackling all the corner cases.


> Most people do not (cannot?) use English that precisely.

No one can, which is why any place human interaction needs anything anywhere close to the determinancy of code, normal natural langauge is abandoned for domain-specific constructed languages built from pieces of natural language with meanings crafted especially for the particular domain as the interface language between the people (and often formalized domain-specific human-to-human communication protocols with specs as detailed as you’d see from the IETF.)


I gotta say, I love how you use english to perfectly demonstrate how imprecise english is without pre-understood context to disambiguate meaning.

using English has been tried many times in the history computing; Cobol, SQL, just to name a very few.

Still needed domain experts back then, and, IMHO, in years/decades to come


Or you can draw pretty pictures in LabVIEW lol

Was it intentional to reply with another no true Scotsman in turn here?

Yeah, I was also reading their response and was confused. "Creation comes from the soul, which we all know machines, and business people, don't have" ... "far deeper than you'll ever know", I mean, come on.

If you have to ask, then you missed it

I’m not optimistic on that point: the executive class is very openly salivating at the prospect of mass layoffs, and that means a lot of technical staff aren’t quick to inject some reality – if Gartner is saying it’s rainbows and unicorns, saying they’re exaggerating can be taken as volunteering to be laid off first even if you’re right.

Yeah but what comes after the mass layoffs? Getting hired to clean up the mess that AI eventually creates? Depending on the business it could end up becoming more expensive than if they had never adopted GenAI at all. Think about how many companies hopped on the Big Data Bandwagon when they had nothing even coming close to what "Big Data" actually meant. That wasn't as catastrophic as what AI would do but it still was throwing money in the wrong direction.

I’m sure we’re going to see plenty of that but from the perspective of a person who isn’t rich enough to laugh off unemployment, how does that help? If speaking up got you fired, you won’t get your old job back or compensation for the stress of looking in a bad market. If you stick around, you’re under more pressure to bail out the business from the added stress of those bad calls and you’re far more likely to see retribution than thanks for having disagreed with your CEO: it takes a very rare person to appreciate criticism and the people who don’t aren’t going to get in the situation of making such a huge bet on a fad to begin with – they’d have been more careful to find something it’s actually good for.

> technically minded people who are advocating for the use of ML understand the short comings and hallucinations

really, my impression is the opposite. They are driven by doing cool tech things and building fresh product, while getting rid of "antiquated, old" product. Very little thought given to the long term impact of their work. Criticism of the use cases are often hand waved away because you are messing with their bread and butter.


> but we need to be frank about the fact that the business layer above us (with a few rare exceptions) absolutely does not understand the limitations of AI and views it as a magic box where they type in

I think we also need to be aware that this business layer above us that often sees __computers__ as a magic box where they type in. There's definitely a large spectrum of how magical this seems to that layer, but the issue remains that there are subtleties that are often important but difficult to explain without detailed technical knowledge. I think there's a lot of good ML can do (being a ML researcher myself), but I often find it ham-fisted into projects simply to say that the project has ML. I think the clearest flag to any engineer that this layer above them has limited domain knowledge is by looking at how much importance they place on KPIs/metrics. Are they targets or are they guides? Because I can assure you, all metrics are flawed -- but some metrics are less flawed than others (and benchmark hacking is unfortunately the norm in ML research[0]).

[0] There's just too much happening so fast and too many papers to reasonably review in a timely manner. It's a competitive environment, where gatekeepers are competitors, and where everyone is absolutely crunched for time and pressured to feel like they need to move even faster. You bet reviews get lazy. The problems aren't "posting preprints on twitter" or "LLMs giving summaries", it's that the traditional peer review system (especially in conference settings) poorly scales and is significantly affected by hype. Unfortunately I think this ends up railroading us in research directions and makes it significantly challenging for graduate students to publish without being connected to big labs (aka, requiring big compute) (tuning is another common way to escape compute constraints, but that falls under "railroading"). There's still some pretty big and fundamental questions that need to be chipped away at but are difficult to publish given the environment. /rant


This is why hallucinations will never be fixed in language models. That's just how they work.

mysteriously with a helping of random too!

that's not something ML people would like to hear

Is ML the new SOAP? Looks like a silver bullet and 5 years later you're drowning in complexity for no discernible reason?

> SOAP

Argh. My PTSD from writing ONVIF drivers just kicked in.


Been there, Done that. Slides over a bottle of single malt.

Horrifying memories of Microsoft Biztalk

ML is somewhere between the new SOAP and the new cryptocurrency.

Dear sir, may I interest you in the initial coin airdrop of WSDLCoin? It is going straight to the moon.

Well thats grim

So... obviously SOAP was dumb[1], and lots of people saw that at the time. But SOAP was dumb in obvious ways, and it failed for obvious reasons, and really no one was surprised at all.

ML isn't like that. It's new. It's different. It may not succeed in the ways we expect; it may even look dumb in hindsight. But it absolutely represents a genuinely new paradigm for computing and is worth studying and understanding on that basis. We look back to SOAP and see something that might as well be forgotten. We'll never look back to the dawn of AI and forget what it was about.

[1] For anyone who missed that particular long-sunken boat, SOAP was a RPC protocol like any other. Yes, that's really all it was. It did nothing special, or well, or that you couldn't do via trivially accessible alternative means. All it had was the right adjective ("XML" in this case) for the moment. It's otherwise forgettable, and forgotten.


ML has already succeeded to the point that it is ubiquitous and taken for granted. OCR, voice recognition, spam filters, and many other now boring technologies are all based on ML.

Anyone claiming it’s some sort of snake oil shouldn’t be taken seriously. Certainly the current hype around it has given rise to many inappropriate applications, but it’s a wildly successful and ubiquitous technology class that has no replacement.


That ML I have no problem with.

This new ML that's supposed to be the basis for an entire new economic wave, that I mostly dislike.

But I guess that's how we build new things... We explore and throw away 80% of what we've built.


Call me back when you have voice recognition that doesn't constantly fail spectacularly.

Voice recognition will never be rule based.

Thank you for this.

Reading these comments I thought I stepped into some alternate timeline when we don't already have widespread ML all over the place.

Like, nobody does rules-based image recognition for a decade now already!


Yeah, I'm staring at my use of chatgpt to write a 50 line python program that connected to a local sqlite db and ran a query; for each element returned, made an api call or ran a query against a remote postgres db; depending on the results of that api call, made another api call; saved the results to a file; and presented results in a table.

Chatgpt generated the entirety of the above w/ me tweaking one line of code and putting creds in. I could have written all of the above, but it probably would have taken 20-30 minutes. With chatgpt I banged it out in under a minute, helped a colleague out, and went on my way.

Chatgpt absolutely is a real advancement. Before they released gpt4, there was no tech in the world that could do what it did.


Don't forget about that expensive GPU infrastructure you invested in.

and the power bill

and how difficult it is to program those GPU to do ML


ML is a quite well adopted technology. iPhones has ML bulit in since about 2017. It has been more than 5 years.

Well, it depends on the ML person. I work on industrial ML and DL systems every day and I'm the one who made that comment.

Same here with YouTube, assuming they use ML, which is likely.

They routinely give me brain-dead suggestions such as to watch a video I just watched today or yesterday, among other absurdities.


For what it's worth, I do not remember a time when YouTube's suggestions or search results were good. Absurdities like that happened 10 and 15 years ago as well.

These days my biggest gripe is that they put unrelated ragebait or clickbait videos in search results that I very clearly did not search for - often about American politics.


15 years ago, I used to keep many tabs of youtube videos open just because the "related" section was full of interesting videos. Then each of those videos had interesting relations. There was so much to explore before hitting a dead-end and starting somewhere else.

Now the "related" section is gone in favor of "recommended" samey clickbait garbage. The relations between human interests are too esoteric for current ML classifiers to understand. The old Markov-chain style works with the human, and lets them recognize what kind of space they've gotten themselves into, and make intelligent decisions, which ultimately benefit the system.

If you judge the system by the presence of negative outliers, rather than positive, then I can understand seeing no difference.


>The relations between human interests are too esoteric for current ML classifiers to understand.

I would go further and say that it is impossible. Human interests are contextual and change over time, sometimes in the span of minutes.

Imagine that all the videos on the internet would be on one big video website. You would watch car videos, movie trailers, listen to music, and watch porn in one place. Could the algorithm correctly predict when you're in the mood for porn and when you aren't? No, it couldn't.

The website might know what kind of cars, what kind of music, and what kind of porn you like, but it wouldn't be able to tell which of these categories you would currently be interested in.

I think current YouTube (and other recommendation-heavy services) have this problem. Sometimes I want to watch videos about programming, but sometimes I don't. But the algorithm doesn't know that. It can't know that without being able to track me outside of the website.


>I would go further and say that it is impossible. Human interests are contextual and change over time, sometimes in the span of minutes.

Theres a general problem in the tech world where people seem to inexplicably disregard the issue of non-reducibility. The point about the algorithm lacking access to necessary external information is good.

A dictionary app obviously can't predict what word I want to look up without simulating my mind-state. A set of probabilistic state transitions is at least a tangible shadow of typical human mind-states who make those transitions.


I think there are things they could do and that ML could maybe help?

* They could let me directly enter my interests instead of guessing

* They could classify videos by expertise (tags or ML) and stop recommending beginner videos to someone who expresses an interest in expert videos.

* They could let me opt out of recommending videos I've already watched

* They could separate sites into larger categories and stop recommending things not in that category. For me personally, when I got to youtube.com I don't want music but 30-70% of the recommendations are for music. If the split into 2 categories (videos.youtube.com - no music) and (music.youtube.com - only music) they'd end up recommending far more to me that I'm actually interested in at the time. They could add other broad categories like (gaming.youtube.com, documentaries.youtube.com, science.youtube.com, cooking.youtube.com, ...., as deep as they want). Classifying a video could be ML or creator decided. If you're only allowed one category they would be incentive to not mis-classify. If they need more incentive they could dis-recommend your videos if you mis-classify too many/too often).

* They could let me mark videos as watched and actually track that the same as read/unread email. As it is, if you click "not interested -> already watched" they don't mark the video as visibly watched (the red bar under the video). Further, if you start watching again you lose the red-bar (it gets reset to your current position). I get that tracking where you are in a video is something that's different for email vs video but at the same time (1) if I made it to 90% of the way through then for me at least, that's "watched" - same as "read" for email and I'd like it "archived" (don't recommend this to me again) even if I start watching it again (same as reading an email marked as "read)


Those are some good suggestions, particularly the first one:

>let me directly enter my interests


YouTube has this feature

you can click one of the ML-selected categories at the top of your homepage to tell it what you'd like to see today

They probably optimize your engagement NOW - with clickbaity videos. So their KPIs show big increases. But in long term you realize that what you watch is garbage and stop watching alltogether.

Someone probably changed the engine that shows videos for you - exactly as with search.


I have to say, all my YouTube recommendations are good and they're rarely clickbait. If you sign out they're pretty bad though.

I do remember when Youtube would show more than 2 search results per page on my 23" display.

Or when they would show more than 3 results before spamming irrelevant videos.

Or when they didn't show 3 unskippable ads in a 5 minute video.

Or when they had a dislike button so you would know to avoid wasting time on low quality videos.


    > Or when they didn't show 3 unskippable ads in a 5 minute video.
On desktop Chrome, a modern ad-blocking browser extension will block 100% of YouTube adverts. I haven't watched one, literally, in years. I don't watch YouTube from a mobile phone, but I think the situation is different. (Can anyone else comment about the mobile experience?)

On Android devices I use the app PipePipe to avoid the YouTube ad hell. I recommend it.

I also use Firefox for Android, which has Addon support. Ublock Origin works on the phone and disables a a lot of the ad horror.


>PipePipe

It feels a bit funny asking this, since we're talking about Google (i.e. YouTube), but did you mean ;) PipeTube? I know there is a PeerTube too.


I don't know PipeTube. I meant PipePipe, which works well for me: https://github.com/InfinityLoop1308/PipePipe

> I do remember when Youtube would show more than 2 search results per page on my 23" display.

Wait what?! You "Consume Content" on a COMPUTER? What are you some kinda grandpa? Why aren't you consuming content from your phone like everyone else? Or casting it from your phone to your SMART TV! Great way to CONSUME CONTENT!

CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT CONSUME CONTENT


Lol, Youtube on Apple TV is great. Mostly because I either need to find something fast or I switch it off because the remote is not conducive to skipping. But the only time I watch Youtube on my computer is for a specific video. The waste of space is horrendous. Same with Twitter (rarely visited), just a 3/4 inches wide column of posts on my 24 inch screen.

I'm not consuming the content on my phone, because the user experience of using these services on my phone sucks. Just the app vs website difference with urls is a difference in behavior I hate let alone all the UI differences that make the mobile experience awkward.

I don't know about the TV though.


YouTube seems to treat popular videos as their own interest category and it’s very aggressive about recommending them if you show any interest at all. If you watch even one or two popular videos (like in the millions of views), suddenly the quality of the recommendations drops off a cliff, since it is suggesting things that aren’t relevant to your interest categories, it’s just suggesting popular things.

If I entirely avoid watching any popular videos, the recommendations are quite good and don’t seem to include anything like what you are seeing. If I don’t entirely avoid them, then I do get what you are seeing (among other nonsense).


Long long time ago; youtube "staff" would manually put certain videos on the top of the front page when they started. Im sure there we're biases and prioritization of marketing dollars but at least there was human recommending it compared to poorly recorded early family guy clips. I dont know when they stopped manually adding "editors/staff" choice videos but I recall some of my favorite early youtubers like CGPGgrey claim that recommendation built the career.

See this >15-year-old video "How to get featured on YouTube" - https://www.youtube.com/watch?v=-uzXeP4g_qA, which I remember as being originally uploaded to the official Youtube channel but looks like it's been removed now, this reupload is from October 2008.

It all depends on your use case but a lot of people seem to be in agreement it fell off in the mid to late 10s and the suggestions became noticeably worse.

YT Shorts recommendations are a joke. I'm an atheist and very rarely watch anything related to religion, and even so Shorts put me in 3 or 4 live prayers/scams (not sure) the last few months.

Similarly, Google News. The "For You" section shows me articles about astrology because I'm interested in astronomy. I get suggestions for articles about I-80 because I search for I-80 traffic cams to get traffic cam info for Tahoe, but it shows me I-80 news all the way across the country, suggestions about MOuntain View because I worked there (for google!) over 3 years ago, commanders being fired from the Navy (because I read a couple articles once), it goes on and on. From what I can tell, there are no News Quality people actually paying attention to their recommendations (and "Show Fewer" doesn't actually work. I filed a bug and was told that while the desktop version of the site shows Show Fewer for Google News, it doesn't actually have an effect).

Part of the reason I switched from google to duckduckgo for searching was I didn't WANT "personalization" I want my search results to be deterministic. If I am in Seattle and search for "ducks" I want the exact fucking same search results as if I travel to Rio de Janeiro and search for "ducks".

Honestly, I'd prefer my voice assistant (siri mostly) to be like that as well. It was at first, and I think everyone hated that lol.


YT Shorts itself is kind of a mystery to me. It's an objective degradation of the interface; why on earth would I want to use it? It doesn't even allow adjustment of the playback speed or scrubbing!

So, there's a few ways to explain it. From a business strategy level, TikTok exists, and is a threat to YouTube, so we need to compete with it.

From a user perspective, Shorts highlights a specific format of YouTube that happened to have been around for a lot longer than people realize. TikTok isn't anything new, Vine was doing exactly the same thing TikTok was a decade prior. It was shut down for what I can only assume was really dumb reasons. A lot of Viners moved to YouTube, but they had to change their creative process to fit what the YouTube algorithm valued at the time: longer videos.

Pre-Shorts, there really wasn't a good place on YouTube for short videos. Animators were getting screwed by the algorithm because you really can't do daily uploads of animation[0] and whatever you upload is going to be a few minutes max. A video essayist can rack up hundreds of thousands of hours of watch time while you get maybe a thousand.

(Fun fact: YouTube Shorts status was applied retroactively to old short videos, so there's actually Shorts that are decades old. AFAIK, some of the Petscop creator's old videos are Shorts now.)

But that's why users or creators would want to use Shorts. A lot of the UX problems with Shorts boils down to YouTube building TikTok inside of YouTube out of sheer corporate envy. To be clear, they could have used the existing player and added short-video features on top (e.g. swipe-to-skip). In fact, any Short can be opened in the standard player by just changing the URL! There's literally no difference other than a worse UI because SOMEONE wanted "launched a new YouTube vertical" on their promo packet!

FWIW the Shorts player is gradually getting its missing features back but it's still got several pain points for me. One in particular that I think exemplifies Shorts: if I watch Shorts on a portrait 1080p monitor - i.e. the perfect thing to watch vertical video on - you can't see comments. When you open the comments drawer it doesn't move over enough and the comments get cut off. The desktop experience is also really bad; occasionally scrolling just stops working, or it skips two videos per mousewheel event, or one video will just never play no matter how much I scroll back and forth.

[0] Vtubers don't count


You can scrub on the mobile player, that's what makes it so much frustrating because you can't do that on desktop

What does scrubbing mean in this context? Blocking the Shorts?

Seeking to a certain part of the video. On mobile, you can do it by dragging the progress bar at the bottom of the screen.

Scrubbing means quickly moving the current playback position back and forward

I think there is a large demo of people now who actually prefer to watch videos in portrait.

If you’re watching a single subject of interest video on your phone (TikTok type of content), it’s great. But landscape videos is more pleasant and there’s a reason we move from 4:3 for media. But that actually means watching the videos, but what I see is a lot of skipping.

Landscape videos were more pleasant on landscape screens, which are rarely used now, so they aren't more pleasant now.

I dont mind portrait. I mind inability to jump forward in the video.

Solid point. Not to mention that Shorts content is mainly linkbait and/or garbage.

I imagine my blocked channels list is stress testing YouTube at this point from the amount of shit Shorts results it's fed me after 2 years. Lol

Besides the religious crap, ill randomly get shit in India in hindu, having had not watched anything Indian and not even remotely Indian.


I only get those when it's new content with <20 likes and they are testing it out. Doesn't bother me, I like to receive some untested content - even though 99% of it is pure crap (like some random non-sense film with a trendy music on top).

>in hindu

Hindi is the word for the language, bro.


I knew I could count on you.

You bet. Think nought of it. We gave the world zero, after all. Even computers owe us. ;)

https://en.m.wikipedia.org/wiki/0


Just because you're an atheist doesn't mean you won't engage with religious content though. YT rewards all kinds of engagement not just positive ones. I.e. if you leave a snide remark or just a dislike on a religious short that still counts as engagement.

Yes I know, not the case, and before you ask, I also don't engage with atheist videos. But that's only one example: the recommendations are really bad in a lot of ways for me.

Prayers for the unbelievers makes some sense.

But I associate YouTube promotions with garbage any how. The few things I might buy like Tide laundry detergent are entirely despite occasional YouTube promotion.


Lmao. I'm very positive that the conversion rate for placing an atheist in a live mass out of the blue is very very very low. Because I never stayed for more than 3 seconds, I'm not sure if it's real religious content or a scam, though - and they don't even let me report live shorts :(

“Conversion rate”. I’m not sure if you intended that pun but it’s pretty good.

[flagged]


That's a feature, not a bug.

I think it's probably pushing pattern it sees in other users.

There's videos I'll watch multiple times, music videos are the obvious kind, but for some others I'm just not watching/understanding it the first time and will go back and rewatch later.

But I guess youtube has no way to understand which one I'll rewatch and which other I don't want to see ever again, and if my behavior is used as training data for the other users like you, they're probably screwed.


A simple "rewatch?" line along the top would make this problem not so brain dead bad, imho. Without it you just think the algorithm is bad (although maybe it is? I don't know).

This is happening to me to, but from the kind of videos it's suggested for I suspect that people actually do tend to rewatch those particular videos, hence the recommendation.

Install "Unhook" chrome extension. That changed my life.

Thanks for writing this insightful piece.

The pathologies of big companies that fail to break themselves up into smaller non-siloed entities like Virgin Group does. Maintaining the successful growing startup ways and fighting against politics, bureaucracy, fiefdoms, and burgeoning codebases is difficult but is a better way than chasing short-term profits, massive codebases, institutional inertia, dealing with corporate bullshit that gets in the way of the customer experience and pushes out solid technical ICs and leaders.

I'm surprised there aren't more people on here who decide "F-it, MAANG megacorps are too risky and backwards not representative of their roots" and form worker-owned co-ops to do what MAANGs are doing, only better, and with long-term business sustainability, long tenure, employee perks like the startup days, and positive civil culture as their central mission.


What's odd to me is how everything is so metricized. Clearly over metricization is the downfall of any system that looks meritocratic. Due to the limitations of metrics and how they are often far easier to game than to reach through the intended means.

An example of this I see is how new leaders come in and hit hard to cut costs. But the previous leader did this (and the one before them) so the system/group/company is fairly lean already. So to get anywhere near similar reductions or cost savings it typically means cutting more than fat. Which it's clear that many big corps are not running with enough fat in the first place (you want some fat! You just don't want to be obese!). This seems to create a pattern that ends up being indistinguishable from "That worked! Let's not do that anymore."


Agree you have to mix qualitative with the quantitative, but the best metrics systems don't just measure one quantity metric. They should be paired with a quality metric.

Example: User Growth & Customer Engagement

Have to have user growth and retention. If you looked at just one or the other, you'd be missing half the equation.


I think that a good portion of the problem is that there are groups involved in metrics:

1) People setting the metrics

2) People implementing/calculating the metrics

3) People working on improving the metrics (ie product work)

2 is specially complicated for a lot of software products because it can some times be really hard to measure and can be tweaked/manipulated. For example, the MAU twitter figures from the buyout that Musk keeps complaining about, or Blizzard constantly switching their MAU definition.

Often 2 and 3 are the same people and 1 is almost always upper management. I argue that 1 and 2 should be a single group of people (that doesn't work on the product at all) and not directly subject to upper management and not tracked by the same metrics they implement (or tracked by any metrics at all).


Absurdity, unfairness, and failure often result from selective blindness to reality, whether willful or unintentional. Hyperlogical people sometimes lack empathy or an ability to conceive of, to understand, or prefer to trivialize ambiguous situations, politics, biases, human factors, or nonfunctional requirements. Always keep looking for one's own and organizational blind spots.

Oh god. The blind faith in reductive, objectivist, rationalist meritocracy that somehow "everything can be measured perfectly" and "whatever happens is completely unbiased as proscribed by a black-and-white, mechanical formula". No, sorry, that's insufficiently holistic in accounting for intangibles and supportive effort, and more of a throwback ideology that should've died in the 1920's. Some degree of discretion is needed because there is no shortcut to "measuring" performance.

The hard part about starting worker owned co-ops is financing. We need good financing systems for them. People/firms who are willing to give loans for a reasonable interest, but on the scale of equity investment in tech start ups.

The problem is risk —- most new businesses will go under. Who’s going to take on that unreasonable risk without commensurate reward (high interest loan rate, if any, or equity).

Co-ops could go the angel/VC route for funding if they don’t give up a controlling share.


I formed a worker co-op - but it's just me! And I do CAD reverse engineering, nothing really life-giving.

I would love to join a co-op producing real human survival values in an open source way. Where would you suggest that I look for leads on that kind of organization?


Let's start with replacing Google. Count me in.

While DDG, Brave, Kagi etc are working generously to replace Google search. The other areas that I think get less attention and needs to be targeted to successfully dismantle them and their predatory practices are Google maps and Google docs.

Maps are hard because it requires a lot of resources and money and whatever but replacing docs should be relatively easier.


(paid user of Kagi here)

FWIW, Kagi is built on top of Google search, so yes it's "replacing" (for you and me) a dependence on Google search, but it is categorically not a from-the-ground-up replacement for Google search.


Oh that's pretty smart

Using OSS to commoditize complements plays a big role in breaking up big advantages.

There is big tech open source consortium working on maps now to commoditize it: https://siliconangle.com/2022/12/15/aws-microsoft-meta-tomto...

Not sure it'll work. I think half the advantage comes from the integration across all these tools (maps, search, etc). Have you ever tried to use duckduckgo? It surprised me what I take for granted in Google's user experience.


I wholeheatedly agree with you. The GMaps experience is vastly superior. Additionally, when I'm referring to Gmaps, I think one of the critical features that I would love to replace with Open Source is Places. With due respect, I find both Google and Yelp a*holes in this area. While OpenStreetMap is really good for mapping, I'm still looking to find(or create) somethign that can supplement OSM with Places/Business data.

What does a zero cost / zero IP / cooperative model of a Google killer look like?

It can't have ads, and it can't hide any knowledge that exists which could help the user.. even if the knowledge is proprietary.

It must repeal copyright laws by force. It must drain all silos and know all things. And it must utilize the entirety of the library Genesis.


OpenStreetMaps is pretty decent, and I find it better than Google Maps in most cases.

I would imagine GitHub and technology social media

I guess it depends on how much equity you own as to what is better (to your first paragraph), and how large the paycheck is (to the 2nd paragraph.

Problem is, worker owned co-ops would still require money to do anything even remotely competitive to existing businesses.

So... people go walk up for handouts from VCs....and the story begins lol.


> There is a semi-famous internal document he wrote where he argued against the other search leads that Google should use less machine-learning, or at least contain it as much as possible, so that ranking stays debuggable and understandable by human search engineers.

There's a lot of ML hate here, and I simply don't see the alternative.

To rank documents, you need to score them. Google uses hundreds of scoring factors (I've seen the number 200 thrown about, but it doesn't really matter if it's 5 or 1000.) The point is you need to sum these weights up into a single number to find out if a result should be above or below another result.

So, if:

  - document A is 2Kb long, has 14 misspellings, matches 2 of your keywords exactly, matches a synonym of another of your keywords, and was published 18 months ago, and

  - document B is 3Kb long, has 7 misspellings, matches 1 of your keywords exactly, matches two more keywords by synonym, and was published 5 months ago
Are there any humans out there who want to write a traditional forward-algorithm to tell me which result is better?

You do not need to. Counting how many links are pointing to each document is sufficient if you know how long that link existed (spammers link creation time distribution is widely differnt to natural link creation times, and many other details that you can use to filter out spammers)

> You do not need to.

Ranking means deciding which document (A or B) is better to return to the user when queried.

Not writing a traditional forward-algorithm to rank these documents implies one of the following:

- You write a "backward" algorithm (ML, regression, statistics, whatever you want to call it).

- You don't use algorithms to solve it. An army of humans chooses the rankings in real time.

- You don't rank documents at all.

> Counting how many links are pointing to each document is sufficient if you know how long that link existed

- Link-counting (e.g. PageRank) is query-independent evidence. If that's sufficient for you, you'll always return the same set of documents to each user, regardless of what they typed into the search box.

At best you've just added two more ranking factors to the mix:

  - document A
    qie:
        length: 2Kb
        misspellings: 14
        age: 18 months
      + in-links: 4
      + in-link-spamminess: 2.31E4
    qde:
        matches 2 of your keywords exactly
        matches a synonym of another of your keywords

  - document B
    qie:
        length: 3Kb
        misspellings: 7
        age: 5 months
      + in-links: 2
      + in-link-spamminess: 2.54E3
    qde:
        matches 1 of your keywords exactly
        matches 2 keywords by synonym
So I ask again:

- Which document matches your query better, A or B?

- How did you decide that, such that not only can you program a non-ML algorithm to perform the scoring, but you're certain enough of your decision that you can fix the algorithm when it disagrees with you ( >> debuggable and understandable by human search engineers )


A few minor nitpicks. Pagerank is not just link counting, who is linking to the page matters. Among the linking pages those that are ranked higher matter more -- and how does one figure out their rank ? its by Pagerank. It may sound a bit like chicken and egg but its fine, its the fixed point of the self-referential. definition.

Pagerank based ranking will not return the same set of pages. Its true that the ranking is global in vanilla version of Pagerank, but what gets returned in rank order is the set of qualifying pages. The set of qualifying pages are very much query sensitive. Pagerank also depends on a seed set of initial pages, these may also be set on a query dependent way.

All this is a little moot now, because Pagerank even defined in this way stopped being useful a long time ago.


Statistical methods are debuggable. Is PageRank not debuggable? I am not sure where ML starts and statistics end.

What's qie and qde?

> spammers link creation time distribution is widely differnt to natural link creation times

Yes, this is a statistical method. Guess what machine learning is and what it actually excels?


For a few months last year every time I searched for information about a package related to software available in homebrew, the first few results were to a site that clearly just had crawled all of the links in homebrew, and templated out a site of links corresponding to each package name. and thats about it. It would have been nice if the generated pages contained any useful information, but alas it did not.

There's got to be a better way.


Amit was definitely against ML, long before "AI" had become a buzzword.

He wasn't the only one. I built a couple of systems there integrated into the accounts system and "no ML" was an explicit upfront design decision. It was never regretted and although I'm sure they put ML in it these days, last I heard as of a few years ago was that at the core were still pages and pages of hand written logic.

I got nothing against ML in principle, but if the model doesn't do the right thing then you can just end up stuck. Also, it often burns a lot of resources to learn something that was obvious to human domain experts anyway. Plus the understandability issues.


i worked on ranking during singhal's tenure, and it was definitely refreshing to see a "no black box ML ranking" stance.

simplicity is always the recipe for success, unfortunately, most engineers are drawn to complexity like moth to fire

if they were unable to do some AB testing between a ML search and a non-ML search, they deserve their failure 100%

there are not enough engineers blowing the whistle against ML


> most engineers are drawn to complexity like moth to fire

Unfortunately, Google evaluates employees by the complexity of their work. "Demonstrates complexity" is a checkbox on promo packets, from what I've heard.

Naturally, every engineer will try to over-complicate things just so they can get the raises and promos. You get what you value.


I've heard a similar critique for Google launching new products and then letting them die, where it's really driven by their policies and practice around what gets someone a promotion and what doesn't.

Yep, promo doc bs that will be immediately abandoned as soon as the promo goes through in X quarter.

I definitely think the ML search results are much worse. But complexity or not, strategically it's an advantage for the company to use ML in production over a long period of time so they can develop organizational expertise in it.

It would have been a worse outcome for Google if they had stuck to their no ML stance and then had Bing take over search because they were a generation behind in technology.


Engineers love simplicity but management hates it and won’t promote people that strive towards it. A simple system is the most complex system to design.

I was there from 2015-2023 and, although I didn't work in Search, I remember a lot of the bigger initiatives designed at improving Search for users, like the project to add cards for the top 500 most commonly searched medical terms/conditions, using content from Mayo and custom contracted digital art (for an example, here's a sample link: https://www.google.com/search?q=acl+tear ). There were a lot of things like this going on at any point in time, and it was terrific to see. Then I discovered the manually curated internal knowledge graph, that even included many-language i19n. And then that it was possible for any googler to suggest updates/changes/additions.

Point being, there's a lot of amazing stuff that folks on the outside never would have seen, and it would be a shame for beancounters to ruin it all with decisions actively not "respecting the user".


That amazing internal knowledge graph you're talking about folks on the outside never seeing? That is very ironic because that knowledge graph used to be Freebase.com and a lot of the data came from the open data community who volunteered their efforts and expertise. Then Google bought Metaweb and shut down Freebase.

@gregw134 Thank you for sharing! I've never worked at Google, but really curious what the engineering context is when you say "needs a launch" in the last line.

Guessing: perhaps this means, if someone needs credit for shepherding an improvement to search quality into production, here is a set of known improvements waiting for someone to take ownership.

Exactly. The main way to get promoted at Google is to claim that you launched something important. Results in a lot of busywork and misaligned incentives.

I'm glad you shared this.

My priors before reading this article were that an uncritical over-reliance on ML was responsible for the enshittification of Google search (and Google as a whole). Google seemed to give ML models carte blanche, rather than using the 80-20 rule to handle the boring common cases, while leaving the hard stuff to the humans.

I now think it's possible both explanations are true. After all, what better way to mask a product's descent into garbage than more and more of the core algorithm being a black box? Managers can easily take credit for its successes and blame the opacity for failures. After all, the "code yellow" was called in the first place because search growth was apparently stagnant. Why was that? We're the analysts manufacturing a crisis, or has search already declined to some extent?


Phenomenal article, very entertaining and aligns with my experience as a prominent search "outsider" (I founded the first search intelligence service back in 2004, which was later acquired by WPP. Do I have some stories).

The engineers at Google were wonderful to work with up to 2010. It was like a switch flipped mid-2011 and they became actively hostile to any third party efforts to monitor what they were doing. To put it another way, this would like NBC trying to sue Nielsen from gathering ratings data. Absurd.

Fortunately, the roadblocks thrown up against us were half-hearted ones and easily circumvented. Nevertheless, I had learned an important lesson about placing reliance for one's life work on a faceless mega tech corporation.

It was not soon after when Google eliminated "Don't Be Evil" from the mission statement. At least they were somewhat self aware, I suppose.


I'm really glad the article came out though, it fills in some gaps that I was fairly confident about but didn't have anything other than my sense of the players and their actions to back up what I thought was going on.

I and a number of other people left in 2010. I went on to work at Blekko which was trying to 'fix' search using a mix of curation and ranking.

When I left, this problem of CPC's (the amount Google got per ad click in search) was going down (I believe mostly because of click fraud and advertisers losing faith in Google's metrics). While they were reporting it in their financial results, I had made a little spreadsheet[1] from their quarterly reports and you can see things tanking.

I've written here and elsewhere about it, and watched from the outside post 2010 and when people were saying "Google is going to steam roll everyone" I was saying, "I don't think so, I think unless they change they are dead already." There are lots of systemic reasons inside Google why it was hard for them to change and many of their processes reinforced the bad side of things rather than the good side. The question for me has always been "Will they pull their head out in time to recover?" recognizing that to do that they would have to be a lot more honest internally about their actions than they were when I was there. I was also way more pessimistic, figuring that they would be having company wide layoffs by 2015 to 2017 but they pushed that out by 5 years.

I remember pointing out to an engineering director in 2008 that Google was living in the dead husk of SGI[2] which caused them to laugh. They re-assured me that Google was here to stay. I pointed out that Wei Ting told me the same thing about SGI when they were building the campus. (SGI tried to recruit me from Sun which had a campus just down the road from where Google is currently.)

[1] https://docs.google.com/spreadsheets/d/18_y-Zyhx-5a1_kcW-x7p...

[2] Silicon Graphics -- https://www.sfchronicle.com/news/article/peninsula-high-tech...


> I was also way more pessimistic, figuring that they would be having company wide layoffs by 2015 to 2017 but they pushed that out by 5 years.

Well in 2011 Google had just over 30k employees, and now they're doing "layoffs" with 180k+ in 2024. I don't think the layoffs mean much.


Did I mention I was more pessimistic? :-) I expect that today they could layoff 150k, keep the 30K that are involved with search and enough ads that are making business and that husk would do okay for a long time. I don't suppose you watched SGI die, that happened to them, kind of spiraled into a core that has some money making business and then lived on that.

One of my observations between "early" Google and "late" Google (and like the grandparent post I see 2010 as a pretty key point in their evolution) was employee "efficiency." I don't know if you've ever been in that situation where someone leaves a company and the company ends up hiring two or three people to replace them because of all the things they were doing. Not 10x engineers but certainly 3 - 5x engineers. Google starting losing lots of those in that decade. They had gone through the "Great Repricing" in 2008 when Google lowered the strike price on thousands of share options. And having been there 5 to 10 years had enough wealth built up in Google stock that for a modest level of "this isn't fun any more" could just do that.

But aside from your observation that "they have plenty of people" it is similar to observing that a plane that has lost its engine at 36,000' has "plenty of altitude" both true and less helpful than "and here is the process we're going to use as we fall out of the sky to get the engines back on."

Google has lots of resources. If you have ever read about IBM reinventing itself in the 90's its quite interesting to note that had IBM not owned a ton of real estate it likely would not have had the resources to restructure itself. I worked with an executive at IBM who was part of that restructuring and it really impressed on me how important "facing reality" was at a corporation, and looking at the situation more realistically. I had started trying to get Google to do that but gave up when Alan Eustace explained that he understood my argument but they weren't going to do any of the things I had recommended. At that point its like "Okay then, have fun." Still, at some point, they could. They could figure out exactly what their "value add" is and the big E economics of their business and realign to focus on that. Their 'mission oriented' statement suggests that they are paying some attention to that idea now. But to really pull it off a lot of smart, self-interested, and low-EQ people are going to have to come to terms with being wrong about a lot of stuff. That is what I don't see happening and so I'm not really expecting them to transform. Both not enough star bits and the luma are just not hungry enough.


Are you suggesting that Google fire all the engineers who work on Cloud? That would... be a very interesting business decision, likely closing any door for them working with enterprise in the future.

Here's a few more realistic changes that Alphabet could make: - shut down X - shut down Verily - sell calico or shut it down if no buyers - sell Fiber or shut it down if no buyers - shut down Intrinsic, Wing, and all the other X spinoffs - make Cloud be its own Alphabet company with Kurian as an actual CEO

That would show Wall Street that GOogle is really serious about not wasting money on crazy ideas. That would boost the price (along with reducing costs) giving them some runway. I think it would be a shame if Waymo was shut down but it has a long, long way to being highly profitable.

It looks like Alphabet wants to sell Verily or spin it out of the Alphabet family entirely (after decoupling Verily's infrastructure from Google's) but nobody wants to buy it.


I was suggesting that they fire all the engineers that work on things that don't make money. It was only last quarter that Cloud actually made a profit. That said, I think you make a reasonable restructuring case; Now you just have to figure out how to get leadership to buy in and execute on that plan. In my experience two things work against that.

1) If it isn't their idea that don't believe it will do any good and could not possibly be the "right" thing to do.

2) If they don't have a job after it happens, they will work behind the scenes to sabotage attempts to make it happen.

You can work around those, but you need "existential risk" level energy to create that sort of change in a company.


Here here! Google needs to trim the fat, desperately! They need to eliminate all of their non revenue generating departments, ban all internal discussion forums and such that aren't laser focused on the job at hand. Cut 30-40% of all engineers, and get rid of the free food and other benefits. Install vending machines and charge for meals at their cafeterias, run them like any other business and make a profit. Get rid of free employee health benefits, make the employees pay for them. And for god sake get rid of that ridiculous swimming pool! Anything that isn't directly in the service of delivering value for shareholders needs to be done away with, starting with those hair-brained cash burning crazy X projects.

But Google seems to be doing decently well compared to blekko and Watson?

That it is, but a more apt comparison would be Duck Duck Go which was a contemporary of Blekko and definitely out performed relative to Blekko's success. DDG still going strong and even buying TV ads, so yeah.

That said, how Blekko and Watson ended up squandering good technology in search of something else is also an interesting learning experience/tale.


I agree with everything you are saying, but the stock price is up 6x over the last 10 years and revenue is still growing 13% a year so nobody at google is going to listen to any of this.

Basically ad dollars have continued to transition from old media to digital media over the last decade+ and that mass migration has created enough revenue to cover up all of Google's core problems.


The reality is that this firehose of money is what allowed those core problems to grow & fester. That and the practice of using TVCs for as much as possible, to the point where nearly every process that's documented is outsourced, and often no one inside really understands how things work anymore.

Looking at financials, all metrics are improving. They haven’t even started to lose altitude - they’re still gaining.

We might not like what they’ve become, but the comparison to a plane that’s lost its engine seems rather odd. Why couldn’t they keep going indefinitely, without making the changes that some would like?


Is IBM a good example? Like GE, their saving-grace restructuring was basically turning into a giant corporate leach (one through financialization, one through consulting).

ChuckMcM, I just wanted to say, I really appreciate the long view you bring to HN discussions. When you've been in tech for a few decades you start to see predictable patterns. History may not repeat, but it often rhymes.

Piggybacking on this to also express my appreciation. If/when you write a memoir someday, it would be a valuable historical record. If not, your hn comments are a wonderful corpus too :)

Thank you for sharing your experiences, Chuck!


A) I think it’s important to acknowledge that in many ways Google is actively trying to keep CPC low - what they care most about is total spend. A low CPC means an effective advertising network where interested consumers are efficiently targeted. Their position is complex thanks to their monopoly status over online advertising.

B) I don’t think it’s fair to characterize recent layoffs as some put-off collapse… criticize Google all you want for running a bad search engine, but right now they’re still dominant and search is the most effective advertising known to man. They’re raking in buckets of money: they had 54K employees on 01/01/2015, and 182K on 01/01/2024. Similarly, they made 66B in 2014, and 305B in 2023. The latest layoffs are them cleaning house and scaring their workers into compliance, not the death throes of a company in trouble — they’re barely a dent in the exponential graphs: https://www.macrotrends.net/stocks/charts/GOOG/alphabet/numb...


A) This is short-sighted. What you're suggesting is in fact a way to optimize short-term gain over long-term viability. It's pure MBA tactics.

Additionally, it's complete and total oversimplification. If you look at Google's earnings it's pretty damn clear that at least until 2020 they were not just going for maximum total spend, but for a steady, gradual raise in total spend. Not too slow, not too fast. They were NOT taking every opportunity they had, in fact they're famous for systematically refusing many opportunities (see the original founders' letter, but even after that). They were farming the ad market, the ad spend, growing it, nurturing it. Then COVID blew up the farm.

Maybe you're right now, but I do hope they're recovering their old tactics. Because if they maximize it you'd see nothing but scams ... wait a second.

B) Google was built by providing a vision, and getting out of the way of ground-up engineer efforts. "Scaring workers into compliance" IS killing the golden goose.

You can see this in AI. Every story from an AI engineer that ran away from Google is the same. They didn't run away for the money, they ran away because they were getting scared into compliance.

Now AI may make it, or not. I don't know. But this is happening EVERYWHERE in Google. Every effort. Every good idea, and every bad idea runs away, usually inside the mind of "a worker". Not to make them personally maximum money, but it's natural selection: if the idea doesn't run away, the engineer it's in is "scared into compliance", into killing the idea.

Whatever the next big thing turns out to be, it simply cannot come out of Google. And it will hit suddenly, just like it did for Yahoo.


Totally agree on the overall prognosis of Google - I am (also?) one of said engineers! Here’s a recent update from a tiny corner of the company: the rank and file is still incredibly smart and generally well-intentioned, but are following hollow simulacrums of the original culture - all-hands, dogfooding, internal feedback, and ground-up engineering priorities are all maintained in form, but they are now rendered completely functionless. I am personally convinced that the company is — or was, before ChatGPT really took off - focused on immediate short term stock value above all else. After all, if you were looking down the barrel of multiple federal and EU antritrust suits and dwindling public support for the utility you own and operate, you might do the same…

I guess I’m standing up for the simple idea that terribly inefficient organizations can prevail when they’re the incumbents, at least for significant periods if not forever. We can’t be complacent and assume they’ll fall on their own, esp when AGI threatens social calcification on an unheard of scale.


Ironically (and unsurprisingly), it's a repeat of what happened at Yahoo. ;-)

Drop your good intentions - towards Google, that is.

Work to sabotage and collapse the organization - do that for the good of humanity.

Thank you for your work, and good luck getting out without harm or reprisal <3

Hit em hard.


Why would Google's collapse be for the good of humanity? When was a power vacuum ever beneficial?

"Build a better search engine for the good of humanity", I can understand. "Kill a search engine for the good of humanity" is a reductive, childish take.


They've already killed it in essence, so that they are hurting billions of humans with it daily. But they can still run it because it creates more revenue in this harmful form than it did in its helpful form. Therefore sabotage against that revenue is justified.

Sabotaging the revenue of Google search will weaken them against honest incumbents. They are currently well funded enough to kill incumbents. That will start to change as they decline, aided by our boycotts and other forms of sabotage. The decline and sabotage of Google is necessary for a better search engine to have the space to succeed.

A power vacuum is often good.

Linux and open source exists in a personal and collective power vacuum that was created by proprietary knowledge and software.

Sometimes power vacuums are colonized by people with good intentions. And it's neither reductive nor immature to help create those opportunities.

I never said that someone shouldn't sabotage Google as well as create a better search engine. I myself am working on llm-driven knowledge retrieval systems, at the same time as advocating for the destruction of Google.

Good luck and do anything in your power that you think will help humanity have good search again.


Very much appreciate the sentiment and kind words! Reminds me of Yudkowsky’s line[1] about AI: “we should be willing to destroy a rogue datacenter by airstrike.” This kind of talk sounds insane in the Silicon Valley language game, but we’re talking about real people’s lives here and sometimes implied violence needs to be made explicit. And that’s what I see your suggestion as, ultimately —- but that’s probably because I got an American HS education, so the Malcom X vs. MLK Jr. debate was driven into my mind quite thoroughly.

[1] https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-no...

Luckily/unluckily I left already due to factors out of my control. Regardless, for all of Google’s faults I will say that they were incredibly serious about data security and respecting consumer data protection laws with strict oversight, so I think “sabotage” in a direct sense would be incredibly hard + risky. The only solution I see is continuing to organize for government regulation. I would include worker organization within Google, but I recently learned they represent less than half a percent of the company…


> Reminds me of Yudkowsky’s line[1] about AI: “we should be willing to destroy a rogue datacenter by airstrike.”

That op-ed reminds me of some short fiction you might like:

https://www.teamten.com/lawrence/writings/coding-machines/


> You can see this in AI. Every story from an AI engineer that ran away from Google is the same. They didn't run away for the money, they ran away because they were getting scared into compliance.

Can you elaborate?


Yeah, what is scared into compliance?


> The latest layoffs are them cleaning house and scaring their workers into compliance, not the death throes of a company in trouble

Really? I have the impression Google’s other tools (I have lots of uses of Docs and Meet ) are degrading in quality quite quickly

That is a subjective judgement, but it seems Google no longer cares


What is definition of dead? 15 years later they have huge majority of traffic share and lots of revenue.

Companies this size die several years before the body hits the floor.

They're dead when everyone starts to hate them and someone says "no, look how much money they're making, they're fine." That's the fatal blow, because they think they're fine, and keep doing the things that make everyone hate them.

At that point you're just waiting for someone else to offer an alternative. Then people prefer the alternative because the incumbent has been screwing them for so long, and even if they change at that point, it's too late because nobody likes or trusts them anymore, and ships that big can't turn on a dime anyway.

You have to address the rot when customers start complaining about it, not after they've already switched to a competitor.


That sounds a lot like Kodak.

I remember running into Kodak engineers, at an event in the 1990s, and they were all complaining about the same thing.

They were digital engineers, and they were complaining that film people kept sabotaging their projects.

Kodak invented the digital camera. They should have ruled the roost (at least, until the iPhone came out). Instead, they imploded, almost overnight. The film part was highly profitable.

Until it wasn't. By then, it was too late. They had cooked the goose.


If they owned the digital camera space like they should have, who’s to say they wouldn’t have eventually released a smartphone. It probably would have been an absolutely incredible camera first, and some mobile internet and phone features second.

One can really dream up a fascinating alternate timeline of iKodak if they didnt shoot themselves in the foot.


And even if they didn’t, maybe it would be Kodak sensors in iPhones instead of Sony sensors. A lot of possibilities.

Note that Nokia was already "great camera, first smartphones".

Sony did a rather short-lived modular camera phone.

It had a magnetic mount, where you could snap on external lenses.

I'm pretty sure they still have some variant of the concept, except that it's an external camera that uses your phone as a viewfinder.


I'm not a Steve Jobs fan, but one business-quote I do like: "If you don't cannibalize yourself, someone else will."

In other words, it could have been better for Kodak as a whole if they allowed their digital-arm to compete more with their film-arm, so that as the market shifted they'd at least be riding the wave rather than under it.


I'm also not a Steve Jobs fan, and this reminds me of how Flash died[1].

The Flash Renaissance was the counter-era to the search despair era we currently find ourselves in.

In the same vein as Kodak, I wonder what the alternate timeline would look like where Adobe cannibalized native apps.

[1] https://youtu.be/65crLKNQR0E?si=mXPgXxlMRxU-xjcu&t=2472


The mistake Adobe made was in canceling Flash instead of open sourcing it. Publish a spec and the let browsers implement the client side, then you can keep selling tools to make animations without everyone having to deal with the bug-riddled proprietary player Adobe clearly had no interest in properly maintaining to begin with.

It's kind of astonishing that all these years later we still don't have something equivalent in browsers. In theory they're Turing-complete and you can do whatever you want, but where's the thing that makes it that easy?


What makes you think people want easy? /s I mean, clearly that would be best for creativity, for cultural robustness, for accessibility. Unfortunately, there are a lot of incumbents in all the spaces Flash touched who were ecstatic (if in a schadenfreude-esque sense) to see the ladder pulled up after them. When you make it difficult or impossible for the peons to create, you make it difficult or impossible for them to bypass the professionals and the gatekeepers; when they can't tell their stories, their stories get told for them. Again, the professionals and the gatekeepers (and, now, the propagandists) find this ideal.

Suffice it to say, there are a lot of people who worked very hard to make sure that the 1998-2012ish period of openness and open-access and democratization was an anomaly. You got to see a mini-echo of this with the rollout and rollback of pandemic-era accessibility.


The just-so story about Kodak is one of those things that bugs me. Kodak did own the digital camera market, stem to stern, for years. They did not ignore it. They did, however, invent all that stuff a little early, before the semiconductor manufacturing technology had matured to the point where it could be a consumer good.

The company imploded because it spent all of its time, attention, and capital trying to become a pharmaceutical factory, starting in the mid-1980s.


Yeah, lots of things happened for a perfect storm of downfall…probably starting with the antitrust breakup of the film processing division.

They did indeed have a huge patent arsenal from all their research efforts that was very valuable. They were also really good at consumer tech - so it’s a shame it didn’t amount to more.


Well, NYSE: EMN is worth $12 B.....

One of the problems was just how profitable film was. No amount of digital camera sales is going to be as profitable as being able to charge people $2 per photo (film+development).

Fujifilm survived by diversifying more into a chemical company than a consumer product company (whereas Kodak sold off those portions of the company as "not being core to consumer imaging" and focused on printers(??))

And yet even Fuji are now back to having traditional film photography being their single largest revenue generator (their instax instant film is now so popular it is chronically sold out and they are doubling factory capacity to keep up)


Any examples of this actually playing out with a company as established as Google? You can read comments like this on many companies... Microsoft (70B$ income), Meta (40B$), Oracle (8.5B$), IBM(7.5B$), SAP (6B$), yet none of them seem to ever actually enter the predicted death spiral.

And the internet isn't new anymore. There is no vast landscape of unexplored new technological possibilities, and no garage start up with an engineering mindset that will just offer a better solution.


IBM used to be bigger than MS, it's a 10th of it today.

But most importantly all the above listed companies with the exception of Meta are those that are heavily ingrained in large companies operations. IBM still provides mainframes, MS has Exchange and Windows domains and is successfully transitioning a lot of customers to Azure, Oracle has their databases and other products, SAP their ERP systems.

Once a non-IT company has their internal IT systems and some legacy working they're going to be very very slow in changing them out if it works, companies that provide those and get a critical are going to have very very long runways compared to regular b2c companies if a significant portion of their revenue comes from this.

Google has Chromebooks that are used in schools and some GCP usage but could that save Google long enough if search revenue was cut into a fraction? And GCP is kinda of an also-ran today, people looking at larger options usually look at AWS(nr 1) or Azure (Windows legacy).


In 2023 the revenues of Google Cloud, Youtube Ads and "Google Other" and Google Network Members Ads were 130B combined.

If they could reduce headcount and operating expenditures to 2019 levels without losing that, they would be roughly breaking even without any search. They also have 280B$ in equity to tide them over.

When Google actually sees its business failing, it will have many many many chances to turn things around.


Microsoft and Meta reinvented themselves a few times over. At this point Windows is just an legacy business unit for instance, and Meta literally changed name to mark the turn.

Oracle, IBM and SAP have the advantage(?) of being heavily business focused from the start, and I don't see them ever die a natural death in our lifetimes. As long as they have the money to outbribe the competition they'll be there, and it will require a small miracle to break that loop.


The one thing that has kept Microsoft afloat is their business oriented part. They are deeply entrenched in any company that needs to use Office and only ever hires Windows admins who won't even look beyond Windows. That is pretty much every non tech small to medium company. When things were shifting to the cloud they were smart enough to make sure it would be their cloud, locking customers even deeper into their own technology.

Anything else they do is a bonus.


To add to this, Microsoft is really really good at understanding businesses, in a way Google will probably never be I think.

Having on premise hosting options for Exchange and all their core services is an example of that, even as they're also pushing for 365 in the cloud. I remember them being earlier than GCP to deal with GDPR and the in EU requirements as well but my memory might be failing.


They're starting to lose the thread though.

People use Windows at home and at school and then employers use the same thing because they don't want to retrain people. But the home versions of Windows are becoming so malevolent that they're losing market share. Meanwhile all the things that used to require Windows are becoming web pages and phone apps. You go to a university and it's full of Macbooks and if you see a PC in the CS department there's a good chance it has Linux on it. These are the people who will be choosing what to buy in a few years.

But who cares about the clients anymore, right? They're making money from cloud services. Except their hook is getting people to use Active Directory and Microsoft accounts, which are the things for managing Windows client devices.

It's going to be a while before anybody convinces the accountants to stop using Excel, but for large swathes of employees Windows is no longer relevant, and if you don't need Windows then why do you need Azure instead of AWS or any of the others?


> if you don't need Windows then why do you need Azure instead of AWS or any of the others?

I don't have enough insight, but there's more to it than Windows/Microsoft services tie up. It's clearly not the ease of use for small customers, it could be the contract making, or something else that makes it better deal for businesses beyond just the cost bundling.

For instance I remember Apple hosting iCloud on Azure. And there's a few other big players going with Microsoft, especially retail chains who can't touch anything Amazon, and don't trust Google.


It's the ease of use for medium customers. Large customers have Linux servers with full-time staff to write custom code and do whatever they want because they have their own resources; Facebook doesn't use Azure. Small customers buy a Macbook or Chromebook or tablet and have a gmail address and host their website on WordPress or one of those awful (but easy to use) web host proprietary site builders.

Medium businesses are big enough to want to have their own email domain but not big enough to want to implement their own spam filter, so they turn to the likes of Amazon and Google and Microsoft. Then Microsoft's advantage is they can manage and integrate with your Windows devices. Otherwise they're just doing price competition with every other hosting company. People who aren't even using Active Directory start to wonder why they should pay extra for SQL Server instead of using Mariadb on Linux, and in turn why they shouldn't put that VM on AWS unless Microsoft cuts them a better deal. (Which is presumably what happened with Apple, but offering long-term discounts is not how you make a lot of money.)


Apple spends >$2b/yr with Google to host iCloud data on Google Cloud.

Moreover, it's increasingly easy each year for companies to support BYOD and let employees procure whatever they want that meets IT requirements. My current employer gave all non-tech staff $2000 to buy themselves a laptop, which was then enrolled in some fleet management systems with almost a single click.

Frankly, I see very few people choosing Windows anymore.

Also, another point to add: Microsoft's Intune fleet management system is perfectly capable of managing Macs, and you can use AD as your IDM source of truth for just about anything, including SSO for Google Workspace & ChromeOS devices.

To your last point, Windows Server is a hard requirement in many enterprises because of legacy or procured software that requires it. That is entirely separate from end user computing.

(I used to run end user computing for an F500, and I also ran the Enterprise Apps org at the same time. This was from about 2008-2015, and initiatives including mass migrations aware from MS Office to Workspace, and replacing thousands of Windows laptops with Chromebooks.)


The point is, if Microsoft managed, why wouldn't Google be able to reinvent itself?

I think many of us are underestimating Microsoft because of how crappy Windows is and keeps being.

But as a business entity they've been ferocious from the start, and succeeded through sheer perseverance where Google gave up after some tepid tries.

Xbox would have been killed by Google in the first year. Exchange would have stayed in beta for a decade, and Office365 would have had no support if it was in GSuite.

If Google were to find a way, I think they'd need a radically different approach, as I don't see them ever fixing their focus problem.


I think that's a valid point. Maybe Google culturally will not be able to turn around. Not because crappy product, but because of lack of focus.

That said, Google is still printing money and increasing profits and revenues. Nothing like falling profits (or even losses) to create some pressure to focus. DEC would be the example of a company that failed to do so.


Reinventing yourself because you imploded your primary market is still an own-goal. If you can capture a new market then you could have had both. And what if the primary market collapses first?

AT&T, GE, AOL, Yahoo, Sony technology (they are a media company now, but they did used to make things that weren't a game console), Time Warner, SGI, Compaq, 3DFx, DEC...

Not only that, most of the other examples are just not at the end of their death spiral yet. Take a look at Windows market share, it's down 20% over the last 10 years:

https://www.statista.com/statistics/218089/global-market-sha...

And that's just desktop. Microsoft ceded the entire mobile market, which in turn now represents the majority of devices. The majority of the company's profits no longer come from selling Windows and Office. If they hadn't pivoted into a new line of business (Azure) they'd be on a trajectory to impact with the ground.

IBM has been bleeding customers -- and business units -- for decades. Their stock is flat, not even keeping up with inflation, compared to +300% over the last decade for the overall market. And they have no obvious path to redemption.

Oracle is kind of an outlier because of the nature of their business. Their product has an extraordinarily high transition cost, so once you're locked in, they can fleece you pretty hard and still not have it cost more than the cost of paying database admins high hourly rates for many hours to transition to a different database. Then they focus their efforts on getting naive MBAs to make a one-time mistake with a long-term cost. Or just literal bribery:

https://www.cnbc.com/2022/09/27/sec-fines-oracle-23-million-...

And even with that, their database market share has been declining and they're only making up the revenue in the same way as Microsoft through cloud services.

Meta isn't a great example because people just don't hate them that much. Facebook sucks but in mostly the same ways as their major competitors, they're still run by the founder and they do things people like, like releasing LLaMA for free.


All of the companies I cited are hugely profitable. They might not be as large as they once were, or as important, but a business that has non-declining net income in the billions is not in a death spiral. IBM has shrunk a lot, but except for the financial crisis in the 90s, they have been profitable every year, and profits are roughly flat since 2017.

This is certainly a completely different picture than Yahoo for example.

And your argument for Microsoft is that they are in a death spiral because they only have 70% of market share on the desktop, and are shrinking by 2% per year, so in, uh 15 Years they might only have 50% of the market share! Also, please ignore that they successfully diversified their revenue streams to other markets (Cloud).

And your evidence is that they failed to capture the mobile market. While you also argue that Google is in a death spiral when Google is actually the company that won the mobile market.

I think you might be using the term death spiral in an unconventional way here.


> All of the companies I cited are hugely profitable.

You cited them because they are hugely profitable, ignoring the ones that are already defunct. And the entire premise is that a company can simultaneously be posting profits while doing the thing that will ultimately destroy them.

> And your argument for Microsoft is that they are in a death spiral because they only have 70% of market share on the desktop, and are shrinking by 2% per year, so in, uh 15 Years they might only have 50% of the market share!

Platforms have a network effect. They're doing so poorly that the network effect from having 90% market share isn't enough to prevent them from losing market share. But now they only have the network effect from 70% market share, which makes it even easier for customers to switch. That's how you get a death spiral.

> Also, please ignore that they successfully diversified their revenue streams to other markets (Cloud).

Which are in turn dependent on customers using Windows so they need Active Directory etc. See also:

https://news.ycombinator.com/item?id=40142351

> And your evidence is that they failed to capture the mobile market. While you also argue that Google is in a death spiral when Google is actually the company that won the mobile market.

It is unquestionably the case that Microsoft lost the mobile market, which is the larger market. Android has the most worldwide market share, but Android is free to use and generates revenue for Google only to the extent that people want their services. If people stop wanting their services and switch to e.g. another search engine, how does it save Google from this even if they're using Android?


statista is locked behind paywall

Yeah, it's a pain in the butt. It often shows you the graph and then you try to show the link to someone else and it tries to get them to swipe their card as if anybody is going to do that. Meanwhile it ranks highly in Google search results instead of some other site that contains the same information without the bait and switch, because Google has completely lost the ability to produce quality search results.

Maybe it's time to switch to a competitor.


AT&T: 15B$ net income, world largest telecom company. #13.0n Fortune 500.

GE, while a reasonable example of a company that declined severely from its peak, was still generating 9B$ in income on 2023 before being split in better focused and profitable successors.

AOL/Yahoo were never dominant in a mature market. They were early to the Internet, but this was an uncharacteristically volatile time with an exponentially growing market.

Sony is also a leading manufacturer in several tech sectors (second largest camera, largest premium TVs). 6B$ net income and rising.

3DFx was never dominant in a mature field but, again, early in a nascent one. They collapsed quickly, not through some highly profitable extended death spiral.

Compaq was never dominant in a highly profitable field. Their market share peaked at 14%.

DEC might be a genuine example, they were never the top of the field, but they did not manage to adapt and turn things around when the world moved in a different direction. Compare to IBM who _were_ in a dominant position in the same field, and have leveraged that position into a sustainable and steady, if smaller and less groundbreaking, business.

Google might be in trouble (relatively speaking) if LLMs disrupt search, but they are not close to being in trouble from being outcompeted in search itself.


Are you really doing this?

AT&T: today is not AT&T. The name was bought. It used to be Cingular.

GE: so your point is that it is a good example.

AOL/Yahoo: A 'mature market'? Are you making up rules so you can disqualify them?

Sony today only innovates in image sensors. They are a financial and entertainment company. Who cares if they sell the most 'premium TVs', this is the company created (off the top of my head) Betamax, CDs, DVDs, Minidiscs, Trinitrons, and made the best consumer tech in the world -- consistently.

3Dfx was the leader of an industry that is now lead by nVidia. That industry wasn't as big then, but everyone knew it would be and it was theirs to lose.

Compaq was the market leader in PC sales in the 90s.

DEC: so, it is a good example.


I used the term "as established as google". In my mind this certainly meant the market has to be established. As long as an industry is brand new and rapidly developing, things are obviously different. Many early market leaders didn't make it in the internet. But in the last ten years, market leaders haven't been failing in the same way.

So no, not changing the rules, but maybe clarifying the point. Situations such as the rise of the internet in the late 90s and early 2000s are the anomaly, not the rule.

Operating Systems and Internet search are roughly the same they were ten years ago. 3d accelerator cards changed immensely in the years when 3dfx failed. Microsoft and Google are not in businesses where younger agile companies that read the changing tides better can quickly supplant them.

And that's why they get a thousand chances to turn things around while printing money with their "death spiralling" business.


Your question is effectively answerable as 'no' if you want to limit it to exactly google like market positions, because they haven't existed before. I was answering with examples of market leaders that fell due to bad top-down decisions.

Symantec comes to mind.

I know they aren’t the same scale as Google, but what you wrote really describes Atlassian for me.

While I totally agree that Atlassian products are terrible and steadily getting even worse, I'm not sure they are going anywhere anytime soon given their disconnect between users and customers. Most people who have to suffer their products have no say in the purchasing decision, and the company does a somewhat better job of appealing to the relative small group that does. Atlassian could very well have Oracle-like staying power.

That also sounds a lot like Blockbuster.

Google continues generating profits out of inertia and a lack of a better alternative.

It went for “don’t be evil” to “a necessary evil” (just until something a little better appears).


I think they are just at attain median levels of evil now.

The bigger the behemoth, the slower the fall.

You know how a chess player will say something like "mate in 6" because their experience of all the options left to their opponent are both easily countered and will not prevent them from losing? Companies, and tech companies in my experience, get into death spirals due to a combination of people, culture, and organization. Pulling out of one of those is possible but requires a unique combination of factors and a strong leadership team to pull off. Something that is very hard to put into place when the existing leadership has overriding voting power. You can look at GE, IBM, and to some extent AT&T as companies that have "re-invented" themselves or at least avoided dissolution into an over marketed brand.

I have a strong memory of watching a Jacques Cousteau documentary on sharks and learning that Sharks could become mortally wounded but not realize it because of how their nervous system was structured. As a kid I thought that was funny, as an engineer watching companies in the Bay Area die it was more sobering.

If you have read the article, I think Gomes was right and saw search as a product, whereas Raghavan saw it as a tool for shoveling ads. A good friend of mine who worked there until 2020 wouldn't tell me why they left, but acknowledged that it was this that finally "ruined" Google.

Their cash cow is dying, I know from running a search engine what sort of revenue you can get from being "just one of the search engine choices" versus the 800lb gorilla. Advertisers are disillusioned, and structurally their company requires growth to support the stock price which supports their salary offerings. There is a nice supportable business for about 5,000 - 8,000 people there, but getting there from where they are?

My best guess at the moment is that when they die, "for reals" as they say, their other bets will either be spun off or folded, their search team will get bought by Apple with enough infrastructure to run it, Amazon or someone else buys a bunch of data centers, and one of the media companies buys the youtube assets.


> You know how a chess player will say something like "mate in 6" because their experience of all the options left to their opponent are both easily countered and will not prevent them from losing?

As a chess person, saying "Mate in _" means it's a calculated inevitability. There is no mathematical way out of it.

It is not nearly equivalent to the outside judgement of a company with so many factors — it's just incomparable.


I don't disagree, chess is much more algorithmic and predictable. Maybe it is more like seeing your best mate of the last 20 years getting into their fourth or fifth relationship with the same kind of partner they failed with before and thinking, "Seen this movie before, it is not gonna work out." No algorithms, just you know how you're friend sabotages themselves and you also know they can't (or won't) look critically at that behavior, and so they are doomed to fail again.

But I can guarantee you that Google employees are reading these comments and saying "Wow, this guy is totally full of it, he doesn't know about anything!" and for some of them that thought will arise not from flaws in what I and others are saying, but in the uncomfortable space of "if this is accurate my future plans I'm invested in are not going to happen..., this must be wrong." I have lived in that space with an early startup I helped start, when I went back and worked on the trauma it had caused me it taught me a lot about my willingness to ignore the thinking part of my brain when it conflicted with the emotional part.

You have to do some of that to take risks, but you also have to recognize that they are risks. Painful lesson for me.


Yes, but there are other positions that do fit the comparison, like a couple of advanced passed pawns that can still be defended against with surgical precision, but most times are lethal.

Again, I think there is a misunderstanding of what the saying is used for.

In chess, it's specifically used for saying "even with the best defense possible, you will be mated no mater what in a maximum of X moves." Computers use this definition as well. If Stockfish says # in 6, that means there is an indefensible path to mate available, and with the best play of the opponent will take 6 moves.

It's not a "Mate in X, probably."


Chuck, curious if you have ever posted here on what happened to Sun Micro. Love to read your take on it.

I don't think so. At one of the Sun Reunion events a bunch of us sat around and talked about it. I suggested someone should write a companion volume to "Sunrise: The first 10 years of Sun" called "Sunset: The last 10 years of Sun." But as far as I know nobody followed up (if they did they didn't reach out to me for my take)

Quit teasing. Give us a taste, then. [:)]

#E#M#A#I#L#I#N#B#I#O#

:-)


will do, sir.

With Google, I always feel like the side hustles (waymo, X, etc.) Really exist to be sold off in the future to prop up the add/search business and ensure future profitability. Everything not adds/search is on that list, and anything shut down despite being useful isn't seen as future-sellable.

Google today is starting to smell of future financial engineering games, like when a car maker earns more through financing than selling core product.


fwiw, there are approximately 25,000 FTEs reporting up to Thomas Kurian, and I'm not sure how many thousands of TVCs. That's just for Cloud, and doesn't include the massive numbers of additional, relevant employees directly support Cloud from within TI. Part of Google's problem is that it's so big and so broad, and has always insisted on a monorepo for internal source code, and it's outsourced to vendors as much as possible, that it's nearly impossible to disentangle any one business unit from the next. I predict that if the FTC or the EU seriously try to break up the company, this will be there argument against it.

The majority of that revenue comes from violating data protection law and regulators and litigants are slowly racking up a series of wins which will gut ads margins.

There is no Plan B, they are just going to break the law until they can’t and there’s zero clue what happens after that.

They sat back and let OpenAI kick their ass precisely because ghouls like Prabakar call the shots and LLM are not a good display ads fit.

The best parallel for Google is Kodak.


Dead in the same sense that IBM was dead in the late 90s, but it is not quite there yet I would say.

Number of HN complaints per day posted.

How did the slashtag feature worked and what did it do ? It seems like a interesting concept but sadly the site is dead . What happened to it ?

People would add sites for a particular topic (aka slashtag) to their list. That would build a virtual custom search engine within the search engine. And topic specific searches thrown at it would consistently out perform Bing and Google in terms of search quality. The meta "spam" slash tag (everyone got their own) would let you tell the engine sites you never wanted to see in your search results so if you were tired of your medical queries being spammed by quacks, add them to your spam list and they wouldn't be in your results.

FWIW, I've wanted things like that for so long. I'm sad that I never even heard of Blekko.

Why did it shut down ?

Ultimately, lack of traffic. During Blekko's lifetime Google went from paying people less than $10M/quarter to send their search traffic to Google to over $4B/quarter to do that. If you are ad based you need traffic to serve ads to.

At some point a pay for search model might emerge that has a big enough audience to support a company but that time is not yet here.


1 . Does that mean blekko was something similar to millionshort ? 2. Was blekko capable of tackling seo sites or blogspam taht we have today or it had the advantage of low spam site count from the old web ? 3. Does it have a chance of coming back like how yahoo has been recently hinting a comeback ? 4. A stupid question : How much will it cost it build a blekko today ?

Not sure on #1, definitely mitigated SEO sites and blogspam (on an individual basis, if you added a spammy blog to your personal spam list it stopped showing up in results). As a result on slashtag searches there was very little spam.

Would it come back? As it was? no. The folks at Bing used some of the techniques to mitigate some spam in Bing results but didn’t implement slashtags.

It would cost between $3 - $6M to go from scratch to developing a 3 billion page index with a 10 billion page crawl ‘frontier’. You can seed the crawl with Common Crawl. If you can get $10 RPM’s ($10 per thousand queries) and roughly 10M queries/day (so $10k/day recurring revenue) you can run an operationally cash flow positive business. You would want to grow it organically to a 10 billion page index on a 100 billion page crawl which would cover 90+% of the english language queries. With clever optimizations (like a news sieve to only index pages about the news that made ‘sense’) you might improve efficiencies. You would also want to focus on reference applications (people who use search to get their job done) for paid subscriber growth, and simpler commercial partnerships for managing ad lead generation on commercial search (people looking for products or services).

Also you would need to be an advertising ‘primary’ (not taking feeds from networks on a revenue share model) So, for example, working directly with Amazon to both efficiently access their internal product index and to surface it on commercial queries. Note people like Amazon do their own advertising business on their own index so you compete with that to some extent and navigating that early is essential.

Certainly doable but not something that your typical venture fund would go for. It would have a longer payback time (lower internal rate of return) than VC’s look for.


The 2010-2013 timeline was also when the problem of ad fraud exploded. Google even acquired a company (or multiple, if I recon correctly: https://www.ft.com/content/352c7d8e-9acc-11e3-946b-00144feab... ). You had these companies popping up left and right that were snooping on Google and the emerging programmatic advertising environment to see if the websites and the traffic delivered were legit, and there were some scary numbers flying around.

The whole problem kind of got swept under the rug with most advertising ecosystems implementing a checkbox solution for clean traffic, and the web turned mobile user first.

My impression is that ad fraud never disappeared - it just got sanitized and rolled in with the other parts of the ad stack.


Exactly.

How much of (online) advertising is legit? Does any one know?

What would a "healthy" ad ecosystem look like? What should the the FTC (and advertisers) be working towards?

Eliminate any potential conflicts of interest? Bust up vertical integration (eg search & ads must remain separate companies)? Independent verification, as best able (eg like Nielsen does for ratings)?

Or maybe we determine (digital) ads based biz models are irredeemable, and we figure something else out.


1. Do you know what caused it ? 2. How did the hostility look like ? 3. How did you circumvented them ? 4. What did your search service do ?

I don't know what caused it but I suspected at the time, and still do, that it was simply business people getting more involved in order to drive growth.

The hostility was simply this. One day we had a dedicated high level Google engineer helping us out and giving us guidance (and even special tags) to get the data we needed in a cost effective manner for both Google and us. The next day, he was gone and we received demands to know exactly what we were doing, why and even sensitive information about our business. After several months of such probing, we were summarily told that the access we had was revoked and that there was no recourse.

We circumvented by setting up thousands of unique IP addresses in 50+ countries throughout the world and pointing our spiders at Google through them (same as they do to everyone else). These were throttled to maintain very low usage rates and stay off the radar. We continually refilled our queues with untouched IPs in case any were ever blacklisted (which happened occasionally).

As for what we did, we sampled ads for every keyword under the sun, aggregated and analyzed them to find out what was working and what wasn't. This even led to methods for estimating advertiser budgets. At one point, we had virtually every Google advertiser and their ongoing monthly spend, keywords and ad copy in our database. Highly valuable to smart marketers who were looking for an edge.


I enjoy reading this chap's stuff. It's not the way that I would write, but he's got a much broader audience than I do, so he obviously is meeting the needs of the reading population.

I do feel that I can't argue with his stuff, although it is very dark and cynical (and, truth be told, I have a lot of dark and cynical, in me, as well, but I try not to let it come out to play, too often).


Before 2019, the year most people who had issues with Google Search gave as the last one it was decent was 2012, so that tracks.

How many companies have management consultants taken down? It's quite amazing how bad they are at anything. Peter Thiel's hatred for consultants is really legit.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: