Google Is the Only Search Engine That Works on Reddit Now Thanks to AI Deal

https://www.404media.co/google-is-the-only-search-engine-that-works-on-reddit-now-thanks-to-ai-deal/

Google is now the only search engine that can surface results from Reddit, making one of the web’s most valuable repositories of user generated content exclusive to the internet’s already dominant search engine.

If you use Bing, DuckDuckGo, Mojeek, Qwant or any other alternative search engine that doesn’t rely on Google’s indexing and search Reddit by using “site:reddit.com,” you will not see any results from the last week. DuckDuckGo is currently turning up seven links when searching Reddit, but provides no data on where the links go or why, instead only saying that “We would like to show you a description here but the site won’t allow us.” Older results will still show up, but these search engines are no longer able to “crawl” Reddit, meaning that Google is the only search engine that will turn up results from Reddit going forward.

(you should register at 404media, they have great content!)

Reddit robots.txt:

# Welcome to Reddit's robots.txt
# Reddit believes in an open internet, but not the misuse of public content.
# See  Reddit's Public Content Policy for access and use restrictions to Reddit content.
# See  for details on how Reddit continues to support research and non-commercial use.
# policy: 

User-agent: *
Disallow: /https://support.reddithelp.com/hc/en-us/articles/26410290525844-Public-Content-Policyhttps://www.reddit.com/r/reddit4researchers/https://support.reddithelp.com/hc/en-us/articles/26410290525844-Public-Content-Policy

Indexing is fully blocked.

EDIT: Just to clarify to those who think that robots.txt is not a requirement.
Reddit holds copyright for distributing of all content that is created on reddit.
And yes, technically it’s not blocked.

This supposed to prevent ai training?

That’s easy, scrape google which scrapes reddit.

Fuck u/spez

“Front page of the internet google”?

This message is good: “if you are using an automated agent to access Reddit, you need to abide by our terms and policies, and you need to talk to us.” In the face of the content pillagers, the likes of Microsoft, OpenAI, and the other million AI startups that have zero regard to ethics, going from a blocklist approach to an allowlist approach is reasonable.

The implementation absolutely is not, however. If the end result is “we literally block everyone but Google” then you’re actively making things worse. This could’ve been a chance to provide a contrast between search engines that stay out of the LLM craze and those who don’t.

He [Reddit spokesperson Tim Rathschmidt] said that Reddit is blocking all crawlers unwilling to commit to not using crawl data for AI training, and that Reddit has “been in discussions with multiple search engines. We have been unable to reach agreements with all of them, since some are unable or unwilling to make enforceable promises regarding their use of Reddit content, including their use for AI.”

That’d be nice if it were true, but

  • You’re allowing fucking Google, which does fucking use crawl data for AI training; you got them to pay you for exactly that. If money can justify misuse then would fucking OpenAI be allowed if they get in a deal with you? They’ve already partnered with StackOverflow, so it’s not like they’re not willing to pay.
  • You’re protecting UGC and selling them as if it’s your possession. You’re just interested in being paid for your user’s contributions while providing nothing to users, while using the fight against misuse of public content as a fucking excuse.

Man, if only we had like a division of the government that was supposed to enforce anti consumer practices and big companies throwing their monopolistic weight around….

This… is an anti-trust lawsuit begging to be had. Yes sites are allowed to control that but when you publicly have a deal with Google for training AI AND limit search crawling to just them, that harms competition in search and thus the public.

More insane recepies coming soon.

On a side note Google is not even usable without Reddit anymore. The results are pure trash if you don’t add “Reddit” to your query.

Report this to your state attorney general. This shouldn’t be legal, it’s anti-competitive

This doesn’t seem like a very anti-trust action, Google

How hard does u/spez need to monetize you before you switch to lemmy

EDIT: For those who don’t know what Lemmy is, it’s an open source reddit-like forum that anyone can host on a server. Wikipedia GitHub Since servers share a protocol (a user registered on one server can subscribe to forums on any other server, and can have discussions with users registered elsewhere) and can be hosted by anyone, it’s a truly decentralized version of Reddit that no one person/group can control.

We need reddit competitors

Bing is in shambles rn

Google is the only search engine I use but fuck monopolies.

Bing for porn.

Google for anything else.

OK, time to stop with making content for free

Qwant still works for me

Part of me wants to believe this will be the straw that finally kills reddit for good, and brings us closer to a decentralized internet like the ‘good old days’. But it is really just more monopolistic enshittification.

It’s a good time to remember that federation exists. Lemmy is a fantastic place that replicates the reddit experience, yet seamlessly spans multiple sites. Federation really is the obvious next evolution of the internet, if only it wasn’t getting dominated by massive advertising budgets and the modern insular web. Web companies don’t want you leaving their platforms, so they make it harder and harder to even link elsewhere.

Is this going to spur an antitrust investigation? I think so.