Reddit is suing corporations SerApi, OxyLabs, AWMProxy and Perplexity for allegedly scraping its knowledge from search outcomes and utilizing it and not using a license, The New York Times reports. The brand new lawsuit follows legal action towards AI startup Anthropic, who allegedly used Reddit content material to coach its Claude chatbot.
As of 2023, Reddit charges companies trying entry to posts and different content material within the hopes of getting cash on knowledge that may very well be used for AI coaching. The corporate has additionally signed licensing offers with corporations like Google and OpenAI, and even constructed an AI answer machine of its own to leverage the information in customers’ posts. Scraping search outcomes for Reddit content material avoids these funds, which is why the corporate is in search of monetary damages and a everlasting injunction that stops corporations from promoting beforehand scraped Reddit materials.
A few of the corporations Reddit is concentrated on, like SerApi, OxyLabs and AWMProxy, will not be precisely family names, however they’ve all made accumulating knowledge from search outcomes and promoting it a key a part of their enterprise. Perplexity’s inclusion within the lawsuit could be extra apparent. The AI firm wants knowledge to coach its fashions, and has already been caught seemingly copying and regurgitating materials it hasn’t paid to license. That additionally consists of reportedly ignoring the robots.txt protocol, a manner for web sites to speak that they do not need their materials scraped.
Per a replica of the lawsuit offered to Engadget, Reddit had already despatched a cease-and-desist to Perplexity asking it to cease scraping posts and not using a license. The corporate claimed it did not use Reddit knowledge, however it additionally continued to quote the platform in solutions from its chatbot. Reddit says it was capable of show Perplexity was utilizing scraped Reddit content material by making a “take a look at publish” that “might solely be crawled by Google’s search engine and was not in any other case accessible anyplace on the web.” Inside a number of hours, queries made to Perplexity’s reply engine had been capable of reproduce the content material of the publish.
“The one manner that Perplexity might have obtained that Reddit content material after which used it in its ‘reply engine’ is that if it and/or its co-defendants scraped Google [search results] for that Reddit content material and Perplexity then shortly included that knowledge into its reply engine,” the lawsuit claims.
When requested to remark, Perplexity offered the next assertion:
Perplexity has not but acquired the lawsuit, however we are going to at all times battle vigorously for customers’ rights to freely and pretty entry public information. Our strategy stays principled and accountable as we offer factual solutions with correct AI, and we won’t tolerate threats towards openness and the general public curiosity.
This new lawsuit suits with the aggressive stance Reddit has taken in the direction of defending its knowledge, together with rate-limiting unknown bots and net crawlers in 2024, and even limiting what entry the Web Archive’s Wayback Machine has to its website in August 2025. The corporate has additionally sought to outline new phrases round how web sites are crawled by adopting the Really Simple Licensing standard, which provides licensing phrases to robots.txt.
Trending Merchandise
Wireless Keyboard and Mouse Combo, Lovaky 2.4G Full-Sized Ergonomic Keyboard Mouse, 3 DPI Adjustable Cordless USB Keyboard and Mouse, Quiet Click for Computer/Laptop/Windows/Mac (1 Pack, Black)
Acer KB272 EBI 27″ IPS Full HD (1920 x 1080) Zero-Body Gaming Workplace Monitor | AMD FreeSync Know-how | As much as 100Hz Refresh | 1ms (VRB) | Low Blue Mild | Tilt | HDMI & VGA Ports,Black
Acer Nitro KG241Y Sbiip 23.8â Full HD (1920 x 1080) VA Gaming Monitor | AMD FreeSync Premium Technology | 165Hz Refresh Rate | 1ms (VRB) | ZeroFrame Design | 1 x Display Port 1.2 & 2 x HDMI 2.0,Black
ASUS RT-AX55 AX1800 Twin Band WiFi 6 Gigabit Router, 802.11ax, Lifetime web safety, Parental Management, Mesh WiFi assist, MU-MIMO, OFDMA, 4 Gigabit LAN Ports, Beamforming
Samsung 32-Inch Odyssey G55C Collection QHD 1000R Curved Gaming Monitor, 1ms(MPRT), HDR10, 165Hz, AMD Radeon FreeSync, Eye Care, LS32CG550ENXZA, 2024
CORSAIR 6500X Mid-Tower ATX Twin Chamber PC Case – Panoramic Tempered Glass – Reverse Connection Motherboard Suitable – No Followers Included – Black
