I need help :/ I built an enormous google size solution and I’m in search of a problem

thevoiceoftruth

New member
TLDR: if you knew every product sold on the internet, what would you do with that info?

Backstory-
I have spent the last 6-12 months building a whole series of web crawlers which crawl and collect 90% of all products on the web. Sounds far fetched right? Well it’s true and I’m able to do this because I have built up a very large homelab to support it all. For the technical folks it’s a petabyte hdd, 100tb flash, 160cores, 3tb of ddr4, 4 3090 GPUs, 10gb sfp, and cost me $15k.

In total it’s about 900 million product pages across 2 million websites and that does not factor in Amazon or eBay new products which I can pull in. It also doesn’t include Alibaba, aliexpress or other country specific marketplaces.

I’m able to do price watching, availability monitoring, product grouping, image similarity search and many other things with this data.

I have tested this out using cloudflare tunnels and hosting it all out of my house, the net result is that a user can preform any of the above searches and get results and images within 2 seconds.

Why did I build it-
Originally I started this because I was looking for some very specific baby clothes (a raccoon baby onesie). I couldn’t find anything on Amazon, eBay didn’t have what I wanted and I questioned the quality of stuff shipped from China, Etsy didn’t have the styles I wanted and was pricey, I ultimately went through 10 pages on google until I found a store that featured what I wanted. Would have been great if I didn’t have to spend 4 hours finding what I wanted.

My problem-
Google and others have a renewed focus in their shopping tools and they have implemented some of the features I originally thought made me unique. This was to be expected, I just didn’t think they would move so quick.

What I need help with-
I need some ideas from other folks in this sub about what they would do with this data. I have a strong preference in wanting to provide a platform for everyday people to use, but I want to put users first and don’t want to bastardize it with sponsored listings/ads; maybe one ad per page or every 5 minutes but nothing more as I’m anti-enshitificaion. I could also go with a b2b SaaS route, however there are so many different ones out there I’m not sure where the fit would be.

So the question is, if you had the details of every product on the internet, what business would you start to leverage that info?
 
@jord Create an API and it’s something we’d pay for. Companies like BrightData for example sell their database of Amazon products for $400k USD for 267M products.
 
I already crawl similar data, there are more valuable information on such data than just the pricing. Clients will find you I guess
 
@hricard1964 These are already a product. Used fairly extensively in the advertising world, they’re called digital share of shelf. Used for pricing , promotion and sku analysis.

My company has built a similar product that’s been running for about 5-6 years and I’ve sold it a bit but the market isn’t huge and there’s a bunch of competitors out there. It’s a slow, difficult b2b sale. Knowing the products available and the price is cool, but without the actual sales data behind it, it’s very much a ‘nice to have’.

Company I worked for before ran a similar, albeit smaller, exercise some 10+ years ago, they used it for inflation and currency comparaison.

Also without amazon or marketplaces you’re not capturing the bulk of consumer goods sales. Realistically you can’t compare across countries with your current cut as a result.

You may find success selling on a smaller scale to niche companies looking to understand what competitors are doing online. But as someone who has sold data of this nature for 10+ years, the product you describe is not ready for wide commercialisation.
 
@edprof I'm not the OP. I have a few different sets of niche data that I don't think anyone else has. I have a few ideas for how other companies might use it but I haven't had luck reaching out to them.
 
@thevoiceoftruth Some of the most popular APIs on RapidAPI are Amazon/Ecomm scrapers, companies would pay monthly for reliable Amazon data. Uptime may be an issue though if everything is being self hosted residentially.
 
@childofjesus201 I never heard of RapidAPI until you and others mentioned it, interesting platform to say the least. I will do some more research on this.

FWIW- I do have a good price on a co-location if I ever need it, in the meantime I just have dual ISP's going to my house because the bandwidth I use is insane (250-300tb per month).
 
@thevoiceoftruth I would build a browser extension that you can open on any page and it will show the same/similar item for a better price.

If you include grocery stores in the crawling, I would create the ability to upload a shopping list (or ideally be able to pull it from cart data on web page) and show the price on different stores - help people save on their everyday online.groceries
 
Back
Top