I am writing a Python script that will scrape data from a customers social media platform. The script will run on the server, but I want the actual API requests to come from the IP address of the establishment.
Are there existing libraries to setup an ESP32 so I can proxy these requests to it?
Edit: the script will run on a cloud server (EC2 instance)
You probably want the ESP32 to ping the remote server every so often, instead of trying to reach the ESP32 directly from the server.
I think I could’ve been clearer in my original post.
The script on the cloud server (EC2 instance) makes API calls. The API caller has an option to proxy the requests by just passing it an IP address and port.
I want to use the ESP32s public IP to proxy those requests through.
That’s confusing. Do you want to funnel all requests through the ESP as a proxy? That won’t work. With its puny 520Kb of Ram (of which you maybe have 200K tops available), that’s painful at least. If not impossible. Can you explain why you want this? It smells XY-problemy.
Run a proxy on the esp32?
Should be doable, but do not expect a lot of pre-made code for doing that.
Also, how should the cloud instance reach the esp32?
Do you have an esp32 directly connected to the Internet?
The usual case would likely be on a LAN with local ip.
Yeah that’s what I was worried - I used to run the script directly on a Raspberry Pi. This was great but the prices of Pi’s are too high now.
The problem I am trying to solve is to have a unique IP for each one of my customers so that the social media platform has a lower chance of detecting its scraped. The way I went about solving it is to just use the customers establishment IP by giving each of them a Pi with the scraper. The customers phone would also have the same public IP and it would look quite “natural” from the platforms point of view. Now I was thinking I can run the heavy script on a server and use a lighter/cheaper machine to just proxy the requests
With its puny 520Kb of Ram (of which you maybe have 200K tops available)
Lots of ESP32s have 2MB RAM now, no more 520KB. I switched to an ESP32 with 2MB PSRAM and it’s made a huge difference in my IoT project.
There are better ways to do whatever it is you want to do that has nothing to do with ESP32 or Raspberry Pis or any IoT device. I’m not even sure why you’re trying to force it on these small devices. I give up on this thread.
You are considering a wrong tool.
Buy a bunch of these $15 beauties - TP-Link TL-MR3020
Install OpenWRT on them. You’d get the cheapest Linux box with network capabilities. Set up your script and a management script on it. Tell your customers to simply plug it in at their place. Gift them a 1 ft Ethernet patch cord.
But be super careful with your setup. If you fuck up, you’d turn your customers’ networks into a free botnet someone could just take and use for bad purposes, and police will be knocking their doors. So no open proxies, no plain text authorization, no passwords, auth with certificates only. For best in class security, learn about CloudFlare ZeroTrust.
Can you share? What other ways can I make requests from a given establishments network without a micro computer/controller?
Thanks this is perfect! That’s exactly what I did with the pi’s before when they were around 30 bucks :)!
I want the actual API requests to come from the IP address of the establishment.
Why do you think you need the requests to come from the IP address of the establishment? Why is that specific IP address important? What do you think would happen if it were just some random IP address?
Other option you could consider is buying access to proxies. People more experienced with the stuff than you and more shadier than you buy specialized kits with dozens of USB modems and local SIM-cards and sell access to them. Typically mobile internet users are NAT-ed behind single IPs by their carriers and the large social networks have them white-listed. You can do your little scraping project through them.
Because this seems to be the cheapest way to ensure that the IP comes from the same city and country as where the user generally comes from.
I could pay for proxies, but they often are already on blacklists - hence my workaround.
I can’t just use another AWS EC2 instance because the CIDR ranges of AWS are known, so the bot will get blocked by the platform.