> Building a Dockerfile took ages due to the low compute
Why compile anything on the raspberry pi? Cross-compile on a machine with more compute (like your laptop, desktop, phone, or ec2 instance) one time, and then transfer the compiled binaries or built docker image over.
> Pi with 2GB instead of 8GB
For headless chrome, that should be enough unless you're doing other stuff with it. Unless you mean for compiling stuff, which as before can be done elsewhere.
First of all the answer recommends to use either a Pi or a VPS. If a Pi doesn't cut it for you, just switch to a VPS with sufficient specs for your requirement. Problem solved, now it's viable.
Besides, a significant part of the web can be scraped without resorting to a heavyweight browser such as Chromium. It should always be the last resort. Even if you have to evaluate Javascript (in case of SPAs), there are much cheaper solutions than Puppeteer (JSDOM being an example) which can get the job done most of the time.
As to Docker, I fail to see why you would need Docker for this kind of job, unless you don't know how to do it without Docker.
When there is no portability requirement, the costs of Docker easily surpass its benefits.
...
So no, it's not that what's being recommended in the answer is not viable. You're doing it wrong. Either you're using a Pi when you need to use a VPS or you are introducing unnecessary layers.
It's true that you don't, but I can see the advantages.
I have a Raspberry Pi that is natively running a scraper using headless Chromium and cron. It works great, except....
I ended up needing a virtual framebuffer. I got it working on the Raspberry Pi, but I got a new workstation and wanted to edit my script and test it there. I got cryptic errors that I needed to debug to understand they were framebuffer issues, then attempt to recreate the setup that's running on my Pi, then debug that.....
My first mistake was not writing down what I did in my README, but a Docker image would have saved me a ton of time here.
That's really deviating from the nature of the "cheapest, easiest way to host a cronjob" question. If the OP has that kind of requirement, he won't get good answers.
You can use the "browserless" docker service which contains a headless chrome browser in a docker container. It also supports puppeteer and playwright connect api. Works flawless! I use it in combination with n8n. All on a raspberry pi4b (yes, I got one recently)
> You can use the "browserless" docker service which contains a headless chrome browser in a docker container.
A lot of websites can detect the IPs from this and block it, basically almostly like a CAPTCHA.
I had other needs like Postgres, etc.
Scraping data is one thing, actually doing anything with it is another. I quickly hit the limits of a $50 RaspberryPi 4 or whatever they're going for on Amazon these days with the gouging, etc.
I tried to do Chromium/Puppeteer based scraping this way.
Building a Dockerfile took ages due to the low compute. (Rust was a non-starter).
I also had (foolishly) only bought the Pi with 2GB instead of 8GB so RAM was an issue.
Disk was super slow.
I'm not sure how viable this is, especially with how hard it is currently to source a Pi, let alone its computation/memory constraints.