It takes an array with the entries as input, not a web page. But I guess the HTML parsing should take no more than another few lines? For HTML parsing, I have good experiences with the lxml module which is in the Debian repos. It is fast and works pretty well.
Looks like you're hosting this on fly.io - PAYG model. You could probably host this for free on Cloudflare Workers; 100k requests/day on the free tier; static content (the homepage) is free & unlimited.
Edit: The catch is the 10ms CPU cap per request - you'd need a super lean implementation. Django's too heavy for that.
In some ways a good thing, no? Shows you've got work to do on optimisation for large audiences. A free stress test (unless you're on a host that charges per hit or bandwidth excess), as you will.
Did load eventually for me, thought it was broken as no styles but looks like it's intentional.
Not the same but this gives me an idea… what if there was a map reduce for doms as a web primitive. Like imagine if I could make a dom (or feed) that was some selection and transformation of another dom
I wrote a similar thing in go (using chromedriver, so it could handle things that need JS).
Handled most things nicely, but I found a few sites where I wanted multiple selections to be combined into one document.
I emailed the result to myself, turning any images into attachments; this meant my “feed reader” had read/unread tracking that synced across devices, some html support, folders, offline viewing, etc.
I made a CGI program that ran CSS selectors against URLs and returned the output. I debated making it public and then realized I probably didn't want to run an open proxy. I'm curious how long this will last.
Dates shouldn't matter. The feed has ID elements which is what identify entries. Atom has no guid element. So I would expect this to work with any reader.
The few times I actually tried it, it worked badly, with huge chunks of text content missing from the page. Makes me wonder if with modern web the task has became so difficult even a browser couldn't pull it off, or if they just wasn't trying to do a good job with the feature.
If you're serious about RSS curation and reading, FreshRSS really is the Swiss Army Knife. It does so much, including this“any site/page to RSS”. My favorite feature is that it makes refreshes in your feed reader client pretty much instant, which is such a huge quality of life improvement.
59 requirements, including Django, seems pretty heavy though?
For my own RSS feed, I use this 48 line Python file with no dependencies outside the standard library:
https://github.com/no-gravity/atomfeed.py
It takes an array with the entries as input, not a web page. But I guess the HTML parsing should take no more than another few lines? For HTML parsing, I have good experiences with the lxml module which is in the Debian repos. It is fast and works pretty well.