Hacker News new | past | comments | ask | show | jobs | submit | jasoncartwright's comments login

Web scraping - creating semi-structured data from a wide variety of horrific HTML soups.

Absolutely do swap out models sometimes, but Gemini 2.0 Flash is the right price/performance mix for me right now. Will test Gemini 2.5 Flash-Lite tomorrow though.


Pirate seaborne internet has been tried.

"A Navy officer is demoted after sneaking a satellite dish onto a warship to get the internet"

https://apnews.com/article/navy-illegal-wireless-internet-53...

https://www.navytimes.com/news/your-navy/2024/09/03/how-navy...


Maybe military is not best place to have "piracy internet", yet I think with current tech - internet access should not be an issue.

How in the bloody hell do you install a Starlink on a submarine?

---The Starlink dish was secretly installed on the Independence Class vessel’s weather deck, where it was relatively out of view. The network was initially named “Stinky”, but it was later renamed to appear to be a wireless printer – despite there being no such devices aboard.

---The Starlink dish wasn’t discovered until a civilian technician, installing a Starshield satellite communications system, noted the device and reported it to a senior crew member.


Arg, thanks. I literally read that, but there was too much advertising on that site. Maybe the Navy needs to provide Wi-Fi to sailors if it means that much for their morale. And maybe consumer endpoint security needs to be so rock-solid that a sailor can trust their phone/laptop on a voyage.

Post suggests the underlying problem is the metadata takes too long to put in the HTML? A few tags? Really?

This kind of problem, and the vendor lock-in, is completely unimaginable for any web framework I can think of. Bizarre.


I think it was that if metadata requires getting data from third parties or something, it takes a long time, and it doesn't affect what the user sees.

It did say this was a problem for very few people.


Implementing an option in order to stream metadata would have been more smart. Am I missing something obvious ?

That's "deprecating EKS for Omega star at the end of the month" level of dumb.

Let the programmer do high latency shit if they want. Give them a damn catch if they need it.


Still waiting on support for ISO timestamps...


Tangential! I bought a 210L water butt to collect rainwater to water plants a while back. It cost £110 to my door with the all installation parts. Out of interest I looked up that the cost to fill it with pristine London tap water would be ~52p. 211 uses and I'm at breakeven, money-wise.


Communities can 'adopt' them for £1. Other uses include libraries or food bank donation points.

https://business.bt.com/public-sector/street-hubs/adopt-a-ki...


Might want to consider upgrading that VPS for a couple of days


Haha, Just enabled an endpoint wide caching policy. Stale content, but should be a stopgap.


must be a backend person! they were so proud to tell us about eliminating the cost of the scraping, but totally forgot about the cost of the user facing part of it. /s

it is hard to imagine how quickly "going viral" can swamp your system until you've been there, done that.


Ah man, Access. Got a website to over 1m users/month with a dead simple Access DB and ASPv3. Back when a million users was a million users.


The heavy Mac UX compliance is the reason why I enjoy using Postico for 99% of straightforward Postgres tasks


I made this to get around pages being cached at CDN level, but still needing to get live data...

https://github.com/jasoncartwright/clientsideinclude


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: