More

cavisne · 2025-06-22T03:14:53 1750562093

Its a good read but are his intentions unambiguously good? He had a windfall from crypto (ie contributing nothing to society). Then he was desperate to avoid paying the tax he owed on that windfall.

Is there any actual need for this Solar Farm in the middle of nowhere (that was only built there because of a tax scheme)? Are Texas ratepayers meant to cover the cost of the interconnect in their $/Kwh instead of him?

Better to just pay his taxes and move on, and leave the subsidies for an actual useful solar project.

EDIT: Oh and mineral rights are basically the original cryptocurrency/memecoin, so its somewhat funny that they came into play

xhkkffbf · 2025-06-22T15:23:40 1750605820

These are all fair points.

The extra costs and limits are ostensibly because of safety. Yes, I know that engineers can game the system, but they're listing real reasons why the plant should shoulder the costs to the larger grid.

cavisne · 2025-06-19T17:28:32 1750354112

Hilariously there was a story how Google could not train on Youtube data due to their TOS, so they changed it for new videos. Meanwhile everyone else was scraping Youtube as much as they liked and training on it.

cavisne · 2025-06-07T17:26:06 1749317166

Some API's (Gemini at least) run a search on their outputs to see if the model is reciting data from training.

So for direct copies like what you are talking about that would be picked up.

For copying concepts from other libraries, seems like a problem with or without LLM's.

cavisne · 2025-06-07T17:08:42 1749316122

Nothing compared to the NDIS rort in Australia. There is literal gang warfare over providing "services" to people with high payouts.

cavisne · 2025-06-06T20:06:53 1749240413

Anyone who thinks the VA is well run should watch a few episodes of the “financial audit” YouTube channel. The most financially irresponsible people in the world (and generally bad people in general). Almost all are cashing a “disability” payment from the VA.

cavisne · 2025-05-31T21:12:46 1748725966

My theory for these PM's is its basically a cheap way to take potential entrepreneurs off the market. Its hard to predict if a startup will succeed but one genre of success is having a Type A "fake it till you make it" non technical cofounder who can keep raising long enough to get product market fit.

These types all go to the same schools and do really well, interview the same, and value the prestige of working in big tech. So it's pretty easy to identify them and offer them a great career path and take them off the market.

Technical founders are way trickier to identify as they can be dropouts, interview poorly, not value the prestige etc.

cavisne · 2025-05-29T04:54:10 1748494450

"knowing why a model refuses to answer something matters"

The companies that create these models cant answer that question! Models get jailbroken all the time to ignore alignment instructions. The robust refusal logic normally sits on top of the model, ie looking at the responses and flagging anything that they don't want to show to users.

The best tool we have for understanding if a model is refusing to answer a problem or actually doesn't know is mechanistic interp, which you only need the weights for.

This whole debate is weird, even with traditional open source code you cant tell the intent of a programmer, what sources they used to write that code etc.

cavisne · 2025-05-27T03:07:18 1748315238

Only one person in the groupchat needs to be using Telemessage, ie. a CIA agent can use a government device with Telemessage to talk to sources on Signal. Signal has a great protocol & robust clients, and getting caught with Signal on your phone is probably a bit better than being caught with CIAChat on your phone.

The actual implementation here is atrocious though.

cavisne · 2025-05-27T00:34:04 1748306044

The signal protocol is public, using their servers is frowned upon but its not a source code license violation.

cavisne · 2025-05-15T02:42:18 1747276938

From the paper it was a speedup on the XLA GPU kernel they wrote using Jax, which is probably not SOTA. I don't think Jax even has a official flash attention implementation.

yarri · 2025-05-15T15:58:45 1747324725

Not sure what “official” means but would direct you to the GCP MaxText [0] framework which is not what this GDM paper is referring to but rather this repo contains various attention implementations in MaxText/layers/attentions.py

[0] https://github.com/AI-Hypercomputer/maxtext