More

InkCanon · 2025-05-18T13:45:08 1747575908

There was an interesting post here awhile back about autonomy and motivation. The gist was people's motivation is proportional to their autonomy. This is quite intuitive, you can see people are really motivated when they have autonomy (think kids with Minecraft, musicians with instruments). One terrible thing about Anki is that it probably is horrible for autonomy. Quite possibly using anki actually has a negative effect on motivation.

url00 · 2025-05-18T16:05:16 1747584316

That sounds very interesting! Do you still have a link to that post?

Tomte · 2025-05-18T18:44:10 1747593850

Look up Deci/Ryan, self determination theory.

InkCanon · 2025-05-18T13:42:45 1747575765

It's someone I wondered, what is the point of memorizing a proof if it only ever proves something you already know. The answer is you hope it generalises. There is a possible way you can do it in SRS, being inspired by RL training. Instead of cards you'd show options within a game or simulation. But this would need a lot of expert knowledge for a single concept.

bawolff · 2025-05-18T15:58:25 1747583905

Sure, but if you are memorizing the proof instead of understanding it, you aren't going to be able to generalize it.

In general, math is not a subject where memorization is going to get you ahead. The "why" matters much more than the "what".

InkCanon · 2025-05-18T13:40:48 1747575648

I think the difference in recall- knowledgeable and logical-model-knowledge will be really interesting. LLMs appear to strongly be the first. But this is very hopeless on mathematics.

InkCanon · 2025-05-18T13:39:26 1747575566

This is a very big problem. Virtually all the results from research here comes from some form of simple word recall. Direct recall occupies some part of real world tasks, but IRL if you're stopped by doing something it's people not because you can't remember it (and you could look it up if you forgot).

aswegs8 · 2025-05-18T20:04:36 1747598676

It's just logical that memorization is useful for broad areas like vocabulary and get progressively worse the more depth is involved, e.g. vocabulary>grammar>maths. The first one doesn't require generalization, the last one most certainly does. Even though I find that SRS leads to good generalization if it is used for relatively shallow conceptual knowledge.

InkCanon · 2025-05-18T13:37:30 1747575450

There's some UX problems of SRS (that I'm working on) that makes it high friction 1) Time taken to create cards 2) Need for self marking 3) Creates a one to one mapping of prompt-answer 4) If you're an autodidact, you have to teach yourself first (alternatively called understanding, scaffolding, etc)

More fundamentally, SRS isn't a superpower because it's just very specific to creating a direct prompt retrieval. Generalization is poor. Even creating a graph of knowledge, is a chain of edges between bits of knowledge, isn't done very well here.

And I suspect there's a very deep, fundamental difference between recollection knowledge and logical-modeling knowledge. Recollection seems very similar to a dictionary access, and if you recorded the time to recall in humans I suspect they'd all be constant. But learning the knowledge of a logical model, like of a mathematical concept, appears to be vastly different and have very different time to compute.

Proponents of SRS will point out logical models need facts as well, like formulas, lemmas, etc. Which is true. But if you already grasped it before you'd grasp it faster the second time. So the practical use of SRS is a significant step above having a very well sorted and labeled notebook, but still way below becoming a genius.

nomadpenguin · 2025-05-18T14:36:06 1747578966

Poor generalization (overtraining on prompts) and loss of context over time are the biggest issues I've found with them. Slow card creation workflows and needing to rate your own reviews are merely UX issues -- losing context and losing generalization make SRS actively harmful when used for some topics.

There's 2 solutions I've thought of but haven't tried implementing:

1. A free-recall based approach. Free recall allows you to operate at a higher level of organization and connect concepts at lower levels. However, how you would schedule SRS with free recall is not clear.

2. Have an LLM generate questions on-the-fly so that you don't overtrain on prompts. You might also instruct the LLM to create questions that connect multiple concepts together. The problem with this approach is that LLMs are still not so good at creating good test questions.

barrell · 2025-05-18T15:51:00 1747583460

I implemented free recall into FSRS pretty easily. Granted, it’s only for language learning, and I have it set up to work in a free recall friendly way (you don’t learn cards, you learn actual words and morphemes) but it’s been working for a few weeks now. I’m working on a product video atm, but once that’s done my next task (sometime this week) is to clean up the UI and merge it to master.

I almost never see someone talk about free recall so I was too excited to see it mentioned not to comment

nomadpenguin · 2025-05-18T19:16:44 1747595804

How are you handling scheduling with FSRS? The challenge that I quickly saw was that it was difficult to figure out when you should advance a segment of information. If you get 80% of the info right, should it be advanced? What happens to the 20% you missed? How do you prevent yourself from missing the same 20% every time it comes around?

barrell · 2025-05-19T07:44:58 1747640698

If you don’t mention an item, it is skipped (no grade). If you can’t remember an item, but you recall learning it, you describe it and it will be marked as fail. At the end there is a screen with all the words and you can change any from skip to fail if you truly forgot it.

Any skipped items are then prioritized in the flashcards/cloze completion/shadowing modalities.

AFAIK free recall is not very high signal as to which words you know and which ones you don’t. Skipped words I just as often forget about cards because they’re so easy as I do because they’re so hard. It is however an incredibly effective exercise to cement your recall (and in my apps case, a good way to skip a good portion of your reviews in a day)

barrell · 2025-05-18T13:54:47 1747576487

Where do we find more about what you’re working on? :)

InkCanon · 2025-05-05T10:31:43 1746441103

It's a common idea, all the way back to Hoare logic. There was a time when people believed in the future, people would write specifications instead of code.

The problem with it takes several times more effort to verify code than to write it. This makes intuitive sense if you consider that the search space for the properties of code is much larger than the code for space. Rice theorem's states that all non trivial semantic properties of a program are undeniable.

Smaug123 · 2025-05-05T11:31:25 1746444685

No, Rice's theorem states that there is no general procedure to take an arbitrary program and decide nontrivial properties of its behaviour. As software engineers, though, we write specific programs which have properties which can be decided, perhaps by reasoning specific to the program. (That's, like, the whole point of software engineering: you can't claim to have solved a problem if you wrote a program such that it's undecidable whether it solved the problem.)

The "several times more effort to verify code" thing: I'm hoping the next few generations of LLMs will be able to do this properly! Imagine if you were writing in a dependently typed language, and you wrote your test as simply a theorem, and used a very competent LLM (perhaps with other program search techniques; who knows) to fill in the proof, which nobody will never read. Seems like a natural end state of the OP: more compute may relax the constraints on writing software whose behaviour is formally verifiable.

deterministic · 2025-05-08T04:27:21 1746678441

Using a LLM to generate the proofs from a spec and verify it (OK/Error) would make it much faster.

InkCanon · 2025-04-29T17:15:01 1745946901

I think if your assumption is that AI is deducing where it is with rational thoughts, you would be. In truth what probably happened is that the significant majority of digital images of the world had been scraped, labeled and used as training data.

Philpax · 2025-04-29T18:36:50 1745951810

Try it with your own photos from around the world. I used my own photos from Stockholm, San Francisco, Tvarožná, Saas-Fee, London, Bergen, Adelaide, Melbourne, Paris, and Sicily, and can confirm that it was within acceptable range for almost all of them (without EXIF data), and it absolutely nailed some of the more obvious spots.

oncallthrow · 2025-04-29T17:23:56 1745947436

How do you explain https://simonwillison.net/2025/Apr/26/o3-photo-locations/?

Rumudiez · 2025-04-29T18:05:28 1745949928

they only posted one photo in the post, but going off of that it's still an easy match based on streetview imagery. furthermore, the AI just identified the license plate and got lucky that photographer lives in a populous area, making it more prominent in the training data and therefore more likely to be found (even though it was off by 200 miles on its first guess)

simonw · 2025-04-29T18:22:33 1745950953

I posted two more at the bottom, from Madagascar and Buenos Aires: https://simonwillison.net/2025/Apr/26/o3-photo-locations/#up...

InkCanon · 2025-04-24T16:59:27 1745513967

Pretty rich coming from a company that's not-so-slowly outsourcing it's workforce to India.

InkCanon · 2025-04-17T18:41:46 1744915306

Strongly suspect OAI can't afford 20B cash. Their latest funding round was 40B, and they're burning through money like it's rice paper. They could offer OAI equity, but Cursor's founders would probably be very suspicious of private valued stock (which is fairy money).

How wise it is to buy Cursor is another question. Current valuation has them at 100x revenue. And I suspect agentic products will be a lot less cash flow positive than traditional SaaS because of the massive cost of all that constant codebase context and stream of code.

sksxihve · 2025-04-17T19:48:15 1744919295

> The initial funding will be $10 billion, followed by the remaining $30 billion by the end of 2025, the person said. But the round comes with a caveat. SoftBank said in an updated disclosure on Monday that its total investment could be slashed to as low as $20 billion if OpenAI doesn’t restructure into a for-profit entity by Dec. 31.

They might not even get the full $40 billion

blitzar · 2025-04-17T20:13:03 1744920783

I would have assumed the same - earlyish stage comanies in this area will likely be happy to take a big wedge of openai stock and a little cash.

InkCanon · 2025-04-17T15:03:03 1744902183

Some of my relatives and colleagues actually actively encourage this. They give them an iPad with YouTube on it after meals and so on. It acts as a pacifier.

bryanlarsen · 2025-04-17T15:21:46 1744903306

Are you so confident that they're not doing so in a limited and appropriate manner?

basisword · 2025-04-17T15:31:47 1744903907

There’s no appropriate level of short form swipe through content for kids. Adults can barely cope with it.