It will work, but at the scale needed for pretraining you are bound to have many quality issues that will destroy your student model, so your data cleaning process better be very capable.
One way to think of it is that any little bias or undesirable path in your teacher model will be amplified in the resulting data and is likely to become over represented in the student model.
since models can't reason, as you just pointed out, and need examples to do anything, and the LLM companies are abusing everyone's websites with crawlers, why aren't we generating plausible looking but non working code for the crawlers to gobble, in order to poison them?
I mean seriously, fuck everything about how the data is gathered for these things, and everything that your comment implies about them.
The models cannot infer.
The upside of my salty attitude is that hordes of vibe coders are actively doing what I just suggested -- unknowingly.
That seems like a feedback loop that’s unlikely to exist currently. I guess if intentionally plausible but bad data became a really serious problem, the loop could be created… maybe? Although it would be necessary to attribute a bit of code output back to the training data that lead to it.
They can't reason at all. The language specification for Tcl 9 is in the training data of the SOTA models but there exist almost no examples, only documentation. Go ahead, try to get a model to write Tcl 9 instead of 8.5 code and see for yourself. They can't do it, at all. They write 8.5 exclusively, because they only copy. They don't reason. "reasoning" in LLMs is pure marketing.
It becomes clear that it's just statistics once you get near a statistically significant "attractor".
A silly example is any of the riddles where you just simplify it to an obvious degree and the LLM can't get it (mostly gone with recent big models), like: "A man, a sheep, and a boat need to get across a river. How can they do this safely without the sheep being eaten".
A more practically infuriating example is when you want to do something slightly different than a very common problem. The LLM might eventually get it right, after too much guidance, but then it'll slowly revert back to the "common" case. For example, replacing whole chunks of code with whatever common thing when you tell it add comments. This happens frequently to me with super basic vector math.
I've thought about this a lot in the context of "why do I need to learn facts when I can just look them up?"
Understanding a concept means you are able to use it in higher order reasoning. Think about the rote practice necessary to build intuition in mathematics until you're able to use the concept being learned for the next concept which in turn relies on it.
Once that intuition is built, that's understanding.
> You might prefer manual coding, but you might just be bad at AI coding and you might prefer it if you improved at it.
ok but how much am I supposed to spend before I supposedly just "get good"? Because based on the free trials and the pocket change I've spent, I don't consider the ROI worth it.
It wont be the hippest of solutions, but you can use something like Devstral Small with a full open source setup to get experimenting with local LLMs and a bunch of tools - or just chat with it with a chat interface. I did pingponged between Devstral running as a chat interface and my regular text editor some time ago to make a toy project of a raytracer [0] (output) [1] (code).
While it wasn't the fanciest integration (nor the best of codegen), it was good enough to "get going" (the loop was to ask the LLM do something, then me do something else in the background, then fix and merge the changed it did - even though i often had to fix stuff[2], sometimes it was less of a hassle than if i had to start from scratch[3]).
It can give you a vague idea that with more dedicated tooling (i.e. something that does automatically what you'd do by hand[4]) you could do more interesting things (combining with some sort of LSP functionality to pass function bodies to the LLM would also help), though personally i'm not a fan of the "dedicated editor" that seems to be used and i think something more LSP-like (especially if it can also work with existing LSPs) would be neat.
IMO it can be useful for a bunch of boilerplate-y or boring work. The biggest issue i can see is that the context is too small to include everything (imagine, e.g., throwing the entire Blender source code in an LLM which i don't think even the largest of cloud-hosted LLMs can handle) so there needs to be some external way to store stuff dynamically but also the LLM to know that external stuff are available, look them up and store stuff if needed. Not sure how exactly that'd work though to the extent where you could -say- open up a random Blender source code file, point to a function, ask the LLM to make a modification, have it reuse any existing functions in the codebase where appropriate (without you pointing them out) and then, if needed, have the LLM also update the code where the function you modified is used (e.g. if you added/removed some argument or changed the semantics of its use).
[2] e.g. when i asked it to implement a BVH to speed up things it made something that wasn't hierarchical and actually slowed down things
[3] the code it produced for [2] was fixable to do a simple BVH
[4] i tried a larger project and wrote a script that `cat`ed and `xclip`ed a bunch of header files to pass to the LLM so it knows the available functions and each function had a single line comment about what it does - when the LLM wrote new functions it also added that comment. 99% of these oneliner comments were written by the LLM actually.
Not even close. I'm still under $100, creating full apps. Stick to reasonable models and you can achieve and learn a lot. You don't need latest and greatest in max mode (or whatever the new one calls it) for majority of the tasks. You can have to throw the whole project at the service every time either.
do I get a refund if I spend a grand and I'm still not convinced? at some point I'm going to start lying to myself to justify the cost and I don't know how much y'all earn but $1k is getting close
Would you ask for a refund from a university class if you didn’t get a job or skill from it? Investing in a potential skill is a risk and carries an opportunity cost, that’s part of what makes it a risk
This kind of seems like asking “how are poor people supposed to keep up with rich people” which we seem to not have a long term viable answer for right now
For the past 10 years we have been telling everyone learn to code, now it’s learn to build AI prompts.
Before a poor kid with a computer access could learn to code nearly for free, but if it costs $1k just to get started with AI that poor kid will never have that opportunity.
If you lack "that kind of scratch", you are at the learning stage for software development, not the keeping up stage. Either that or horribly underpaid.
I recently had a coworker tell me he liked his last workplace because "we all spoke the same language." It was incredible how much he revealed about himself with what he thought was a simple fact about engineer culture. Your comment reminds me of that exchange.
- Employers, not employees, should provide workplace equipment or compensation for equipment. Don't buy bits for the shop, nails for the foreman, or Cursor for the tech lead.
- the workplace is not a meritocracy. People are not defined by their wealth.
- If $1,000 does not represent an appreciable amount of someone's assets, they are doing well in life. Approximately half of US citizens cannot afford rent if they lose a paycheck.
- Sometimes the money needs to go somewhere else. Got kids? Sick and in the hospital? Loan sharks? A pool full of sharks and they need a lot of food?
- Folks can have different priorities and it's as simple as that
We're (my employer) still unsure if new dev tooling is improving productivity. If we find out it was unhelpful, I'll be very glad I didn't lose my own money.
I agree with all this but the simple fact is that if you don't keep up you'll be out of a job faster than the rest of us. My strategy for being replaced by AI is to replace the company that replaces me. Software is getting trivial to implement, especially if you know how to specify it.
This applies to AI, too, albeit in different ways:
1. You can iteratively improve the rules and prompts you give to the AI when coding. I do this a lot. My process is constantly improving, and the AI makes fewer mistakes as a result.
2. AI models get smarter. Just in the past few months, the LLMs I use to code are making significantly fewer mistakes than they were.
There are definitely dumb errors that are hard for human reviewers to find because nobody expects them.
One concrete example is confusing value and pointer types in C. I've seen people try to cast a `uuid` variable into a `char` buffer to, for example, memset it, by doing `(const char *)&uuid)`. It turned out, however, that `uuid` was not a value type but rather a pointer, and so this ended up just blasting the stack because instead of taking the address of the uuid storage, it's taking the address of the pointer to the storage. If you're hundreds of lines deep and are looking for more complex functional issues, it's very easy to overlook.
But my gripe with your first point is that by the time I write an exact detailed step-by-step prompt for them, I could have written the code by hand. Like there is a reason we are not using fuzzy human language in math/coding, it is ambiguous. I always feel like doing those funny videos where you have to write exact instructions on how to make a peanut butter sandwich, getting deliberately misinterpreted. Except it is not fun at all when you are the one writing the instructions.
2. It's very questionable that they will get any smarter, we have hit the plateau of diminishing returns. They will get more optimized, we can run them more times with more context (e.g. chain of thought), but they fundamentally won't get better at reasoning.
> by the time I write an exact detailed step-by-step prompt for them, I could have written the code by hand
The improved prompt or project documentation guides every future line of code written, whether by a human or an AI. It pays dividends for any long term project.
> Like there is a reason we are not using fuzzy human language in math/coding
You're misunderstanding the point of structural analysis. Comparing AI to divination isn't about making everything equivalent, but about highlighting specific shared structures that reveal how humans interact with these systems. The fact that this comparison can be extended to other domains doesn't make it meaningless.
The issue isn't "cached intuitions" about divination, but rather that you're reading the comparison too literally. It's not about importing every historical association, but about identifying specific parallels that shed light on user behavior and expectations.
Your proposed "resolutions" are based on a false dichotomy between total equivalence and total abandonment of comparison. Structural analysis can be useful even if it's not a perfect fit. The comparison isn't about labeling AI as "divination" in the classical sense, but about understanding the interpretive practices involved in human-AI interaction.
You're sidestepping the actual insight here, which is that humans tend to project meaning onto ambiguous outputs from systems they perceive as having special insight or authority. That's a meaningful observation, regardless of whether AI is "causally disentangled from reality" or not.
> It's not about importing every historical association, but about identifying specific parallels that shed light on user behavior and expectations.
Indeed, I hold that driving readers to intuit one specific parallel to divination and apply it to AI is the goal of the comparison, and why it is so jealously guarded, as without it any substance evaporates.
The thermometer has well-founded authority to relay the temperature, the bones have not the well-founded authority to relay my fate. The insight, such as you call it, is only illuminative if AI is more like the latter than the former.
This mode of analysis (the structural) takes no valid step in either direction, only seeding the ground with a trap for readers to stumble into (the aforementioned propensity to not clear caches).
> That's a meaningful observation, regardless of whether AI is "causally disentangled from reality" or not.
If the authority is well-founded (i.e., is causally entangled in the way I described), the observation is meaningless, as all communication is interpretative in this sense.
The structural approach only serves as rhetorical sleight of hand to smuggle in a sense of not-well-founded authority from divination in general, and apply it to AI. But the same path opens to all communication, so what can it reveal in truth? In a word, nothing.
> That's a meaningful observation, regardless of whether AI is "causally disentangled from reality" or not.
And regardless of how many words someone uses in their failed attempt at "gotcha" that nobody else is playing. There are certainly some folks acting silly here, and it's not the vast majority of us who have no problem interpreting and engaging with the structural analysis.
This is a great example because the LLM answer was insufficiently complete but if you didn't look up the result you wouldn't know. I think I remain an AI skeptic because I keep looking up the results and this kind of omission is more common than not.
lumberjacks didn't go away when chainsaws were invented; demand for wood rose to meet the falling cost of wood and lumberjacks kept cutting down trees. don't see why it won't be any different for programmers.
In the US, there were entire cities with essentially just lumberjacks. In other words, there used to be a LOT more lumberjacks than there are today, and they were a lot better paid.
too expensive since those are all licensed sources, much easier to train on Reddit data
reply