They can if they've been post trained on what they know and don't know. The LLM can first been given questions to test its knowledge and if the model returns a wrong answer, it can be given a new training example with an "I don't know" response.
“Hallucination” is seeing/saying something that a sober person clearly knows is not supposed to be there, e.g. “The Vice President under Nixon was Oscar the Grouch.”
Harry Frankfurt defines “bullshitting” as lying to persuade without regard to the truth. (A certain current US president does this profusely and masterfully.)
“Confabulation” is filling the unknown parts of a statement or story with bits that sound as-if they could be true, i.e. they make sense within the context, but are not actually true. People with dementia (e.g. a certain previous US president) will do this unintentionally. Whereas the bullshitter generally knows their bullshit to be false and is intentionally deceiving out of self-interest, confabulation (like hallucination) can simply be the consequence of impaired mental capacity.
> Frankfurt understands bullshit to be characterized not by an intent to deceive but instead by a reckless disregard for the truth.
That is different than defining "bullshitting" as lying. I agree that "confabulation" could otherwise be more accurate. But with previous definition they are kinda synonyms? And "reckless disregard for the truth" may hit closer.
The paper has more direct quotes about the term.
You're right. It's "intent to persuade with a reckless disregard for the truth." But even by this definition, LLMs are not (as far as we know) trying to persuade us of anything, beyond the extent that persuasion is a natural/structural feature of all language.
Claude 4's system prompt was published and contains:
"Claude’s reliable knowledge cutoff date - the date past which it cannot answer questions reliably - is the end of January 2025. It answers all questions the way a highly informed individual in January 2025 would if they were talking to someone from {{currentDateTime}}, "
I thought best guesses were that Claude's system prompt ran to tens of thousands of tokens, with figures like 30,000 tokens being bandied about.
But the documentation page linked here doesn't bear that out. In fact the Claude 3.7 system prompt on this page clocks in at significantly less than 4,000 tokens.
Yup. Either the system prompt includes a date it can parrot, or it doesn't and the LLM will just hallucinate one as needed. Looks like it's the latter case here.
Technically they don’t, but OpenAI must be injecting the current date and time into the system prompt, and Gemini just does a web search for the time when asked.
the point is you can't ask a model what's his training cut off date and expect a reliable answer from the weights itself.
closer you could do is have a bench with -timed- questions that could only know if had been trained for that, and you'd had to deal with hallucinations vs correctness etc
just not what llm's are made for, RAG solves this tho
What would the benefits be of actual time concepts being trained into the weights? Isn’t just tokenizing the dates and including those as normal enough to yield benefits?
E.g. it probably has a pretty good understanding between “second world war” and the time period it lasted. Or are you talking about the relation between “current wall clock time” and questions being asked?
what i mean i guess is llms can -reason- linguistically about time manipulating language, but can't really experience it. a bit like physics. thats why they do bad on exercises/questions about physics/logic that their training corpus might not have seen.
Different teams who work backend/frontend surely, and the people experimenting on the prompts for whatever reason wanna go through the frontend pipeline.
Not an answer to your question but BTC does not really have wallets. Someone just signs a number of BTCs with his private key and your public key so the only way to sign it again is with your private key. This is essentially how it "moves" to "someone".
Some other DTLs (blockchains) actually have wallets/accounts with properties where you can disable incoming funds/add a name/change the password and such stuff.
Unlike Ethereum, Bitcoin addresses are just a hash of a public key. The sent Bitcoins are unspent outputs that you can selectively (depends on the wallet) decide not to use.
That would work. But then there's also the question if the received transaction counts as taxable income in the jurisdiction of the receiver. If that's the case and if they received a very significant amount they would be forced to sell some of the coins so that they can pay the tax for them.
Also there is a good question is there some limit or ratio of tainted coins that would be considered non incriminating. Or if there isn't would single satoshi taint all coins? If there were wouldn't dilution of funds be possible? That is wash them by sending to wallets with enough funds...
That's assuming a BTC miner even agrees to process the transaction. Why would a BTC miner want to deal with the headache of tainted bitcoins? It's such a problem that Marathon Digital Holdings, a major bitcoin miner, has stated they will refuse to process transactions from tainted addresses. In the near future, I expect more mining pools to do the same.
Would you return stolen goods to the thief? How about return it to the owner/authorities. May not be simple but certainly better than sending it back. If all else fail there are black hole addresses to forever lock the BTCs.
what is tracked is outputs and inputs, not addresses. addresses are derived from a public key or a script.
when a block is mined, an output is created, outputs can only be spent once. a bitcoin transaction is just a list of existing outputs (inputs) and new outputs to create (outputs). each output is created with a lock script, to spend the output you must provide the unlock script which normally contains at least a signature and a public key
returning the output that corresponds to the unwanted transaction should do
Would you pay the transaction fee from the returned funds, washing a fraction of it through mining? Or pay for it out of your own utxo, potentially leading to a griefing attack?
If I were a malicious actor with a good way to attack this, I would wait until this was being used in real applications which deal with larger amounts of ETH, and then attack.
Bug bounties aren't aimed at malicious actors, or an attempt to outbid the black market. There's a lot of non malicious people out there who are still competent hackers.
The cockpit recording from Japan Airlines Flight 123 (which crashed due to improper repairs of tail strike damage 7 years earlier) is chilling.
https://www.youtube.com/watch?v=Xfh9-ogUgSQ
> Casualties of the crash included all 15 crew members and 505 of the 509 passengers
> ...
> deadliest single-aircraft accident in history, the deadliest aviation accident in Japan, the second-deadliest Boeing 747 accident and the second-deadliest aviation accident after the 1977 Tenerife airport disaster.
> ...
> During the investigation, Boeing calculated that this incorrect installation would fail after approximately 10,000 pressurization cycles; the aircraft accomplished 12,318 successful flights from the time that the faulty repair was made to when the crash happened.
> ...
> In the aftermath of the incident, Hiroo Tominaga, a JAL maintenance manager, killed himself to atone for the incident, while Susumu Tajima, an engineer who had inspected and cleared the aircraft as flight-worthy, committed suicide due to difficulties at work.
> > In the aftermath of the incident, Hiroo Tominaga, a JAL maintenance manager, killed himself to atone for the incident, while Susumu Tajima, an engineer who had inspected and cleared the aircraft as flight-worthy, committed suicide due to difficulties at work.
That sucks. If ever there were two people that knew first-hand how important it was to get right and would have worked to make sure that accident, and likely others as well could never happen on their shift, those were them. Suicide as atonement is a stupid, counter-productive cultural norm (and hopefully it's much less of a norm in any modern society where it exists).
Interestingly, at the end of WWII, when Tojo was captured by the Americans, he shot himself in the abdomen--creating referencss fo seppuku--and apologized that "I am taking so long to die."
Not long after, he was tried for war cimes and executed by hanging.
I'm sure he would have rather have died from the gunshot.
Whether right in some cases and wrong in others, I think it's difficult to say cultural norms are simply and plainly wrong.
I'm not sure comparing a case where the person has a high likelihood of dying anyway is necessarily appropriate based on what I was referring to. I don't fault the skydiver that failed to pack their parachute correctly and causes it to fail for committing suicide before hitting the ground, I do fault a system that would cause the person responsible for re-checking the parachutes to commit suicide because of the death.
I also didn't say the cultural norm was wrong, just that it was stupid and counter-productive. I meant stupid as an enhancement to counter-productive, and I meant counter-productive in relation to actually advancing a society to the point where the problem that caused the suicide in the first place is less common.
As someone who lives and works in Japan, it's absolutely shameful how people do this, when they are pressured so much from above to bot only work insane hours, but to take the entire responsibility upon themselves. It's as if there is a magical white line, where you can pass orders down through, but there is no responsibility for errors passing back through.
I've just been reading a bit on the aftermath and what is amazing to me is that Boeing apparently paid nothing in compensation or liability, but JAL did, despite the accident being nearly entirely the fault of Boeing!? Does anyone know why?
Based on my reading, the fault lies with JAL for executing the wrong repair procedure. Why do you think the fault lies with Boeing?
Boeing specifies: "repair procedure is like so" and due to misunderstanding, pressure to bring equipment back into use, failure to acquire the correct parts or some sort of similar problem [all my speculation btw], JAL maintenance executes another repair procedure that seems equivalent to them (or at least adequate).
Flaps do increase drag (especially at higher speeds) but they also decrease the stall speed which is why they're deployed during takeoff and landing-- they allow a slower takeoff speed. In effect, at lower speeds, the lift increase has a bigger effect than the drag.
Note that this is a historical accident that hasn’t happened since and that accidents in aviation are extremely uncommon. You’re significantly more likely to get T-boned by a driver on their phone browsing Facebook than you are to encounter a minor incident in the air.
You should read Cockpit Confidential; it’s a great book written specifically to address your concerns
Well cars don't have drive recorders, they seem to cause a lot more deaths and probably should.
I think a lot of people have anexiety for flying, I useto for a while. The best recommendation I could give to get over that (or at least to mitigate it a little), is to do some flight lessons. I actually conpletly switched deom being nervous to wanting to fly fairly quickly :-)
I know for a fact that I would have far less fear (maybe even NONE) if I were the one flying the plane. One of these days I might take your advice and go for some flying lessons.
I know that the issue at its heart, is about control. In a commercial airline flight I have no control over anything, and that's the root of the fear. It's why I'm not afraid of driving, despite knowing my risks are statistically MUCH higher. At least when I get behind the wheel, I feel like I have the ability to do something about risks as they arise, even if my chances of reacting in time are extremely small, at least I can do something.
I couldn't help but think about being on that flight, for 30 minutes, knowing it was going to crash and being completely unable to do anything about it. I've always wondered what would be going through other people's minds? I would be having a crisis, and possibly would die from a heart attack before the plane even crashed.
For what it's worth, I'm a huge proponent of self driving cars BECAUSE I recognize the potential for a much safer world.
I've been using OrientDB (one of the leading Graph databases) for the last few months; it's been a horrible experience to get it working with Python, as the official Python OrientDB driver is essentially a very thin wrapper around the binary protocol.
Using Gremlin would actually be nice, and save me a lot of nasty queries, but it seems like there is no python ecosystem for it.
The presentation mentions "Gremlin-Python". A quick Google search brings up these results:
I don't think no commits in 10 and 6 months is a very good indication of whether or not the projects are dead.
Assuming they just wrap other libraries, there's usually not much to screw up, so I wouldn't expect them to be heavily committed to, except for after a release of the underlying library.
I'm also needing graph db with Python. I thought I'd go with ArangoDB instead of OrientDB but the Arango python bindings also don't strike me as really stable. So now I'm thinking of using Blazegraph instead. The restful interface should be pretty easy from Python and I'm thinking SPARQL is actually going to be a much better query language for my end users.
> Which version of tailwind css do you know?
> I have knowledge of Tailwind CSS up to version 3.4, which was the latest stable version as of my knowledge cutoff in January 2025.
reply