More

levzettelin · 2024-10-27T16:31:06 1730046666

tl;dr

They made a C++ string type, let's call it "german_string", with sizeof(german_string) == 16. german_string can be in SSO mode (4 byte size field plus 12 SSO bytes) or in long string mode. In long string mode, they still use 4 bytes for the size, store no capacity (capacity is always equal to size), use another 4 bytes to always store the prefix of the string (this avoids a memory indirection in many cases when doing string comparisons), plus the pointer. Furthermore, a long string mode german_string can also act as a string_view or a string_view to ".rodata"; this is called the "string class". The string class is stored in two bits stolen from the 8 byte pointer.

levzettelin · 2024-10-26T08:51:29 1729932689

Nice collection of puzzles. But the ads on this are bloody awful. Can't even pay to get rid of them. What a pity.

TIP: Don't sign up! It doesn't give you any benefits, and you still can't get rid of the ads.

levzettelin · 2024-10-18T22:28:46 1729290526

Terse C interpreter that can interpret itself.

Previous posts:

https://news.ycombinator.com/item?id=22353532

https://news.ycombinator.com/item?id=8558822

levzettelin · 2024-09-01T07:56:08 1725177368

Probably just companies trying to impede progress of other companies. Not to say that the statement is wrong necessarily. But given that this is coming from a group of people that could very easily solve the problem, I'll take it with a grain of salt.

metabagel · 2024-09-01T08:24:15 1725179055

It’s not an easily solvable problem. They can’t make an AI which won’t lie or make stuff up, which is sort of the root of the problem. Imagine an AI which is granted access to control systems. We can’t trust such an AI to run control systems any more than we can trust it not to lie or make stuff up. There isn’t the sort of rigor behind AI development to permit creating a provably correct AI. There needs to be more study in order to understand the limits of AI fallibility and failure modes.

levzettelin · 2024-09-01T08:30:53 1725179453

They could just collectively stop working on the problem until they feel that the issue is resolved (moratorium). That's what I meant by "they could very easily solve the problem".

oefrha · 2024-09-01T09:11:52 1725181912

Or they could still work on it, but don’t use it on customers until it’s good enough.

> Zhou shared a striking example of how AI-generated content could lead to real-world consequences. “Some of the initial stock images of various ingredients looked like a hot dog, but it wasn’t quite a hot dog—it looked like, kind of like an alien hot dog,” he said. Such errors, he argued, could erode consumer trust or, in more extreme cases, pose actual harm. “If the recipe potentially was a hallucinated recipe, you don’t want to have someone make something that may actually harm them.”

There’s absolutely no reason Instacart has to show customers AI-hallucinated recipes from stock images. They choose to do it, then beat the drum about AI security as if they actually give a shit. It’s like Boeing self-certification.

fsiefken · 2024-09-01T09:06:32 1725181592

There are some techniques to alleviate hallucination, contradictory or confusing answers, but I have difficulty imagining a provable correct LLM because the attack surface is so large. The current methods to train for AI safety might be augmented with insights from chaos engineering, cognitive psychology, marketing and persuasion - making them agogic truth machines scoring very low on hallucination benchmarks [1].

I think we should program and train LLM with universal recognized agogic principles instead of being neutral in this regard, to encourage critical thinking and prevent 'reality tunnels' in the mindset of the users and perhaps incorporating this also in future training and curating techniques [2][3][4]. How to raise GenAI and future AGI well.

There are LLM training techniques to alleviate hallucinogenic, contradictory and confusing answers. These might be augmented with insights from chaos engineering, cognitive psychology and persuasion - making them agogic truth machines scoring very low on hallucination benchmarks [1].

I think we should program and train LLM with universal recognized agogic principles instead of being neutral in order encourage critical thinking and prevent 'reality tunnels'. Perhaps incorporating this in future training and curating techniques [2][3][4]

* Data curation Ensure data used to train AI models is balanced and diverse helps in preventing biases that could lead to hallucinations or harmful outputs. So curating data from a wide range of sources, cultures and viewpoints. Implementing quality control during data collection and preprocessing to filter out unreliable, outdated, or biased information.

* Targeted post-training (fine-tuning) After initial training models can be fine-tuned using datasets specifically designed to emphasize helpfulness, harmlessness and alignment with ethical principles. Embed ethical guidelines in datasets, for example include scenarios to handle sensitive topics, avoid hate speech and promote fairness.

* Red-teaming Red-teaming involves stress-testing the model by simulating adversarial attacks or intentionally providing challenging prompts to see how the model responds. This helps identify weaknesses, such as susceptibility to generating harmful content or hallucinations. This can be used to improve the model's robustness and safety.

* Post-training datasets focused on responsible AI principles Incorporating datasets that help the model understand context and nuance of various topics, ensuring it can provide appropriate responses to the situation.

* Refusal-aware instruction tuning While data curation, targeted post-training, and red-teaming help to prevent the introduction and propagation of false or harmful content, R-tuning directly enhances the model's ability to recognize its limitations. Enabling the model to refuse to answer questions beyond its knowledge.

* Iterative user feedback based refinement Continuously collecting and analyzing feedback from users and independent review teams helps identify issues that may not have been apparent during development.

[1] Vectara hallucination leaderboard https://github.com/vectara/hallucination-leaderboard

[2] On epistemic black holes: How self-sealing belief systems develop and evolve". Maarten Boudry and Steije Hofhuis in the journal Theoria August 2024 https://onlinelibrary.wiley.com/doi/epdf/10.1111/theo.12554

[3] Costello, T. H., Pennycook, G., & Rand, D. G. (2024, April 3). Durably reducing conspiracy beliefs through dialogues with AI. https://doi.org/10.31234/osf.io/xcwdn https://osf.io/preprints/psyarxiv/xcwdn

[4] BriX: Reducing polarization through Bridging and eXposure https://research.qut.edu.au/genailab/projects/brix-reducing-...

levzettelin · 2024-09-01T12:41:42 1725194502

As commented before, by say that they "could very easily solve the problem" I meant that they could just collectively stop using the problematic AIs in prod until they feel that the issue is resolved (moratorium). Not that it's easy to resolve the technical difficulties.

levzettelin · 2024-08-18T15:02:12 1723993332

How often do you have to add singular entries to some data file your working with? For all other cases, Miller and xsv look more powerful.

notRobot · 2024-08-18T15:55:47 1723996547

Often you're adding multiple lines programmatically from a script or cron or whatever.

levzettelin · 2024-08-10T12:34:13 1723293253

He wouldn't be speaking like this if he was born on Mars.

edflsafoiewq · 2024-08-10T12:37:09 1723293429

Sure he would, the meter would just be a different length.

Maxatar · 2024-08-10T15:22:32 1723303352

I thought they key insight of this article was if he were born on Mars, then the meter would have been defined differently so that gravity would still be 9.8 m/s^2.

I think what you meant to say was that he wouldn't be speaking like this if people were born with 3 fingers.

alfiedotwtf · 2024-08-10T13:13:52 1723295632

The pendulum from the Mars pole to Paris would be long indeed!

baxtr · 2024-08-10T12:43:49 1723293829

Finally an easy way to identify aliens!

samstave · 2024-08-10T15:01:37 1723302097

How? Because they don't have a Venus?

levzettelin · 2024-08-04T21:02:58 1722805378

neovim with the gp.nvim plugin.

Allows you to open chats directly in a neovim window. Also, allows you to select some text and then run it with certain prompts (like "implement" or "explain this code"). Depending on the prompt you can make the result appear directly inside the buffer you're currently working on. The request to the ChatGPT API also is enriched with the file-type.

I hated AI before I discovered this approach. Now I'm an AI fanboy.

levzettelin · 2024-07-09T20:32:47 1720557167

Can you post any evidence for this? I somehow have it in my mind that it's the other way around (i.e., mean doesn't grow anymore, but 99th percentile still does).

daedrdev · 2024-07-09T20:36:43 1720557403

There is a chart floating around showing the mean age increasing the but the 99% age has been completely flat for at least a century

levzettelin · 2024-06-30T15:32:00 1719761520

https://www.ultimatepp.org/www$uppweb$vswx$en-us.html

levzettelin · 2024-06-23T21:38:40 1719178720

Maybe a bit more than that. I'd say a week to start and about one or two months to achieve the previous speed.