Hacker News new | past | comments | ask | show | jobs | submit | nemothekid's comments login

Are you using Flat indexes? If so they should return the same results provided you are using the same distance function. If you aren't using Flat indexes, there might be more setup, but I'd recommend just using Flat indexes. They are plenty fast on most systems for searching ~1 million vectors (assuming 1024-32 bit float vectors).

If you aren't doing anything crazy you could probably just get away with storing them all in a memory mapped file.


Intersting to see this article on the perfromance advantage of not having to zero buffers after this article 2 days ago: https://news.ycombinator.com/item?id=44032680

>Does anyone else have the sickening feeling that their actual plan is to straight up destroy USD

Stephen Miran, the chair of the Council of Economic Advisers under Trump, is pretty much trying to do this. He published 'A User's Guide to Restructuring the Global Trading System', and it pretty much outlines why they should destroy the USD - in order to bring manufacturing home, so that the warhawks will no longer have any reason to not start a war with China


I like Stephen Miran, I always like people who are (relatively) honest

> CEA Chairman Steve Miran Hudson Institute Event Remarks

https://www.whitehouse.gov/briefings-statements/2025/04/cea-...


It's a bit of a stretch to call this juxtaposition of topics, with absolutely zero analysis of how they're connected, honest:

> we tax hardworking Americans mightily to finance global security. On the financial side, the reserve function of the dollar has caused persistent currency distortions and contributed, along with other countries’ unfair barriers to trade, to unsustainable trade deficits.

Money is fungible between these two concerns. The excess demand for USD is a source of revenue for our economic empire, realized by the continual monetary inflation without nearly as much corresponding price inflation. Some of that monetary inflation has been used by the government (~"deficit spending"), but the sheer majority has been getting dumped into the financial industry to bid up existing assets as a handout to the rich. That is what has left the American worker high and dry - near complete inability for the US government to use that already-centralized revenue to help wider society, due to a political movement based around fake austerity.

The article continues on using the passive voice to describe multiple things that the US government could have put a stop to any time it wanted, framed as if they were being done to us by other countries. For example:

> in the years running up to the 2008 crash, China along with many foreign financial institutions, increased their holdings of U.S. mortgage debt, which helped fuel the housing bubble, forcing hundreds of billions of dollars of credit into the housing sector without regard as to whether the investments made sense

Obviously if the government had set interest rates higher rather than lower, there would have been fewer mortgage bonds to buy and the dollars would have had to go elsewhere. I don't know if this pattern is deliberate or just an inevitable result of the bizzarro framing where having the world reserve currency is asserted to be a liability, but either way it is most certainly not honest.


(relatively)

I call this kind of nonsense that can be seen through at a glance an 'relatively honest lie'

those lines you quoted are nonsense that can be totally ignored without misunderstanding his point, those lines are like something as "get schwifty"

> Second, they can get schwifty by opening their markets and buying more from America

> Fifth, they could simply write checks to Treasury that help us get schwifty


SMH, the mental contortions people will go through to rationalize and whitewash the actions of this administration. You're either honest or you're not. There's no such thing as honest if people just read every third sentence and invent their own meanings for the rest. That's called dishonest.

But even accepting your point for the sake of discussion, he is "honestly" doing what? Begging? How is that a good thing?


I don't think this is a matter of like or disliking someone. While I can appreciate his candor in what he's trying to do - just because he's "honest" doesn't mean I agree with his policy.

What he is effectively saying is that large swaths of the American populace need to accept lower economic strength in order to decouple our reliance on our trading partners. And his reason for doing so is so that we can wage war on them. It's a completely nonsensical approach to maintain American hegemony. Why would any prefer strength through violence rather than maintaining the current system of American hegemony through trade?

While reserve currency status has it's warts, especially like he points out, the absolute immense amount of demand for US debt which fuels an uncontrolled spending crisis domestically I believe that is 1000% preferable to my daughters working in factories, and my sons dying in the Taiwan strait? For what? So that maybe the US can forcibly bomb China back into a nation of poor farmers and claim ideological victory over the communist project? It's completely inane.

You need to go one step further and ask yourself why he's proposing this. Instead of reexamining our relationship with China and asking ourselves how can we win in a multi-polar future, Miran and Trump have taken the view that China must remain a global adversary and we must maintain some sort of leverage on them.


Bizarre. I think I've been writing broken Rust code for a couple years. If I understand you correctly something like:

    let mut data = Vec::with_capacity(sz);
    unsafe { data.set_len(sz) };
    buf.copy_to_slice(data.as_mut_slice());
is UB?

It's an open question whether creating a reference to an uninitialized value is instant UB, or only UB if that reference is misused (e.g. if copy_to_slice reads an uninitialized byte). The specific discussion is whether the language requires "recursive validity for references", which would mean constructing a reference to an invalid value is "language UB" (your program is not well specified and the compiler is allowed to "miscompile" it) rather than "library UB" (your program is well-specified, but functions you call might not expect an uninitialized buffer and trigger language UB). See the discussion here: https://github.com/rust-lang/unsafe-code-guidelines/issues/3...

Currently, the team is leaning in the direction of not requiring recursive validity for references. This would mean your code is not language UB as long as you can assume `set_len` and `copy_to_slice` never read from 'data`. However, it's still considered library UB, as this assumption is not documented or specified anywhere and is not guaranteed -- changes to safe code in your program or in the standard library can turn this into language UB, so by doing something like this you're writing fragile code that gives up a lot of Rust's safety by design.


That's right. Line 3 is undefined behaviour because you are creating mutable references to the uninit spare capacity of the vec. copy_to_slice only works with writing to initialized slices. The proper way for you example to mess with the uninitialized memory on a vec would be only use raw pointers or calling the newly added Vec::spare_capacity_mut function on the vec that returns a slice of MaybeUninit

Why not simply:

    let mut data = Vec::with_capacity(sz);
    data.extend(&buf[..sz]);
Vec::extend extends a container from an iterable. A Vec/slice is iterable.

And from the doc:

> This implementation is specialized for slice iterators, where it uses copy_from_slice to append the entire slice at once.

Of course this trivial example could also be written as:

    let mut data = buf.clone();

Yes, this is the case that I ran into as well. You have to zero memory before reading and/or have some crazy combination of tracking what’s uninitialized capacity or initialized len, I think the rust stdlib write trait for &mut Vec got butchered over this concern.

It’s strictly more complicated and slower than the obvious thing to do and only exists to satisfy the abstract machine.


No. The correct way to write that code is to use .spare_capacity_mut() to get a &mut [MaybeUninit<T>], then write your Ts into that using .write_copy_of_slice(), then .set_len(). And that will not be any slower (though obviously more complicated) than the original incorrect code.

Oh this is very nice, I think it was stabilized since I wrote said code.

write_copy_of_slice doesn't look to be stable. I'll mess around with godbolt, but my hope that whatever incantation is used compiles down to a memcpy

As I wrote in https://news.ycombinator.com/item?id=44048391 , you have to get used to copying the libstd impl when working with MaybeUninit. For my code I put a "TODO(rustup)" comment on such copies, to remind myself to revisit them every time I update the Rust version in toolchain.toml

In other words the """safe""" stable code looks like this:

    let mut data = Vec::with_capacity(sz);
    let mut dst_uninit = data.spare_capacity_mut();
    let uninit_src: &[MaybeUninit<T>] = unsafe { transmute(buf) };
    dst_uninit.copy_from_slice(uninit_src);
    unsafe { data.set_len(sz) };

That's correct.

Valgrind it :)

Valgrind doesn’t tell you about UB, just if the code did something incorrect with memory and that depends on what the optimizer did if you did write UB code. You’ll need Miri to tell you if this kind of code is triggering UB which works by evaluating and analyzing the mid level of compiler output to check if Rust rules about safety are followed.

Reading from uninitialised memory is a fault that valgrind will detect.

But that’s precisely NOT the problem that exists in OPs code. It’s a problem Valgrind will detect if and only if the optimizer does something weird to exploit the UB in the code which may or may not happen AND doesn’t even necessarily happen on that line of code which will leave you scratching your head.

UB is weird and valgrind is not a tool for detecting UB. For that you want Miri or UBSAN. Valgrind’s equivalent is ASAN and MSAN which catch UB issues incidentally in some rare cases and not necessarily where the UB actually happened.


You are still doing a copy, and people want to avoid the needless memory copy.

If you are decoding a 4 megabyte jpeg, and that jpeg already exists in memory, then copying that buffer by using the Reader interface is painful overhead.


Getting an io.Reader over a byte slice is a useful tool, but the primary use case for io.Reader is streaming stuff from the network or file system.

In this context, you can either have the io.Reader do a copy without allocating anything (take in a slice managed by the caller), or allocate and return a slice. There isn't really a middle ground here.


And you are going to work on all 4mb at the time? Even if you were to want to plop it on a socket you would just use IO.copy which would be no overhead, as no matter what you are still always going to copy bits out to place it in the socket to be sent.


>And you are going to work on all 4mb at the time?

Yes? Assume you were going to decode the jpeg and display it on screen. I assume the user would want to see the total jpeg at once.

Consider you are working on processing a program that has a bunch of jpegs and is running some AI inference on them.

1. You would read the jpegs from disk into memory. 2. You decode those jpegs in into RGBA buffers 3. You run inference on the RGBA buffers.

The current ImageDecode interface forces you to do a memcopy in between steps 1 and 2.

1. You would read the jpegs from disk into memory. 2. You copy the data in memory into another buffer because you are using the Reader interface 3. You decode those jpegs in into RGBA buffers 4. You run inference on the RGBA buffers.

Step two isn't needed at all, and if the images are large, that can add latency. If you are coding on something like a Raspberry Pi, depending on the size of the jpegs, the delay would be noticable.


>significantly faster for this kind of code

"Significantly" and "this kind" are load bearing sentences here. In applications where predictable latency is desired, cloning is better than GC.

This is also the baby steps of learning the language. As a programmer gets better they will recognize when they are making superflous clones. Refactoring performance-critical stuff in FFI, however, is painful and wont get easier with time.

Furthermore, in real applications, this only really applies to Strings and vectors. In most of my applications most `clones` are of reference types - which is only marginally more expensive than memory sharing under a GC.


I feel like leftpad has given package managers a very bad name. I understand the OP's hesitation, but it feels a little ridiculous to me.

tokio is a work-stealing, asynchronous runtime. This is a feature that would be an entire language. Does OP consider it reasonable to audit the entire Go language? or the V8 engine for Node? v8 is ~10x more lines than tokio.

If Cloudflare uses Node, would you expect Cloudflare to audit v8 quarterly?


And for what it's worth, people do audit tokio. I have audited tokio. Many times in fact. Sure, not everyone will, but someone will :)


How does one approach doing so? Do you open the main.rs file (or whichever is the entry point) and start reading code and referenced functions on a breadth-first search (BFS) manner?


If two different dependencies use a different version of some other dependency between them does cargo still include both versions by default?

This is something I've only ever seen cargo do.


It'll do that if there isn't a single version that meets both requirements. Which is a great thing, because most other languages will just fail the build in that case (well, there are still cases where it won't even work in rust, if types from those sub-dependencies are passed in between the two closer dependencies)


> If two different dependencies use a different version of some other dependency between them does cargo still include both versions by default?

No, cargo will resolve using sem ver compatibility and pick the best version. Nuget, for C# does something very similar.


> This is something I've only ever seen cargo do.

npm does this (which causes [caused?] the node_modules directory to have a megazillion of files usually, but sometimes "hoisting" common dependencies helps, and there's Yarn's PnP [which hooks into Node's require() and keeps packages as ZIPs], and pnpm uses symlinks/hardlinks)


Of all the information I've seen about Semaglutides the only people I've seen it keep the weight off are:

1. High end personal trainers clients, for which semaglutide was used in conjunction with the trainer's workout regiment and diet.

2. Body builders and models, for which semaglutides simply replaced caffine/adderall/ephedrine.

The drugs can't induce the lifestyle change needed to keep the weight off (nor will it give you the motivation to go to the gym and build muscle). I'm thinking for now, it's a race to see how cheap these drugs can get and ensuring they have no side effects from very long term use. Overall I think the drugs are a net good and I'm interested for seeing the effects for myself, but I'm in good shape and $500/mo is still steep.


Have to say I basically take the opposite on everything you say here. I know quite a few people who aren't super active (myself included) who took it and went off and kept the weight off.

And they actually do induce lifestyle changes, which is the fascinating part. Not for everyone, but the impulse control changes are dramatic. I had a friend credit him going to therapy for the first time in his life and reading for the first time since high school to it which was crazy, but makes sense because it also helped him quit smoking weed so he had a lot more time.


Do you know which drug they took? Do you know if it was a compound? I'm a layman here and I'm interested in the success stories.

Ignoring the weightloss, I find the life quality improvements properties of the drug fascinating and it would be my number 1 reason for trying it.


The best is definitely Tirzepatide, it's both stronger and easier to tolerate. You can find it online from more or less reliable sources.


This seems to starkly contradict the current data on glp-1 agonists.


There are only 3 effective browsers - Chrome, Safari and Firefox. I don't see how limiting Google's control will create competition. The barrier to more browsers is the massive investment needed to create one, not any action that Google is doing.


You are correct, although its more correct to say there a only 3 major browser engines, Blink (used by all chromium derivatives), WebKit (used by Safari and some minor browsers), Gecko (used by Firefox and its derivatives). Creating a browser engine is hard, so hard that even a multi billion dollar company like Microsoft gave up on doing it. And we may soon witness Gecko going away as a side effect of the Google antitrust lawsuit.


This is circular reasoning. You are pretty much saying democracies should be aggressors first. If you swap `authoritarian` and `democracy` in your statement, it will also ring true.

However, the parent poster paints a different picture. If people in Moscow were economically threatened by reduced trade caused by an invasion, the elite appetite for such a move would be reduced.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: