Hacker Newsnew | past | comments | ask | show | jobs | submit | Xorlev's commentslogin

This is often (though not always) blanket statement.

Logs are always generated, and logs include some amount of data about the user, if only environmental.

It's quite plausible that the spellchecker does not store your actual user data, but information about the request, or error logging includes more UGC than intended.

Note: I don't have any insider knowledge about their spellcheck API, but I've worked on similar systems which have similar language for little more than basic request logging.


Pii is stored _at most_ for 30 days.


Do you have a source for that? I'm not denying it, just curious to read more.


cursory search:

> Preliminary information about the accident remains scarce, though two people familiar with the aircraft tell The Air Current that the aircraft in question, N704AL, had presented spurious indications of pressurization issues during two instances on January 4. The first intermittent warning light appeared during taxi-in following a previous flight, which prompted the airline to remove the aircraft from extended range operations (ETOPS) per maintenance rules. The light appeared again later the same day in flight, the people said.

https://theaircurrent.com/feed/dispatches/alaska-737-max-9-t...

No idea about the accuracy of the site. And it seems like they have some script that prevents text highlighting for whatever reason (turn off Javascript).


Well, that's an interesting thing. During taxi-in, the cabin altitude should be the ground altitude; outflow valves open at touchdown.

Hard to understand how an an incipient failure could manifest then (e.g. from increased leakage).

Of course, there's warning lights for excessive cabin pressure, etc, too... which would point to a different theory of the problem than a structural manufacturing problem.


Is "sensor just no longer responding" a failure mode which could trigger the alarm?


Jon Ostower is one of the best aviation reporters in the business and the Air Current is a site many professionals and executives in the industry trust.


It will come out in the NTSB report, if it's true. Though that will take quite a bit of time.


It's too bad that asking "source?" comes across as hostile unless clarified to be otherwise. Maybe the internet should adopt something similar to the "/s" tag that signals that sentiment.


Asking for any sort of clarifying information inevitably leads to argumentation on Reddit. It’s like we’ve all learned to be so polite that the truth barely matters (I’m exaggerating of course).


https://github.com/openzfs/zfs/issues/7631

This is a long-standing issue with zvols which affects overall system stability, and has no real solution as of yet.


You'd think so, but for datacenter workloads it's absolutely common, especially if you're just scheduling a bunch of containers together. Computation also doesn't happen in a vacuum, unless you're doing some fairly trivial processing you're likely loading quite a bit of memory, perhaps many multiples of what your business logic is actually doing.

It's also not as easy as GB/s/core, since cores aren't entirely uniform, and data access may be across core complexes.


I'm not sure what you mean by datacenter workloads.

The work I do could be called data science and data engineering. Outside some fairly trivial (or highly optimized) sequential processing, the CPU just isn't fast enough to saturate memory bandwidth. For anything more complex, the data you want to load is either in cache (and bandwidth doesn't matter) or it isn't (and you probably care more about latency).


I had these two dual-18-core xeon web servers with seemingly identical hardware and software setup but one was doing 1100 req/s and the other 500-600.

After some digging, I've realized that one had 8x8GB ram modules and the slower one had 2x32GB.

I did some benchmarking then and found that it really depends on the workload. The www app was 50% slower. Memcache 400% slower. Blender 5% slower. File compression 20%. Most single-threaded tasks no difference.

The takeaway was that workloads want some bandwidth per core, and shoving more cores into servers doesn't increase performance once you hit memory bandwidth limits.


This seems very unlikely. The CPU is almost always bottlenecked by memory.


It's usually bottlenecked by memory latency, not bandwidth. People talk about bandwidth, because it's a simple number that keeps growing over time. Latency stays at ~100 ns, because DRAM is not getting any faster. Bandwidth can become a real constraint if your single-threaded code is processing more than a couple of gigabytes per second. But it usually takes a lot of micro-optimization to do anything meaningful at such speeds.


Except it's also trivial to buy or produce tables of pre-hashed emails, so this cloak of "oh we don't know who you are, it's a hash!" is usually just lipservice.


They're not literally passing around the hash. Holders of hash(email) <=> browser cookie associations are heavily incentivized for both regulatory and also competitive reasons to not blast that information around the internet -- or even to let direct partners A & B identify overlaps without their being in the middle.

When passing identifiers, there's generally some combination of lookup tables, per-distribution salted hashes, or encryption happening to make reverse mapping as difficult as possible.

(I was in this space up until a few years ago).


Which makes perfect sense, because why would anyone sell their golden goose, if they had any other possible way of monetizing it?


I think this is generally false in most instances. All advertising pixels use unsalted sha1 hashes over https.


Things like homomorphic encryption making secure sharing possible without the security being lip service.


This is one of the things that drives me nuts when hardcore privacy advocates start wading into browser feature discussions and complaining about things being used to fingerprint users.

I mean, can eye-tracking in a WebXR session be used to identify users? Yes, clearly that is a possibility. But will the addition of eye-tracking increase the identifiability of users? No, not in the least, because users are already identifiable by means that involve core browser features.

But frequently, the "privacy advocates" win and we're left with a web platform that has a lot of weird, missing functionally in comparison to native apps, pushing developers to either compromise on functionality or develop a native app. Compromising is bad for users. And developing a native app could can be bad for the developer, if one considers their existing investment in web technologies. Or both the developer and users, when one considers the vig that app stores charge, or the editorial control that app stores enforce over socially-controversial-yet-not-actually-illegal topics. Or just users when one considers the fact that the app stores just hand app developers a user identity without even making them work for it with fingerprinting.

And often, the voices that are loudest in defence of "privacy" are browser developers that also just so happen to be employed by said app store vendors.


> produce tables of pre-hashed emails

Wouldn't this require knowledge of the email beforehand?


I think the idea is that you can generate the MD5 hash of all, say 8 letter, @gmail.com addresses trivially and since the email hashes used for targeting don’t have a salt, it’s a one time “expense” to build the reverse lookup table


Android also reaps permissions that haven't been used recently. In the case of location, Android prompts for renewal even if it has been used recently.


There is no war in Ba Sing Se.

Just saying it doesn't make it true. Twitter has a long way to go to regain advertiser confidence, and Blue is barely a footnote.


He's probably thinking of it from a user experience point of view, where there are lots of shiny new features and no big downsides.


Is it though? The new verified system was rolled out really poorly.

There should have been a migration path from legacy to new verified, but instead they just unverified everyone (including obviously government accounts that under the new rule should retain a grey check).


That's true, but I don't think it effected the average user much.


I must be using a different site from you. Letting people pay to get boosted has turned the top of every thread into a hive of emoji-pasting, cruel, low-effort cretins.


Everyone's twitter feed is different. Mine doesn't have that.


Software, movies, music is just a string of bits.

Using something leaked always carries some inherent risk.


The difference is that software and music are made by authors unlike keys, that's what makes them copyrightable


+1. And, it's in version control forever. It's not as if it entirely disappears. Like one of the sibling comments mentioned, I only rarely reject Sensenmann CLs.

That's worth explaining: it's automated code deletion, but the owner of the code (a committer to that directory hierarchy) must approve it, so it's rare there's ever a false deletion.


I think you're being downvoted because you've claimed "That's a case of not solving the problem.", but I think that actually better describes this answer. It's clever, certainly, but misses the fact that the stack of screens was never intended to be recursively escaped and changing the form that it took was the real fix rather than rubbing some compression sauce on what was never intended to be lots of backslashes in the first place. And indeed, that's what the author did: they shipped a bandaid fix while working on a more comprehensive fix, one which didn't require RLE or a quadtree (!).


You're describing the desire to avoid protocol compression when transmitting data that grows without bound. That has nothing to do with fixing the backslashes. The problem isn't backslashes. Those are just a symptom of data that can have any length.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: