Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The most impressive/unbelievable part of this story is that 2 Word files, created separately, ended up with the same exact file size. I'd assume with all of the Metadata they cram in that this would be unlikely.


The simplest answer is that this didn’t happen and it’s just a myth.


I agree; this story is apocryphal. I was annoyed enough that I did some research into the story. Via this Atlantic article[1], I found the original Valleywag post[2] containing the letter from the employee. It's worth noting that the letter from the employee is pure text (i.e. not a scan of a letter), is only 170 words long, and has a couple of misspellings. The Valleywag post provides a paraphrasing of what the employee told them, followed by what seems to be the entire body of the email that they received.

The Musk detective work is described by Vance in a footnote:

“Musk would later discover the identity of this employee in an ingenious way. He copied the text of the letter into a Word document, checked the size of the file, sent it to a printer, and looked over the logs of printer activity to find one of the same size. He could then trace that back to the person who had printed the original file. The employee wrote a letter of apology and resigned.”

This leaves me a with more than a few questions:

1. Why would this email/letter have been printed out? It's short, informal, and did not seem to contain any corporate materials that would require access from a work computer. Surely, this would be better sent as an email from a personal computer? All I can think of is that the employee did print some kind of private company information (perhaps as proof?) to send to Valleywag. But wouldn't this mean that the print-job sizes wouldn't match since there would be printed materials not made public? It seems beyond insane to physically mail a tip to a gossip site that was built around emailed tips.

2. If this was sent as a physical letter, why would the quote in the article contain the typos? Why would they even take the time to type up the entire letter when the article summarizes every single point that the Tesla employee mentioned? Shouldn't they have taken some care to not verbatim reproduce text that could have gotten their source into trouble? I will say that the minimal journalistic standards employed by former Gawker-network sites provide convenient explanations to these questions, so these aren't particularly damning.

3. Is this really all the text that was sent to Valleywag? The quoted part of the letter provides no salutation or signature. Sending this text exactly as quoted as a physical letter seems bizarre, even for an anonymous tip.

4. Would the sizes of the files sent to the printer even match up considering the document metadata? This actually seems somewhat plausible.

5. Would the print-job really be the best way to figure out who the leaker was? In October 2008, before the letter was written, Tesla only had 363 employees[3] and may have laid off a few dozen of them before this letter was written. This employee claims to have joined in 2004. A Wired article from 2006[4] mentions a meeting of 30 Tesla employees and board members in December 2004. It seems like there are a very small number of people who could have been the potentially leaker. How many of those people were using the corporate printers the day after the mentioned all-hands meeting to print a single page document?

--------

I imagine that this story was told to present Musk as smarter than everyone else while also threatening disloyalty, which seems to be a frequent Musk bugbear. The bit about the caught employee writing a letter of apology and then resigning (amidst large-scale layoffs at Tesla in 2008!) also seems a little too cute, in a chain-email-atheist-professor-humiliated-by-freshman-Albert-Einstein kind of way.

--------

[1] https://www.theatlantic.com/technology/archive/2018/06/elon-...

[2] https://www.gawker.com/5071621/tesla-motors-has-9-million-in...

[3] https://www.latimes.com/archives/la-xpm-2008-oct-25-fi-tesla...

[4] https://www.wired.com/2006/08/tesla-3/


The wired.com link doesn't work me ("Sorry, something has gone wrong"). Archived copy:

https://web.archive.org/web/20221116003941/https://www.wired...


You could use bounds from x KB to y KB if you know the version of Word used and try to approach.

It is a difficult task, anyways.


The size of the saved word file (depending on exactly how it was saved and what version) will vary depending on how it was edited, and yeah, what metadata is there, so this seems...wrong.


Honestly speaking, the story sounds somewhat believable to me.

They likely misidentified the leaker with that silly analysis and fired them on the spot, would be just the usual modus operandi of musk.


they probably narrowed it down to one person and approached them with the logs and "admit and be fired or don't admit and legal (and all the bad/expensive stuff that comes with that) will be engaged to get to the bottom of it starting with you".


You can also easily narrow it down to a time period. Find out when leak occurred and then look at prints shortly before that.


How likely is it that there would just happen to be a file with exactly the same wrong size in the printer logs?


I‘d say if the letter was short it‘s pretty likely.


Maybe there was few % error but no other candidate.


Which printers in that timeframe ever directly printed word files?


It would be the resulting spool file that gets measured, not the source Doc file.


[flagged]


Two different leaks. One was a printed word file. One was a double-space-encoded email.


He searched the printer logs. Presumably when Word sends stuff to the printer, it doesn't bother with all the extraneous garbage it puts in files. Printer communication hasn't changed much in decades.


Still. He’d have to have the same normal.dot, make the same space/tab choices, same fonts, etc.

And in a large company, how many ~1 page letters get printed? What are the odds that there is only a single match in the entire company?

I take this story similar to the binary coded space story: more likely to be apocryphal and promoted to deter future leakers than true stories about catching them in the past.


I would imagine any two computers at Tesla have a very good chance of having identical normal.dot, and font can be determined by looking at the letter. For a letter, who is using tabs for anything but the start of paragraphs? Even if it's not perfect, it's unlikely to change the size of a word document (which is likely <25 kB to begin with) by a full kilobyte.

Basically all it would tell you is the number of word documents with approximately the same amount of text were printed, plus or minus about a paragraph. Letters aren't frequently printed and the contents of the leak would almost certainly limit the number of suspects to a small handful of people. It's not hard at all to believe that in a group of ~50 people and a time window of ~1 week you might only have one even close match.


Perhaps it wasn't an exact byte match, but close enough to only 1 print job size within a reasonable time frame.


Exactly this. If you know that its around 36kb, and had to be printed between two dates, the list narrows substantially. And more than likely, the printer queue or corporate compliance tools had additional functionality that made such a search trivial.

I have my own story about this: In the early 2010's I had a boss that loved to call us and check in every day. We were a remote team and he had anxiety that we were all larking off. I later learned his technique was to open the dropbox admin tools to see the location of someone before calling them. "So, BarelySapient, where are you working at today?", he'd ask in a cheerful tone. But in reality, he was testing employee truthfulness. Every call. He later fired one of my co-workers when they reported to be working from home, but dropbox reported them somewhere in the Florida keys....


... and here I'm just happy about German employee and EU data protection laws. This kind of abuse would be plainly illegal here and entitle the employee to significant compensation.


What part would be illegal?


Certainly possible, but also relies on the assumption that this was done on company equipment. When leaking info under a vindictive boss I would think opsec rule #1 is avoid company equipment as much as possible. Even without a printer at home, it's quite easy to send a print job to a local office or shipping store.


Data exfiltration is an extremely dangerous risk in the world of corporate espionage, especially for a company like Tesla that is trying to be first to market with something as massive as FSD.

There is no way I would send anything to an external print shop from company equipment; they are almost certainly on top of that as well.


Sheesh, Tesla FSD would be even worse than the "secret Coca-Cola formula", no one would even want to be in the same computer with it.

Now, production cost figures and schedules ...


This has gotten me genuinely curious, I wonder what the safest way to get a document like that onto your own device is. Printing it on a work printer doesn't seem ideal, but I don't really know what the best approach is. Maybe emailing it to an outside address or sharing it as a document via Dropbox or similar? Copying to physical storage? All of those seem fairly easy to monitor as well though.

If any infosec experts feel like chiming in I'd love to learn more.


Anything done on a corporate machine needs to be assumed monitored. The paranoid approach would be to fully power off the machine, take the hard drive out, then use an independent machine to mount and read the data off that. Now, if they suspect that approach, or they suspect you personally, there will likely be evidence of your hardware tampering. But it would thwart automated mass-surveillance solutions.


Hard drive encryption makes this difficult.


Take photo of screen with phone camera, carry it home, OCR. If you're really paranoid you don't want to send the document's bytes off your system or use any unusual program (e.g. steganography) to edit them.


> Maybe emailing it to an outside address

Easily detected.

> sharing it as a document via Dropbox or similar

If you use a TLS-inspecting proxy/VPN, this will be detected. Otherwise, it depends on how much monitoring is going on, but at best they could suspect it.

> Copying to physical storage?

At my work, USB drives are disabled by MDM.

You could use transfer the files over SSH. Even if you have an MitM SSH-inspecting VPN, once the SSH channel is established, you could tunnel a second SSH connection through the established insecure SSH session.

Even then, with enough logging, you could detect that all local files were accessed sequentially which would raise a red flag.

There's nothing you can do to prevent insider espionage that wouldn't raise false positives and block legitimate work, but you could at least detect it.


It will always be a game of cat and mouse. Your protections for leaking are limited to legal whistleblower protections against retaliation, so odds are anything suggested here will potentially be traceable or suspicious, which may invite further scrutiny.


Ironically, we had an employee appear to exfiltrate data through a print shop. They almost got away with it unnoticed.

The USB stick they used to make the transfer contracted ransomware from the public terminal, which put everyone on high alert when it was next introduced into the corporate network.


Or take A picture of the letter Or retype the letter using leaker’s own smartphone.


Could be the case that there were only a handful of things printed at the whole company in a week, considering how obsolete paper has been getting


It's probably not the only bit of evidence. "Hmm, it's about the same size, it was printed at about the right time and... yeah it sounds like *that* guy, let's see what else he printed - oh, nothing? Cool."


> I take this story similar to the binary coded space story: more likely to be apocryphal and promoted to deter future leakers than true stories about catching them in the past.

Or have a printer at home on stand-by for your leak press releases. If I were Musk, I would have fired him/her not for being a snitch, but for his/her sloppiness and overall carelessness and lack of discipline.


Is that so unlikely? They would be on the same network, same configuration.


You could filter on time and size and then get a subset of likely candidates and whittle it down to the expected people.


It is also possible he's just a lunatic who fired an innocent person.


Or lying. When you are rich or famous people or in vogue socially let you get away with these stories e.g. Consider feynmans "I'm modest but look at my genius"-isms


Yeah exactly. My thought was he probably found the leaker in a much more mundane way, but being undeniably great at PR he decided to spice up the story with this little bit of detective work. It sounds so clever you want to believe it, even if it's pretty impractical(not impossible though!). You know what they say, don't let the truth get in the way of a good story.


Looking at twitter these days I sure have a feeling this is not super unlikely.


More important is spreading the perception he was able to sleuth down who the leaker was, to deter future leakers.


I would assume there was also at least a minor degree of due diligence, such as "Would this person actually have had the knowledge to write letter".


It'd be even easier than that. Walk to the person's desk and ask them to bring up the document that they printed on ${printer} at ${time}. If he can't produce a document of the correct size, he's your guy.


This is a good way to fire innocent employees who had the audacity to edit a document after they’ve printed it.


What are the odds that he calculated a certain number of bytes, and exactly one other employee had sent a file to the printer that contained that number, but it was just a coincidence? If the method indeed doesn't work (as you suggest "is possible") it's impossible it would turn up a false positive, only a false negative.


It's also possible that he calculated the number of bytes, searched the logs, didn't find anything and then tweaked his numbers until something matched. And then fired that random person to make a statement and lied about how he found out who to fire.


This approach would work in isolation, but by failing to find the true leaker would result in them being empowered to leak more over time. Is there evidence that Tesla has a lot of regular leaks?


I didn't mean to claim other than that anything is possible. I was just saying the original theory put forward wasn't internally logical.


almost as good as the odds that he got something pretty close, then changed the details a bit to fit the narrative.

or, he stayed up for two days until he got it exactly to match by tweaking his approach a little at a time, and actually matched it to the right person. Honestly, he seems like he'd do this.


Sounds like a fun project any hacker would get distracted by.


You wouldn't do that though, would you? You'd type the thing up, print it, see that the printer received a 4.7 KB (or whatever) doc and look for a 4-6 KB doc the week of the leak, and then you have 100 documents to go through, rather than 100000.


If he had access to printer logs, he would know exactly what everyone printed and when they did it.

Networked printers store a lot of information about what is being printed.


In general a lot of metadata is kept, but very rarely is the full print content kept (though in some organizations it may be for security review). The difficulty here is figuring what metadata is relevant, in this case size was the most important.


The OP talks about the size in kilobytes, not bytes. That will almost certainly give false positives, but if you limit the time range, it may not be that bad.

Not a bad idea for initial filtering, I think, but I doubt it would hold up in court on itself.


Agreed - the goal isn't "the KB's prove you are guilty". It's metadata which serves as a means to reduce the number of computers you need to examine with a more complex/manual search (deleted files, temporary/corrupted files, word document history, interviewing the employee, comparing the material in the document with the role/responsibilities of the employee, etc).


I think you're forgetting that he could just re-print the file once he had located which printer it was on to verify. Once verified it's the file in question, you could easily check which device sent it to the printer.


This was my first thought as well. Is the file size really that precisely deterministic from a mammoth program like Word given different dates, machines, versions, etc?

Given that Musk is not the most reliable narrator, to say the least, I'm skeptical of this story. I'm sure he sees plenty of utility in appearing omniscient to frighten potential leakers.


I can imagine that a company actually saves everything being printed for security / auditing reasons, along with who issued the print request, etc.


I think such a company would equally want to not retain such records (at least not indefinitely), for fear of discovery in legal processes, etc.


Or most likely laziness.


Or a fortuitous combination of the two...


Doing that suggests they think that the threat from insiders >>> outsiders.


It always is. Insiders have more access. They are trusted. That makes them a much greater threat if they decide to do harm.


I think that ... certain nations ... are quite good at playing "the long game," and often have their plants hired by companies; sometimes at fairly senior/sensitive levels. They may not even start phoning home, until they have had a few promotions.


Well when the concern is leaked internal memos, that is a very reasonable assumption.


OP says he compared them in terms of KBs, if this is taken literally then he didn't check to see if they were exactly the same length. Assuming that the two word documents are actually identical except for metadata, I wouldn't expect the size discrepancy to be on the order of kilobytes.

Still a reckless way to identify the leaker, there's plenty of reasonable doubt if all they have to go on is the size of the document that was printed.


Same here, maybe Tesla as a homogeneous word deployment?

I’m sure between various version there is difference in what is produced.

Unless you can ask word for the equivalent of a postscript ?


I believe the printer logs would contain the size of a Postscript-like equivalent, not the docx itself. I wonder if the story is missing the detail of submitting the new Word file to a printer and checking the size. Or he fired someone random where Word doc size == printer input size.


From what I've heard of their engineering practices, it would be shocking if any deployment they have is homogeneous


had the exact same thought. if anyone works in-office and has the time/capacity (don't get fired) to do this with some colleagues, it'd be a fun and interesting experiment!


id assume a billionaire heir to a diamond mine with an econ BA parsing various different models of printer logs for fifteen offices late into the evening would have discredited the quote entirely, But the cult of personality implores me to insist even Stalin himself did something similar in both Haskell and emacs.


Parallel construction.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: