Though, that engine evaluation is going to be absurdly expensive, when you're compressing/decompressing billions of games.
I think everyone's optimizing the wrong thing! The size of a useful, real-world chess database would be dominated by its search indexes—indexes that allow fast, random-access lookup of every position that occurs in every game. (Or even better: fuzzy searching for nearby, "similar" positions). This is the difficult, open-ended algorithmic problem. Disk space is cheap!
(Out of curiosity, does anyone here actually know how to solve the "fuzzy search of chess positions" problem?)
Brazil is one of my favorite films of all time, which is unsurprising given how much I enjoy Terry Gilliam's other work as well. The first time I saw it I was in high school and my drummer lent me the DVD. It was aesthetically compelling and fantastical in a way that films simply don't seem to be anymore. When I revisited it as an adult the heavier thematic elements resonated with me much more, i.e. the havoc that nightmarish bureaucracies and technology can wreak on our lives and interpersonal relationships.
In the age of surveillance capitalism Brazil is more essential than ever; if you haven't seen it then you owe it to yourself to watch it.
I don't want to assume too much, since the details are sparse. But I know for a fact that few of my current coworkers know a thing about writing tooling code. It's becoming a bit of a lost art.
Here's the way such a script should be done. You have a dry-run flag. Or, better yet, make the script dry-run only. What this script does is it checks the database, gathers actions, and then sends those actions to stdout. You dump this to a file. These commands are executable. They can be SQL, or additional shell scripts (e.g. "delete-recoverable <customer-id>" vs. "delete-permanent <customer-id>").
The idea is you now have something to verify. You can scan it for errors. You can even put it up on Github for review by stakeholders. You double/triple check the output and then you execute it.
Tooling that enhances visibility by breaking down changes into verifiable commands is incredibly powerful. Making these tools idempotent is also an art form, and important.
I wish I had a recommendation based on experience for one of these really strange operating systems like EUMEL, Guardian, OS/400, and L3. But I don't. I've used CP/M and MS-DOS, but those are just really limited, not really interesting. Although, with ZCPR and 4DOS, you could make them reasonably usable, it was like coming out of Plato's cave when I switched my primary operating environment from 4DOS to csh on Ultrix.
Squeak is a pretty different operating environment that isn't simply primitive. Oberon is another. They can both run as user processes on top of Linux, as well as on bare metal. Both of them are somewhat alien.
Are you comfortable with embedded development? If not, try Arduino. It starts out easy, since you program the boards in C++, but you have the opportunity to build things that will run for months on a AA battery with submicrosecond interrupt response time — because there's no OS. (It's routine for even programming novices to write their own interrupt handlers.) Arduino instantly gives you the ability to measure things on microsecond timescales, a thousand times faster than you can normally see. Modern boards like the Blue Pill have response latencies in the 100-nanosecond range when they're awake. That's the time it takes light to go 30 meters, as you're probably aware.
In retrocomputing land, VMS was the first OS I used that was really usable. The OpenVMS Hobbyist Program still exists, and it's actually possible to run old versions of Mozilla on it. F-83 was an interactive Forth IDE that provided higher-order programming, virtual memory, and multithreading under MS-DOS, in 1983 — without syntax or types. Turbo Pascal was also an IDE, in a way the first modern IDE, around the same time; the first versions ran on CP/M and MS-DOS. But I think that you kind of had to be grappling with the limitations of BASIC on those systems to appreciate that.
There are Pick systems that still have enthusiastic users: https://www.pickwiki.com/index.php/Pick_Operating_System but they don't sound appealing to me. Other systems with cult fanbases include FileMaker, HyperCard, and Lotus Agenda, which last I think you can run successfully under FreeDOS. Agenda is interesting in part because it's so alien. (It's easy to forget that it was normal at the time to have to use the program manual to figure out how to exit.)
There are a bunch of modern specialized development environments that can do strange things. Radare2 is an environment focused on reverse engineering. Emacs is focused on text editing, but for some reason it's also the main user interface for interactive proof assistants like Coq and Lean, which are shaping up to be pretty interesting. R is focused on statistics. Jupyter is sort of focused on data visualization, although not really. (Now I see you've been doing deep learning for 10 years, so I guess Jupyter is your best friend.) LibreOffice Calc is focused on rectangular arrays of mostly numerical data (although in many cases their most advanced users use Excel instead). You can develop applications in all of them.
How about math? It's one thing to invoke a Runge-Kutta integration method; it's another to be able to prove convergence bounds on it. And machine-checked formal proof is shaping up to be an interesting thing, like I said.
How about cryptography? That has the advantage that there are right answers and wrong answers, so you can test your code.
How about shaders? Shadertoy is accessible and super fun. Maybe that's too similar to HPC, but the shader parallelism model (similar to ispc) is pretty different from both AVX and MPI.
How about mobile development? SIGCHI papers are full of experimental user interface ideas to explore, and Android Studio is free and relatively usable, if clumsy. Have you seen Onyx Ashanti's Beatjazz?
In the neighborhood of beatjazz, there's livecoding. It's a thrill to get a nightclub full of people dancing to your code, and there are a bunch of different environments.
GNU Radio with an RTL-SDR makes it possible for you to run DSP algorithms on RF signals over a pretty wide frequency range, with applications in communications and sensing. Maybe if you've been doing HPC, DSP is already second nature, but if not it might be rewarding. And DSP has close connections to control theory and image processing, as well as the more obvious applications.
How about alternative programming paradigms? If you're comfortable in procedural and OO programming, how about extreme alternatives — answer-set programming like miniKANREN, constraint-logic programming (as supported by modern Prologs https://www.metalevel.at/prolog/clpz not just Mozart/Oz), Erlang-style fault-tolerance-focused programming, APL-style array programming (though maybe you're familiar enough with that to take it for granted), or Forth? How about strongly typed programming like Haskell, Rust, or OCaml? (And of course Haskell is purely functional, and OCaml is mostly so.)
And STM solvers like Z3 can easily solve problems now that were infeasible only a few years ago.
Also, wasm.
Or maybe try hacking together some games in Godot.
I don't know, myself I find that it's hard to avoid getting out of my comfort zone in some direction, just because the world is so big and my knowledge is so small. Deep learning is the out-of-my-comfort-zone programming thing I want to try next!
Just looking at data and descriptive statistics is one of the first things a person is taught in machine learning, data science and statistics coursework. It’s a major skill in the field that is emphasized all the time.
Practitioners frequently do cursory data analysis and data exploration to gain insight into the data, corner cases and which modeling approaches are plausible.
Just to give some examples, Bayesian Data Analysis (Gelman et al), Data Analysis Using Regression and Multilevel/Hierarchical Models (Gelman and Hill), Doing Bayesian Data Analysis (Kruschke), Deep Learning (Goodfellow, Bengio, Courville), Pattern Recognition and Machine Learning (Bishop) and the excellent practical blog post [0] by Karpathy all list graphical data checking, graphical goodness of fit investigation, descriptive statistics and basic data exploration as critical parts of the model building workflow.
If you are seeing people produce models without this, it’s likely because companies try to have engineers do this work, or hire from bootcamps or other sources that don’t produce professional statisticians with grounding in proper professional approach to these problems.
When people mistakenly think models are commodities you can copy paste from some tutorials, and don’t require real professional specialization, then yes, you get this kind of over-engineered outcome with tons of “modeling” that’s disconnected from the actual data or stakeholder problem at hand.
The data, conclusions, and implications of this are huge, and go several ways.
First: virtually nobody using information technology has any idea of what it's really doing, or how to do anything beyond a very narrow bound of tasks. This includes the apparently proficient, and it's almost always amusing to discover the bounds and limits in knowledge and use of computers by the highly capable. Children, often described as "digital natives", are better described as "digitally fearless": they're unaware of any possible consequences, and tend to plunge in where adults are more reticent. Actual capability is generally very superficial (with notable exceptions, of course).
Second: if you're building for mass market, you've got to keep things exceedingly and painfully simple. Though this can be frustrating (keep reading), there's absolutely a place for this, and for systems that are used by millions to billions (think: lifts, fuel pumps, information kiosks), keeping controls, options, and presentation to the absolute minimum and clearest possible matters.
Third: Looking into the psychological foundations of intellectual capabilities and capacity is a fascinating (and frequently fraught) domain. Direct experience with blood relatives suggests any possible genetic contribution is dwarfed by experiential and environmental factors. Jean Piaget's work, and subsequent, makes for hugely instructive reading.
Fourth: If you are building for general use keep your UI and conceptual architecture as stable as possible. There simply is NOT a big win at UI innovation, a lesson Mozilla's jwz noted years ago. (Safe archive link: https://web.archive.org/web/20120511115213/https://www.jwz.o...) Apple's Mac has seen two variants of its UI in over 35 years, and the current OSX / MacOS variant is now older than the classic Mac UI was when OSX was introduced. Food for thought and humble pie for GUI tweakers.
Fifth: if you're building for domain experts, or are an expert user forced to contend with consumer-grade / general-market tools, you're going to get hit by this. The expert market is tiny. It's also subject to its own realms of bullshit (look into the audiophile market, as an example). This is much of the impetus behind my "Tyranny of the Minimum Viable User", based in part on the OECD study and citing the Nieman-Neilsen group's article:
(I've been engaged in a decades-long love-hate, and increasingly the latter, battle with information technology.)
Absent some certification or requirements floor (think commercial and general aviation as examples), technical products are displaced, and general-market wants will swamp technical users' needs and interests.
Agreed that understanding lags behind when money and cheaper computation (god I hate that term "compute" used as a verb, learn to English my dudes) drives the SOTA rather than actual understanding (on both researchers' and computers' part).
There's a reason we haven't seen a hard takeoff with BERT writing BERT+1 in even better Python and it probably isn't because nobody is trying, it's because BERT is too stupid to write a computer program.
> Am I misrepresenting anything in your position here?
Yes.
1) I never said "no value". The claims have value. Call it 2 bits per claim, with diminishing returns after the first few.
2) The Intel engineer has considerably more context, and more on the line, so their claim has considerably more weight. Call it... 8 bits. If they say it should work, then it should work, and if other people say it doesn't work, that doesn't mean someone is wrong, it might just mean that it should work, but for someone it isn't working, for whatever other reasons.
3) If the Intel engineer tells you your opinion is wrong or out of date, you should probably give that considerably more weight than if some random person on the internet without that context says it.
The engineer working in the field can be expected to know a lot more about both the intended design, the actual build, and the real-world issues arising in practice compared to what any individual user would know about any of those things. This is generally why experts don't like to engage with non-experts; the non-experts' priors on who to trust are likely to be all out of whack.
There are two interesting parts to an S1 (or a 10K):
* a qualitative description of the business, as the management sees it. I find this super interesting and it's usually in the middle of the document - look for "management's discussion and analysis ..", or go here: https://www.sec.gov/Archives/edgar/data/1561550/000119312519...
* the quantitative part. There are three key, but interrelated concepts to learn. The income statement, balance sheet, and cash flow statement. A intro course to accounting should be great, I can recommend this one: https://www.wallstreetprep.com/self-study-programs/accountin...
>"If it was a high-return fruit somebody would be doing it." Not necessarily.
With high likelihood given current funding of ML with $$$ applications but yes, not necessarily.
> AI benchmarcks are what direct progress in AI
Sadly this is largely true.
The AI benchmarks + culture around it are the bullshit.
What actually moves forward the field of AI is:
- accessible
- reproducible
- comprehensible
results done with some thought and reasoning which is explained well, published well, and justified by more than some #$!& "our F1 score went up by 2 therefore our approach makes sense" bullshit.
AI benchmarks have done as much to retard progress in AI as they have to promote it.
Current AI benchmark top scores are gamification for big companies to waste even more resources running algorithms they can't explain. They are not machine learning, they are machine pissing contests.
"Cornell University is studying human susceptibility to digital disinformation generated by language models."
"The Middlebury Institute of International Studies Center on Terrorism, Extremism, and Counterterrorism (CTEC) is exploring how GPT-2 could be misused by terrorists and extremists online."
"The University of Oregon is developing a series of “bias probes” to analyze bias within GPT-2."
But apparently no university studies the social and economic impact of using terabytes of public data to train algorithms that for all practical reasons end up being inaccessible to an average person.
If things go on the way they're going right now, in 20 years millions of people will be "mechanical turked". Most of information processing tools will be mediated exclusively through companies like Google and Amazon. They will be less like normal tools (e.g. word processors) and more like systems you have to be a part of. Can you imagine the levels of inequality involved? The hyper-centralization of power? This is the foremost challenge presented by AI, not some hypothetical nonsense involving terrorists using a text generator.
And it's not like there aren't any solutions. Douglas Engelbart, for example, pointed out a great way of introducing technology into society without screwing most of the society over:
People who don't sellout to the corporate agenda... watching HN'ers downvote me for pointing out the videogame industry has been stealing PC games for the last 20 years was alarming. We went from owning diablo, warcraft and starcraft games to not owning them... that is a major change and to see a place that calls itself the place of "hackers" and nerds, who theoretically should all be about fighting to preserve culture against the corporate onslaught against our basic rights to own our own software and not have it tied to "the cloud" is disturbing in it's own right.
Seems everyone wants to be a slave to the mainframe and have no privacy and no general computing.
Not getting that the corporations of the world are hell bent on turning the PC into a dumb client and everyones bending over is disturbing.
The stuff Microsoft has in the pipe with UWP and encrypted computing is alarming on its own, "honest files" of the past, not trapped in some vm or some remotely controlled new microsoft file system and license servers for this new Software as a service (aka stealing your software an selling to back to you at inflated prices) is madness itself.
There is no reason for any piece of software whether that be an OS, Office application or Game to be divided between our computers and the companies.
For those not aware of the background, the author is a wizard from a secretive underground society of wizards known as the Familia Toledo; he and his family (it is a family) have been designing and building their own computers (and ancillary equipment like reflow ovens) and writing their own operating systems and web browsers for some 40 years now. Unfortunately, they live on the outskirts of Mexico City, not Sunnyvale or Boston, so the public accounts of their achievements have been mostly written by vulgar journalists without even rudimentary knowledge of programming or electronics.
And they have maintained their achievements mostly private, perhaps because whenever they've talked about their details publicly, the commentary has mostly been of the form "This isn't possible" and "This is obviously a fraud" from the sorts of ignorant people who make a living installing virus scanners and pirate copies of Windows and thus imagine themselves to be computer experts. (All of this happened entirely in Spanish, except I think for a small amount which happened in Zapotec, which I don't speak; the family counts the authorship of a Zapotec dictionary among their public achievements.) In particular, they've never published the source or even binary code of their operating systems and web browsers, as far as I know.
This changed a few years back when Óscar Toledo G., the son of the founder (Óscar Toledo E.), won the IOCCC with his Nanochess program: https://en.wikipedia.org/wiki/International_Obfuscated_C_Cod... and four more times as well. His obvious achievements put to rest — at least for me — the uncertainty about whether they were underground genius hackers or merely running some kind of con job. Clearly Óscar Toledo G. is a hacker of the first rank, and we can take his word about the abilities of the rest of his family, even if they do not want to publish their code for public criticism.
I look forward to grokking BootOS in fullness and learning the brilliant tricks contained within! Getting a full CLI and minimalist filesystem into a 512-byte floppy-disk boot sector is no small achievement.
It's unfortunate that, unlike the IOCCC entries, BootOS is not open source.
This is the dirty secret to keeping life as an SRE unexciting. If you can't roll it back, re-engineer it with the dev team until you can. When there's no alternative, you find one anyway.
(When you really and truly cross-my-heart-and-hope-to-die can't re-engineer around it fully, isolate the the non-rollbackable pieces, break them into small pieces, and deploy them in isolation. That way if you're going to break something, you break as little as possible and you know exactly where the problem is.)
Try having a postmortem, even informal, for every rollback. If you were confident enough to push to prod, but didn't work, figure out why that happened and what you can do to avoid it next time.
> Item: New states (enums) need to be forward compatible
Our internal Protobuf style guides strongly encourage this. In face, some of the most backward-compatible-breaking features of protobuf v2 were changed for v3.
> Item: more than one person should be able to ship a given binary.
Easy to take this one for granted when it's true, but it 100% needs to be true. Also includes:
* ACLs need to be wide enough that multiple people can perform every canonical step.
* Release logic/scripts needs to be accessible. That includes "that one" script "that guy" runs during the push that "is kind of a hack, I'll fix it later". Check. It. In. Anyway.
* Release process needs to be understood by multiple people. Doesn't matter if they can perform the release if they don't know how to do it.
> Item: if one of our systems emits something, and another one of our systems can't consume it, that's an error, even if the consumer is running on someone's cell phone far far away.
Easy first step is to monitor 4xx response codes (or some RPC equivalent). I've rolled back releases because of an uptick in 4xxs. Even better is to get feedback from the clients. Having a client->server logging endpoint is one option.
And if a release broke a client, rollback and see the first point. Postmortem should include why it wasn't caught in smoke/integration testing.
I hear about these things through the grapevine in San Francisco. There's a subculture there that's all about optimizing their healthcare with as much data as possible; having seen what's possible, you really come to believe that primary care as it exists today for most people under 55 is basically a scam. I have two personal friends who are probably only alive now because they were more proactive about getting health data! (No joke, abnormalities found on MRIs turned out to be early stage cancer.) And a bunch more friends who are overall much healthier than they were before because now they have numbers to move. (Or numbers that move faster: for example, using Dexcom G6s has been catching on as a hack to lose weight because the gradient is much better behaved than a scale.) I also saw one case where someone was diagnosed with cancer and they put together a research team to help guide their treatment... and they ended up developing all this experimental medicine for him which, while ultimately unsuccessful, is thought to have added about a year of life. (Did you know that the FDA can grant single-patient INDs over the phone within one day? [0])
There's a lot of accumulated knowledge out there on how much better healthcare can be for motivated patients who can afford to pay for some of it out of pocket, but it's seen as pretty contrarian, so not discussed that openly. (Just see the other comments in this thread about how "overtesting" is a dangerous waste of resources...)
I don't know of an online community or blog that collates all this info well, but Peter Attia's blog/podcast is a good place to start: https://peterattiamd.com/
In machine learning, you don't get credit for publishing rigorous papers. You get credits for publishing papers that show improved performance:
One big challenge the community faces is that if you want to get a paper published in machine learning now it's got to have a table in it, with all these different data sets across the top, and all these different methods along the side, and your method has to look like the best one. If it doesn’t look like that, it’s hard to get published. I don't think that's encouraging people to think about radically new ideas.
Now if you send in a paper that has a radically new idea, there's no chance in hell it will get accepted, because it's going to get some junior reviewer who doesn't understand it. Or it’s going to get a senior reviewer who's trying to review too many papers and doesn't understand it first time round and assumes it must be nonsense. Anything that makes the brain hurt is not going to get accepted. And I think that's really bad.
I guess people look at statistical machine learning and deep learning, see all the formulae and hear all the calculus terminology and think - "oh, wow, that's a really rigorous field! Look at all the formalisms!".
But it's not. It's an extremely, almost exclusively, empirical field. The mathiness and the formulae are just unfortunate attempts to pass off the whole endeavour as something that it's not- some kind of careful science that uncovers deep truths about intelligence and cognition. In truth, it's all just about beating other peoples' systems in very specific benchmarks.
If it wasn't for this culture of pretensions to higher science, machine learning papers would most likely be written with much more clarity than they are now and mistakes like the one described in the above article would be rare.
The Alphatype CRS ("Cathode Ray Setter") was nominally 5333dpi, but it's a bit of a fudge. As the person who wrote the DVI interface for it, and personally ran the entire Art of Computer Programming, Vol. 2, 2nd Edition through it, let me explain.
You are very correct that in the early 1980's, even 64k of RAM was quite expensive, and enough memory to handle a complete frame buffer at even 1000dpi would be prohibitive. The Alphatype dealt with this by representing fonts in an outline format handled directly by special hardware. In particular, it had an S100 backplane (typical for microcomputers in the day) into which was plugged a CPU card (an 8088, I think), a RAM card (64k or less), and four special-purpose cards, each of which knew how to take a character outline, trace through it, and generate a single vertical slice of bits of the character's bitmap.
A bit more about the physical machine, to understand how things fit together: It was about the size and shape of a large clothes washer. Inside, on the bottom, was a CRT sitting on its back side, facing up. There was a mirror and lens mounted above it, on a gimbal system that could move it left/right and up/down via stepper motors (kind of like modern Coke machines that pick a bottle from the row/column you select and bring it to the dispenser area). And, at the back, there was a slot in which you'd place a big sheet of photo paper (maybe 3ft by 3ft) that would hang vertically.
OK, we're all set to go. With the paper in, the lens gets moved so that it's focused on the very top left of the paper, and the horizontal stepper motor, under control of the CPU, starts moving it rightwards. Simultaneously, the CPU tells the first decoder card to DMA the outline info for the first character on the page, and to get the first vertical slice ready. When the stepper motor says it's gotten to the right spot, the CPU tells the decoder card to send its vertical slice to the CRT, which flashes it, and thus exposes the photo paper. In the meantime, the CPU has told the second card to get ready with the second vertical slice, so that there can be a bit of double-buffering, with one slice ready to flash while the next one is being computed. When the continuously-moving horizontal stepper gives the word, the second slice is flashed, and so on. (Why two more outline cards? Well, there might be a kern between characters that slightly overlaps them (think "VA"), and the whole thing is so slow we don't want to need a second pass, so actually two cards might flash at once, one with the last slice or two of the "V" and the other with the fist slice of the "A".)
So, once a line is completed, the vertical stepper motor moves the lens down the page to the next baseline, and then the second line starts, this time right-to-left, to double throughput. But therein lies the first fallacy of the 5333dpi resolution: There is enough hysteresis in the worm gear drive that you don't really know where you are to 1/5333 of an inch. The system relies on the fact that nobody notices that alternate lines are slightly misaligned horizontally (which also makes it all the more important that you don't have to make a second pass to handle overlapping kerned characters; there it might be noticeable).
Looking closer at the CRT and lens, basically the height of the CRT (~1200 pixels, IIRC) get reduced onto the photo paper to a maximum font size of ~18pt (IIRC), or 1/4in, giving a nominal resolution of ~5000dpi on the paper. But this design means you can't typeset a character that was taller than a certain size without breaking it into vertical pieces, and setting them on separate baseline passes. Because of the hysteresis mentioned above, we had to make sure all split-up characters were only exposed on left-to-right passes, thus slowing things down. Even then, though, you could see that the pieces still didn't quite line up, and also suffered from some effects of the lack of sharpness of the entire optical system. You can actually see this in the published 2nd edition of Vol 2.
Finishing up, once the sheet was done (six pages fit for Knuth's books, three across and two down), the system would pause, and the operator remove the photo paper, start it through the chemical developer, load another sheet, and push the button to continue the typesetting.
It's worth noting that the firmware that ran on the 8088 as supplied by Alphatype was not up to the job of handling dynamically downloaded Metafont characters, so Knuth re-wrote it from scratch. We're talking 7 simultaneous levels of interrupt (4 outline cards, 2 stepper motors that you had to accelerate properly and then keep going at a constant rate, and the RS-232 input coming from the DEC-20 mainframe with its own protocol). In assembly code. With the only debugging being from a 4x4 keyboard ("0-9 A-F") and a 16 character display. Fun times!
Now, if anybody asks, I can describe the replacement Autologic APS-5 that we replaced it with for the next volume. Teaser: Lower nominal resolution, but much nicer final images. No microcode required, but sent actual bitmaps, slowly but surely, and we were only able to do it because they accidentally sent a manual that specified the secret run-length encoding scheme.
Every time I'm on FB. It's pleasant, really pleasant. But, every single time I'll go on the site and to do a thing like, check on the plans for my friend's birthday party. I end up spending like 30min or so more time than I expected to just browsing aimlessly.
It's not like they are putting a gun to my head but, they are using a billion dollars of research and my friends' faces to entice me into doing something I don't want. I'm more or less a monkey and my social circuitry just is so super-stimulated that it's annoying.
And all of that is before the FOMO kicks in and I start feeling bad that my life isn't as nice as my friends even though among my friends, I have a very enviable life (my friends have confided in me as much). But, my every day life cannot compete with the highlight reel of all the best moments of all my friends' lives.
Even though I know all of this stuff consiously, it still affects me because keeping guard and watch over everything constantly is hard. Falling into how it's designed to make you feel is easy.
Deleting FB off your phone is the best thing.
This overwhelming feeling, I'm convinced, is one of the driving factors that created Instagram and Snapchat because it was smaller and less invasive. Ironically, they became just like FB or in some ways (to some people) worse as they grew.
We are social animals and FB weaponizes the faces of all of your friends to suck as much attention from you as possible. It's as unfair a fight as there ever has been in the history of commerce.
I do not believe what we read is "forgotten" as in deleted. I believe, it remains as a distinct series of patterns that may or may not be elicited again, depending on circumstances. Proust's "madeleine" is a perceptual example of this and we all have many such. Often, entire passages of books I read come back up to my awareness without me even knowing what the book was. I could possibly investigate but I prefer to rely on this human ability to retrieve percepts or meaning in an apparent random fashion, as I trust my brain entirely.
There are amply enough potential connections or patterns of connections in anyone's brain to store, in a way or another, pretty much anything we are exposed to, from the trivial to the essential.
As far as I am concerned, a book read is a book stored. Maybe not wilfully remembered, but mine.
It's worth noting that the meanings of a number of very important words have changed over the last 500 years -- sometimes enlarged, sometimes really changing, including: true/false, fact, theory, explanation, believe, logic, etc.
And "thinking". Consider the difference between using logic in traditional mathematics to how thinking needs to be done in the sciences. In the former, we are deriving something, in the latter we are -negotiating- between phenomena and our representations. In science, we don't get to nail our premises/definitions, and we don't get to enumerate the extent cases. We only get inductions. And so on. A "perfectly reasoned conclusion" still needs to be checked out with nature.
In life in the large we have to be even more careful about logic, because it plays into several dozens of biases we have -- ultimately from genetics, but many have been honed by our cultures. In order to do modern thinking in almost every arena, we have to add a lot of context to normal logic, and often have to paradoxically weaken it in order to think better.
In talks, I try to get audiences to understand that logic is not nearly as good a way to think about thinking and do thinking as (say) science is. And science is not the only powerful context we can use to help us calibrate our mental compasses.
With regard to "thinking", logic is one of the servants of the Art, not the Master.
> The planets also are very close to each other. If a person was standing on one of the planet’s surface, they could gaze up and potentially see geological features or clouds of neighboring worlds, which would sometimes appear larger than the moon in Earth's sky.
this reminds me of one my favorite films ever. it is just four minutes long
I think everyone's optimizing the wrong thing! The size of a useful, real-world chess database would be dominated by its search indexes—indexes that allow fast, random-access lookup of every position that occurs in every game. (Or even better: fuzzy searching for nearby, "similar" positions). This is the difficult, open-ended algorithmic problem. Disk space is cheap!
(Out of curiosity, does anyone here actually know how to solve the "fuzzy search of chess positions" problem?)