More

kusmi · on Nov 9, 2017

Excellent, where can I trade in my ducky for this upgraded version?

kusmi · on Nov 7, 2017

Alfresco is an open source Enterprise content management software which has frontend UI called Alfresco Share. The installation is simple using their all inclusive executable which bundles the server, database, and Solr for searching any and all data. It also has this concept of business workflows which you say is a pain point for you. Each document can have workflows assigned to it at which point it has a life of it's own, passing from user to user, emailing itself, tagging itself, etc. There is a learning curve on the dev side, especially if you aren't Java savvy, but once you get the hang of it it's very powerful. I've personally written an API in Lua for syncing documents. I assign bot accounts from the Share UI which are given permission to crawl and sync directories to bot accounts on the server (origin server or external server), and each bot has a job to do on each document it gets, for which it writes and dispatches a service with systemd, then uploads the processed documents back to alfresco. And this is probably 1% of what Alfresco is actually capable of (and my implemention reeks of hackiness, because I hate Java). A team of devs is bound to get more use out of it.

kusmi · on Oct 25, 2017

I once wrote a thing that would take text files uploaded to the alfresco ecm, containing run options or data input for some automation script sitting on a server that's part of the corporate subnet, and depending on how the text file was tagged or which folder it was in, it would create service and timer unit files for systemd. The files would sync to the server via alfresco's atom API, and any output generated would be uploaded back into alfresco. This was neat because you could make alfresco user accounts for these bots, thereby turning them on or off for any project by simply adding the user and further modify which directories should be crawled for tasks to run by assigning read permissions to the bot account all via the client side alfresco share interface. Server images containing the bot code could be provisioned in say AWS, and integrated into an alfresco cluster. It made it much simpler for allowing non-technical users to run scheduled web crawlers, pipe them into document formatters, for web publication, or attach to email out the results hot to a given email list all from the nice alfresco share interface.

kusmi · on Oct 23, 2017

“I’d be the last one to give you a quote saying that we don’t need to bother with these [unsequenced] regions.”

I wonder if he said this with a straight face.

kusmi · on Oct 23, 2017

Cambridge analytica is propped in a major way by Robert Mercer of Renaissance Technologies. See, https://www.newyorker.com/magazine/2017/03/27/the-reclusive-....

I happen to work near their corporate headquarters in NY, and I can't tell you the number of times I had to fight the urge to get into their network.

kusmi · on Oct 19, 2017

I don't need it to be faster, I need it to stop crashing whenever I try to stuff over 2GB into memory.

neomantra · on Oct 19, 2017

I hear ya.... try:

* Building with LJ_GC64 mode. [1]

It is newer but people are using it and it has been merged by Mike Pall into v2.1. Still beta though.

* Using FFI to allocate large objects or large pools of small objects.

I made some simple containers to help with the FFI parts [2] and also a (not recently updated) jemalloc binding [3] to help tune memory usage. Here's a gist I made with some experimental results [4].

[1] https://github.com/LuaJIT/LuaJIT/issues/25

[2] https://github.com/neomantra/lds

[3] https://github.com/neomantra/luajit-jemalloc

[4] https://gist.github.com/neomantra/9122165

kusmi · on Oct 9, 2017

Doesn't e-estonia use block chain for identity?

dmitrygr · on Oct 9, 2017

no. They simply use digital signatures. Blockchain is as needed here as a bicycle is needed by a fish.

kusmi · on Oct 7, 2017

edge-cases. I feel like edge cases balloon exponentially relative to data size, where it feels as if at some point you might as well just screw the automation and handle each entry individually. Who cares, it's never going to be 'complete' anyway.

kusmi · on Oct 4, 2017

I got the t450s recently off eBay. I would not make the same purchase again. Instead I would go for the x series models. This is because the keys on the t450s size laptop (which is larger) are too far apart for coding. Then there's the tn panel screen, the viewing angles are so bad that at 14" you have to choose whether you want the top or the bottom of the screen to be clearly visible -the other side will appear washed out. I never noticed this on the X230 which probably has an equally shitty screen, because its size is just right to never be an issue.

If I could return the t450s I would go for x240 or x230. Upgrade the battery, and toss in SSD if it doesn't already have one.

Also, check out mini PCs like the gigabyte brix. I use them as headless dev boxes on the local network. Run 24/7 without issue for months now.

lonesword · on Oct 4, 2017

I already work on a full size keyboard. So I guess keys being spaced out won't be an issue. Also, a refurbished t450 would be too expensive for me anyways. Sticking with t420 or t430

kusmi · on Oct 4, 2017

X230 sounds to be up your alley then which you can find for as low as 200$. If you're a hardened emacs user, your fingers will thank you.

kusmi · on Oct 1, 2017

I embedded the sentiment analyzer from Stanford core nlp into a web crawler I wrote, with the idea of instructing the crawler to follow links surrounded by only positive or negative text. Didn't seem all that useful at the time, but now I wonder if it's worth digging up?

Iv · on Oct 1, 2017

I wonder if you could not feed, e.g. reddit comments and identify long chains of respectful conversations for instance.

It could also be a useful moderation tool by pointing out who first started to troll and devolved a conversation into a fece flinging contest.

divanvisagie · on Oct 2, 2017

You may want to train it on some new data then. The data CoreNLP comes trained with is basically yelp review type of stuff, so people talking about a product. Seems to work fine for people talking about other people too. But I think you will get much better results if you train it to actually detect trolls.

kusmi · on Oct 3, 2017

Maybe set up a crowdfund to make the internet's first troll content training corpus.

Iv · on Oct 2, 2017

Try to predict karma scores then?