Hacker News new | past | comments | ask | show | jobs | submit login
Solid Project: All of your data, under your control (solidproject.org)
350 points by gibsonf1 on Feb 1, 2021 | hide | past | favorite | 118 comments



1) I am a big fan of the general philosophy of Solid. But as others have mentioned, I've found it extremely difficult to understand specifics of the proposal or to see concrete progress, based on publicly available resources.

The best place I've found is Ruben Verborgh's blog (he's a researcher who's affiliated with Solid). For example here's a nice post which goes into more detail on the ideas behind Solid:

https://ruben.verborgh.org/blog/2017/12/20/paradigm-shifts-f...

2) In terms of applications, I'd personally like to see more of a focus on productivity applications, rather than so much focus on social media. Better interop between cloud SaaS apps would be valuable to businesses and professionals, and would sidestep many thorny challenges like decentralized moderation.

As a concrete starting point: what if I could store my Google Docs text files on my own storage layer, and edit them in realtime using a variety of editor apps? This would resemble my ability to edit a .txt file on my computer in vim or TextEdit, but would port that metaphor to the world of modern online collaboration.

Here's a short Twitter thread I wrote on this topic:

https://twitter.com/geoffreylitt/status/1355255162626068482

3) I'm very curious to see more incremental paths from the current web to a decentralized approach. For example, what if we could start annotating existing websites with private data and sharing those annotations P2P, rather than starting over from scratch?

I've explored this a bit with my Wildcard project, where users can store annotations in a "spreadsheet" that is linked to a web page:

https://www.geoffreylitt.com/wildcard/


I think that with a problem statement of "I want to control my data and how third parties use my data" Solid makes a good deal of sense. Personal information should be owned by you and you should be able to take it from service to service as you see fit. The two problems that I find with Solid, based on my own limited understanding are as follows:

(1) Services don't need to give back to Solid - Services, Facebook, your medical provider, whatever else you are using, does not have a clear incentive to provide their own data on you back to your Solid pod. It is far easier for them to keep it: it lets them do offline processing, and it keeps you more locked into their service. I'm not sure how one would solve this issue.

(2) Much like mobile apps with excessive permissions and the abuse of tracking elements - I don't see how Solid prevents the abuse of its service. If Solid catches on and Facebook has a permissions check saying "Let Facebook do 'SELECT * FROM .;' on your Solid data, how many people will click yes? Even if you request it each time, once the data is copied out, it is out there and can be packaged and resold, used to build advertising profiles, etc. You're back to the original problem of not being able to limit access to your data, but with extra steps. Where I think this could be solved is by Solid not providing the data directly, but by being a service which can answer queries. Queries could be items such as "Does user like cats? y/n/m". Or it could be something like "Here is an anonymized dataset being built out. Please add your input to it." Replies to queries could also have an amount of deliberately wrong or misleading answers given, depending on the service and endpoint to obfuscate your personal data on places that don't need it. While this can still be abused, it raises the bar for abuse.


The problem with Facebook is not that they are using user uploaded content, is that the user has no clue whatsoever the data is being collected by means of tracking actions (likes, time spend on a video...) and cross site tracking and what is going to be used for (Ads, make the site more addictive ?, recommendations?).

It's an issue of collecting user generated data without awareness, and with lack of transparency, that's very different from "I upload a picture and I share it with my closest friends only" for that issue, one could argue Facebook has a fair enough UI/UX.

I'm assuming that what would go into a pod is decided by the user and therefore is a separate issue.

I imagine the permission you are describing more like, "Let Facebook Access your Wedding photo's Pod" if well implemented, and "Let Garage Band access your Music pod".


> The problem with Facebook is not that they are using user uploaded content, is that the user has no clue whatsoever the data is being collected by means of tracking actions (likes, time spend on a video...) and cross site tracking and what is going to be used for (Ads, make the site more addictive ?, recommendations?).

I'd say the two biggest questions that outweigh any others are:

Who will access the data?

What will they do with it?

The problem is that few of us know how our data is being used against us. "We will use the latest artificial intelligence methods to convince you that [political issue] is a good thing, since you're just on the other side of the fence."

The key thing you didn't mention is who the data will be sold to. I doubt the extremist nonsense, not related at all to any videos I've watched, creeps into my YouTube recommendations by accident. Who's paying them to generate clicks and viewing hours? What conversion are they attempting with my information? Only one side understands this game. Is there any justification for not having to reveal who paid them for my information or my clicks?


another problem is that once data has been accessed, it can be stored and analyzed. maybe we need to play big tech's own game and apply ToS to any data we allow them to access?


When you put it that way it feels like Solid is a technical solution to a legislative issue.


Or perhaps a complement to a legislative issue. Solid could be an easy tool/set-of-practices that allows companies to abide by data privacy laws.


This exactly. This problem can never get better until normal individuals have the capacity and leverage to demand reasonable contractual limits on the use of their data, and the confidence that these limits will be enforced.

That's a tall order when we don't even have the language to describe what data-usage limits even mean.


You raise a good point - my example of asking if the user likes cats is something that could be an inferred metric based on their clicking on cat pictures and having lots of pictures of cats. It isn't the user data itself which Solid aims to control.

Trying to subdivide pods so that services map access from album A -> pod B and album B -> pod C sounds like a painful UX problem as well.. Do I place my ambient music in my music pod or my "office music" pod? Do I have a hierarchy of musical pods? I am joking a little, but it is a hard thing to have fit into an easy flow.


One possible answer to both of your questions is "with legislation", and I feel like to some extent a tightening of the rules here is what Solid anticipates. Perhaps not to the point that we could expect the big players of today to adopt a platform like Solid, but perhaps where the companies of tomorrow looking to avoid the headaches of compliance see offloading data storage to a dedicated entity managed by the user as an appealing option.


I think that may be the answer! If data is toxic waste then offloading the storage of toxic waste may be the new way to focus on your "core competency" of running your service.


> If Solid catches on and Facebook has a permissions check saying "Let Facebook do 'SELECT * FROM .;' on your Solid data, how many people will click yes?

Excellent point. For the answer, just look at how many people give FB app permission to all their contact and photos.

Hint, if you add an observer on android, you will see that whatsapp and fb app, both scan your contact list every few minutes!

If you allow anything (access my contact list so i can find person X), they will take it literally everything they can (you entire contact list, every few minutes).


With respect to (1), data is a liability (running afoul of GDPR, etc), so if someone else can manage the storage then that reduced my liability considerably. Facebook might decide that their incentive to track you around the web exceeds their incentive to avoid this liability (of the two are even mutually exclusive), but for many avoiding liability is a powerful incentive.


March 2020 I was investigating Solid for legaltech projects I'm designing. I thought it sounded like a great way for a user to create a master profile that can be used with multiple web apps and service providers, while maintaining more control over their data. I engaged with the Solid community and with Ruben, who was very nice and helpful. However, I found that the tech was still at the hobby stage, and I didn't really think the toys being built on it were very compelling. It was very disappointing considering it was already a couple years old and had a lot of hype around it. I hope that this moves forward, but it's almost a year later and seems to be the same story.


there's a "joke" on the dweb groups that "SOLID is freenet with less caching of other people's illegal content"


Look at Hyper Hyper Space!

https://github.com/hyperhyperspace

Its goals are similar, the approach is more pragmatic (p2p data layer using standard web browsers and webrtc).


We are launching both a Solid server TrinPod and a suite of productivity apps with TrinApp launchpad - free versions will be available end of the month https://graphmetrix.com


Every now and then this gets shared again and every time I am struck by what a terrible job the front page does of explaining what Solid is and why anyone would want to use it. Like, okay, yeah, "control my own data" is a nice pitch in this day and age I guess, but... what can I do with it? "Any kind of data can be stored in a Solid pod, including normal files like you might store in Google Drive or Dropbox" - that sounds kinda useful, where's the link to SolidFile? Is there even a SolidFile app or is that just a thing someone could make with this protocol?

Hell, where's the front page link to check out the list of any apps using Solid? It's like two or three random clicks to find this page (https://solidproject.org/apps) if you are curious, and it's super uninspiring - why does one of the apps in the "Showcase" section at the top not even have a description?

Solid always just feels like a solution looking for a problem.


You're right. It is still not clear to me what/how this will be used (I think I have an idea but not sure; the front page should have made it quite clear).

It would have been great if they had described a target use-case right there on the front page.

The writeup seems to be targeting developers but it looks to me like the product is for the public (end-users). The site doesn't seem to have done a good enough job linking both of them.


I'm in the early stages of bootstrapping a SaaS app that prioritizes security and privacy. For my business, this would be great: users are far more likely to trust an app if the app won't be storing any of the application data itself -- if they can choose their own data hosting provider.

Google Cloud Drive, Box, and others already allow this kind of model to an extent. Third-party apps can access and modify user data hosted there, although not through a standardized API.


This project might be worth a look: https://www.etebase.com/


I'm the founder of Etebase, cool to see it mentioned it here!

If anyone has any questions, please let me know (email in profile). We will also be giving a talk about Etebase at FOSDEM this year if anyone is interested: https://fosdem.org/2021/schedule/event/etebase/


E2E encrypted firebase equivalent, with CalDAV frontend for E2E encrypted calendar from mobile & Thunderbird clients, with OSS clients, server and multiple language bindings? Seems too good to be true! Thanks for the pointer.


I found this months ago, lost my pinned tabs and spent _hours_ looking for this. This will teach me to not bookmark interesting stuff. Thank you!


This is fantastic. Thank you!


Thanks (I'm the founder of Etebase)!

Let me know if you have any questions, my email is tom @ the domain.


Whoah!! That's frikkin awesome!

Has this made front page in HN yet at any point?


This post[0] by Ruben Verborgh helped me understand the problem Solid is trying to solve:

> In December 2019, Google and Facebook proudly announced a major milestone, which was echoed in news media all around the world: it is now possible to copy a picture from Facebook to Google Photos. This news came in mere months after we celebrated the 50th anniversary of another technological feat: the moon landing of 20 July 1969, when millions of households witnessed Neil Armstrong take a giant leap for mankind.

> So let me get this straight: two of the largest tech companies in history make headlines because in 2019, they move a single photo over the whopping distance of 11 km it takes from the Facebook headquarters in Menlo Park to the Googleplex in Mountain View, whereas in 1969, we sent live video signals from 380,000 km away on the actual moon?

> If those two companies, both widely hailed as pinnacles of technology, genuinely consider this to be innovation they are proud of, the only logical conclusion is that data-driven innovation today is fundamentally broken.

> The problem is widespread and not limited to technology or social media. Any sector that requires personal data to deliver services, from retail over insurance to health, suffers from the damaging effects of siloization. Companies increasingly need more access to data, but they won’t get there if they keep on collecting that data themselves.[...]

0. https://ruben.verborgh.org/blog/2020/12/07/a-data-ecosystem-...


I've never seen a compelling explanation of how this is supposed to work. Do people self-host their pods? Is the pod one more thing to sign up with an external provider?

If I'm self-hosting my pod anyway, then wouldn't I want my self-hosting to include the (for example) photo-library viewing layer instead of giving google photos or flickr access to the photos in my pod? If I did give them access, then how is it more private than if they hosted it themselves? Maybe the pitch is more about mobility of services and avoiding lock-in than it is about privacy and security, but that doesn't seem to be the message I get. Does anyone have any pro-solid blog posts or articles you think might be helpful in convincing me? It smells like it is well aligned with my values but I just can't see how it will actually work.


The hosting story is still a little messy, but in my version of this, I think it needs a trusted third-party to offer hosting, along with an open committee to manage the data specifications.

The advantages of separating data storage from usage are similar to those in application development. If you have a robust model for defining and retrieving your data, the tools for working with that data can be iterated on independently from the data itself.

So let's say your photos are in Google Photos, because your phone backs them up there automatically, but all your friends who you want to see the photos are on Facebook. Theoretically, "Solid" could be pulling those photos into a standard data specification, and provide easy tools for automating how they get shared into Facebook.

Then let's say you wanted to edit some of those photos, so you open up PhotoShop Solid, and get editing.

But then you join Instagram, and you want to quickly show off your Photoshop skills. All you need to do is connect Solid.

The key aspect here is that you can use as many different tools as you like to work with your data, but it's the same data, and if Facebook goes belly-up tomorrow, or you just don't want your photos there anymore, Solid has your back.

One thing that I don't think is well defined yet, but really should be, is how services will request access to your data. This needs some kind of standard interface, very similar to how phones let you customize app access. It should make it very clear what data you are "selling" to the Facebook, and what they are providing in return for that data.


That makes sense, but it also means the selling point is data portability more than data privacy/security. Since I would already need to give an application access to my data, I might as well have them host it for me. It still feels like Solid is a more complex and harder-to-reason-about solution to the same problem that GDPR data takeout tries to solve. Consistent open standards are nice, but it doesn't take that much work for Facebook (or others) to accept a Google Photos takeout dump as an input format.

If the only real problem that Solid solves is "I loose my data if provider X locks me out or goes bankrupt", then it isn't even good enough since the third-party pod-hosting company can have the same failure mode. Maybe we should expand on GDPR data takeout legislation to require something like API-driven access that would allow people/companies to build automated backup/export solutions.

That becomes a much simpler thing to build and get buy-in for instead of a whole new paradigm which companies aren't incentivized to follow and is hard for users to understand.


I think data portability is the most obvious value proposition for most people, especially between potentially hostile services (Facebook and Google Photos could easily inter-operate, but that's not something either company wants to invest in). For someone like me, who has moved a streaming music "collection" between 4 services over the past decade or so, it would have been real handy.

It also, conceptually at least, dramatically lowers the barrier to entry for new services in the same space, since they can use the existing data models and persistence layers.

The privacy aspects are secondary, at least to start, though I think the last few years have helped establish more context for them. Providing granular controls for data access in a common format across services seems like a big win, at least compared to my experience hunting through preferences/settings/account details/etc menus in Facebook and other places.

Finally, I think there's significant potential value in this being used as way to authenticate the source of media. As deepfakes are getting better and easier to create, having all the video and audio of you connected to a known identity seems like a necessity. Letting Facebook or Google be that identity provider would be very bad.


The privacy aspects need more attention. There does need to be a neutral and trusted identity provider. It should be possible for the pod owner to control access via some kind of scope/role/policy setup. That's going to be complicated for most people. Using an example from elsewhere in this thread, I probably would want PhotoShop to be able to have full access to many (but not all) images my pod, but Instagram should really only be able to read the images I want to post there.


I think federated identity is the way to go. Technically that's what we have now, ie every identity provider I'm aware of lets you reset your password via email, which is federated.

But we tried federated and users didn't care. They want convenience. Maybe with privacy apparently picking up some public interest, we can try again.


> every identity provider I'm aware of lets you reset your password via email, which is federated

I'm not sure I understand what you're getting out. Federated identity is far more than self-service password reset. And as far as convenience goes, how many places do you use your google/github/facebook/linkedIn/twitter credentials to log into a third party? How convenient is that? That's identity federation. The problem is that those companies own your identity profile and use it for their purposes, not yours.


Solid seems like an idea that only makes sense if your key goal is allowing cloud providers to offer proprietary services that use your data, while still storing it elsewhere.

But I'd say most people fall either into "I don't mind proprietary services holding my data" or "I want to self-host the preferably open source application and my data". I don't see the benefit of separating the two at all, and I think the only reason this project keeps coming up and around is the name of the guy pushing it.

"Solid is a mid-course correction for the Web by its inventor, Sir Tim Berners-Lee" is very literally the only selling point I think Solid has.

Sandstorm or Cloudron have a better model, self-host (or pay someone to host for you) all of your apps and data together, in an easy interface for adding new apps like an app store.


I really liked Sandstorm’s capability-based security model. Very cool concept for a team app server.


Spritely project is developing Goblins, implementing the capability-based CapTP transport protocol for distributed programming and the fediverse, which looks very interesting.

https://spritelyproject.org#goblins


It really is. The challenge is it is tough to have that app support with it, everyone expects their apps to work the old way of doing things. ;)


I had this idea a while ago when trying to figure out how we're going to do web-tech in space. Obviously you don't want to send an entire web page, just the data.

If data is clearly templated, it's up to the user to choose a UI for displaying it. Many UIs can compete. They don't have to proprietary.

In time, UI would become an art form; like music.


You can choose whether to self-host or whether to have someone else handle that for you.

As for also hosting your apps: I think the answer to that is the same as for why you might still choose to use an email client even if you're hosting your email server.

However, I would expect the majority of people not to self-host, but to have that handled by a party they trust. They would then not be limited by that host also having to support the particular photo-viewing layer they are interested in, but instead are able to choose one independently.

(Disclosure: I work for Inrupt, but opinions are my own.)


>photo-library viewing layer instead of giving google photos or flickr access to the photos in my pod?

IMHO, privacy is the wrong tack for these arguments. The issue is that Flickr or Google Photos can shut down and/or raise prices while all your data is stuck there.


You may also like Etebase (I created it), which is an open-source and end-to-end encrypted backend for apps. So it removes the extra layer (an additional provider) but still maintains the privacy and security benefits of self-hosting.


Philosophically, and in the long-term, Solid is compelling to me for all of the reasons that the project purports to exist. But we have a classic chicken-and-egg problem with the absence of both reputable pod providers and the development of an application ecosystem.

Practically, and immediately, it is compelling to me as a developer of small-ish applications (plugins, etc.) in which I want to give users the ability to store some data (preferences, etc.) in the cloud without my having to manage that data, or the associated services/infrastructure - including having to deal with absorbing or recouping the cost.

Dropbox once offered the Datastore API[0], which was a handy little bring-your-own-database service allowing apps read/write access to a key/value store in a user's account (Dropbox accounts being quite common then), but it was deprecated[1] due to lack of traction at the time.

[0] https://dropbox.tech/developers/the-datastore-api-a-new-way-...

[1] https://dropbox.tech/developers/deprecating-the-sync-and-dat...


I think it's important to point out (if only for end-users) that Solid puts your data "under your control" in the "allows you to authorize third party access to your data" – not in the "can actually control your data in practice, post-access".

It'll let you granularly authorize first party access what data you have in your pod, but there can’t be any technical guarantees with regards to illegitimate sharing or otherwise copying (many might at least cache, for instance) – nor about what is collected and shared outside this system. It's own documentation even states this, if you dig a little.

I keep seeing data-hubs and identity-providers touting themselves as solutions to the web's privacy issues, but I don't see how they actually solve anything. It seems like an attempted technical solution to a social problem to me.

The real problem with data how data is used these days is really that a bunch of data is collected opaquely, unethically, and in some cases illegally – and possibly shared. The whole system including data brokers and real time bidding is out of control.


How does this work in practice, I didn't see a worked example on the site.

For instance, say I give Solid Social access to my contacts and my photos. At that point, Solid Social can access all my contacts and photos, copy them, and then do whatever they please with them (even if I later revoke permissions). In the current web, I upload my photos to a website, which I then trust not to use them inappropriately. In the Solid world, it seems like I give photo-access to a website (which could then copy them) and trust that website to not use them inappropriately. This seems like the current system, but with extra steps.

In terms of data compatibility, I think that there's a potentially compelling case (user-driven standards vs. bespoke handling per website), but again I fail to see the practical implications. For example, if I upload a video to a website and it becomes incompatible with other websites, my assumption is that there's some reason for that step. It could be out of walled-garden malice, but it also could be that the website is encoding it and co-locating it in such a way that it'll be really fast for other users to watch. In Solid-world, it's unclear what the flow here should be: does the website only get to consume the videos as I have them on my pod (potentially terrible performance)? Does the website get to still do optimizations (and potentially keep artifacts of my video after I revoke permissions?)

I like the drive towards consistent standards for interfacing with data, but other than that I could just be missing the important bits here (and please let me know if so), and I fail to see the positive change in most web-based user-experiences for the most part.


Solid is just the technical part. You're also going to need the regulatory part that says it's illegal for a company stores data after you revoke access.

But the technical part is still necessary.


How is this really any fundamentally different from say GDPR then?


Because Solid is a completely different model for where the data is stored. It decouples the apps from the storage.



This project is doomed in isolation. It needs to be coupled with at least one major popular/killer service, and preferably more than that. Plus make that as easy as uploading data to “the cloud,” and as transparent, plus “free” to meaningfully compete.

Right now it just looks like technology in search of a problem.


> Right now it just looks like technology in search of a problem.

This is a common turn of phrase (which has turned up more than once here, even), but it's not appropriate to describe Solid. Solid is a half-implemented solution to a real problem.


The whole internet was built on this concept. The Web technologies were built long before e-commerce or dating sites start using it


I'm not sure what you mean.


Reminds me of https://remotestorage.io/ which I have used in a (mostly personal) website. I like that I can provide a service as a static website, and users pay for, manage and have control over their own storage. One of the things that could help it bootstrap is that they also support client-side Google Drive and Dropbox shims, so users who want to use something else can, but most users will just use one or the other.

There are downsides though.

- The API may be less than ideal for some use cases. This can make a well-performing app hard. For example there is no relational-sql API.

- You can't migrate user's data. So if you make changes to the format you need to retain backwards compatibility "forever".

- If you want to provide "discovery" or cross-user features you will need to have your own storage anyways.


I actually started working on just such a shim yesterday. The idea is to represent all major cloud storage through a single simple frontend API. You have to implement each provider's OAuth flow, but once you have the files it's pretty much the same.

Dropbox, Google, and my own protocol[0] will come first, but Solid is planned eventually.

What I've learned so far:

* Google is extremely draconian unless you use their JavaScript picker. To really integrate nicely you would have to spend at least $15000-$75000 getting your app audited by a 3rd party.

* Dropbox picker integration is very easy and slick.

[0]: https://github.com/gemdrive


What Solid really needs to do is create a page that I could show to my nontechnical friends where it explains how it works and what services/sites support it. Without those two things being front and center, I can't see Solid being anything more than a niche interest for technical folks.


I think the problem there is that there is not much to show. But it will require that page at some point. Maybe it would help now as well as managers and other "decision makers" might understand it more. And they have to implement it in the end right?


At least for the services/sites/apps which support it, there is a page:

https://solidproject.org/apps

There are lot's of apps, most of them work just fine, which is quite inspiring.


Lovely to see this come up again. It's really a great time to promote Solid, with the evidence of shady practices at the big corporations mounting, I know lots of folk would like a way to reclaim some of their data.

That said, I feel like the big starting point for Solid is exfiltration/syncing of data between services, and that appears to be missing, for the most part.

Give folk an app that will pull their music preferences from Spotify, and let them try out Amazon or Apple music with the same collection. It's a very simple data set, and a fairly common and comprehensible use case.

Doing photos from Facebook/Instagram/Twitter/etc would be a bigger lift, but not too much different in concept, and I think very compelling in terms of selling the value of the service.


Music, movie, and book history would be trivial to implement. This is timely, given the removal of Goodreads' API.

User profiles as well, à la gravatar.

How about work history, LinkedIn exports, standard resume formats?

A lot could be done with medical data, using FHIR's ontologies.

Fitness data also deserves a place there.

It's difficult to understand how Solid isn't a major success after all these years.


Is this related to Jaron Lanier's vision of union (in the labor sense) groups which sell to companies access to the group's anonymized data? In his model, one individual can belong to many groups and their information is like a share/stock they have in a pool of data with other people. They sell in a cooperative fashion to select companies, gaining power through anonymity and group bargaining power.


Data in those contexts nearly always have "reciprocial actions" fundamentally and those who gather the data know what they actually want out of it. Company-user interactions would likely produce better data. It feels like half of an understanding of both data usage and union style collective barganing.

Unless I am missing something it seems a bit like trying to combine a semitruck and a bicycle to get green bulk cargo transportation. Sure both things individually have these advantages and it would be great to have all in one but implementing it is fundamentally nonviable.


Would love to learn more about this. Any resources/links on this subject?


Parent seems to be referring to MIDs [1]

"Mediators of Individual Data (MIDs) are a union-like organization championed by technologist Jaron Lanier as a framework for addressing the key issues around user-generated data like ownership, monetization, and reparations. An effective MID or data union could perform a number of critical functions on behalf of its members that are currently lacking in the digital economy."

There exists also a great Harvard Business Review article, but it is paywalled.

[1] https://blog.datadividendproject.com/this-week-in-data-101-w...


"Access from your Country was disabled by the administrator.", The Netherlands not allowed (the hashtag for the project is #DDPforAll ironically) :(

http://archive.today/swdlk

Edit: DataDividendProject.com (https://archive.is/iFhT0). Given that it is all blocked, it is not that interesting to me.


Interesting. Do you have any good links?


I think something like this makes sense at some # of users closer to 10-20, or maybe even higher numbers, kind of monkeysphere / Dunbar's number level.

That is, a smaller group of people (an extended family, a neighborhood street, a collective ... fine, a cult) band their resources together to pay for pod hardware to realize some savings / distributed back-up, and have "edge" apps for trusted sharing within the pod, and then P2P (pod-to-pod?) apps for sharing outside the pod.

Right now, the economics just don't quite make sense.


My favorite example of this is image sharing for large extended families.

If most people thought about it, they don't actually want pictures of their kids all over Instagram, but it's a way to distribute them and, after a fashion, back them up. We don't have to worry about the matriarch's house burning down and taking out our photo archive with it. Or her clicking on a malware link and wiping it all out.

If they had a piece of software that wasn't designed for building a startup (deeply technical, highly fiddly), Uncle Bob and you and your nephew Timmy could host geographically redundant copies of the data that the group has access to.


If this were to take off, I would expect to see some governments offering it to all citizens, both to support whatever digital services they provide, and as a tool for helping users manage their privacy.


Such as that of Flanders, for example: https://inrupt.com/flanders-solid


Exactly what I want in my tech… gov control of my data, combined with gov-paced innovation…


A "collaboration" led by Jan Jambon, the guy who says that about collaborating with the Nazis:

"The people who collaborated with the Germans had their reasons. Me, I was not alive during that time."

I don’t trust GAFAM to protect my privacy, but I wouldn’t trust this government either.


How would governments finance this? Developing these offerings with the quality of service of the GAFAM (which users take now for granted) costs a fortune (financed by advertising in practice as of today).


I absolutely think the self-hosting story definitely only works for the widespread world if those of us capable of self-hosting provide services to less-technical users we know.


Solid is awesome and I'm bullish on this model in the long-run, but it seems to be caught in a classic chicken/egg problem. Nobody is going to use it until there are good apps. and nobody is going to make good apps until there is a market of users. IMO this is the same problem that kept Sandstorm and remoteStorage from getting big.

I see 2 solutions currently:

1. Make an awesome storage product, that just happens to implement Solid. It might have to start out also implementing S3, which unfortunately doesn't support OAuth, which basically kills it for apps. You could try making it compatible with another proprietary protocol like Google Drive, or provide a frontend library that easily talks to all the major cloud storage providers, plus Solid...

2. Take the slow track. If developers themselves want to use it, they'll write useful open source apps over time. This can take forever though. Even after 30 years there are still large feature gaps on Linux because the market is too small to attract big app developers.


> It might have to start out also implementing S3, which unfortunately doesn't support OAuth

The expectation that Solid is backed by a smart server that's intimately aware that it's hosting a Solid pod and what exactly it takes to do so is a critical failure in Solid's design. If Solid's protocols instead better accommodated "dead" formats that can be served by static web hosts, so your $USERNAME.github.io could be the front door to your data store—at least the parts of it that you already make available to the public, anyway—then we'd have passed the chicken-and-egg stage years ago and Solid would already be a runaway success.

At the end of the day, it's supposed to be about data, after all—linked data—and not about the endpoints/servers themselves. Compare to the ease of creating an RSS feed for your blog (you can get your static site generator to spit one out pretty easily) and contrast it to needing to provision and actively administer a Mastodon server, for example.


> "dead" formats that can be served by static web hosts, so your $USERNAME.github.io could be the front door to your data store—at least the parts of it that you already make available to the public

Could you share some examples of such data, other than "news" feeds? FOAF, for example? http://www.foaf-project.org/


Inline annotations[1][2], personal patch queues, project notes, bookmarks, calendars, photo galleries, reading lists, media trackers[3], anything you might think of dumping into a Gist or other pastebin-style site, anything you might think of dumping into a hosted TiddlyWiki, and so on. I.e., most of the type of data that Solid pods are already meant to handle, really (except served statically, without the POLP-violating[4] architecture that exists in Solid's current design).

1. https://hypothes.is

2. https://www.w3.org/annotation/

3. https://noeldemartin.github.io/media-kraken/

4. https://www.w3.org/2001/tag/doc/leastPower-2005-12-19.html


Do I understand this (https://solidproject.org/developers/tutorials/getting-starte...) correctly: you can only write Solid applications with frontend Javascript?


No, you're looking at a Getting Started Page targeted at web developers, hence they use the tools they are familiar with. These tools take care of auth/data format and other glue code so you can get started easily... But because Solid is based on open standards, you could write a desktop app that makes HTTP requests using your language's clients without issues, but you'll have to write some boilerplate code yourself until there's frameworks in your language that handle that for you.

I believe working with Solid is mostly about working with HTTP and RDF... so, if you're into the JVM, try Apache Jena[0].

PHP? Try EasyRDF[1].

Python? rdflib[2]

[0] https://jena.apache.org/

[1] https://www.easyrdf.org/

[2] https://rdflib.readthedocs.io/en/stable/


Asking devs to work with RDF is the main issue I have with the approach of Solid.

I don't expect SPARQL and ontologies to go mainstream. So why "force your hand" and make solid depend on the standards of the semantic web?

There are much simpler and productive approaches to solve the problem of data interoperability.


> There are much simpler and productive approaches to solve the problem of data interoperability.

Really? Which ones?


Set a standard.

Working with RDF is computationally intensive. And writing UIs that fit to various ontologies is not solved. So everyone inevitably ends up writing applications to fit a particular ontology.

Excluding theoretical quibbles, what changes? In practice, the particular ontology that your application supports is the standard.


> And writing UIs that fit to various ontologies is not solved.

Are you aware of any projects that try to solve this? Very interested.


The evaluation of SPARQL patterns is PSPACE-complete.

Which means that for trying to approach the problem of writing UIs (working on a realistic hardware) that adapt to different ontologies you need to restrict the set of possible solutions...

That is, to limit the number and place constraints on the ontologies you accept.

The ActivityPub ecosystem is a glaring example of this. ActivityPub defines a set of baseline requirements (and a basic ontology) that all implementations need to support.

Once you meet the baseline, you can use linked data tools to extend the standard. Applications that understand your extension will also interact with it in the UI, those that don't understand your extension usually just ignore it.

But you need to specify base requirements (set the baseline for an extensible standard)! You can't just throw totally random ontologies at devs, it won't work. I think this is the error Solid makes.


> Solid is mostly about working with HTTP and RDF

Surely this incarnation of the RDF-based Semantic Web nirvana will succeed where all other attempts have failed! :)


Love the idea but how does Solid prevent any malicious actor from over time replicating all the data that user authorize into a centralized repository - in this case, it only makes it even easier for data consolidation.


In my (personal) view, Solid is technology that enables service providers to put you in control of the data, not that forces them to do so. In other words, it's just part of the puzzle, where the other part is e.g. customer demand or regulation.


To use an old metaphor: locks on the doors are not what keep people from getting into a house.


I attended the presentation of Tim Berners Lee at MozFest 2018 where he outlined the Solid Project, and I've been waiting to see this project evolve. I'd like to see some more documentation and a 'get started' kind of tutorial to make this pretty abstract concept understandable. Looking around the project pages doesn't give me much incentive to start using it yet.

I'd love to see a 'Solid for dummies' kind of approach in the documentation, things are quite technical still. Some easy to understand examples would be nice!


Solid seems like a great technological idea. It empowers people to own their data and give others and apps access to it. This is empowering.

But!!!! people want useful things. Things that help them solve problems so they want to pick up those things. Solid is lacking these solutions right now.

Where is my todo list, contact app, calendar app, and so forth... all built on solid? This is an opportunity. But, until these kinds of the solutions exist for solid it's going to stay as a neat technology that's not bridged the gap to be useful.


There's definitely not many Solid-supporting apps at the moment, but as for the todo list, there is https://noeldemartin.github.io/solid-focus/


Indeed, I feel projects like this can invest for years on platform, but fail to foster or develop the core useful apps people want to use. Ultimately, if your platform lacks that, few people will invest long enough to bring those apps from third parties.


To me the whole pitch is a bit too nebulous and seemingly inconsistent. What does it mean with "portable" and "interoperable" data standards? Does that mean specific file formats? Protocols? Apps?

What do they mean with "Linked Data"? What's novel about this, what makes it different from hyperlinks or shared Dropbox folders?

Why does it say pods are decentralized data stores, but then on the "about" page it tells me I have to host the pod on a server - how is that decentralized?


Linked Data does have a concrete definition: RDF data that is accessible through SPARQL, with well defined semantics. That is, following a published ontology and re-using existing vocabularies as much as possible.

Linked Data itself is not novel: https://www.w3.org/standards/semanticweb/data


I like the concept of Solid, and proposed it to the BeWelcome community (who are running a hospitality exchange network like CouchSurfing, but free).

Although Solid is a better technical idea, I'm not sure how the migration from PHP/MySQL would work. Nobody on the Solid forum replied to my post asking about it. Another guy, Chagai, is pushing for Matrix and the fediverse. Ultimately, there's many great technical ideas, but the real solution doesn't lie in writing code - it's in building the community.

Therefore I'm trying harder to create more activity in BeWelcome's existing community (which is hard when nobody's travelling), like an online meetup every Thursday. If it gets critical mass, I believe it could be as widespread as Wikipedia or OpenStreetMap (which it models in org structure). I wish that Solid could be a part of it, but that requires active developers, and currently it's hard enough to find someone willing to make an API or native iOS app, never mind a breaking back-end change that people aren't asking for.


Last month, I was working with some people on building a decentralized health data store for vaccine waitlists (a la Mailbox) and certificates and we were looking at Solid as one potential backbone. I really admire what they’re trying to do. However, as some other people have observed, their example applications are quite awful and not very useful(?) and they seem to be marketing it to end users, not developers. In my experience, most end users do not care about privacy and security until 1) they either lose a lot of money or other valuables 2) some hacker comes knocking on their door for extortion and they lose a lot of money or other valuables.

It would be so much better if they just made it so easy and simple for developers to use that it becomes their default, and not rely on hope that end users will suddenly start using decentralized apps in the name of privacy.


I'm the founder of Etebase[1] an open-source and end-to-end encrypted backend for building apps. The goal is to maintain privacy and security while still bringing users the benefits of a central storage. Unlike Solid it's already used in a few places, like GNOME, KDE and etc.

[1] https://www.etebase.com


I read and it says you have to use a third party host or host on your own [1]. I’m not sure how this differs from the WWW - could someone explain? Thank you!

[1] https://solidproject.org/users/get-a-pod


It's an inversion of control over your data. Instead of all the data being stored in multiple places outside of your control, your data is stored on a pod and you have better control over what happens to it.


I was also confused by this, on the home-page it says

> Solid lets people store their data securely in decentralized data stores called Pods.

Then in the "Get-a-pod" it suggest you should get Amazon for hosting?


Has anyone built anything with this yet?


How would you build something with this?

Given that none my prospective users have a pod, or would want to go through additional sign-up steps.

Also all pod providers seems to be experimental, and have no pricing.


I want to love this.

But, in trying to imagine any scenario in which tech corporations might conceivably start hosting their internal user data on these external "Solid Pods", I'm left drawing a blank.

Until we have a way that tech companies start storing users' data in user-controlled infrastructure, I think this idea is a noble fantasy.

I would love to hear an argument that will convince me I'm wrong on this.


The only scenario that I think is likely for tech corporations are either legislative pressure, or a tech corporation for which user tracking is not their core competency (e.g. Apple) using it as their USP as a moat against Google and Facebook.

That said, there's a lot of user data at non-tech companies. Especially given that they're not particularly skilled at securing that data, a changing regulatory climate turns that data more and more into a liability. If, as a user, you can ensure that that data is stored at a trusted and competent party (potentially yourself), that alone would already be a major win, in my (personal) opinion.


Do pods support queries or just HTTP GET/POST/PUT/DELETE?

Queries are important for performance. They avoid transferring all data over the network from the pod to the application.

Although it's possible to define an open index format (metadata built from the actual data), it's probably still less efficient to access and also tricky to update without race conditions, data corruption, etc.


How would querying the data work? Will a pod provide querying capabilities as well? Example use case would be finding all pictures around a certain location. It "seems" unavoidable that certain data still has to be at a service provider since they enrich the original data. How would that work?


I haven't looked into it for a couple years, I was initially very excited about the concept. TLDR: Unless the user experience/accessibility/hosting experience is improved, I just couldn't suggest anyone use it.

Last time I really dug in the user interface and accessibility concerns were not great.

I understand the idea was more about the data handling, and that anyone could produce something based off the framework but I had all sorts of issues with the login tokens/usability. (ie, if you logged in with the wrong cert it didn't have an easy way to clear it, and would 'log in' to an error page. ect. )

as far as I can tell, each pod/data host can run different versions of solid, so your experience may vary between them. You can self host as well.

I'd have to dig in again to see if things have gotten better, but I don't have time at the moment. on quick look, some of my data sites are no longer working. oh! link says it was shut down in oct. https://lists.w3.org/Archives/Public/public-solid/2020Oct/00...

That said, the community was pretty nice and willing to answer questions.

Maybe things have improved?

I was hoping that maybe they'd start using DAT as well, because it seemed like it might be nice to dovetail with some of the beakerbrowser/distributed web stuff going on. re @pfrazee


quick followup, after checking my old links and making a new sign up. it does seem like the inrupt.com pod has a bit better /more up to date design. Accessibility, while not perfect, does seem to be improved. (quick WAVE tool check)

Things are still fairly confusing for new users though. And some how tos I wrote down from then no longer work. Which is not the worst thing, as some of the tutorials required you to hover over and wait to be able to edit things.

Much easier to look at though. ymmv

for historical/hysterical reference, I uploaded my old solid adventuress file, it is not up to date, and some of the css is not always showing the right colors, ect.

but you can see some of the directions I was looking into then: https://pod.inrupt.com/metahari/public/solidnotes/SolidAdven...


Note: it was only the old solid.community domain that shut down; the same servers are now reachable through solid.community. But yes, the default UI used there is very lacking and slow to develop.

I also built a more developer-focused interface that allows you to view the data in your Pod, and that should be more accessible. If you're interested, you can try it at https://penny.vincenttunru.com


Tim went on to leave MIT, found Inrupt and get funded to the tune of several million dollars. Some great guys there on the solid team, including Melvin Carvalho and others who maintain ties with the MaidSAFE people. Give it time, it’s the future.


From a quick look at the site and a quick look at perkeep's site, both projects look quite similar. Could anyone with some experience comment on their differences?


I self-host via a 1T SD card in my phone (encrypted). Why pull from/risk "the cloud" when I can have my stuff on me in the first place?


Something like Solid is more useful if you want to share data with other people, or use apps that only support cloud storage. There are many such apps (Google Docs for example), but most of them are currently embedded in their respective cloud storage provider. Solid decouples the apps from the storage.


So if you lose your phone, you lose all your data? (asking in a genuine manner)


I have a glacier store of my phone and SD card data that I upload every weekend. I'm good with 5 or so days of phone data loss in that event. (I use a desktop VM that does nothing other than touch Glacier, and my phone, for this.)


a project with similar goals is Sandstorm.io

i plan to check it out soon..




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: