You don't have to delete these logs when someone emails you if they're still useful for those purposes. You do need to delete those logs eventually though, and you need to disclose how often that is. It can be ten years if you want.
That sort of argument has worked pretty well for Disney and friends when it comes to extending copyright indefinitely and yet not technically violating the US constitution's "temporary" wording. I wonder how well it will work in this case. How can anyone possibly make a reasonable, intelligent, a priori assessment of how long that data will be useful for that kind of purpose?
It is good advice to mask off the bottom bits of the IP address since it will still likely be unique enough for you, without being useful to someone who has an unrelated list of IP addresses.
That isn't necessarily true if, for example, you're using IP addresses to detect a pattern of abusive access within a general pool of acceptable access. And if a full IP address isn't enough to potentially identify a specific threat, it surely also isn't enough to constitute personal data. Either way, your advice seems excessively broad here.
You should be able to delete usernames, passwords and email addresses on request. You should be able to remember that request if you restore from backups.
Doing this effectively without immediately becoming self-contradictory probably requires something like storing a hash of every user name or email address that has had deletion requested, and querying a database of such hashes during the restoration process in order to exclude affected data. While this may well be in the spirit of the GDPR, there is no denying that it is extremely onerous, particularly for anyone working with personal data who rarely if ever receives such a request yet would have to redesign a large part of their IT infrastructure around the ability to do it.
> How can anyone possibly make a reasonable, intelligent, a priori assessment of how long that data will be useful for that kind of purpose?
By first coming to grips with the fact that this isn't your data.
Keeping data is always a risk. You risk being hacked and jeopardising losing control of people's personal data. The longer you keep it, the longer you are putting that data at risk.
How long can you be expected to go without being hacked? A year? Five years? Ten years? At which point do you think you can be compliant?
> > If you're not using IP addresses for those reasons [fraud detection], you shouldn't collect it. It is good advice to mask off the bottom bits of the IP address since it will still likely be unique enough for you, without being useful to someone who has an unrelated list of IP addresses.
> That isn't necessarily true if, for example, you're using IP addresses to detect a pattern of abusive access within a general pool of acceptable access
Why do you say something is not necessarily true when I have already given this exact reason?
> Doing this effectively without immediately becoming self-contradictory
No, it requires keeping the two or three requests for data or erasure posted to you, or emailed to you.
This isn't complicated. I have email from over twenty years ago.
By first coming to grips with the fact that this isn't your data...
The trouble is that you haven't adopted any objective or actionable position here. It's easy to pass the buck with more questions. What small organisations need is simple, verifiable answers. The absence of such answers from authoritative sources is possibly the single greatest criticism being made of the GDPR.
Why do you say something is not necessarily true when I have already given this exact reason?
You said it was good advice to mask off the bottom bits of the IP address. I don't think that is good advice in general, for the reason I gave: either the full address has a legitimate use for identifying specific threats, or it probably isn't specific enough to constitute controlled personal data in the first place. Either way, it's unclear what benefit derives from masking part of it.
No, it requires keeping the two or three requests for data or erasure posted to you, or emailed to you.
Sorry, but again I can't see how your reply fits with anything I wrote. What point were you trying to make here?
> The trouble is that you haven't adopted any objective or actionable position here.
What, that you don't own someone else's personal data?
If someone signs up for your service, you do not own their email address and cannot use it in a way they wouldn't want.
What part of that is unclear?
> You said it was good advice to mask off the bottom bits of the IP address
If it wasn't needed.
You keep dropping this, so maybe it's because you didn't read it when I wrote it or when you wrote it.
> I don't think that is good advice in general, for the reason I gave: either the full address has a legitimate use for identifying specific threats, or it probably isn't specific enough to constitute controlled personal data in the first place.
Well, you're wrong. The ICO has recommended it in general for exactly this reason:
> The absence of such answers from authoritative sources is possibly the single greatest criticism being made of the GDPR.
Possibly, but also possibly not. So what?
It's still law.
> I can't see how your reply fits with anything I wrote. What point were you trying to make here?
I said: "You should be able to delete usernames, passwords and email addresses on request. You should be able to remember that request if you restore from backups."
You said I can't do that without storing a hash of every user name or email address that has had deletion requested, and querying a database of such hashes during the restoration process which is bonkers: How many complaints and requests for deletion are you going to get? You might get two or three. Ever. How often do you restore backups? Monthly? Yearly? How hard is it to search two or three requests whenever you do a database restoration if you do something a dozen times a year?
If you're a bigger company you might do it more frequently, but then it's clearly no longer onerous.
What, that you don't own someone else's personal data?
No, sorry, I was referring to your whole opening section, which was a response to my question, "How can anyone possibly make a reasonable, intelligent, a priori assessment of how long that data will be useful for that kind of purpose?"
My point is that even something as "simple" as deciding how long you should retain some data that you originally acquired and have used for some obviously legitimate purpose is not necessarily an easy question. These are the sorts of issues where I'm arguing that small organisations really need simple, concise, clear guidance.
You've linked to a few pages on the ICO's site throughout this discussion. I note in passing that reading and understanding them fully would take many hours, even just looking at the high-level guides and checklists, and having done so, they still leave numerous subjects open to interpretation or judgement where someone would probably need real legal advice to find out where they might stand in practice.
Now, if you're a business with an in-house legal team and an in-house IT team and a turnover well into the millions and a designated management structure and established processes for doing most things, that's probably not a big deal. But if you're three guys running an Internet startup from someone's bedroom, or even a significantly more established online business but not large enough for dedicated IT staff or in-house lawyers (which is still the scale that the vast majority of businesses are working at), no-one has time to read through dozens of pages of "guidance" full of subjective-anyway legalese. You need clear, actionable guidance, and you need a clear, unambiguous legal framework.
I would argue that expecting even a good faith effort at compliance from many of those smaller businesses is unrealistic, given the "support" available today. They're just going to break the law, either of out of ignorance or out of apathy, and either way the law didn't achieve anything useful in all of those cases. That then means that any of us who are concerned with trying to do the right thing and do want to understand our real legal obligations will automatically be at a disadvantage. That's not a good way to encourage compliance or to support smaller businesses growing (and in particular, growing responsibly and legally).
> No, sorry, I was referring to your whole opening section, which was a response to my question, "How can anyone possibly make a reasonable, intelligent, a priori assessment of how long that data will be useful for that kind of purpose?"f
Doesn't matter.
It's not your data.
It's their personal data.
You may use it as long as it benefits them; As long as they would want you to.
Why do I want a site to have my email address? To send me notifications I've requested? To facilitate a password recovery process I initiate? To send me marketing material that I am interested in? Surely it's obvious that if I change my mind, that's up to me.
Why do I want a site to have my IP address? To help protect my account? Sure. For blacklisting addresses that try to log into my account with an invalid password? Of course.
See? Specific examples are easy. Enumerating them is hard though, which is why European courts don't do that. They rely on organisations like the ICO to field questions, make a judgement call, and publish guidance for frequently asked questions. But for the most part, it's just common sense.
> My point is that even something as "simple" as deciding how long you should retain some data that you originally acquired and have used for some obviously legitimate purpose is not necessarily an easy question.
You should keep it for as short a period as possible.
The longer you keep personal data, the longer it is at risk. Remember you're also responsible for keeping that data safe. If you get hacked and lose control of that data, you're responsible!
> You've linked to a few pages on the ICO's site throughout this discussion. I note in passing that reading and understanding them fully would take many hours, even just looking at the high-level guides and checklists, and having done so, they still leave numerous subjects open to interpretation or judgement where someone would probably need real legal advice to find out where they might stand in practice.
Understanding how to pay tax in the US correctly takes much more than "many hours", and very often requires professional advice.
However very few people worry about tax on millions when their income is under $100; Very few companies need to worry about how to handle millions of requests for erasure when they don't even have any personal data.
But some prudence helps: Simply "not keeping data you don't need" is the ICO's advice. It's also best practices for IT security.
Take the few hours to understand it. If you've got specific questions, I might try to answer them, but you're unlikely to have more than a few for your specific business case, and you'll find legal advice for those questions cheaper than tax advice.
> I would argue that expecting even a good faith effort at compliance from many of those smaller businesses is unrealistic
Well, good luck with that.
My experience with European regulators like the ICO is that they're not going to be amused by your argument.
Given that most of this has been law for decades is a big part of why I think it's not unrealistic.
> That then means that any of us who are concerned with trying to do the right thing and do want to understand our real legal obligations will automatically be at a disadvantage.
Then do the right thing: Treat the data as the person would want it treated. Make sure you are proactive and do your best. Don't worry so much that someone else is going to treat people a little worse so you need to abuse people as much as is legally permitted.
I'm still grasping GDPR myself, but in terms of deleting users from backups might be solved via uids. Each user should also have a uid that isn't PII by itself. Upon getting a eraser request you remove everything and preserve the uid and flag it. Then, when restoring from backup you can easily see which users need to be erased, and you've not stored PII for any amount of time.
As an aside, you also have to understand that some data is only PII when you have other data joined with it. Extended PII can easily be ingested into a system and stripped of its association with the user. That value independent of other identity data means its no longer PII, extended or otherwise. But, again I'm still grasping this myself. Please correct me if I am wrong.
I think your approach with UUIDs is practically equivalent to my suggestion involving hashing: you replace something that is the actual personal data with an irreversible proxy.
My concern remains the same either way. It's not that such measures can't technically be implemented, it's that the effort required to do so in practice is disproportionate, particularly for smaller organisations using limited personal data for legitimate purposes where there is little risk to privacy from otherwise properly handled data not being fully deleted on demand.
For example, instead of a small transport business buying a standard backup service and using backup and restore tools that just save all their important data to a secure, reliable location in case of disaster, it appears that they might now have to implement data crunching logic customised to their specific circumstances, despite possibly having no knowledge about how databases or programming work at all.
I fail to see how such a requirement would be constructive in terms of safeguarding anyone's privacy in a meaningful way, but I also fail to see how it isn't required according to the letter of the GDPR.
PII isn't a term used by the GDPR, and the association stripping we're used to (from healthcare, for example), isn't necessarily enough... GDPR covers psuedonymized and anonymized personal data too.
It's even easier than that: Just keep the requests for deletion in a separate database, like hardcopy in a filing cabinet. Then train your staff to check the filing cabinet whenever they restore from backups.
That sort of argument has worked pretty well for Disney and friends when it comes to extending copyright indefinitely and yet not technically violating the US constitution's "temporary" wording. I wonder how well it will work in this case. How can anyone possibly make a reasonable, intelligent, a priori assessment of how long that data will be useful for that kind of purpose?
It is good advice to mask off the bottom bits of the IP address since it will still likely be unique enough for you, without being useful to someone who has an unrelated list of IP addresses.
That isn't necessarily true if, for example, you're using IP addresses to detect a pattern of abusive access within a general pool of acceptable access. And if a full IP address isn't enough to potentially identify a specific threat, it surely also isn't enough to constitute personal data. Either way, your advice seems excessively broad here.
You should be able to delete usernames, passwords and email addresses on request. You should be able to remember that request if you restore from backups.
Doing this effectively without immediately becoming self-contradictory probably requires something like storing a hash of every user name or email address that has had deletion requested, and querying a database of such hashes during the restoration process in order to exclude affected data. While this may well be in the spirit of the GDPR, there is no denying that it is extremely onerous, particularly for anyone working with personal data who rarely if ever receives such a request yet would have to redesign a large part of their IT infrastructure around the ability to do it.