Hacker News new | past | comments | ask | show | jobs | submit login

Heh, I think we're talking about different scenarios. In the case of event sourcing, we often set the retention period to 'forever', because the events in Kafka are our source of truth. Then we just build a materialising layer on top of Kafka, with the possibility to rehydrate based on _every_ event in the Kafka topics. In this case we would have to do some really weird compaction to delete singular events.



I think that scenario is adressed on this confulent blog post

> Deleting a message from a compacted topic is as simple as writing a new message to the topic with the key you want to delete and a null value. When compaction runs the message will be deleted forever.

Handling GDPR with Apache Kafka: How does a log forget?

https://www.confluent.io/blog/handling-gdpr-log-forget/


Oh, wow, must have missed that post, thanks!


It really depends on what you're doing. If you're doing something where the people you're storing data about don't need to interact with each other, you can actually store each user's events in a separate event log.

This can be fed to more transient event queues which do not have an indefinite retention period where interaction is required.

Not sure about how well-geared Kafka is to this scenario though.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: