I would say something more akin to SPAM scoring would be good.
Contextual filters/scanners would score a piece of content, give it a "score" based on what ever categorizations are being filtered (NSFW, Non-Inclusive Lang, Slurs, Disinfo, etc)
Then both the creator and the consumer should be able to see the score in transparent manner, with the consumer being able to set a threshold to filter out any post that is higher then what they choose
Free Speech Absolutist could set it to 0, Default could be 50, and go from there
I agree, this is the only part I'm doubtful upon whether I can see an individual's score, as it might create a prejudice against that individual that is felt by that individual ("I see you're a bot/troll/like hate speech…"). It also makes me wonder if individual centralised mods should be able to see more than that, but I digress.
Scores across a range of measures would be best, in my view.
Contextual filters/scanners would score a piece of content, give it a "score" based on what ever categorizations are being filtered (NSFW, Non-Inclusive Lang, Slurs, Disinfo, etc)
Then both the creator and the consumer should be able to see the score in transparent manner, with the consumer being able to set a threshold to filter out any post that is higher then what they choose
Free Speech Absolutist could set it to 0, Default could be 50, and go from there