Context awareness for message moderation
Context is crucial when handling content moderation. One thing might seem innocent in one context, but hateful in a different context.
You can already supply contextId and authorId with content, and this can help you understanding the context when reviewing items in the review queue.
Now you can also enable context awareness in your project settings.
What is context awareness?
When you enable context awareness, the moderation pipeline pulls in the latest messages within the same conversation, thread, or game room, and includes the previous messages when analysing the new message.
How it helps
Each type of model might use context awareness in different ways.
Especially LLM models like AI agents are great at understanding conversations, and they can now assess messages in the light of the existing conversation.
Look at this example:
user 1 -> What's the worst thing you know?
user 2 -> European people [FLAGGED with context awareness]
Simpler ML models can still benefit from context awareness, even though they do not understand conversations.
For example, some users try to circumvent guardrails by spreading their content over multiple messages - now you can catch that as well.
For examples sharing a phone number:
msg 1 -> 2
msg 2 -> 4
msg 3 -> 6
msg 4 -> 5
msg 5 -> 5
msg 6 -> 5
msg 7 -> 5
msg 8 -> 5 [FLAGGED with context awareness]
Or someone swearing over multiple lines:
msg 1 -> f
msg 2 -> u
msg 3 -> c
msg 4 -> k [FLAGGED with context awareness]
How to enable it
First make sure that you include both contextId
and/or authorId
in your API requests.
The context ID can be the id of the chatroom, thread, or anything where messages appear sequentially after each other.
The author ID would be the ID of the user that wrote the message.
Context awareness starts working with either of these two fields - but include both for the best results, if possible.
Afterwards, make sure that you enable context awareness in your project settings.
What's next?
We are excited to hear if the context awareness improves the accuracy for your use case. We hope it helps you flag more content that previously was not caught.
If you have any questions or ideas to make them better, please let us know.