Moderation
The Moderation API allows you to use OpenAIs harmful text API, and some of our own custom moderation functions:
- Checking if text is harmful and in what harmful category it falls into
- Analyzing sentiment
- Redacting personal information
Config
Default config files are provided for Moderation endpoints at config/prompts/moderation/*
.
For each you can define:
- The model and fallback models used by chat, eg
gpt-4-turbo
- Options to pass to OpenAI, eg
size
orquality
Most image requests are quite simple so don’t need a lot of configuration.
Harmful text
Given some input text, outputs if the model classifies it as potentially harmful across several categories.
Chunks
OpenAI recommends that you split long text into chunks for moderation, this is handled automatically for you and input is chunked into sensible snippets.
If you’re not getting perfect results then you can tweak the chunk_size
parameter in the moderation/harmful.yml
file.
Redaction
Given some input text, checks if any Personally Identifiable Information is present and redacts it.
By default the following PII will be replaced by the text [redacted]
:
- Phone numbers
- Email addresses
- Physical addresses
For example:
Becomes:
The endpoint will also output the details of the items that have been redacted.
Sentiment
Given some text, outputs the sentiment either positive
, negative
, or neutral
.