Why normalization exists
Raw transaction text is inconsistent across banks, processors, countries, and channels. Normalization creates a reliable layer for analytics and decisioning.
Input and output model
- Input: raw descriptor plus optional amount/date metadata.
- Output: canonical transaction wording and structured signal fields.
Practical implementation pattern
- Keep original descriptor unchanged for audit.
- Store normalized descriptor as a separate field.
- Attach confidence and processing metadata.
Example
{
"raw_description": "POS 09321 MKTPLACE*ONLINE",
"normalized_description": "online marketplace purchase",
"normalization_confidence": 0.94
}