How can developers use the Sensitive-lexicon thesaurus for text filtering?

2025-08-19

507

There are two typical integration approaches for developers:

Basic methods: Sensitive words are spliced into patterns (e.g., "word 1|word 2") by regular expressions, suitable for low performance scenarios.
Efficient methods: Using DFA or Trie tree algorithms, the lexicon is first loaded into a data structure (e.g., Python's Trie library), and then the text is matched. The latter time complexity is only related to the length of the text, suitable for high concurrency scenarios. The project provides pseudo-code examples to illustrate the whole process of loading the lexicon, building the matcher and checking the text.

Quick query station AI tool