Tokenization

Text tokenization using Byte Pair Encoding (BPE). Supports Indian language.

Try with samples:

or