BPE Tokenizer

Interactive byte-pair encoding visualization

Type or paste text below and watch how BPE tokenization works in real time. The algorithm trains on your input text — it finds the most frequent character pairs and merges them iteratively until the target vocabulary size is reached.

Vocab Size
Display