This is an extremely simple tokenizer, breaking only and exactly on the space
character. This tokenizer is intended to work in tandem with
prepare_text
, so that spaces are cleaned up and inserted as
necessary before the tokenizer runs. This function and
prepare_text
are combined together in
prepare_and_tokenize
.