Bug: white spaces missing between tokens

During the tokenization, we ignore all white spaces. This leads to the resulting spans not having a white space between them and glues the single words together.