Skip to content

Simple conversation generator is faking streaming.

The simple conversation generator does not actually stream the generation real-time unlike tool generations. It fakes it by waiting the LLM to finish generation, then splits the generated content into tokens, then simulates generation by outputting text with a delay after each token.