general
Why is there code truncation behavior in Tabby Server versions 0.11 and 0.12 compared to version 0.10?
I upgraded Tabby from 0.10.0 to 0.12.0 and noticed code truncation behavior in the new versions. The completion function returned in versions 0.11 and 0.12 is truncated compared to version 0.10. I tested the behavior with a specific model and input context. I would like to understand why this truncation occurs and if it is an expected behavior or a bug.
mo
moqi
Asked on Jun 13, 2024
- In version 0.10, the generated tokens limit is 128, while in versions 0.11 and 0.12, this limit was changed to 64.
- The code truncation behavior is due to a change in the generated tokens limit between versions 0.10, 0.11, and 0.12.
- The limit was reduced from 128 tokens in version 0.10 to 64 tokens in versions 0.11 and 0.12.
- This change in token limit affects the completeness of the returned completion function, leading to truncation in newer versions.
- The user suggested adding an optional
max_decoding_tokens
field to the CompletionRequest to allow users to control the maximum number of tokens generated, which could provide more comprehensive completions for users with powerful servers or more patience.
Jun 14, 2024Edited by