I upgraded Tabby from 0.10.0 to 0.12.0 and noticed code truncation behavior in the new versions. The completion function returned in versions 0.11 and 0.12 is truncated compared to version 0.10. I tested the behavior with a specific model and input context. I would like to understand why this truncation occurs and if it is an expected behavior or a bug.
moqi
Asked on Jun 13, 2024
max_decoding_tokens
field to the CompletionRequest to allow users to control the maximum number of tokens generated, which could provide more comprehensive completions for users with powerful servers or more patience.