Skip to content

[Bug]: last_token_time is equal to arrival_time #9998

Closed
@wolfgangsmdt

Description

@wolfgangsmdt

Your current environment

The bug is not related to the envirement

Model Input Dumps

The bug does not related to the model

🐛 Describe the bug

QUESTION 1:

How do you calculate the RequestMetrics in RequestOutput please look at screen-shot below (in YELLOW):

image

I have found here in L. 696 that last_token_time is equal to arrival_time !!! IS IT A BUG?

Could you please tell me what unit is the time is it second? nanosecond? I believe it is something like this example below (correct me if I am wrong):

import time
arrival_time = time.perf_counter()

QUESTION 2:

How can I calculate the tokens/second (for output), TTFT, TBT, throughput and total time

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions