The server returns the token usage when asked nicely (with some json). IMO, it would be very useful to inform the user about the costs of any LLM operation.