Skip to content

Add: handling case where RL is lower then expected #1153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Sakoram
Copy link
Collaborator

@Sakoram Sakoram commented May 16, 2025

  • Increase RL speed if measured speed turns out too low while training pacing
  • Refactor functions responsible for storing and retrieving pacing train result
  • Adjust dev_tx_queue_set_rl_rate() to handle IAFV on DPDK 25.03
  • Add wrapper function of rwlock
  • Use rwlock to protect DPDK port during resets.
  • STATIC_PAD to be disabled by default

@Sakoram
Copy link
Collaborator Author

Sakoram commented May 19, 2025

This works well for PF, but fails for VF. IAVF does not support dynamic TM updates. I need to figure out how to deal with it.
Everything seems to work now.

Signed-off-by: Kasiewicz, Marek <marek.kasiewicz@intel.com>
@Sakoram Sakoram force-pushed the RL-lower-speed branch 3 times, most recently from 79152e2 to ad3c438 Compare May 22, 2025 07:01
@Sakoram Sakoram requested a review from Copilot May 22, 2025 07:23
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the video pacing logic to detect when the measured bitrate is lower than the requested limit (RL) and retrain at a higher rate, and standardizes pacing-train APIs to handle both pad intervals and BPS profiling.

  • Add measured_bps check in tv_train_pacing to trigger a BPS-based retrain
  • Rename and refactor mt_pacing_train_* APIs for pad and BPS variants
  • Update hardware init paths to consult mt_pacing_train_bps_result_search for initial BPS

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
lib/src/st2110/st_tx_video_session.c Added measured BPS branch, recursive retrain, renamed pad APIs
lib/src/st2110/st_tx_audio_session.c Swapped audio-specific pacing APIs to generic BPS variants
lib/src/mt_util.h Renamed pacing-train declarations for pad and BPS
lib/src/mt_util.c Updated pacing-train implementations to use unified struct
lib/src/mt_main.h Merged audio pacing struct into unified mt_pacing_train_result
lib/src/dev/mt_dev.c Improved error handling and port restart logic for RL

Signed-off-by: Kasiewicz, Marek <marek.kasiewicz@intel.com>
@Sakoram Sakoram marked this pull request as ready for review May 22, 2025 08:06
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@Sakoram Sakoram requested a review from Copilot May 27, 2025 14:25
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances rate-limiting (RL) pacing by adding dynamic bps retraining when measured throughput is too low, refactors pacing result storage APIs, and updates port reset handling for IAVF on DPDK 25.03.

  • Adds logic in tv_train_pacing to increase bps and retrain if measured rate < expected
  • Unifies “pad” and “bps” pacing train result APIs in mt_util and updates all callers
  • Replaces a mutex with a rwlock for port reset protection and propagates it through flow creation, datapath bursts, and RL updates

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
lib/src/st2110/st_tx_video_session.c Added measured_bps comparison, bps retraining loop, and renamed pad result API calls
lib/src/st2110/st_tx_audio_session.c Updated audio pacing calls to unified bps result API
lib/src/mt_util.h Split pacing APIs into pad vs. bps variants
lib/src/mt_util.c Implemented mt_pacing_train_pad_* and mt_pacing_train_bps_* functions
lib/src/mt_platform.h Introduced inline pthread rwlock wrappers
lib/src/mt_main.h Consolidated train result structs, removed old audio struct, added rl_rwlock member
lib/src/mt_flow.c Replaced vf_cmd_mutex with rl_rwlock around rte_flow operations
lib/src/mt_cni.c Removed the resetting atomic check in cni_traffic
lib/src/dev/mt_dev.c Refactored dev_tx_queue_set_rl_rate for IAVF, moved to rwlock, updated error paths
lib/src/datapath/mt_queue.c Wrapped Rx/Tx bursts with try-read rwlock to avoid blocking datapath
Comments suppressed due to low confidence (2)

lib/src/mt_util.c:269

  • [nitpick] Parameter name rl_bps is outdated in this function signature; rename it to input_bps to match the header and purpose.
int mt_pacing_train_pad_result_search(struct mtl_main_impl* impl, enum mtl_port port,

lib/src/mt_cni.c:347

  • Removing the resetting flag check allows cni_traffic to run during interface reset, which may result in invalid operations or races.
// if (rte_atomic32_read(&impl->inf[i].resetting)) continue;

@Sakoram Sakoram force-pushed the RL-lower-speed branch 3 times, most recently from dc7ad8b to ba8e062 Compare May 28, 2025 09:21
Signed-off-by: Kasiewicz, Marek <marek.kasiewicz@intel.com>
Copy link
Collaborator

@DawidWesierski4 DawidWesierski4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is changes in critical path
I don't think this can be merged without perf tests

Comment on lines 806 to 809
ret = mt_pthread_rwlock_unlock(&inf->rl_rwlock);
if (ret) err("%s(%d), failed to release write lock, ret %d\n", __func__, port, ret);

return ret;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using goto, although prevalent in kernel codin,g shouldn't be standard in our project as we are working witha structural paradigm

Can't we use some inline error handling function ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am wrong

Lets use it for error handling and just have it properly designated in functions
mby good to add to programing style documents

Comment on lines 769 to 773
ret = rte_eth_dev_start(port_id);
if (ret) {
err("%s(%d), start port %d fail %d\n", __func__, port, port_id, ret);
goto exit;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't get it
Shoudn't we check if the device is started ?
why are we making sure it's started ?
Looking at the comment above if it's not we can just commit hierarchy and go through with it ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on PF any flow works. rte_tm_hierarchy_commit() handles on its own stopping the device. on VF to change node properties we need to:

  • close the device
  • remove old node ( can be done only on closed device, and for some weird reason only after hierarychy was committed while device was UP, when it is done while the device is DOWN some flag in DRV_IAVF is switched and we can't remove any node )
  • add node with new properties
  • start the device (only for reason described above)
  • commit hierarchy
  • stop device
  • start device ( the port needs to be reset to update the hierarchy, again idk why)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IAVF from DPDK seems to be... Imperfect. I also hate to create such a weird flow, but I couldn't make it work otherwise

Comment on lines +48 to +61
enum mtl_port port_id = entry->rxq->port;
struct mt_interface* inf = mt_if(entry->parent, port_id);
int ret;

/* Trylock as we should not block in tasklates */
ret = mt_pthread_rwlock_tryrdlock(&inf->rl_rwlock);
if (ret) {
dbg("%s(%d), try lock fail %d\n", __func__, port_id, ret);
return 0;
}
ret = mt_dpdk_rx_burst(entry->rxq, rx_pkts, nb_pkts);
mt_pthread_rwlock_unlock(&inf->rl_rwlock);

return ret;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is critical path

i wonder if we can make it faster even a bit by ditching everything apart from the lock handle and ret
like
pthread_rwlock_t lock = mt_if(entry->parent, entry->rxq->port)->rl_rwlock

Idk ... I'm supper worried about the performance implication of adding anything there
but i dont' see any sensible alternative
This needs performance validation

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really good concern, how much overhead this code introduces, I agree that we need to test it,
but I do not agree that changing the way we access variables will change anything. the compiler will handle such a small optimization by itself.

Copy link
Collaborator

@DawidWesierski4 DawidWesierski4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    if (inf->tx_pacing_way == ST21_TX_PACING_WAY_RL && bytes_per_sec) {
      ret = dev_tx_queue_set_rl_rate(inf, q, bytes_per_sec);
      if (ret < 0) {
        err("%s(%d), fallback to tsc as rl fail\n", __func__, port);
        inf->tx_pacing_way = ST21_TX_PACING_WAY_TSC;
      }
    }
    tx_queue->active = true;
    mt_pthread_mutex_unlock(&inf->tx_queues_mutex);
    if (inf->tx_pacing_way == ST21_TX_PACING_WAY_RL) {
      float bps_g = (float)tx_queue->bps * 8 / (1000 * 1000 * 1000);
      info("%s(%d), q %d with speed %fg bps\n", __func__, port, q, bps_g);
    } else {
      info("%s(%d), q %d without rl\n", __func__, port, q);
    }

idk why i dont see it flagged down in commit changes but
would change the mt_dev.c line 1596 code above to somthing like

    if (inf->tx_pacing_way == ST21_TX_PACING_WAY_RL && bytes_per_sec) {
      ret = dev_tx_queue_set_rl_rate(inf, q, bytes_per_sec);
      if (ret < 0) {
        err("%s(%d), fallback to tsc as rl fail\n", __func__, port);
        inf->tx_pacing_way = ST21_TX_PACING_WAY_TSC;
      }

      float bps_g = (float)tx_queue->bps * 8 / (1000 * 1000 * 1000);
      info("%s(%d), q %d with speed %fg bps\n", __func__, port, q, bps_g);
    } else {
      info("%s(%d), q %d without rl\n", __func__, port, q);
    }
    tx_queue->active = true;
    mt_pthread_mutex_unlock(&inf->tx_queues_mutex);

Sakoram added 2 commits May 30, 2025 14:11
Signed-off-by: Kasiewicz, Marek <marek.kasiewicz@intel.com>
Signed-off-by: Kasiewicz, Marek <marek.kasiewicz@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants