Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Growing memory usage with DASH live streams #6610

Closed
MichaelSweden opened this issue May 14, 2024 · 20 comments
Closed

Growing memory usage with DASH live streams #6610

MichaelSweden opened this issue May 14, 2024 · 20 comments
Labels
component: DASH The issue involves the MPEG DASH manifest format priority: P1 Big impact or workaround impractical; resolve before feature release status: archived Archived and locked; will not be updated type: bug Something isn't working correctly type: performance A performance issue
Milestone

Comments

@MichaelSweden
Copy link
Contributor

MichaelSweden commented May 14, 2024

Have you read the FAQ and checked for duplicate open issues?
Yes

If the problem is related to FairPlay, have you read the tutorial?
No

What version of Shaka Player are you using?
v4.8.3

Can you reproduce the issue with our latest release version?
Yes

Can you reproduce the issue with the latest code from main?
Yes

Are you using the demo app or your own custom app?
Own

If custom app, can you reproduce the issue using our demo app?
Not done, but tested with a minimal html page and default shaka config.

What browser and OS are you using?
Cobalt browser, Linux

For embedded devices (smart TVs, etc.), what model and firmware version are you using?
N/A

What are the manifest and license server URIs?
Sorry, have not found any free public streams showing this problem.

What configuration are you using? What is the output of player.getConfiguration()?
Not relevant, since problem also occurs with default config.

What did you do?
Upgrade shaka from v4.5.0 to v4.8.3 and playing a dash live stream.

What did you expect to happen?
Be able to play for long time.

What actually happened?
After a few hours memory usage have grown very much, several hundred megabytes. Later in our platform oom-killer kicks in and terminate the browser process.

Are you planning send a PR to fix it?
Yes

@MichaelSweden MichaelSweden added the type: bug Something isn't working correctly label May 14, 2024
@MichaelSweden
Copy link
Contributor Author

Trouble shooting
Playing in Firefox and using web dev tool show shortly after start, strings 40 MiB (count 28902). After 70 minutes, 185 MiB (count 39141). More detail:
157.261.256 74% objects Array, g_Player parser_ streamMap_ id_PT1713303398S,audio1300_0 segmentIndex references
Below this many 262.720, with objectElements[xxxx].
The shaka.media.SegmentIndex class have an array of shaka.media.SegmentReference named "references".

Since the older version doesn't show this behavior I started to test different releases to track down when the problem started to occur. Seems to be between v4.6.0 and v4.6.1, more precisely:
Issue: #5898 SegmentTemplate@media not updated after change in DASH manifest
PR: #5899 fix(DASH): SegmentTemplate@media not updated after change in manifest
Commit: 30de177 fix(DASH): SegmentTemplate@media not updated after change in manifest (#5899)
Row: this.templateInfo_.mediaTemplate = info.mediaTemplate;

If this row is removed in v4.8.3, the huge growing memory usage disappear. This row is not needed for the stream that cause this problem.

@MichaelSweden
Copy link
Contributor Author

MichaelSweden commented May 14, 2024

Thoughts and attempts to fix
My guess is that the assigment of the string will create a reference between objects that cause the garbage collector not to free up memory that is associated with these objects. However shouldn't it be only the string that is hold in memory, not other parts. This part of the code is quite complex and handle different scenarios. A dash manifest (mpd) support many functionalities and many parameters. This problem don't show up with all streams. I haven't found any free public test stream on internet that gives this problem. So maybe some special in this manifest that cause some part to behave faulty.

Well I try to get rid of this growing memory, by changing this row. It turned out to be problematic. Here is a summary of testing (by playing in Cobalt browser):
These doesn't help:

#1
this.templateInfo_.mediaTemplate = String(info.mediaTemplate);
#2
this.templateInfo_.mediaTemplate = String(info.mediaTemplate).toString();
#3
{
  // eslint-disable-next-line no-new-wrappers
  const newString = new String(info.mediaTemplate);
  this.templateInfo_.mediaTemplate = newString.toString();
}
#4
this.templateInfo_.mediaTemplate = String(info.mediaTemplate).trim();
Documentation of trim says: "If neither the beginning or end of str has any whitespace, a new string is still returned (essentially a copy of str)."
#5:
this.templateInfo_.mediaTemplate = String(info.mediaTemplate).replace('\0', '\0');

These helps:

#6 ok
this.templateInfo_.mediaTemplate = String(info.mediaTemplate).replace('_'. '_');
Could be done like this:
  static createNewString(str) {
    if (!str || str.length < 1) {
      return '';
    }
    const replaceStr = str.substring(0, 1);
    return str.replace(replaceStr, replaceStr);
  }
#7 ok
{
  /** @const {!ArrayBuffer} */
  const buf = shaka.util.StringUtils.toUTF8(String(info.mediaTemplate));
  const newString = shaka.util.StringUtils.fromUTF8(buf);
  this.templateInfo_.mediaTemplate = newString;
}
#8 ok
{ // Create a new string, which do not have any reference to the source
  const newString = '_' + info.mediaTemplate;
  this.templateInfo_.mediaTemplate = newString.substring(1);
}

I prefer the last, put in a new function.

I would appreciate getting links to live streams that trigger calls to the appendTemplateInfo function.

@shaka-bot shaka-bot added this to the v4.9 milestone May 14, 2024
@MichaelSweden
Copy link
Contributor Author

More tests
In issue #6239 Memory leak on multi-period streams
I found this link: https://d24rwxnt7vw9qb.cloudfront.net/v1/dash/e6d234965645b411ad572802b6c9d5a10799c9c1/All_Reference_Streams/4577dca5f8a44756875ab5cc913cd1f1/index.mpd
This also show growing memory problem.
I have now tested this stream with the latest main (12 May, last commit 5f8628a).
pure main: 1h 207MB (v4.8.3-main-20-g5f8628a4a)
main with fix: 1h 206MB (v4.8.3-main-20-g5f8628a4a-dirty)
Problem with this stream will not be solved by my fix.

Same test condition, but our stream:
main with fix: 1h 61MB (v4.8.3-main-20-g5f8628a4a-dirty)
pure main: 1.25h 182MB (v4.8.3-main-20-g5f8628a4a)

Test in Chromium (124.0.6367.155) with default shaka config (MB from heap snapshot with developer tools), stream from cloudfront:
pure main: 1h 80MB
main with fix: 1.25h 119MB

Test in Chromium with our stream:
pure main: 1h 7.8MB Oops, no huge growing memory!

Please don't compare MB (MegaByte) values between Cobalt and Chromium, they show different memory.

The fix:

--- a/lib/dash/segment_template.js
+++ b/lib/dash/segment_template.js
@@ -714,7 +714,11 @@ shaka.dash.TimelineSegmentIndex = class extends shaka.media.SegmentIndex {
     } else {
       const currentTimeline = this.templateInfo_.timeline;
 
-      this.templateInfo_.mediaTemplate = info.mediaTemplate;
+      {
+        // Create a new string, which has no reference to the source
+        const newString = '_' + info.mediaTemplate;
+        this.templateInfo_.mediaTemplate = newString.substring(1);
+      }
 
       // Append timeline
       const lastCurrentEntry = currentTimeline[currentTimeline.length - 1];

The pull request will also include an if-statement (to avoid unnecessary assignment/copy of string) and a function.

@MichaelSweden
Copy link
Contributor Author

Conclusion
I have a fix. However since the fault doesn't appear in Chromium, can it be a javascript engine problem?

So right now I'm little confused. Shall I put up a pull request, or not. Any ideas are welcome. I believe I need to think about this for a while, but guess the fix would be good. Maybe I will trouble shoot again, but with a different approach. Maybe it would be possible to solve the issue at another place in the code.

@joeyparrish
Copy link
Member

Chrome and Firefox have different JS engines (V8 and Spidermonkey, respectively), so it's very possible that GC issues could differ by browser.

However, the workaround above (build new string and take a substring) looks very fishy. A string value should not hold any references to a parent object that contains it. I have a hard time imagining a scenario where it's not an independent memory allocation.

Obviously, solving this issue would be good, but I'm hesitant to put in the fix you posted above, because of all of the following:

  1. We don't have any automation to avoid regressing on this point
  2. We don't understand the mechanism causing this in the JS engine, so...
  3. We don't actually know why this "fixes" it, so...
  4. We can't have any confidence in the fix working in the future (say, after a browser update)
  5. We don't know whether we have other instances of this problem
  6. We don't know how to avoid introducing other instances of this problem in the future

I would like to see this solved, but I feel strongly that we need to understand and document the root cause or the mechanism behind the workaround.

@MichaelSweden
Copy link
Contributor Author

I agree with you.

Cobalt using V8, but old 8.8 version (Chromium have 12.4).

@MichaelSweden
Copy link
Contributor Author

I can add, as info, that for our problematic stream the template string is always the same.

I have tested more browsers, here is a summary:

Cobalt 23               : V8  8.8.278.8  : Problem
Cobalt main (Ubuntu)    : V8  8.8.278.17 : Problem, 1.5h+180MB (fix 1h+0MB)
Chromium 124.0.6367.155 : V8 12.4.254.15 : ok
Chromium 112.0.5615.49  : V8 11.2.214.9  : ok
Chromium 90.0.4430.72   : V8  9.0.257.17 : ok
Chromium 49.0.2623.108  : V8  4.9.385.33 : ok
Firefox  125.0.3                         : Problem
Firefox  88.0                            : Problem
Edge 42.17134.1098.0                     : ok, 1h 29->33MB 168->755MB(Process memory usage)
Chromium 124.0.6367.207 (V8 12.4.254.15) : With and without fix for 3 hours VmRSS: +0MB vs +200MB
ok = No growing JS heap size in dev tool

The above "ok" is a check if the JS heap size reported by development tools in the browser doesn't show a big increase. However while testing Cobalt I don't have the possibility to check JS heap size. Instead I have looked at the memory used by the cobalt process. To do similar with Chromium (Ubuntu) I started two instances of Chromium. One running with the fix and one without the fix. Monitor the VmRSS value like this:
while true; do echo "$(cat /proc//status | grep VmRSS) : $(cat /proc//status | grep VmRSS)"; sleep 60s; done
Got this result after about 3 hours:
20240519chromium
The red is without the fix. Oops, it increase. But earlier test show that JS heap doesn't increase. Directly after this I started to play without the fix and open dev tool, to see JS heap size. After 2 hours no significant increase.

So now I'm starting to think that chromium also suffers from this problem.

// To be continued (at low speed)

@avelad
Copy link
Member

avelad commented May 28, 2024

@MichaelSweden do you have any update? Thanks!

@cristian-atehortua
Copy link
Contributor

cristian-atehortua commented May 28, 2024

This is curious because mediaTemplate is create using shaka.util.StringUtils.htmlUnescape which intentionally avoids to call replace:

   static htmlUnescape(input) {
    // Used to map HTML entities to characters.
    const htmlUnescapes = {
      '&amp;': '&',
      '&lt;': '<',
      '&gt;': '>',
      '&quot;': '"',
      '&#39;': '\'',
      '&apos;': '\'',
      '&nbsp;': '\u{a0}',
      '&lrm;': '\u{200e}',
      '&rlm;': '\u{200f}',
    };

    // Used to match HTML entities and HTML characters.
    const reEscapedHtml = /&(?:amp|lt|gt|quot|apos|#(0+)?39|nbsp|lrm|rlm);/g;
    const reHasEscapedHtml = RegExp(reEscapedHtml.source);
    // This check is an optimization, since replace always makes a copy
    if (input && reHasEscapedHtml.test(input)) {
      return input.replace(reEscapedHtml, (entity) => {
        // The only thing that might not match the dictionary above is the
        // single quote, which can be matched by many strings in the regex, but
        // only has a single entry in the dictionary.
        return htmlUnescapes[entity] || '\'';
      });
    }
    return input || '';
  }

So, this should not be an issue if the media attribute has an escaped HTML sequence (?)

@MichaelSweden
Copy link
Contributor Author

I have seen that part. Our problematic stream don't have any that will be unescaped. Did this test:

--- a/lib/util/string_utils.js
+++ b/lib/util/string_utils.js
@@ -300,9 +300,9 @@ shaka.util.StringUtils = class {
 
     // Used to match HTML entities and HTML characters.
     const reEscapedHtml = /&(?:amp|lt|gt|quot|apos|#(0+)?39|nbsp|lrm|rlm);/g;
-    const reHasEscapedHtml = RegExp(reEscapedHtml.source);
+    // const reHasEscapedHtml = RegExp(reEscapedHtml.source);
     // This check is an optimization, since replace always makes a copy
-    if (input && reHasEscapedHtml.test(input)) {
+    if (input/* && reHasEscapedHtml.test(input)*/) {
       return input.replace(reEscapedHtml, (entity) => {
         // The only thing that might not match the dictionary above is the
         // single quote, which can be matched by many strings in the regex, but

Memory still grow. I guess that "since replace always makes a copy" might not be true (as for some other javascript functions that is specified like that).

I have traced the string in question and have these notes:

- This function:
shaka.dash.SegmentTemplate.appendTemplateInfo(info, periodStart, periodEnd, shouldFit, initSegmentReference)
    this.templateInfo_.mediaTemplate = info.mediaTemplate;
- is called from:
createStreamInfo(context, requestSegment, streamMap, isUpdate, segmentLimit, periodDurationMap, aesKey, lastSegmentNumber)
    @param {shaka.dash.DashParser.Context} context
    @type {shaka.dash.SegmentTemplate.SegmentTemplateInfo}
    const info = SegmentTemplate.parseSegmentTemplateInfo_(context);
    tsi.appendTemplateInfo(info, periodStart, periodEnd, shouldFit, initSegmentReference);
- which calling:
parseSegmentTemplateInfo_(context)
    const media = MpdUtils.inheritAttribute(context, SegmentTemplate.fromInheritance_, 'media');
    mediaTemplate: media && StringUtils.htmlUnescape(media)

Let do some more testing. We will take help by this new function:

--- a/lib/util/string_utils.js
+++ b/lib/util/string_utils.js
@@ -312,6 +312,22 @@ shaka.util.StringUtils = class {
     }
     return input || '';
   }
+
+  /**
+   * Creates a copy of a string.
+   * The new returned string do not have any reference to the source string.
+   *
+   * @param {?string} str
+   * @return {string}
+   * @export
+   */
+  static createNewString(str) {
+    if (!str || str.length < 1) {
+      return '';
+    }
+    const newString = '_' + str;
+    return newString.substring(1);
+  }
 };
 
 

Let us test with this:

--- a/lib/dash/segment_template.js
+++ b/lib/dash/segment_template.js
@@ -201,8 +201,8 @@ shaka.dash.SegmentTemplate = class {
     const segmentInfo =
         MpdUtils.parseSegmentInfo(context, SegmentTemplate.fromInheritance_);
 
-    const media = MpdUtils.inheritAttribute(
-        context, SegmentTemplate.fromInheritance_, 'media');
+    const media = StringUtils.createNewString(MpdUtils.inheritAttribute(
+        context, SegmentTemplate.fromInheritance_, 'media'));
     const index = MpdUtils.inheritAttribute(
         context, SegmentTemplate.fromInheritance_, 'index');
 

No huge memory grow. I will do more testing. I have had very much to do last week and also this week. But I will be back with more information.

@MichaelSweden
Copy link
Contributor Author

I also tested this:

 --- a/lib/dash/dash_parser.js
+++ b/lib/dash/dash_parser.js
@@ -1891,6 +1891,12 @@ shaka.dash.DashParser = class {
 
     const segmentBase = TXml.findChild(elem, 'SegmentBase');
     const segmentTemplate = TXml.findChild(elem, 'SegmentTemplate');
+    if (segmentTemplate && segmentTemplate.attributes) {
+      if (segmentTemplate.attributes['media']) {
+        segmentTemplate.attributes['media'] = shaka.util.StringUtils
+            .createNewString(segmentTemplate.attributes['media']);
+      }
+    }
 
     // The availabilityTimeOffset is the sum of all @availabilityTimeOffset
     // values that apply to the adaptation set, via BaseURL, SegmentBase,

and memory usage will not grow huge.

I have some ideas about what going on, but need to do more tests, investigate and think it through.

@avelad avelad modified the milestones: v4.9, v4.10 May 30, 2024
@MichaelSweden
Copy link
Contributor Author

This will not fix growing memory:

--- a/lib/dash/segment_template.js
+++ b/lib/dash/segment_template.js
@@ -54,6 +54,10 @@ shaka.dash.SegmentTemplate = class {
 
     /** @type {shaka.dash.SegmentTemplate.SegmentTemplateInfo} */
     const info = SegmentTemplate.parseSegmentTemplateInfo_(context);
+    if (info.mediaTemplate) {
+      info.mediaTemplate =
+          shaka.util.ObjectUtils.cloneObject(info.mediaTemplate);
+    }
 
     SegmentTemplate.checkSegmentTemplateInfo_(context, info);
 

But this...

--- a/lib/dash/segment_template.js
+++ b/lib/dash/segment_template.js
@@ -54,6 +54,10 @@ shaka.dash.SegmentTemplate = class {
 
     /** @type {shaka.dash.SegmentTemplate.SegmentTemplateInfo} */
     const info = SegmentTemplate.parseSegmentTemplateInfo_(context);
+    if (info.mediaTemplate) {
+      info.mediaTemplate =
+          shaka.util.StringUtils.createNewString(info.mediaTemplate);
+    }
 
     SegmentTemplate.checkSegmentTemplateInfo_(context, info);
 

will not grow memory (i.e. cloneObject vs createNewString).

@MichaelSweden
Copy link
Contributor Author

I have done some investigation about the memory (heap) content to see if I can verify what I think going on. I have played the stream one hour with 3 different variants of Shaka player. Let us call them: MainZero :: MainFix1 :: MainFix2 (I will use this syntax with double colon to present all 3 results)
MainZero = The main branch
MainFix1 = In createStreamInfo after the only call to parseSegmentTemplateInfo_, call the new createNewString for the media template string.
MainFix2 = The above, plus: In appendTemplateInfo only assign the template string if it is different. This function is only called from createStreamInfo.

MainFix1:

--- a/lib/dash/segment_template.js
+++ b/lib/dash/segment_template.js
@@ -54,6 +54,10 @@ shaka.dash.SegmentTemplate = class {
 
     /** @type {shaka.dash.SegmentTemplate.SegmentTemplateInfo} */
     const info = SegmentTemplate.parseSegmentTemplateInfo_(context);
+    if (info.mediaTemplate) {
+      info.mediaTemplate =
+          shaka.util.StringUtils.createNewString(info.mediaTemplate);
+    }
 
     SegmentTemplate.checkSegmentTemplateInfo_(context, info);
 
--- a/lib/util/string_utils.js
+++ b/lib/util/string_utils.js
@@ -312,6 +312,22 @@ shaka.util.StringUtils = class {
     }
     return input || '';
   }
+
+  /**
+   * Creates a copy of a string.
+   * The new returned string do not have any reference to the source string.
+   *
+   * @param {?string} str
+   * @return {string}
+   * @export
+   */
+  static createNewString(str) {
+    if (!str || str.length < 1) {
+      return '';
+    }
+    const newString = '_' + str;
+    return newString.substring(1);
+  }
 };
 
 

MainFix2 is the above patch, plus this:

--- a/lib/dash/segment_template.js
+++ b/lib/dash/segment_template.js
@@ -714,7 +718,9 @@ shaka.dash.TimelineSegmentIndex = class extends shaka.media.SegmentIndex {
     } else {
       const currentTimeline = this.templateInfo_.timeline;
 
-      this.templateInfo_.mediaTemplate = info.mediaTemplate;
+      if (this.templateInfo_.mediaTemplate !== info.mediaTemplate) {
+        this.templateInfo_.mediaTemplate = info.mediaTemplate;
+      }
 
       // Append timeline
       const lastCurrentEntry = currentTimeline[currentTimeline.length - 1];

After run for one hour I use the "kill -s SIGSEGV " command to terminate the browser process (Cobalt) and a core dump file being created. I didn't find out how to dump the heap in gdb. Instead inspect all the core dump file. The core dump file sizes are 478 :: 357 :: 360 MB. The memory usage at kill was 198 :: 75 :: 60 MB (approximately, they goes up and down).

The "media" template is 112 characters long in our manifest and exist 5 times in each manifest. First I searched for the media template string. Then adding little bit more in front of this string, the way it is written in manifest. The matches of this search will actually be all manifest. Please note that these hits are not all true. It can be deleted objects left in memory (garbage), but that is not important for the result of this test. Third I searched for the template string with a underscore in front of it. This will hit strings created by my new createNewString function.

grep -a -o '<media-template-string>' coreMain???? | wc -l
#  4817 :: 1734 :: 176
grep -a -o 'media="<media-template-string>' coreMain???? | wc -l
#  4817 ::  112 :: 105
grep -a -o '_<media-template-string>' coreMain???? | wc -l
#     0 :: 1622 ::  71

Sum the second and third row will give value on first row, for each column.
For the Main variant we can see that all 4817 existens of the string is part of a manifest. A manifest is approximately 111 kB long.
For the Fix1 variant we can see 112 manifests and 1622 new strings.
For the Fix2 variant we can see same amount of manifests as Fix1, but less new strings.

Keep in mind that these numbers are very approximately, since some of them are garbage (I can see it by looking in the files).

The result show how javascript engine handle these two rows in my new function:
const newString = '_' + str;
return newString.substring(1);
The first row will allocate a new string buffer for the new string, since it shall contain a underscore first. The second row will not result in a new buffer, instead it will point to the second character in the newString buffer.
Now we come to what I suspect happen and how javascript engine works. It will try to use existing occurence of a string, even if an existing variable only contain a part of another string. For example a string variable called mediaTemplate can instead of having it's own buffer, point to stringManifest[240] and have length 112.

So the xml parsing part and levels above that will have references to the big manifest string. This causes the row "this.templateInfo_.mediaTemplate = info.mediaTemplate;" to hold a reference to the big manifest. In other words: String objects that have their origin of a bigger string, will hold a reference to the big string. This is in contrast to having its own new buffer with only the short string in it.

@cristian-atehortua
Copy link
Contributor

I realized the same when I did a heap capture on Chrome. There are (sliced string) entries in the heap and each holds internally a parent value which is the original string from where the slice comes.

This shouldn't be a problem since, in general, we should trust in the optimization the JS engine does. The problem here is probably that we are keeping references to old strings, not allowing the GC to collect them (and in consequence the long strings are also kept)

In particular, the mediaTemplate is being kept through the createUrisCb closure in the get method of TimelineSegmentIndex. Can you try this? (with and without the createNewString fix, and with and without the check for replacement on the appendTemplateInfo method)

--- a/lib/dash/segment_template.js
+++ b/lib/dash/segment_template.js
@@ -917,7 +917,6 @@ shaka.dash.TimelineSegmentIndex = class extends shaka.media.SegmentIndex {
     let ref = this.references[correctedPosition];
 
     if (!ref) {
-      const mediaTemplate = this.templateInfo_.mediaTemplate;
       const range = this.templateInfo_.timeline[correctedPosition];
       const segmentReplacement = range.segmentPosition;
       const timeReplacement = this.templateInfo_
@@ -990,9 +989,12 @@ shaka.dash.TimelineSegmentIndex = class extends shaka.media.SegmentIndex {
         if (range.partialSegments > 0) {
           return [];
         }
+        if (!this.templateInfo_) {
+          return [];
+        }
         return shaka.dash.TimelineSegmentIndex
             .createUris_(
-                mediaTemplate,
+                this.templateInfo_.mediaTemplate,
                 this.representationId_,
                 segmentReplacement,
                 this.bandwidth_,

@MichaelSweden
Copy link
Contributor Author

Great proposal, it seems to fix it :-)

All tests take time, I will be back tomorrow with more details.

@MichaelSweden
Copy link
Contributor Author

Here are result from previous test of 3 variants and now added 4 more.
The first 3 new are the same as the previous 3 with addition of the proposal patch above (in the "get" function, I call it CA). The coreCA3 is this patch plus the if-not-equal check in appendTemplateInfo, ie not any createNewString function.

The core dump file sizes are 478 :: 357 :: 360 ::: 356 :: 364 :: 358 :: 360 MB.
The memory usage at kill was 198 :: 75 :: 60 ::: 58 :: 67 :: 58 :: 59 MB (approximately, they goes up and down a lot).

# coreMainZero :: coreMainFix1 :: coreMainFix2 ::: coreCA0 :: coreCA1 :: coreCA2 :: coreCA3
grep -a -o '<media-template-string>' coreMainZero | wc -l
# 4817 :: 1734 :: 176 ::: 81 :: 208 :: 108 :: 85
grep -a -o 'media="<media-template-string>' coreMainZero | wc -l
# 4817 ::  112 :: 105 ::: 81 :: 143 ::  61 :: 85
grep -a -o '_<media-template-string>' coreMainZero | wc -l
#    0 :: 1622 ::  71 :::  0 ::  65 ::  47 ::  0

We can see that the new get-patch lower the amount of manifest in memory.
We can see that the if-ne-assign patch will not make any difference in combination with only the get-patch (CA0 vs CA3).
Adding the createNewString seems to increase number of manifest in memory. This is suspicious, I guess this can be by coincidence. Because lowest is a combination of all. But between CA1 and CA2, I didn't expect what we can see (they should be like equal). It should not effect number of manifests, only number of new strings. Please keep in mind that we look in all memory, also garbage area.
I'm wondering if the new string stuff have any effect or not, with the get-patch. Clearly the get-patch is very good.

It might be a good idea to redo this kind of test several times. Also run for longer time.

@MichaelSweden
Copy link
Contributor Author

MichaelSweden commented Jun 12, 2024

I have executed more of the last test. All variants contains the get-patch.
CA0: Only the get-patch
CA1: createNewString of mediaTemplate in createStreamInfo
CA2: Above plus if-not-equal check of mediaTemplate in appendTemplateInfo
CA3: if-not-equal check of mediaTemplate in appendTemplateInfo

     Mediatemplate:   Manifest:       NewString:
Four one hours run for each:
CA0:  90 108 119 123, 90 108 119 123,  0  0  0  0
CA1: 110 131 137 150, 72  89  95 110, 36 38 40 48
CA2: 143 165 173 182, 85 102 103 104, 58 61 71 79
CA3:  91 106 111 130, 91 106 111 130,  0  0  0  0
Two two hours run for each:
CA0 2h:      131 174,        131 174,        0  0
CA1 2h:      129 181,         84 122,       45 59
CA2 2h:      187 189,        124 125,       62 65
CA3 2h:      102 115,        102 115,        0  0

I expected to see no difference in CA3 compare to CA0. However the 2 hour runs show some, but probably just a coincidence.
I also expect that the if-not-equal check doesn't effect number of manifest if the createNewString is used. Seems to like that. I have thought that the if-not-equal shall be used with createNewString, so these doesn't grow in number. However test show different.
If we can assume above it leaves us with CA0 and CA1 (or CA2). I was wondering if the createNewString is needed or not. I little hope not, but result seems to indicate otherwise. However this way of looking in memory is not so sure. I can run more tests.

I think this memory graph for the 2 hours run don't say any valuable to us.
Memory2

@avelad avelad added component: DASH The issue involves the MPEG DASH manifest format type: performance A performance issue priority: P1 Big impact or workaround impractical; resolve before feature release labels Jun 13, 2024
@avelad avelad modified the milestones: v4.10, v4.11 Jul 1, 2024
@MichaelSweden
Copy link
Contributor Author

Here are result from 8 hours runs. Should have been 4 runs for each shaka variant, but got problem with the stream at end. It might have been problem with stream some times during test. The result have some indication of that.

Three eight hours run for each:

      Mediatemplate  Manifest      NewString
CA0:   93 110 114,    93 110 114,   0  0  0
CA1:  145 159 173,    97 111 113,  48 48 60
CA2:  169 181 206,   101 106 135,  63 71 80
CA3:   79 103 123,    79 103 123,   0  0  0

We can compare the CA0 and CA3, with the earlier four hours run:

CA0:  90 108 119 123, 90 108 119 123,  0  0  0  0
CA3:  91 106 111 130, 91 106 111 130,  0  0  0  0

I guess we can conclude that it doesn't grow more at 8 compare with 4 hours run. I guess that all variants will have the same amount of manifests in memory. Please remember that the way I inspect memory content is very unreliable.

If we look at the browser process memory usage in Linux. I believe we cannot make any conclusion from it. All CA0 is red (4), CA1 green (4), CA2 blue (3), CA3 magenta (3).
mem

PS. I will come back with more info.

@avelad
Copy link
Member

avelad commented Jul 15, 2024

In version 4.10.6 we have introduced some improvements, can you validate if it is enough or do we need more improvements? Thanks!

@avelad avelad added the status: waiting on response Waiting on a response from the reporter(s) of the issue label Jul 16, 2024
@MichaelSweden
Copy link
Contributor Author

I have done some test runs. Four variants and three times of each. Every run was one hour long. Code of two versions of source code, the v4.10.6 release and main branch from some days ago (before the effective commit). This issue is about memory usage, so let us take a look of browser memory usage.

Red: v4.10.6
Green: v4.10.6!
Blue: main20240714! incl CAfix
Magenta: main20240714!
Memory
The most interesting here is that v4.10.6 is good.

Let us also check the search in memory I done earlier. After mediatemplate string and manifest. During one hour run approximate 1800 manifests would have been downloaded and parsed. Each contaning four instances of the mediatemplate string.

                  Mediatemplate:   Manifest:        NewString:
v4.10.6         :       83 90 98,        83 90 98,  0 0 0
v4.10.6!        :       60 82 85,        60 82 85,  0 0 0
20240714! CAfix :       80 88 99,        80 88 99,  0 0 0
20240714!       : 4648 4750 7143,  4648 4750 7143,  0 0 0
! = Own patch to be able to play the stream good, else gap-jumps.

In v4.10.6 this commit is included: c1480c7 fix(DASH): Improve memory usage with live streams (#7039)
This have the patch of createUrisCb, I call CAfix (proposal from cristian-atehortua, Jun 4). This is what I would recommend as fix. This commit also contain an if-not-equal check of mediaTemplate in appendTemplateInfo. This is what I did as a quick fix. I believe that these two is a good solution for this issue.

I consider this issue now being solved by #7039 and can be closed. Thanks!

@shaka-bot shaka-bot removed the status: waiting on response Waiting on a response from the reporter(s) of the issue label Jul 16, 2024
@avelad avelad closed this as completed Jul 16, 2024
@shaka-bot shaka-bot added the status: archived Archived and locked; will not be updated label Sep 14, 2024
@shaka-project shaka-project locked as resolved and limited conversation to collaborators Sep 14, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
component: DASH The issue involves the MPEG DASH manifest format priority: P1 Big impact or workaround impractical; resolve before feature release status: archived Archived and locked; will not be updated type: bug Something isn't working correctly type: performance A performance issue
Projects
None yet
Development

No branches or pull requests

5 participants