Skip to content

Commit

Permalink
docs
Browse files Browse the repository at this point in the history
  • Loading branch information
vpj committed Jun 30, 2023
1 parent 689842a commit e8a5feb
Show file tree
Hide file tree
Showing 10 changed files with 512 additions and 500 deletions.
278 changes: 141 additions & 137 deletions docs/diffusion/stable_diffusion/model/unet.html

Large diffs are not rendered by default.

298 changes: 151 additions & 147 deletions docs/ja/diffusion/stable_diffusion/model/unet.html

Large diffs are not rendered by default.

42 changes: 21 additions & 21 deletions docs/ja/transformers/rope/index.html

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/neox/utils/llm_int8.html
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@
<h1>LLM.int() on GPT-NeoX</h1>
<p>This implements a utility function to transform a <code class="highlight"><span></span><span class="n">nn</span><span class="o">.</span><span class="n">Linear</span></code>
layer to LLM.int8() linear layer.</p>
<p><a href="https://papers.labml.ai/paper/eb2bcaee1d0011edaa66a71c10a887e7">LLM.int8() paper</a> shows you can use int8 quantization while handling outliers to reduce memory footprint without performance degradation in large language models. They convert weights and inputs to scaled 8-bit integers and does matrix multiplication producing int32 results which is then converted back to float16 and rescaled. They show that in large language models, some features can give extreme values (outliers) that dominate the model&#x27;s output. These features get clamped in 8-bit integer space which causes the model performance to degrade. As a solution they pick these outliers (greater than a specified threshold) and compute their multiplications separately in float16 space. Since the percentage of outliers is around 0.01% this doesn&#x27;t increase memory usage, and prevents the model from degrading performance.</p>
<p><a href="https://papers.labml.ai/paper/eb2bcaee1d0011edaa66a71c10a887e7">LLM.int8() paper</a> shows you can use int8 quantization while handling outliers to reduce memory footprint without performance degradation in large language models. They convert weights and inputs to scaled 8-bit integers and does matrix multiplication producing int32 results which is then converted back to float16 and rescaled. They show that in large langauge models, some features can give extreme values (outliers) that dominate the model&#x27;s output. These features get clamped in 8-bit integer space which causes the model performance to degrade. As a solution they pick these outliers (greater than a specified threshold) and compute their multiplications separately in float16 space. Since the percentage of outliers is around 0.01% this doesn&#x27;t increase memory usage, and prevents the model from degrading performance.</p>
<p>The code to transform GPT-NoeX layers is defined in <a href="../model.html#post_load_prepare">model.py</a>.</p>
<p>Here are example uses of GPT-NeoX with int8 quantization.</p>
<ul><li><a href="../samples/llm_int8.html">Generate Text</a> </li>
Expand Down Expand Up @@ -240,4 +240,4 @@ <h2>Transform a <code class="highlight"><span></span><span class="n">nn</span><
handleImages()
</script>
</body>
</html>
</html>
6 changes: 3 additions & 3 deletions docs/sitemap.xml
Original file line number Diff line number Diff line change
Expand Up @@ -484,7 +484,7 @@

<url>
<loc>https://nn.labml.ai/index.html</loc>
<lastmod>2023-04-02T16:30:00+00:00</lastmod>
<lastmod>2023-06-30T16:30:00+00:00</lastmod>
<priority>1.00</priority>
</url>

Expand Down Expand Up @@ -610,7 +610,7 @@

<url>
<loc>https://nn.labml.ai/diffusion/stable_diffusion/model/unet.html</loc>
<lastmod>2023-01-19T16:30:00+00:00</lastmod>
<lastmod>2023-06-30T16:30:00+00:00</lastmod>
<priority>1.00</priority>
</url>

Expand Down Expand Up @@ -932,7 +932,7 @@

<url>
<loc>https://nn.labml.ai/transformers/rope/index.html</loc>
<lastmod>2023-04-02T16:30:00+00:00</lastmod>
<lastmod>2023-06-28T16:30:00+00:00</lastmod>
<priority>1.00</priority>
</url>

Expand Down
42 changes: 21 additions & 21 deletions docs/transformers/rope/index.html

Large diffs are not rendered by default.

292 changes: 148 additions & 144 deletions docs/zh/diffusion/stable_diffusion/model/unet.html

Large diffs are not rendered by default.

42 changes: 21 additions & 21 deletions docs/zh/transformers/rope/index.html

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions translate_cache/diffusion/stable_diffusion/model/unet.ja.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
"<h2>U-Net model</h2>\n": "<h2>U-\u30cd\u30c3\u30c8\u30e2\u30c7\u30eb</h2>\n",
"<h3>Group normalization with float32 casting</h3>\n": "<h3>float32 \u30ad\u30e3\u30b9\u30c6\u30a3\u30f3\u30b0\u306b\u3088\u308b\u30b0\u30eb\u30fc\u30d7\u6b63\u898f\u5316</h3>\n",
"<h3>Group normalization</h3>\n<p>This is a helper function, with fixed number of groups..</p>\n": "<h3>\u30b0\u30eb\u30fc\u30d7\u6b63\u898f\u5316</h3>\n<p>\u3053\u308c\u306f\u30b0\u30eb\u30fc\u30d7\u6570\u304c\u56fa\u5b9a\u3055\u308c\u305f\u30d8\u30eb\u30d1\u30fc\u95a2\u6570\u3067\u3059\u3002</p>\n",
"<h3>Sequential block for modules with different inputs</h3>\n<p>This sequential module can compose of different modules suck as <span translate=no>_^_0_^_</span>, <span translate=no>_^_1_^_</span> and <span translate=no>_^_2_^_</span> and calls them with the matching signatures</p>\n": "<h3>\u5165\u529b\u304c\u7570\u306a\u308b\u30e2\u30b8\u30e5\u30fc\u30eb\u7528\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30d6\u30ed\u30c3\u30af</h3>\n<p>\u3053\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30e2\u30b8\u30e5\u30fc\u30eb\u306f\u3001\u8907\u6570\u306e\u30e2\u30b8\u30e5\u30fc\u30eb\u304b\u3089\u69cb\u6210\u3067\u304d<span translate=no>_^_0_^_</span>\u3001<span translate=no>_^_1_^_</span>\u305d\u308c\u305e\u308c\u3092\u4e00\u81f4\u3059\u308b\u30b7\u30b0\u30cd\u30c1\u30e3\u3067\u547c\u3073\u51fa\u3059\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002<span translate=no>_^_2_^_</span></p>\n",
"<h3>Sequential block for modules with different inputs</h3>\n<p>This sequential module can compose of different modules such as <span translate=no>_^_0_^_</span>, <span translate=no>_^_1_^_</span> and <span translate=no>_^_2_^_</span> and calls them with the matching signatures</p>\n": "<h3>\u5165\u529b\u306e\u7570\u306a\u308b\u30e2\u30b8\u30e5\u30fc\u30eb\u7528\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30d6\u30ed\u30c3\u30af</h3>\n<p>\u3053\u306e\u30b7\u30fc\u30b1\u30f3\u30b7\u30e3\u30eb\u30e2\u30b8\u30e5\u30fc\u30eb\u306f\u3001\u3001<span translate=no>_^_1_^_</span><span translate=no>_^_2_^_</span>\u306a\u3069\u306e\u3055\u307e\u3056\u307e\u306a\u30e2\u30b8\u30e5\u30fc\u30eb\u3067\u69cb\u6210\u3067\u304d<span translate=no>_^_0_^_</span>\u3001\u305d\u308c\u3089\u3092\u5bfe\u5fdc\u3059\u308b\u30b7\u30b0\u30cd\u30c1\u30e3\u3067\u547c\u3073\u51fa\u3059\u3053\u3068\u304c\u3067\u304d\u307e\u3059\u3002</p>\n",
"<h3>Up-sampling layer</h3>\n": "<h3>\u30a2\u30c3\u30d7\u30b5\u30f3\u30d7\u30ea\u30f3\u30b0\u30ec\u30a4\u30e4\u30fc</h3>\n",
"<p> </p>\n": "<p></p>\n",
"<p> Test sinusoidal time step embeddings</p>\n": "<p>\u6b63\u5f26\u6ce2\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u57cb\u3081\u8fbc\u307f\u306e\u30c6\u30b9\u30c8</p>\n",
Expand Down Expand Up @@ -52,7 +52,7 @@
"<ul><li><span translate=no>_^_0_^_</span> is the input feature map of shape <span translate=no>_^_1_^_</span> </li>\n<li><span translate=no>_^_2_^_</span> are the time steps of shape <span translate=no>_^_3_^_</span> </li>\n<li><span translate=no>_^_4_^_</span> conditioning of shape <span translate=no>_^_5_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u30b7\u30a7\u30a4\u30d7\u306e\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u3067\u3059 <span translate=no>_^_1_^_</span></li>\n<li><span translate=no>_^_2_^_</span>\u5f62\u72b6\u306e\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u3067\u3059 <span translate=no>_^_3_^_</span></li>\n<li><span translate=no>_^_4_^_</span>\u5f62\u72b6\u306e\u30b3\u30f3\u30c7\u30a3\u30b7\u30e7\u30cb\u30f3\u30b0 <span translate=no>_^_5_^_</span></li></ul>\n",
"<ul><li><span translate=no>_^_0_^_</span> is the input feature map with shape <span translate=no>_^_1_^_</span> </li>\n<li><span translate=no>_^_2_^_</span> is the time step embeddings of shape <span translate=no>_^_3_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u5f62\u72b6\u4ed8\u304d\u306e\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u3067\u3059 <span translate=no>_^_1_^_</span></li>\n<li><span translate=no>_^_2_^_</span>\u5f62\u72b6\u306e\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u57cb\u3081\u8fbc\u307f\u3067\u3059 <span translate=no>_^_3_^_</span></li></ul>\n",
"<ul><li><span translate=no>_^_0_^_</span> is the input feature map with shape <span translate=no>_^_1_^_</span></li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u5f62\u72b6\u4ed8\u304d\u306e\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u3067\u3059 <span translate=no>_^_1_^_</span></li></ul>\n",
"<ul><li><span translate=no>_^_0_^_</span> is the number of channels in the input feature map </li>\n<li><span translate=no>_^_1_^_</span> is the number of channels in the output feature map </li>\n<li><span translate=no>_^_2_^_</span> is the base channel count for the model </li>\n<li><span translate=no>_^_3_^_</span> number of residual blocks at each level </li>\n<li><span translate=no>_^_4_^_</span> are the levels at which attention should be performed </li>\n<li><span translate=no>_^_5_^_</span> are the multiplicative factors for number of channels for each level </li>\n<li><span translate=no>_^_6_^_</span> the number of attention heads in the transformers</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_1_^_</span>\u306f\u51fa\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_2_^_</span>\u30e2\u30c7\u30eb\u306e\u30d9\u30fc\u30b9\u30c1\u30e3\u30f3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_3_^_</span>\u5404\u30ec\u30d9\u30eb\u306e\u6b8b\u7559\u30d6\u30ed\u30c3\u30af\u6570</li>\n<li><span translate=no>_^_4_^_</span>\u6ce8\u610f\u3092\u5411\u3051\u308b\u3079\u304d\u30ec\u30d9\u30eb\u306f</li>\n<li><span translate=no>_^_5_^_</span>\u306f\u5404\u30ec\u30d9\u30eb\u306e\u30c1\u30e3\u30f3\u30cd\u30eb\u6570\u306e\u4e57\u7b97\u4fc2\u6570\u3067\u3059</li>\n<li><span translate=no>_^_6_^_</span>\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u306e\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3\u30d8\u30c3\u30c9\u306e\u6570</li></ul>\n",
"<ul><li><span translate=no>_^_0_^_</span> is the number of channels in the input feature map </li>\n<li><span translate=no>_^_1_^_</span> is the number of channels in the output feature map </li>\n<li><span translate=no>_^_2_^_</span> is the base channel count for the model </li>\n<li><span translate=no>_^_3_^_</span> number of residual blocks at each level </li>\n<li><span translate=no>_^_4_^_</span> are the levels at which attention should be performed </li>\n<li><span translate=no>_^_5_^_</span> are the multiplicative factors for number of channels for each level </li>\n<li><span translate=no>_^_6_^_</span> is the number of attention heads in the transformers </li>\n<li><span translate=no>_^_7_^_</span> is the number of transformer layers in the transformers </li>\n<li><span translate=no>_^_8_^_</span> is the size of the conditional embedding in the transformers</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u3001\u5165\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059</li>\n<li><span translate=no>_^_1_^_</span>\u306f\u51fa\u529b\u30d5\u30a3\u30fc\u30c1\u30e3\u30de\u30c3\u30d7\u306e\u30c1\u30e3\u30cd\u30eb\u6570\u3067\u3059\u3002</li>\n<li><span translate=no>_^_2_^_</span>\u306f\u30e2\u30c7\u30eb\u306e\u30d9\u30fc\u30b9\u30c1\u30e3\u30f3\u30cd\u30eb\u6570</li>\n<li><span translate=no>_^_3_^_</span>\u5404\u30ec\u30d9\u30eb\u306e\u6b8b\u5dee\u30d6\u30ed\u30c3\u30af\u6570</li>\n<li><span translate=no>_^_4_^_</span>\u6ce8\u610f\u3059\u3079\u304d\u30ec\u30d9\u30eb\u306f\u3069\u308c\u3050\u3089\u3044\u306e\u30ec\u30d9\u30eb\u304b</li>\n<li><span translate=no>_^_5_^_</span>\u306f\u5404\u30ec\u30d9\u30eb\u306e\u30c1\u30e3\u30f3\u30cd\u30eb\u6570\u306e\u4e57\u6cd5\u4fc2\u6570</li>\n<li><span translate=no>_^_6_^_</span>\u306f\u5909\u5727\u5668\u5185\u306e\u30a2\u30c6\u30f3\u30b7\u30e7\u30f3\u30d8\u30c3\u30c9\u306e\u6570\u3067\u3059</li>\n<li><span translate=no>_^_7_^_</span>\u306f\u5909\u5727\u5668\u5185\u306e\u5909\u5727\u5668\u5c64\u306e\u6570\u3067\u3059\u3002</li>\n<li><span translate=no>_^_8_^_</span>\u306f\u30c8\u30e9\u30f3\u30b9\u30d5\u30a9\u30fc\u30de\u30fc\u5185\u306e\u6761\u4ef6\u4ed8\u304d\u57cb\u3081\u8fbc\u307f\u306e\u30b5\u30a4\u30ba\u3067\u3059</li></ul>\n",
"<ul><li><span translate=no>_^_0_^_</span> is the number of channels</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u306f\u30c1\u30e3\u30cd\u30eb\u6570</li></ul>\n",
"<ul><li><span translate=no>_^_0_^_</span> the number of input channels </li>\n<li><span translate=no>_^_1_^_</span> the size of timestep embeddings </li>\n<li><span translate=no>_^_2_^_</span> is the number of out channels. defaults to `channels.</li></ul>\n": "<ul><li><span translate=no>_^_0_^_</span>\u5165\u529b\u30c1\u30e3\u30f3\u30cd\u30eb\u6570</li>\n<li><span translate=no>_^_1_^_</span>\u30bf\u30a4\u30e0\u30b9\u30c6\u30c3\u30d7\u57cb\u3081\u8fbc\u307f\u306e\u30b5\u30a4\u30ba</li>\n</ul><li><span translate=no>_^_2_^_</span>\u306f\u51fa\u529b\u30c1\u30e3\u30f3\u30cd\u30eb\u306e\u6570\u3067\u3059\u3002\u30c7\u30d5\u30a9\u30eb\u30c8\u306f `channels\u3067\u3059\u3002</li>\n",
"Annotated PyTorch implementation/tutorial of the U-Net in stable diffusion.": "\u5b89\u5b9a\u7248\u62e1\u6563\u306b\u304a\u3051\u308bU-Net\u306e\u6ce8\u91c8\u4ed8\u304dPyTorch\u5b9f\u88c5/\u30c1\u30e5\u30fc\u30c8\u30ea\u30a2\u30eb\u3002",
Expand Down
Loading

0 comments on commit e8a5feb

Please sign in to comment.