Skip to content

Commit

Permalink
Alignment Blog
Browse files Browse the repository at this point in the history
  • Loading branch information
JingfengYang committed Feb 18, 2024
1 parent 14c4349 commit 06df7ab
Show file tree
Hide file tree
Showing 8 changed files with 279 additions and 1 deletion.

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions _includes/head.html
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@
<link rel="alternate" type="application/rss+xml" title="RSS Feed for {{ site.name }}" href="{{ site.baseurl }}/feed.xml" />
<script src="https://kit.fontawesome.com/5186839478.js" crossorigin="anonymous"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
<script async defer src="https://buttons.github.io/buttons.js"></script>
</head>
1 change: 1 addition & 0 deletions _site/404.html
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
<link rel="alternate" type="application/rss+xml" title="RSS Feed for Jingfeng Yang" href="/feed.xml" />
<script src="https://kit.fontawesome.com/5186839478.js" crossorigin="anonymous"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
<script async defer src="https://buttons.github.io/buttons.js"></script>
</head>


Expand Down
3 changes: 2 additions & 1 deletion _site/alignment.html
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
<link rel="alternate" type="application/rss+xml" title="RSS Feed for Jingfeng Yang" href="/feed.xml" />
<script src="https://kit.fontawesome.com/5186839478.js" crossorigin="anonymous"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
<script async defer src="https://buttons.github.io/buttons.js"></script>
</head>


Expand Down Expand Up @@ -59,7 +60,7 @@ <h1 class="head-title">
<h1 class="page-title">Capability or Alignment? Respect the LLM Base Model’s Capability During Alignment</h1>
<p>I started writing this post on Dec 28th, 2023, and finished it on Feb 14th, 2024. All the opinions are my own, not reflecting the view of my affiliation or any others. Actually, this post largely summarizes and elaborates my thoughts expressed on Twitter since March 21st, 2023, as demonstrated in the Appendix. I also added a lot more detailed thoughts from some early Anthropic and OpenAI papers and blog posts. Thanks Hongye Jin for providing useful feedback on the initial draft.</p>

<p>Last year saw a boom of LLMs research. Based on the research, one important lesson would be that we should devote most of our efforts to training a general-purpose LLM base model, and leverage it as much as possible after all. I might be opinionated, but through literature review and some practical experience, one general principle is that we need to respect the base model’s capability, when trying to achieve the alignment goal. This argument might be common sense among many people, but it is still controversial among others, especially when it comes to the boundary between capability and alignment.  Feel free to correct me if you have more solid empirical evidence. </p>
<p>Last year saw a boom of LLMs research. Based on the research, one important lesson would be that we should devote most of our efforts to training a general-purpose LLM base model, and leverage it as much as possible after all. I might be opinionated, but I always believe that one general principle is that we need to respect the base model’s capability, when trying to achieve the alignment goal. This argument might be common sense among many people, but it is still controversial among others, especially when it comes to the boundary between capability and alignment.  Feel free to correct me if you have more solid empirical evidence. </p>

<p>In this post, I will first motivate discussion of the capability and alignment with some evidence on the source of LLM capabilities. Then I will define model capability and alignment respectively. I also discuss their boundaries and why capabilities mainly come from base models. Based on this, I will introduce some principles to respect base model capability during each method of alignment. Finally, I emphasize the importance of evaluation used to show the effectiveness of our principle.</p>

Expand Down
1 change: 1 addition & 0 deletions _site/blog/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
<link rel="alternate" type="application/rss+xml" title="RSS Feed for Jingfeng Yang" href="/feed.xml" />
<script src="https://kit.fontawesome.com/5186839478.js" crossorigin="anonymous"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
<script async defer src="https://buttons.github.io/buttons.js"></script>
</head>


Expand Down
1 change: 1 addition & 0 deletions _site/gpt.html
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
<link rel="alternate" type="application/rss+xml" title="RSS Feed for Jingfeng Yang" href="/feed.xml" />
<script src="https://kit.fontawesome.com/5186839478.js" crossorigin="anonymous"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
<script async defer src="https://buttons.github.io/buttons.js"></script>
</head>


Expand Down
1 change: 1 addition & 0 deletions _site/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
<link rel="alternate" type="application/rss+xml" title="RSS Feed for Jingfeng Yang" href="/feed.xml" />
<script src="https://kit.fontawesome.com/5186839478.js" crossorigin="anonymous"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
<script async defer src="https://buttons.github.io/buttons.js"></script>
</head>


Expand Down
1 change: 1 addition & 0 deletions _site/safety.html
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
<link rel="alternate" type="application/rss+xml" title="RSS Feed for Jingfeng Yang" href="/feed.xml" />
<script src="https://kit.fontawesome.com/5186839478.js" crossorigin="anonymous"></script>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/gh/jpswalsh/academicons@1/css/academicons.min.css">
<script async defer src="https://buttons.github.io/buttons.js"></script>
</head>


Expand Down

0 comments on commit 06df7ab

Please sign in to comment.