change h5 to h4 (#1141)

lnhsingh · web-flow · commit 26865fc9a05b · 2025-10-27T11:52:17.000-04:00
## Overview - Remove usage of H5 in guides. Only <H4 generates anchor links ## Type of change **Type:** Fix ## Checklist  - [ ] I have read the [contributing guidelines](README.md) - [ ] I have tested my changes locally using `docs dev` - [ ] All code examples have been tested and work correctly - [ ] I have used **root relative** paths for internal links - [ ] I have updated navigation in `src/docs.json` if needed - I have gotten approval from the relevant reviewers - (Internal team members only / optional) I have created a preview deployment using the [Create Preview Branch workflow](https://github.com/langchain-ai/docs/actions/workflows/create-preview-branch.yml) ## Additional notes
diff --git a/src/langsmith/administration-overview.mdx b/src/langsmith/administration-overview.mdx
@@ -255,9 +255,7 @@ LangSmith has rate limits which are designed to ensure the stability of the serv
 
 To ensure access and stability, LangSmith will respond with HTTP Status Code 429 indicating that rate or usage limits have been exceeded under the following circumstances:
 
-#### Scenarios
-
-###### Temporary throughput limit over a 1 minute period at our application load balancer
+#### Temporary throughput limit over a 1 minute period at our application load balancer
 
 This 429 is the the result of exceeding a fixed number of API calls over a 1 minute window on a per API key/access token basis. The start of the window will vary slightly — it is not guaranteed to start at the start of a clock minute — and may change depending on application deployment events.
 
@@ -276,7 +274,7 @@ This 429 is thrown by our application load balancer and is a mechanism in place
 The LangSmith SDK takes steps to minimize the likelihood of reaching these limits on run-related endpoints by batching up to 100 runs from a single session ID into a single API call.
 </Note>
 
-###### Plan-level hourly trace event limit
+#### Plan-level hourly trace event limit
 
 This 429 is the result of reaching your maximum hourly events ingested and is evaluated in a fixed window starting at the beginning of each clock hour in UTC and resets at the top of each new hour.
 
@@ -291,7 +289,7 @@ This is thrown by our application and varies by plan tier, with organizations on
 | Startup/Plus                     | 500,000 events | 1 hour |
 | Enterprise                       | Custom         | Custom |
 
-###### Plan-level hourly trace data ingest limit
+#### Plan-level hourly trace data ingest limit
 
 This 429 is the result of reaching the maximum amount of data ingested across your trace inputs, outputs, and metadata and is evaluated in a fixed window starting at the beginning of each clock hour in UTC and resets at the top of each new hour.
 
@@ -306,7 +304,7 @@ This is thrown by our application and varies by plan tier, with organizations on
 | Startup/Plus                     | 5.0GB  | 1 hour |
 | Enterprise                       | Custom | Custom |
 
-###### Plan-level monthly unique traces limit
+#### Plan-level monthly unique traces limit
 
 This 429 is the result of reaching your maximum monthly traces ingested and is evaluated in a fixed window starting at the beginning of each calendar month in UTC and resets at the beginning of each new month.
 
@@ -316,7 +314,7 @@ This is thrown by our application and applies only to the Developer Plan Tier wh
 | ------------------------------ | ------------ | ------- |
 | Developer (no payment on file) | 5,000 traces | 1 month |
 
-###### Self-configured monthly usage limits
+#### Self-configured monthly usage limits
 
 This 429 is the result of reaching your usage limit as configured by your organization admin and is evaluated in a fixed window starting at the beginning of each calendar month in UTC and resets at the beginning of each new month.
 
diff --git a/src/langsmith/evaluation-concepts.mdx b/src/langsmith/evaluation-concepts.mdx
@@ -129,31 +129,31 @@ Typically, we will run multiple experiments on a given dataset, testing differen
 ![Comparison view](/langsmith/images/comparison-view.png)
 
 
-### Experiment configuration
+## Experiment configuration
 
 LangSmith supports a number of experiment configurations which make it easier to run your evals in the manner you want.
 
-#### Repetitions
+### Repetitions
 
 Running an experiment multiple times can be helpful since LLM outputs are not deterministic and can differ from one repetition to the next. By running multiple repetitions, you can get a more accurate estimate of the performance of your system.
 
 Repetitions can be configured by passing the `num_repetitions` argument to `evaluate` / `aevaluate` ([Python](https://docs.smith.langchain.com/reference/python/evaluation/langsmith.evaluation._runner.evaluate), [TypeScript](https://docs.smith.langchain.com/reference/js/interfaces/evaluation.EvaluateOptions#numrepetitions)). Repeating the experiment involves both re-running the target function to generate outputs and re-running the evaluators.
 
 To learn more about running repetitions on experiments, read the [how-to-guide](/langsmith/repetition).
 
-#### Concurrency
+### Concurrency
 
 By passing the `max_concurrency` argument to `evaluate` / `aevaluate`, you can specify the concurrency of your experiment. The `max_concurrency` argument has slightly different semantics depending on whether you are using `evaluate` or `aevaluate`.
 
-##### `evaluate`
+#### `evaluate`
 
 The `max_concurrency` argument to `evaluate` specifies the maximum number of concurrent threads to use when running the experiment. This is both for when running your target function as well as your evaluators.
 
-##### `aevaluate`
+#### `aevaluate`
 
 The `max_concurrency` argument to `aevaluate` is fairly similar to `evaluate`, but instead uses a semaphore to limit the number of concurrent tasks that can run at once. `aevaluate` works by creating a task for each example in the dataset. Each task consists of running the target function as well as all of the evaluators on that specific example. The `max_concurrency` argument specifies the maximum number of concurrent tasks, or put another way - examples, to run at once.
 
-#### Caching
+### Caching
 
 Lastly, you can also cache the API calls made in your experiment by setting the `LANGSMITH_TEST_CACHE` to a valid folder on your device with write access. This will cause the API calls made in your experiment to be cached to disk, meaning future experiments that make the same API calls will be greatly sped up.
 
diff --git a/src/langsmith/faq.mdx b/src/langsmith/faq.mdx
@@ -87,7 +87,7 @@ The user will be deprovisioned from your LangSmith organization according to you
 
 Yes. If your identity provider supports syncing alternate fields to the `displayName` group attribute, you may use an alternate attribute (like `description`) as the `displayName` in LangSmith and retain full customizability of the identity provider group name. Otherwise, groups must follow the specific naming convention described in the [Group Naming Convention](#group-naming-convention) section to properly map to LangSmith roles and workspaces.
 
-##### _Why is my Okta integration not working?_
+#### _Why is my Okta integration not working?_
 
 See Okta's troubleshooting guide here: https://help.okta.com/en-us/content/topics/users-groups-profiles/usgp-group-push-troubleshoot.htm.
 
diff --git a/src/langsmith/observability-studio.mdx b/src/langsmith/observability-studio.mdx
@@ -21,11 +21,11 @@ Studio supports the following methods for modifying prompts in your graph:
 
 Studio allows you to edit prompts used inside individual nodes, directly from the graph interface.
 
-#### Graph Configuration
+### Graph Configuration
 
 Define your [configuration](/oss/langgraph/use-graph-api#add-runtime-configuration) to specify prompt fields and their associated nodes using `langgraph_nodes` and `langgraph_type` keys.
 
-##### `langgraph_nodes`
+#### `langgraph_nodes`
 
 - **Description**: Specifies which nodes of the graph a configuration field is associated with.
 - **Value Type**: Array of strings, where each string is the name of a node in your graph.
@@ -38,7 +38,7 @@ Define your [configuration](/oss/langgraph/use-graph-api#add-runtime-configurati
   )
   ```
 
-##### `langgraph_type`
+#### `langgraph_type`
 
 - **Description**: Specifies the type of configuration field, which determines how it's handled in the UI.
 - **Value Type**: String
diff --git a/src/oss/concepts/memory.mdx b/src/oss/concepts/memory.mdx
@@ -36,11 +36,9 @@ For more information on common techniques for managing messages, see the [Add an
 
 Long-term memory is a complex challenge without a one-size-fits-all solution. However, the following questions provide a framework to help you navigate the different techniques:
 
-* [What is the type of memory?](#memory-types) Humans use memories to remember facts ([semantic memory](#semantic-memory)), experiences ([episodic memory](#episodic-memory)), and rules ([procedural memory](#procedural-memory)). AI agents can use memory in the same ways. For example, AI agents can use memory to remember specific facts about a user to accomplish a task.
+* What is the type of memory? Humans use memories to remember facts ([semantic memory](#semantic-memory)), experiences ([episodic memory](#episodic-memory)), and rules ([procedural memory](#procedural-memory)). AI agents can use memory in the same ways. For example, AI agents can use memory to remember specific facts about a user to accomplish a task.
 * [When do you want to update memories?](#writing-memories) Memory can be updated as part of an agent's application logic (e.g., "on the hot path"). In this case, the agent typically decides to remember facts before responding to a user. Alternatively, memory can be updated as a background task (logic that runs in the background / asynchronously and generates memories). We explain the tradeoffs between these approaches in the [section below](#writing-memories).
 
-### Memory types
-
 Different applications require various types of memory. Although the analogy isn't perfect, examining [human memory types](https://www.psychologytoday.com/us/basics/memory/types-of-memory?ref=blog.langchain.dev) can be insightful. Some research (e.g., the [CoALA paper](https://arxiv.org/pdf/2309.02427)) have even mapped these human memory types to those used in AI agents.
 
 | Memory Type | What is Stored | Human Example | Agent Example |
@@ -49,23 +47,25 @@ Different applications require various types of memory. Although the analogy isn
 | [Episodic](#episodic-memory) | Experiences | Things I did | Past agent actions |
 | [Procedural](#procedural-memory) | Instructions | Instincts or motor skills | Agent system prompt |
 
-#### Semantic memory
+### Semantic memory
 
 [Semantic memory](https://en.wikipedia.org/wiki/Semantic_memory), both in humans and AI agents, involves the retention of specific facts and concepts. In humans, it can include information learned in school and the understanding of concepts and their relationships. For AI agents, semantic memory is often used to personalize applications by remembering facts or concepts from past interactions.
 
 <Note>
 Semantic memory is different from "semantic search," which is a technique for finding similar content using "meaning" (usually as embeddings). Semantic memory is a term from psychology, referring to storing facts and knowledge, while semantic search is a method for retrieving information based on meaning rather than exact matches.
 </Note>
 
-##### Profile
+Semantic memories can be managed in different ways:
+
+#### Profile
 
-Semantic memories can be managed in different ways. For example, memories can be a single, continuously updated "profile" of well-scoped and specific information about a user, organization, or other entity (including the agent itself). A profile is generally just a JSON document with various key-value pairs you've selected to represent your domain.
+Memories can be a single, continuously updated "profile" of well-scoped and specific information about a user, organization, or other entity (including the agent itself). A profile is generally just a JSON document with various key-value pairs you've selected to represent your domain.
 
 When remembering a profile, you will want to make sure that you are **updating** the profile each time. As a result, you will want to pass in the previous profile and [ask the model to generate a new profile](https://github.com/langchain-ai/memory-template) (or some [JSON patch](https://github.com/hinthornw/trustcall) to apply to the old profile). This can be become error-prone as the profile gets larger, and may benefit from splitting a profile into multiple documents or **strict** decoding when generating documents to ensure the memory schemas remains valid.
 
 ![](/oss/images/update-profile.png)
 
-##### Collection
+#### Collection
 
 Alternatively, memories can be a collection of documents that are continuously updated and extended over time. Each individual memory can be more narrowly scoped and easier to generate, which means that you're less likely to **lose** information over time. It's easier for an LLM to generate _new_ objects for new information than reconcile new information with an existing profile. As a result, a document collection tends to lead to [higher recall downstream](https://en.wikipedia.org/wiki/Precision_and_recall).
 
@@ -79,7 +79,7 @@ Finally, using a collection of memories can make it challenging to provide compr
 
 Regardless of memory management approach, the central point is that the agent will use the semantic memories to [ground its responses](/oss/langchain/retrieval), which often leads to more personalized and relevant interactions.
 
-#### Episodic memory
+### Episodic memory
 
 [Episodic memory](https://en.wikipedia.org/wiki/Episodic_memory), in both humans and AI agents, involves recalling past events or actions. The [CoALA paper](https://arxiv.org/pdf/2309.02427) frames this well: facts can be written to semantic memory, whereas *experiences* can be written to episodic memory. For AI agents, episodic memory is often used to help an agent remember how to accomplish a task.
 
@@ -103,7 +103,7 @@ Note that the memory [store](/oss/langgraph/persistence#memory-store) is just on
 See this how-to [video](https://www.youtube.com/watch?v=37VaU7e7t5o) for example usage of dynamic few-shot example selection in LangSmith. Also, see this [blog post](https://blog.langchain.dev/few-shot-prompting-to-improve-tool-calling-performance/) showcasing few-shot prompting to improve tool calling performance and this [blog post](https://blog.langchain.dev/aligning-llm-as-a-judge-with-human-preferences/) using few-shot example to align an LLMs to human preferences.
 :::
 
-#### Procedural memory
+### Procedural memory
 
 [Procedural memory](https://en.wikipedia.org/wiki/Procedural_memory), in both humans and AI agents, involves remembering the rules used to perform tasks. In humans, procedural memory is like the internalized knowledge of how to perform tasks, such as riding a bike via basic motor skills and balance. Episodic memory, on the other hand, involves recalling specific experiences, such as the first time you successfully rode a bike without training wheels or a memorable bike ride through a scenic route. For AI agents, procedural memory is a combination of model weights, agent code, and agent's prompt that collectively determine the agent's functionality.
 
diff --git a/src/oss/langgraph/memory.mdx b/src/oss/langgraph/memory.mdx