final doc tweaks

Jutho · Jutho · commit a63ae71b3cce · 2023-07-20T00:30:38.000+02:00
diff --git a/docs/make.jl b/docs/make.jl
@@ -9,8 +9,8 @@ makedocs(; modules=[TensorOperations],
                 "Manual" => ["man/indexnotation.md",
                              "man/functions.md",
                              "man/interface.md",
-                             "man/implementation.md",
-                             "man/autodiff.md"],
+                             "man/autodiff.md",
+                             "man/implementation.md"],
                 "Index" => "index/index.md"])
 
 # Documenter can also automatically deploy documentation to gh-pages.
diff --git a/docs/src/man/indexnotation.md b/docs/src/man/indexnotation.md
@@ -408,8 +408,9 @@ implementation that is used for e.g. arrays with an `eltype` that is not
 permutations, but still in a cache-friendly and multithreaded way (again relying on
 `JULIA_NUM_THREADS > 1`). This implementation can also be used for `BlasFloat` types (but
 will typically be slower), and the use of BLAS can be controlled by explicitly switching the
-backend between `StridedBLAS` and `StridedNative`. Similarly, when different allocation
-strategies are available, their backend can be controlled with the `allocator` keyword.
+backend between `StridedBLAS` and `StridedNative` using the `backend` keyword to
+[`@tensor`](@ref). Similarly, different allocation strategies, when available, can be
+selected using the `allocator` keyword of [`@tensor`](@ref).
 
 The primitive tensor operations are also implemented for `CuArray` objects of the
 [CUDA.jl](https://github.com/JuliaGPU/CUDA.jl) library. This implementation is essentially a
diff --git a/src/indexnotation/tensormacros.jl b/src/indexnotation/tensormacros.jl
@@ -28,11 +28,14 @@ Additional keyword arguments may be passed to control the behavior of the parser
 - `contractcheck`:
     Boolean flag to enable runtime check for contractibility of indices with clearer error messages.
 - `costcheck`:
-    Adds runtime checks to ensure that the contraction order is optimal. Can be either `:warn` or `:cache`. The former will issues warnings when sub-optimal expressions are encountered, while the latter will cache the optimal contraction order for each tensor site and calling site.
+    Can be either `:warn` or `:cache` and adds runtime checks to compare the compile-time contraction order to the optimal order computed for the actual run time tensor costs.
+    If `costcheck == :warn`, warnings are printed for every sub-optimal contraction that is encountered.
+    If `costcheck == :cache`, only the most costly run of a particular sub-optimal contraction will be cached in `TensorOperations.costcache`.
+    In both cases, a suggestion for the `order` keyword argument is computed to switch to the optimal contraction order.
 - `backend`: 
-    Inserts a backend call for the different tensor operations.
+    Inserts an implementation backend as a final argument in the different tensor operation calls in the generated code.
 - `allocator`:
-    Inserts a backend call for the different tensor allocations.
+    Inserts an allocation strategy as a final argument in the tensor allocation calls in the generated code.
 """
 macro tensor(args::Vararg{Expr})
     isempty(args) && throw(ArgumentError("No arguments passed to `@tensor`"))