Skip to content

Commit

Permalink
add proof
Browse files Browse the repository at this point in the history
  • Loading branch information
linsyking committed Apr 13, 2023
1 parent a49d3ba commit 3034f44
Show file tree
Hide file tree
Showing 5 changed files with 92 additions and 32 deletions.
2 changes: 1 addition & 1 deletion Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

Meow lang is a programming language that compiles to *catlet*, another language that is hard to read but easy to execute. Meow's syntax is easier to read than catlet, but meow and catlet are equivalent.

(It's called *catlet* because the only available two keywords are `cat` and `let`, *currently*)
The **only** operation is doing "let" (cat can be implemented by let).

Its main design purpose is to do experiments with "string substitution".

Expand Down
45 changes: 29 additions & 16 deletions docs/Encoding.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,28 +23,41 @@ We have to first implement the `encode` and `decode` macro to help us.

Encoding means to encode a string into another string that has some special properties that we can use.

## Code
## Cat

We want `cat` to concatenate two strings. It is possible by using `rep2`.

```meow
encode(s) {
var rep = {
"$" = "\$";
"#" = "\#";
"\" = "\\";
s
};
"#$"+ rep +"$#"
cat(x,y) {
rep2("\$","\",x,"$",y)
}
decode(s) {
var rep = {
"$#" = "";
"#$" = "";
s
rep2(s,x1,y1,x2,y2) {
var s1 = {
"\$$" = enc(y1);
"\$$$" = enc(y2);
"\$$$$" = enc(x2) + "$";
enc(x2) = "\$$$";
enc(x1) = "\$$";
enc(s)
};
dec(s1)
}
```

## Code

```meow
enc(x) {
"$" = "\$";
"\" = "\\";
x
}
dec(x) {
"\\" = "\";
"\#" = "#";
"\$" = "$";
rep
x
}
```
46 changes: 31 additions & 15 deletions docs/proof/main.tex
Original file line number Diff line number Diff line change
Expand Up @@ -52,12 +52,12 @@

\section{Introduction}

String substituion is widely used in computer programming. Many daily tasks need to be done by string substitution.
String substitution is widely used in computer programming. Many daily tasks need to be done by string substitution.

As an example, consider the ``backslash-escaping string'' problem.
When we want to define a string that including special characters in a programming language, we usually use backslash to escape the special characters. For example, in Python, we can define a string that contains a backslash by ``$\backslash \backslash$'', and a string that contains a single quote by ``$\backslash \texttt{"}$''. And it seems that we can always transform a backslash-escaped string back to the original string. But how to prove its correctness? Is it possible that some string cannot be transformed back to the original string?

In this note, we will investigate the theory behind string substituion.
In this note, we will investigate the theory behind string substitution.

\section{Theorems}

Expand All @@ -75,6 +75,8 @@ \section{Theorems}
\item When $B\neq \epsilon$, $[A/B]: \text{String} \rightarrow \text{String}, [A/B]C=C.\mathtt{replace}(B, A)$. ($[A/\epsilon]C$ is \textbf{undefined behavior})
\item $A\sqsubset B\Leftrightarrow \exists C, B = AC$.
\item $A\sqsupset B\Leftrightarrow \exists C, B = CA$.
\item $\prod_i^{m\rightarrow n} X(i) \triangleq X(m)X(m+1)\cdots X(n)$.
\item $\prod_i^{m\leftarrow n} X(i) \triangleq X(n)X(n-1)\cdots X(m)$.
\end{itemize}

\begin{theorem}
Expand Down Expand Up @@ -106,7 +108,7 @@ \section{Theorems}
\begin{align}
A\subset [B/C]S \Leftrightarrow A \subset S
\end{align}
\label{thm:independent substituion}
\label{thm:independent substitution}
\end{lemma}

\begin{proof}
Expand Down Expand Up @@ -211,7 +213,7 @@ \section{Theorems}
\begin{align}
[\sigma/\Omega]S \triangleq [\sigma/\omega_1][\sigma/\omega_2]\cdots[\sigma/\omega_M]S
\end{align}
is well-defined, i.e. the order of substituion does not matter. And when $\Omega=\Sigma$,
is well-defined, i.e. the order of substitution does not matter. And when $\Omega=\Sigma$,
\begin{align}
[\sigma/\Sigma]S = \underbrace{\sigma\cdots\sigma}_{|S|}
\end{align}
Expand All @@ -220,9 +222,10 @@ \section{Theorems}

\begin{theorem}
Assume that $\Sigma = \{\sigma_1,\cdots,\sigma_N\}$.

We define the following functions (let $\mathtt{enc}=\mathtt{enc}_{\sigma_1,\sigma_2},\mathtt{dec}=\mathtt{dec}_{\sigma_1,\sigma_2}$):
\begin{align}
\mathtt{tail}(X) & \triangleq \mathtt{dec}\left([\epsilon/\sigma_1\sigma_1][\epsilon/\sigma_1\sigma_1\sigma_2\sigma_2][\epsilon/\sigma_1\sigma_1\sigma_2\sigma_1](\prod_{i=3}^N[\epsilon/\sigma_1\sigma_1\sigma_i]) (\sigma_1\sigma_1\mathtt{enc}(X))\right) \\
\mathtt{tail}(X) & \triangleq \mathtt{dec}\left([\epsilon/\sigma_1\sigma_1][\epsilon/\sigma_1\sigma_1\sigma_2\sigma_2][\epsilon/\sigma_1\sigma_1\sigma_2\sigma_1](\prod_i^{3\rightarrow N}[\epsilon/\sigma_1\sigma_1\sigma_i]) (\sigma_1\sigma_1\mathtt{enc}(X))\right) \\
\mathtt{head}(X) & \triangleq \mathtt{dec}([\epsilon/\mathtt{enc}(\mathtt{tail}(X))\sigma_1\sigma_1](\mathtt{enc}(X)\sigma_1\sigma_1))
\end{align}
Then we have:
Expand Down Expand Up @@ -260,10 +263,25 @@ \section{Theorems}
(\emph{Multiple Substitution})

Let $n\in \mathbb{N}^*$, and for every $i\neq j, X_i \not\subset X_j$. We define the following functions:
\begin{multline}
\mathtt{rep}_n(S, X_1, Y_1, \cdots, X_n, Y_n) \triangleq \\
\mathtt{dec}_{b,x}\left((\prod_i^{1\rightarrow n} [\mathtt{enc}_{b,x}(Y_i)/x\underbrace{b\cdots b}_{i+1}])(\prod_i^{1\leftarrow n} ([\mathtt{enc}_{b,x}(X_i)b/x\underbrace{b\cdots b}_{i+2}][x\underbrace{b\cdots b}_{i+1}/\mathtt{enc}_{b,x}(X_i)]))\mathtt{enc}_{b,x}(S)\right)
\end{multline}
\label{lem:multiple substitution}
\end{theorem}

It turns out that we can build \texttt{cat} function by using multiple substitution.

\begin{theorem}
Let $X,Y$ be two strings, choose any $a\neq b$ in $\Sigma$, define
\begin{align}
\mathtt{rep}_n(S, X_1, Y_1, \cdots, X_n, Y_n) & \triangleq \mathtt{dec}\left((\prod_{i=1}^n[\mathtt{enc}(Y_i)/x\underbrace{b\cdots b}_{i+1}x])(\prod_{i=1}^n[x\underbrace{b\cdots b}_{i+1}x/\mathtt{enc}(X_i)])\mathtt{enc}(S)\right)
\mathtt{cat}(X,Y) \triangleq \mathtt{rep}_{2}(ab, a, X, b, Y)
\end{align}
which is what we known as the ``string concatenation'' function.
For any string $X,Y$, we have
\begin{align}
\mathtt{cat}(X,Y) = XY
\end{align}
\label{lem:multiple substitution}
\end{theorem}

Now to formally define the escaping function, we need to introduce the concept of \textbf{escaping charset} and \textbf{escaped charset}.
Expand All @@ -275,21 +293,19 @@ \section{Theorems}
\begin{theorem}
(\emph{Character Escaping})

Suppose $f: U\rightarrow V, U=\{u_1,\cdots,u_n\}, f(u_i) = v_i (1\leq i\leq n), f(u_k)=u_k$.
Suppose $f: U\rightarrow V, U=\{u_1,\cdots,u_n\}, f(u_k)=u_k$.
We define:
\begin{align}
% \mathtt{escape}_f(S) & \triangleq (\prod_{i=1}^{k-1}[u_ku_i/u_i])(\prod_{i=k+1}^{n}[u_ku_i/u_i])[u_ku_k/u_k]S \\
% \mathtt{unescape}_f(S) & \triangleq [u_k/u_ku_k] (\prod_{i=1}^{n-k}[u_ku_{n-i+1}/u_{n-i+1}])(\prod_{i=1}^{k-1}[u_ku_{k-i}/u_{k-i}])S
\mathtt{escape}_f(S) & \triangleq (\prod_{i=1}^{n}[u_kf(u_i)/u_i])S \\
\mathtt{unescape}_f(S) & \triangleq (\prod_{i=1}^{n}[u_kf(u_{n-i+1})/u_{n-i+1}])S
\mathtt{escape}_f(S) & \triangleq \mathtt{rep}_n(S,u_1,u_kf(u_1),\cdots, u_n,u_kf(u_n)) \\
\mathtt{unescape}_f(S) & \triangleq \mathtt{rep}_n(S,u_kf(u_1),u_1,\cdots, u_kf(u_n), u_n)
\end{align}
Then, for any string $S$,
\begin{align}
\mathtt{unescape}_f(\mathtt{escape}_f(S)) = S
\end{align}
\end{theorem}

\begin{proof}
By \ref{lem:double substitution}.
\end{proof}
% \begin{proof}
% By \ref{lem:double substitution}.
% \end{proof}
\end{document}
4 changes: 4 additions & 0 deletions examples/single.cat
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
enc x = let "$" "\$" let "\" "\\" x
dec x = let "\\" "\" let "\$" "$" x
catc x y = rep2 "\$" "\" x "$" y
rep2 s x1 y1 x2 y2 = dec let "\$$" enc y1 let "\$$$" enc y2 let "\$$$$" cat enc x2 "$" let enc x2 "\$$$" let enc x1 "\$$" enc s
27 changes: 27 additions & 0 deletions examples/single.meow
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
enc(x) {
"$" = "\$";
"\" = "\\";
x
}

dec(x) {
"\\" = "\";
"\$" = "$";
x
}

catc(x,y) {
rep2("\$","\",x,"$",y)
}

rep2(s,x1,y1,x2,y2) {
var s1 = {
"\$$" = enc(y1);
"\$$$" = enc(y2);
"\$$$$" = enc(x2) + "$";
enc(x2) = "\$$$";
enc(x1) = "\$$";
enc(s)
};
dec(s1)
}

0 comments on commit 3034f44

Please sign in to comment.