WI #1752 Implements Replace :: by two characters :. #1787

mayanje · 2020-10-23T08:20:37Z

What have been done.

A new class GroupToken to represent for instance the token :: as a group of two tokens ':'
At REPLACE directive creating time, the GoupToken is replaced by its sub-tokens
The ImportedTokensDocument class that is used as token iterator for handling the REPLACE, now create a preprocessed text fragment which is then scanned again to get the new set of tokens to be iterated.
The error message of an empty pseudo text of comparison, which was generated by the Scanner is now generated by the CompilerDirectiveBuilder class.

fm-117 · 2020-10-23T08:49:25Z

Codegen/test/resources/output/TypeCobol/CopyReplace4Colon/CopyReplace4ColonErr.rdz.cbl

@@ -0,0 +1 @@
+


Empty file ? Is the file actually required since no generation occurs because of previous errors ?

fm-117 · 2020-10-23T08:51:37Z

Codegen/test/resources/input/TypeCobol/CopyReplace4Colon/CopyReplace4Colon.rdz.cbl

+       ENVIRONMENT DIVISION.
+       DATA DIVISION.
+       WORKING-STORAGE SECTION.
+       01  xxxENT. COPY YxxxENT REPLACING ==::== BY ====.


Maybe modify one of the test to have a non-empty replacement text.

fm-117 · 2020-10-23T08:53:34Z

Codegen/test/resources/input/TypeCobol/CopyReplace4Colon/CopyReplace4ColonErr.rdz.cbl

@@ -0,0 +1,10 @@
+       IDENTIFICATION DIVISION.


I don't see the test method for this sample.

fm-117 · 2020-10-23T09:02:06Z

TypeCobol/Compiler/CompilationDocument.cs

+        internal bool IsForCopy
+        {
+            get;
+            set;


CompilationUnit is for programs so this property can be made readonly and set at construction time (default true, false when called from CompilationUnit constructor)

fm-117 · 2020-10-23T09:15:58Z

TypeCobol/Compiler/CupPreprocessor/CompilerDirectiveBuilder.cs

+            }
+            return list;
+        }
+        private bool BuildReplaceOperation(IList<ReplaceOperation> replaceOperations, ref Token comparisonToken, ref Token[] followingComparisonTokens, ref Token replacementToken, ref Token[] replacementTokens, bool replaceTokens, List<Token> operandTokens)


The ANTLR-related duplicate method in TypeCobol.Compiler.Preprocessor.CompilerDirectiveBuilder has not been updated.

I think we can remove the Antlr Preprocessor.
I've created #1788 about that.

fm-117 · 2020-10-23T09:24:08Z

TypeCobol/Compiler/Preprocessor/ImportedTokensDocument.cs

+                    TypeCobolOptions tcOptions = new TypeCobolOptions();                    
+                    tcOptions.AreForCopyParsing = true;//Doing that any "::" will be treated as two tokens


tcOptions is not used.

Yes it must be replaced by the new TypeCobolOptions()

fm-117 · 2020-10-23T09:25:42Z

TypeCobol/Compiler/Preprocessor/ImportedTokensDocument.cs

+                    FileCompiler fileCompiler = new FileCompiler(initialTextDocumentLines, null, null, new TypeCobolOptions(), null, false, null);
+                    fileCompiler.CompilationResultsForProgram.InitialScanStateForCopy = state;
+                    fileCompiler.CompilationResultsForProgram.UpdateTokensLines();


Do we really need a whole FileCompiler ? Using Scanner only would be cleaner.

This was a question that I asked to myself, the answer is that it is FileCompiler that connect all things welll.

fm-117 · 2020-10-23T09:27:12Z

TypeCobol/Compiler/Preprocessor/ProcessedTokensDocument.cs

+        /// <param name="compilerOptions"></param>
+        /// <param name="allowWhitespaceTokens">This parameters is used to force whitspaces tokens to be also returned.</param>
+        /// <returns></returns>
+        public static ITokensLinesIterator GetProcessedTokensIterator(TextSourceInfo textSourceInfo, ISearchableReadOnlyList<IProcessedTokensLine> lines, TypeCobolOptions compilerOptions, bool allowWhitespaceTokens)


The new allowWhitespaceTokens parameter is always false except for one usage, we could use a default value.

I do that in order to not miss same places where it is called, but at the end, I think yes.

fm-117 · 2020-10-23T09:28:01Z

TypeCobol/Compiler/Scanner/GroupToken.cs

+    /// <summary>
+    /// A class that represents a Group Of Tokens.
+    /// </summary>
+    public class GroupToken : Token


TokenGroup class already exists. Can we use it instead (with adaptations if need be) ?

Yep... yes you are right I must test it.

fm-117 · 2020-10-23T09:28:55Z

TypeCobol/Compiler/Scanner/Scanner.cs

+                            //if (!(followingChar == ' ' || followingChar == ',' || followingChar == ';' || followingChar == '.'))
+                            //{
+                            //    tokensLine.AddDiagnostic(MessageCode.InvalidCharAfterPseudoTextDelimiter, delimiterToken);
+                            //}


Commented code may be confusing for future readers, better remove it.

So this code is commented because, I still needed to know how to emit this diagnostics in a right context.

smedilol · 2020-10-23T12:01:04Z

TypeCobol/Compiler/Scanner/Scanner.cs

+                        // consume two :: chars in token1 and token2
+                        var token1 = ScanOneChar(startIndex, TokenType.ColonSeparator);
+                        var token2 = ScanOneChar(startIndex + 1, TokenType.ColonSeparator);
+                        GroupToken group = new GroupToken(TokenType.QualifiedNameSeparator, startIndex, startIndex + 1, tokensLine);


Why use a GroupToken for token :: now ?

Group of Tokens is needed here in order to still returning one token when GetNextToken() is called, The REPLACE parsing will flat the group into it's groups tokens, and I just wanted to still returning TokenType.QualifiedNameSeparator.
But In fact maybe Group of Token can be avoid if we consider the flag this.compilerOptions.AreForCopyParsing for a Pure Cobol Treatment not TypeCobol.

Currently the TypeCobol operator :: is a single token.
Characters : inside a pseudo Text (like ==:xxxx:==) are single token :.

In my opinion, this issue should treat characters :: inside a COPY or inside a pseudo text, as 2 separated tokens : and :.
See my proposal in another comment.

After more investigation, characters :: that need to be replaced, must be generated as a partial Cobol Word Token.
E.g. :
move W-Var1:: to W-Var2
W-Var1:: must be single a single Token of type Partial Cobol Word Token. Otherwise the replacing won't occurs.

smedilol · 2020-10-25T10:45:33Z

TypeCobol.Test/Parser/Programs/Cobol85/CopyReplace4ColonErr.rdz.cbl

+       WORKING-STORAGE SECTION.
+       01  xxxENT. COPY YxxxENT REPLACING ==== BY ====.
+       PROCEDURE DIVISION.
+           MOVE 'A' TO xxxENT-FCT01-Var1.


In fact the test also need to check that variable name inside the copy are correctly replaced;
Can you add this :
01 xxxENT2. COPY YxxxENT REPLACING ==::== BY ====.
01 xxxENT3. COPY YxxxENT REPLACING ==::== BY ==S==.

And this move:
MOVE 'A' TO xxxENTS-FCT01-Var1.

And also create the copy YxxxENT with same content as Codegen/test/resources/input/TypeCobol/CopyReplace4Colon/YxxxENT.cpy

Yes, I need also further tests.

Can you also add a test with a mix of Empty and non-empty replacing, like:
Copy YDVZOSM:

01 :MDVZOSM:-A::. 05 :MDVZOSM:-A::-Var0 pic X.

And the source:

IDENTIFICATION division. PROGRAM-ID. DVZS0OSM. data division. working-storage section. COPY YDVZOSM replacing ==::== by ==== ==:MDVZOSM:== by ==MDVZOSM==. COPY YDVZOSM replacing ==::== by ==2== ==:MDVZOSM:== by ==Foo==. procedure division. move 'A' to MDVZOSM-A move 'A' to Foo-A2 . end program DVZS0OSM.

fm-117 · 2020-10-26T14:10:51Z

Although original issue focuses on COPY - REPLACING, could you add a test with a REPLACE ? I think REPLACING and REPLACE should have the same behavior.

       IDENTIFICATION DIVISION.
       PROGRAM-ID. DVZZMFT0.
       data division.
       working-storage section.
       REPLACE ==::== BY ==name1==.
       01 var-:: PIC X.
       REPLACE ==::== BY ====.
       01 var2:: PIC X.
       procedure division.
           display var-name1
           display var2
           goback
           .
       end program DVZZMFT0.

smedilol · 2020-10-26T12:47:48Z

TypeCobol/Compiler/Scanner/Scanner.cs

+                        // consume two :: chars in token1 and token2
+                        var token1 = ScanOneChar(startIndex, TokenType.ColonSeparator);
+                        var token2 = ScanOneChar(startIndex + 1, TokenType.ColonSeparator);
+                        GroupToken group = new GroupToken(TokenType.QualifiedNameSeparator, startIndex, startIndex + 1, tokensLine);


Currently the TypeCobol operator :: is a single token.
Characters : inside a pseudo Text (like ==:xxxx:==) are single token :.

In my opinion, this issue should treat characters :: inside a COPY or inside a pseudo text, as 2 separated tokens : and :.
See my proposal in another comment.

smedilol · 2020-10-26T14:06:06Z

TypeCobol/Compiler/Scanner/Scanner.cs

                    // QualifiedNameSeparator => qualifierName::qualifiedName
-                    if (currentIndex < lastIndex && line[currentIndex + 1] == ':')
+                    if (currentIndex < lastIndex && line[currentIndex + 1] == ':' && (this.compilerOptions == null || !this.compilerOptions.AreForCopyParsing))
                    {


Can you try the following solution:
Modify this condition:

if (!currentState.InsidePseudoText //No QualifiedNameSeparator allowed in pseudoText && (we are not in a COPY) //No QualifiedNameSeparator allowed in COPY && currentIndex< lastIndex && line[currentIndex + 1] == ':') {

Delete line 2307 in original Scanner.cs, because now we allow empty pseudoText ==::==

// no legal cobol word chars found if (index == startIndex + 1 && !CobolChar.IsCobolWordChar(line[index])) return false;

Keep your fix in TypeCobol/Compiler/Scanner/Token.cs
Still return a bool from method BuildReplaceOperation in TypeCobol/Compiler/CupPreprocessor/CompilerDirectiveBuilder.cs to report Diagnostic : "REPLACE Empty Comparison Pseudo Text." and "REPLACE Empty Pseudo Text Delimiter"

Otherwise all of the rest of the C# code can be deleted.

I didn't test Codegen, but it worked on Parser part.

My bad, I forgot to launch all tests.
In my previous comment, the original line 2307 in original Scanner.cs must be changed as:

if (index == startIndex + 1 && !CobolChar.IsCobolWordChar(line[index])) { //Empty partialCobolWord are only allowed inside pseudo text and copy if (!(currentState.InsidePseudoText || is inside a COPY)) { return false; } }

And the MultilineScanState currentState must now be a parameter of method CheckForPartialCobolWordPattern.

Yes it works like that!

I am going to create a new PR and give up this one. I didn't know Partial Cobol Word pattern and how it works :-|

smedilol · 2020-10-26T21:49:11Z

TypeCobol/Compiler/Scanner/Scanner.cs

                    // QualifiedNameSeparator => qualifierName::qualifiedName
-                    if (currentIndex < lastIndex && line[currentIndex + 1] == ':')
+                    if (currentIndex < lastIndex && line[currentIndex + 1] == ':' && (this.compilerOptions == null || !this.compilerOptions.AreForCopyParsing))
                    {


My bad, I forgot to launch all tests.
In my previous comment, the original line 2307 in original Scanner.cs must be changed as:

if (index == startIndex + 1 && !CobolChar.IsCobolWordChar(line[index])) { //Empty partialCobolWord are only allowed inside pseudo text and copy if (!(currentState.InsidePseudoText || is inside a COPY)) { return false; } }

And the MultilineScanState currentState must now be a parameter of method CheckForPartialCobolWordPattern.

smedilol · 2020-10-27T16:54:07Z

TypeCobol.Test/Parser/Programs/Cobol85/CopyReplace4ColonErr.rdz.cbl

+       WORKING-STORAGE SECTION.
+       01  xxxENT. COPY YxxxENT REPLACING ==== BY ====.
+       PROCEDURE DIVISION.
+           MOVE 'A' TO xxxENT-FCT01-Var1.


Can you also add a test with a mix of Empty and non-empty replacing, like:
Copy YDVZOSM:

01 :MDVZOSM:-A::. 05 :MDVZOSM:-A::-Var0 pic X.

And the source:

IDENTIFICATION division. PROGRAM-ID. DVZS0OSM. data division. working-storage section. COPY YDVZOSM replacing ==::== by ==== ==:MDVZOSM:== by ==MDVZOSM==. COPY YDVZOSM replacing ==::== by ==2== ==:MDVZOSM:== by ==Foo==. procedure division. move 'A' to MDVZOSM-A move 'A' to Foo-A2 . end program DVZS0OSM.

mayanje · 2020-10-28T14:36:57Z

@smedilol So The last requested test fails with smedilol method but succeed with the method of this PR, I am looking why.
The one with Copy YDVZOSM.

mayanje · 2020-10-29T08:21:39Z

As a new solution has been suggested by @smedilol a new pull request has been created:
#1794

fm-117 · 2020-11-13T11:09:52Z

TypeCobol/Compiler/Preprocessor/CopyTokensLinesIterator.cs

        /// Set the initial position of the iterator with startToken.
        /// </summary>
-        public CopyTokensLinesIterator(string sourceFileName, IReadOnlyList<IProcessedTokensLine> tokensLines, int channelFilter, bool allowWhitespaceTokens)
+        public CopyTokensLinesIterator(string sourceFileName, IReadOnlyList<IProcessedTokensLine> tokensLines, int channelFilter, Func<Token, bool> tokenFilterCallback)


Both channelFilter and tokenFilterCallback are used to select/discard tokens during iteration so they should be merged together, adding a default value (filtering based on channel like before the change).

mayanje · 2020-11-17T21:21:07Z

Replaced by PR #1794

mayanje · 2020-11-17T21:21:35Z

Handle by PR #1794

WI #1752 Implements Replace :: by two characters :.

c0d28b0

mayanje requested review from fm-117, rooksdo and smedilol October 23, 2020 08:20

trafico-bot bot added the 🔍 Ready for Review Pull Request is not reviewed yet label Oct 23, 2020

fm-117 requested changes Oct 23, 2020

View reviewed changes

trafico-bot bot added ⚠️ Changes requested Pull Request needs changes before it can be reviewed again and removed 🔍 Ready for Review Pull Request is not reviewed yet labels Oct 23, 2020

smedilol reviewed Oct 23, 2020

View reviewed changes

rooksdo linked an issue Oct 26, 2020 that may be closed by this pull request

Handle copy replacing with ==::== #1752

Closed

smedilol suggested changes Oct 26, 2020

View reviewed changes

smedilol suggested changes Oct 27, 2020

View reviewed changes

WI #1752 better Whitespaces handling using token filter.

c2c4d68

trafico-bot bot added 🔍 Ready for Review Pull Request is not reviewed yet and removed ⚠️ Changes requested Pull Request needs changes before it can be reviewed again labels Oct 27, 2020

WI #1752 Add test replace variable name inside copy.

3898fd1

smedilol suggested changes Oct 27, 2020

View reviewed changes

trafico-bot bot added ⚠️ Changes requested Pull Request needs changes before it can be reviewed again and removed 🔍 Ready for Review Pull Request is not reviewed yet labels Oct 27, 2020

rooksdo assigned mayanje Oct 30, 2020

rooksdo added this to the CobolEditor milestone Nov 5, 2020

fm-117 reviewed Nov 13, 2020

View reviewed changes

mayanje closed this Nov 17, 2020

mayanje deleted the 1752_Handle_copy_replacing_with_4colon branch November 17, 2020 21:21

		TypeCobolOptions tcOptions = new TypeCobolOptions();
		tcOptions.AreForCopyParsing = true;//Doing that any "::" will be treated as two tokens

WI #1752 Implements Replace :: by two characters :. #1787

WI #1752 Implements Replace :: by two characters :. #1787

Uh oh!

Conversation

mayanje commented Oct 23, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mayanje Oct 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fm-117 commented Oct 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mayanje commented Oct 28, 2020

Uh oh!

mayanje commented Oct 29, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mayanje commented Nov 17, 2020

Uh oh!

mayanje commented Nov 17, 2020

mayanje Oct 26, 2020 •

edited

Loading

fm-117 commented Oct 26, 2020 •

edited

Loading