Description
Hi,
I have been experimenting with PowerShell-YAML for the last week. It has been quite simple and straight forward to use but the one area that I feel has been lacking is the ability to easily control the produced YAML formatting.
My use case is that I need to provide a tag along with the string value. For example, I am seeking to produce key: !tag stringValue
but the current behaviour of PowerShell-YAML is to put quotes around the string. So you end up with key: '!tag stringValue'
.
I am not complaining about that because that is the correct behaviour that should be produced when attempting to serialize a string that contains an !
. I also fully realize my use case is a bit hacky and the correct way to achieve that would be using custom classes in combination with the YamlDotNet WithTagMapping()
functionality.
This got me thinking though, there surely would be instances where people do want to quickly and easily output a YAML tag without messing with classes, or to adjust the output format of certain values to match some arbitrary requirement from the software that they are working with. Keeping in mind that not all software will be fully YAML compliant, and not everyone is fortunate enough to have a say in the software packages that they need to work with.
So I am of the view, that it could be very helpful to add some functionality that would allow flexibility in controlling output formats at a node level.
To that end, I have been reading up on some of the closed issues and the underlying YamlDotNet library to understand what is and is not possible. I understand that we can make use of the [YamlMemberAttribute()]
attribute by adding that to PowerShell classes.
PowerShell example:
Class MyClass {
[YamlMemberAttribute(ScalarStyle = "Plain")]
[String] $YamlStringPlain = "This is a plain string."
[YamlMemberAttribute(ScalarStyle = "SingleQuoted")]
[String] $YamlStringSingleQuoted = "This is a single quoted string."
[YamlMemberAttribute(ScalarStyle = "DoubleQuoted")]
[String] $YamlStringDoubleQuoted = "This is a double quoted string."
[YamlMemberAttribute(ScalarStyle = "Literal")]
[String] $YamlStringLiteral = "This`n`nis`n`na`n`nliteral`n`nstring."
[YamlMemberAttribute(ScalarStyle = "Folded")]
[String] $YamlStringFolded = "This`n`nis`n`na`n`nfolded`n`nstring."
[YamlMemberAttribute(ScalarStyle = "Plain", Description="This is a comment")]
[String] $YamlStringPlainComment = "This is a plain string with a comment."
[YamlMemberAttribute(ScalarStyle = "SingleQuoted", Description = "This is also a comment")]
[String] $YamlStringSingleQuotedComment = "This is a single quoted string with a comment."
}
[MyClass]::new() | ConvertTo-Yaml
Outputs:
YamlStringPlain: This is a plain string.
YamlStringSingleQuoted: 'This is a single quoted string.'
YamlStringDoubleQuoted: "This is a double quoted string."
YamlStringLiteral: |-
This
is
a
literal
string.
YamlStringFolded: |-
This
is
a
folded
string.
# This is a comment
YamlStringPlainComment: This is a plain string with a comment.
# This is also a comment
YamlStringSingleQuotedComment: 'This is a single quoted string with a comment.'
Note - I noticed that the YamlFoldedString
did not serialize correctly. I haven't looked into that and am unsure if that is related to Powershell-Yaml or YamlDotNet.
Using the [YamlMemberAttribute()]
inside of a PowerShell class is a good option and I think that should be used when appropriate. However, as the attribute is defined at the class definition and the attribute values cannot be changed for different instances of the class, I find that to be quite limiting.
I also think that if you are using PowerShell-Yaml that you likely want to to avoid going down the path of building classes and instead want to define your YAML directly by using the PowerShell syntax to build your object hierarchy. Which is both easy to work with and read.
For example, the PowerShell code:
$YAMLStructure = @(
@{
'PlainString' = 'This is a plain string.'
'SingleQuotedString' = 'This is a single quoted string.'
}
@{
'DoubleQuotedString' = 'This is a double quoted string.'
}
)
Outputs:
- SingleQuotedString: This is a single quoted string.
PlainString: This is a plain string.
- DoubleQuotedString: This is a double quoted string.
So what I propose is that some custom classes be added to the module that strings can be cast to, when working with objects in PowerShell syntax.
For example, the PowerShell code could now look like:
$YAMLStructure = @(
@{
'PlainString' = [YamlStringPlain]'This is a plain string.'
'SingleQuotedString' = [YamlStringSingleQuote]'This is a single quoted string.'
}
@{
'DoubleQuotedString' = [YamlStringSingleQuote]'This is a double quoted string.'
}
)
$serializer = [YamlDotNet.Serialization.SerializerBuilder]::new().WithTypeConverter([YamlDoubleQuotedStringConverter]::new()).Build()
$serializer.Serialize($YAMLStructure)
Outputs:
- SingleQuotedString: 'This is a single quoted string.'
PlainString: This is a plain string.
- DoubleQuotedString: "This is a double quoted string."
I would suggest creating classes for all of the scalar styles support by YamlDotNet.
I have done some experimenting to see if this is possible and from what I could tell it looks like it could be done relatively easily. I am not a C# pro but this is what I was able to work out to test this.
C# Code:
using System;
using System.Numerics;
using System.Text.RegularExpressions;
using System.Collections.Generic;
using YamlDotNet.Core;
using YamlDotNet.Serialization;
using YamlDotNet.Serialization.EventEmitters;
using YamlDotNet.Core.Events;
using YamlDotNet.Serialization.NamingConventions;
using YamlDotNet.Serialization.ObjectGraphVisitors;
public class YamlDoubleQuotedString
{
[YamlDotNet.Serialization.YamlMember(ScalarStyle = YamlDotNet.Core.ScalarStyle.DoubleQuoted)]
private string Value;
public YamlDoubleQuotedString(string input)
{
Value = input;
}
public override string ToString()
{
return Value;
}
// Implicit conversion from YamlDoubleQuotedString to string
public static implicit operator string(YamlDoubleQuotedString obj)
{
return obj.Value;
}
// Implicit conversion from string to YamlDoubleQuotedString
public static implicit operator YamlDoubleQuotedString(string str)
{
return new YamlDoubleQuotedString(str);
}
}
public class YamlDoubleQuotedStringConverter : IYamlTypeConverter {
public bool Accepts(Type type) {
return type == typeof(YamlDoubleQuotedString);
}
public object ReadYaml(IParser parser, Type type, ObjectDeserializer rootDeserializer) {
// We don't really need to do any custom deserialization.
return null;
}
public void WriteYaml(IEmitter emitter, object value, Type type, ObjectSerializer serializer) {
var YamlDoubleQuotedString = (YamlDoubleQuotedString)value;
emitter.Emit(new Scalar(AnchorName.Empty, TagName.Empty, YamlDoubleQuotedString.ToString(), ScalarStyle.DoubleQuoted, true, false));
}
}
Then when used in PowerShell:
$YAMLStructure = @(
@{
'PlainString' = 'This is a plain string.'
'SingleQuotedString' = 'This is a single quoted string.'
}
@{
'DoubleQuotedString' = [YamlDoubleQuotedString]'This is a double quoted string.'
}
)
# For testing I had to create my own serializer so that I could add the extra TypeConverter.
$serializer = [YamlDotNet.Serialization.SerializerBuilder]::new().WithTypeConverter([YamlDoubleQuotedStringConverter]::new()).Build()
$serializer.Serialize($YAMLStructure)
Outputs:
- SingleQuotedString: This is a single quoted string.
PlainString: This is a plain string.
- DoubleQuotedString: "This is a double quoted string."
I realize that roundtrip serializing and deserializing may not be possible when custom formatting is used, but I am not sure if that is a good reason to prevent exploring useful ways to assist in customizing the serializing process.
In regards to the above roundtrip serializing/deserializing it could be worth exploring the idea of deserializing, to format specific string classes based on how the string was formatted in the YAML. Then roundtrip serializing/deserializing could be possible without losing any formatting. I do not know if that is a good idea, but I think it is an interesting one to perhaps discuss.
I am also unsure if this is a good idea as well but to take this idea a step further. I am unsure if this is technically possible, but something similar could be done for producing YAML comments. eg Casting to a [YamlComment]
could insert a comment at the current position of the YAML file and then potentially be deserialized back to a [YamlComment]
object.
If you got this far, thanks for reading!