Skip to content

Creating a vector from an array list instead of a seq #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 26, 2021

Conversation

narkisr
Copy link
Contributor

@narkisr narkisr commented Jan 23, 2021

A small change that fixes the way java.util.ArrayList is handled, right now when you parse a yml string that has an array you get back a seq, this break the expected contract since you can't access items within that array using an index for example:

services:
  wazuh-master:
    image: wazuh/wazuh-odfe:4.0.4_1.11.0
    hostname: wazuh-master
    restart: always
    ports:
      - "1515:1515"
      - "514:514/udp"
      - "55000:55000"
    environment:
      - ELASTICSEARCH_URL=https://elasticsearch:9200
      - ELASTIC_USERNAME=admin
      - ELASTIC_PASSWORD=...
      - FILEBEAT_SSL_VERIFICATION_MODE=full
      - SSL_CERTIFICATE_AUTHORITIES=/etc/ssl/root-ca.pem
      - SSL_CERTIFICATE=/etc/ssl/filebeat.pem
      - SSL_KEY=/etc/ssl/filebeat.key
      - API_USERNAME=acme-user```

The environment variables are parsed as a seq, this breaks the expected contract with the yml input (which is an array):

> (get-in (yaml/parse-string (slurp "docker-compose.yml") :mark false) [:services :wazuh-master :environment] )

("ELASTICSEARCH_URL=https://elasticsearch:9200"
"ELASTIC_USERNAME=admin"
"ELASTIC_PASSWORD=SecretPassword"
"FILEBEAT_SSL_VERIFICATION_MODE=full"
"SSL_CERTIFICATE_AUTHORITIES=/etc/ssl/root-ca.pem"
"SSL_CERTIFICATE=/etc/ssl/filebeat.pem"
"SSL_KEY=/etc/ssl/filebeat.key"
"API_USERNAME=acme-user")

After this fix the result will be:

> (get-in (yaml/parse-string (slurp "docker-compose.yml") :mark false) [:services :wazuh-worker :environment] )

["ELASTICSEARCH_URL=https://elasticsearch:9200"
 "ELASTIC_USERNAME=admin"
 "ELASTIC_PASSWORD=..."
 "FILEBEAT_SSL_VERIFICATION_MODE=full"
 "SSL_CERTIFICATE_AUTHORITIES=/etc/ssl/root-ca.pem"
 "SSL_CERTIFICATE=/etc/ssl/filebeat.pem"
 "SSL_KEY=/etc/ssl/filebeat.key"]

Thanks

@borkdude
Copy link
Collaborator

@slipset @marcomorain LGTM.

@borkdude
Copy link
Collaborator

@narkisr You might want to capture this behavior in the tests.

@slipset
Copy link
Member

slipset commented Jan 25, 2021

@narkisr Thanks! I'd really appreciate it if you added a test so that we don't inadvertently break this in the future.

@borkdude
Copy link
Collaborator

borkdude commented Jan 26, 2021

I personally would prefer just a test with (is (vector? ...)) to capture the intent rather than editing existing tests.

@narkisr
Copy link
Contributor Author

narkisr commented Jan 26, 2021

Hi @borkdude and @slipset after taking a look into at the existing tests it made more sense to me to change them and use get instead of nth, this forces the returned parsed value to be a vector (a seq will return nil).

As a side note prior to this change using nth in this context had linear runtime (instead of o(1))

@narkisr
Copy link
Contributor Author

narkisr commented Jan 26, 2021

Hi @borkdude I don't mind going that route, but I see two main downsides:

  • I feel that checking for types in Clojure is less idiomatic (usually we test behavior).
  • We bind the code to a specific data structure implementation instead of an interfaces.

Happy to change it any case, just wanted to point out my way of thinking on this :)

@borkdude
Copy link
Collaborator

@narkisr The title of this issue is "Creating a vector from an array list instead of a seq"

If that is the intent, I think it's better to explicitly test this and make only the minimal changes to the code-base to test this by adding a test, while preserving the existing tests.

@narkisr
Copy link
Contributor Author

narkisr commented Jan 26, 2021

Ok, cool just made the change let me know if any thing else is required

Thanks

Copy link
Member

@slipset slipset left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@slipset slipset merged commit b1b9bc1 into clj-commons:master Jan 26, 2021
@borkdude
Copy link
Collaborator

borkdude commented Jan 26, 2021

@slipset It seems the tests broke on master:

lein test :only clj-yaml.core-test/allow-recursive-works

ERROR in (allow-recursive-works) (LinkedHashMap.java:635)
expected: (parse-string recursive-yaml :allow-recursive-keys true)
  actual: java.lang.StackOverflowError: null

Any reason the tests aren't run for PRs?

borkdude added a commit that referenced this pull request Jan 26, 2021
@borkdude
Copy link
Collaborator

@slipset I reverted the commit and that fixed the tests again. It would be nice if PRs were tested before they were merged.

@borkdude
Copy link
Collaborator

borkdude commented Jan 26, 2021

@slipset @narkisr It seems the recursive test broke, but it also seems that it has an infinite recursion. When realizing this, you get a stackoverflow. I'm not sure if the test should maybe change, or if laziness was intended to work like this. Even when taking the first element from the recursive result, you get a stackoverflow.

@borkdude
Copy link
Collaborator

The recursive test was added by @erichaberkorn in #13. Could you comment on this?

@narkisr
Copy link
Contributor Author

narkisr commented Jan 26, 2021

Hi @borkdude just checked again that the tests passed on my branch (iv ran them locally):

$   lein test                            
Compiling 1 source files to /home/re-ops/code/clj-yaml/target/classes

lein test clj-yaml.core-test

Ran 25 tests containing 50 assertions.
0 failures, 0 errors.

Not sure what I am missing here

@borkdude
Copy link
Collaborator

@narkisr That's pretty weird. I get the same error as on master in CI.

@borkdude
Copy link
Collaborator

@narkisr Can you try: lein do clean, test?

@narkisr
Copy link
Contributor Author

narkisr commented Jan 26, 2021

Hi @borkdude just did and got the same result:

$ lein do clean, test
Compiling 1 source files to /home/re-ops/code/clj-yaml/target/classes

lein test clj-yaml.core-test

Ran 25 tests containing 50 assertions.
0 failures, 0 errors.

Maybe our JVM is different? I'm using Amazon corretto:

$ java -version 
openjdk version "1.8.0_252"
OpenJDK Runtime Environment Corretto-8.252.09.1 (build 1.8.0_252-b09)
OpenJDK 64-Bit Server VM Corretto-8.252.09.1 (build 25.252-b09, mixed mode)

Maybe the default stack sizes are different

@slipset
Copy link
Member

slipset commented Jan 26, 2021

Screenshot 2021-01-26 at 13 36 06
From what I can see here and on circle, tests are run against PRs?

Tests fail for me on java 11

@borkdude
Copy link
Collaborator

$ java --version
openjdk 11.0.8 2020-07-14
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.8+10)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.8+10, mixed mode)

@slipset
Copy link
Member

slipset commented Jan 26, 2021

So, Since we can have infinite recursive structures in yaml (TIL), I guess mapv is a no-go, since it will eagerly consume the infinite seq.

@narkisr
Copy link
Contributor Author

narkisr commented Jan 26, 2021

Another option is to enable seqs only when using the recursive flag and document this (use vecs in the standard case), I'm not sure how common recursive yamls are in the wild (could be wrong though)

BTW this still doesn't explain why it seems to be working on JDK 8 :)

@slipset
Copy link
Member

slipset commented Jan 26, 2021

I'm not overly enthused about changing types of things depending on options passed.
We're just a postwalk away from changing seqs into vectors anyway, right?

@narkisr
Copy link
Contributor Author

narkisr commented Jan 26, 2021

Sure, its easy to change once you are aware the seq is in there, personally I discovered this the hard way as I got an exception in a live system (trying to do a get-in).

Thank you @borkdude and @slipset for the help ill leave this PR closed.

@slipset
Copy link
Member

slipset commented Jan 26, 2021

Thank you for your work on this, too bad it wasn't really feasible.

@borkdude
Copy link
Collaborator

@slipset What is the usefulness of an infinitely recursive YAML? I could not even do first on the result without getting a SO. There might be a chance that this test isn't intended the way it is written @erichaberkorn?

@borkdude
Copy link
Collaborator

@slipset I am referring to this test:

(def recursive-yaml "
---
&A
- *A: *A
")

(deftest allow-recursive-works
  (is (thrown-with-msg? YAMLException #"Recursive" (parse-string recursive-yaml)))
  (is (parse-string recursive-yaml :allow-recursive-keys true)))

Calling first on (parse-string recursive-yaml :allow-recursive-keys true) already throws.

@slipset
Copy link
Member

slipset commented Jan 26, 2021

I guess that's just taken from the snakeyaml docs

SnakeYAML now fully supports recursive objects. For instance, the document
&A [ *A ]
will produce a list object containing a reference to itself.

https://bitbucket.org/asomov/snakeyaml/wiki/Documentation

Which doesn't quite explain how infinitely recursive objects are to be handled.

@borkdude
Copy link
Collaborator

Producing vector has come up in #38 and #29. I still think we should have at least an option for this or do it automatically if :allow-recursive-keys isn't truthy, since it's just annoying to have to postwalk the result, imo. YAML isn't fun, but let's not make it even less fun :).

cc @grzm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants