Description
The run-length encode/decode step on exercism.io. The tests all pass and I want to submit my solution, but this vector thing iches. Look at this decoding step:
(partition-by alphabet "12AB3CD4E")
;; => ((\1 \2) (\A) (\B) (\3) (\C) (\D) (\4) (\E))
(->> (partition-by alphabet "12AB3CD4E")
(parse-run-sequence))
;; => [[12 "A"] [1 "B"] [3 "C"] [1 "D"] [4 "E"]]
So it is in parse-run-sequence where I build up that vector that I can then just (mapcat #(apply repeat %), and then (apply str) and I get the uncompressed text. But inside there I build up the result as a vector, because I can't figure out a clean way to build it as a seq. In this particular case the input sequence builds one thing from either one or two things. Which to me is ”not a straight mapping”. I'd like to see if I can build a lazy solution before I submit it to my mentor. Maybe mapcat is the way, but I don't see it, if so...
andy.fingerhut 18:04
One bit of food for thought: consider after the partition-by call to do (partition 2 1 ...) on the result, and then process that:
18:04
andy.fingerhut Yesterday at 18:05
The first element of processing that is ((\1 \2) (\A)) , which would become one thing in the output, e.g. [12 "A"] . The second thing would become just [1 "B"] , because the letter \A is not a sequence of digits.
4 replies
pez:calva: 17 hours ago
Wow, yes. Very cool!
pez:calva: 16 hours ago
Almost. It fails for input like "XYZ" , => (((\X) (\Y)) ((\Y) (\Z)))
andy.fingerhut 16 hours ago
Ah, yes, boundary conditions. Doing (partition 2 1 ...) on a sequence with a dummy element added at the beginning or end might help with that.
pez:calva: 16 hours ago
Wonderful!
(def alphabet
(let [upper (range (int \A)
(inc (int \Z)))
lower (range (int \a)
(inc (int \z)))]
(-> #{\space}
(into (map char upper))
(into (map char lower)))))(defn run-length-decode
"decodes a run-length-encoded string"
[encoded-text]
(let [decode-char (fn [[head tail]]
(if (alphabet (first head))
[1 (str (first tail))]
[(Integer/parseInt (apply str head)) (apply str tail)]))]
;; "d" is for "dummy".
;; Prepending it at the start for (partition 2 1)
to chew on
(->> (partition-by alphabet (str "d" encoded-text))
(partition 2 1)
(filter #(alphabet (first (second %))))
(map decode-char)
(mapcat #(apply repeat %))
(apply str))))
user=> (partition 2 1 '((\1 \2) (\A) (\B) (\3) (\C) (\D) (\4) (\E)))
(((\1 \2) (\A)) ((\A) (\B)) ((\B) (\3)) ((\3) (\C)) ((\C) (\D)) ((\D) (\4)) ((\4) (\E)))
andy.fingerhut 18:06
The third element ((\B) (\3)) would become nothing in the output (e.g. using mapcat with a function that returns an empty sequence) because it is a letter followed by a digit sequence.
18:07
(partition 2 1 ...) can be a nice trick that lets you write later code that considers all consecutive pairs of a sequence, without having to explicitly remember state from one step to the next.