Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fails to parse overpass-turbo export due to BOM #7

Open
michaelkirk opened this issue Jun 8, 2021 · 2 comments
Open

fails to parse overpass-turbo export due to BOM #7

michaelkirk opened this issue Jun 8, 2021 · 2 comments

Comments

@michaelkirk
Copy link
Contributor

e.g. the default overpass-turbo script:

/*
This is an example Overpass query.
Try it out by pressing the Run button above!
You can find more examples with the Load tool.
*/
node
  [amenity=drinking_water]
  ({{bbox}});
out;
  1. click run
  2. click export
  3. then download/copy as raw OSM data

You'll get a file like this (though unzipped):
overpass-export-bom.osm.gz

When I try to process it, the osmio parser explodes with:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { pos: 1:1, kind: Syntax("Unexpected characters outside the root element: \u{feff}") }', /Users/mkirk/src/georust/osmio/src/xml/mod.rs:65:25
stack backtrace:
   0: rust_begin_unwind
             at /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/std/src/panicking.rs:493:5
   1: core::panicking::panic_fmt
             at /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/core/src/panicking.rs:92:14
   2: core::option::expect_none_failed
             at /rustc/9bc8c42bb2f19e745a63f3445f1ac248fb015e53/library/core/src/option.rs:1329:5
   3: core::result::Result<T,E>::unwrap
             at /Users/mkirk/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/result.rs:1037:23
   4: <osmio::xml::XMLReader<R> as osmio::OSMReader>::next
             at /Users/mkirk/src/georust/osmio/src/xml/mod.rs:65:22
   5: <osmio::OSMObjectIterator<R> as core::iter::traits::iterator::Iterator>::next
             at /Users/mkirk/src/georust/osmio/src/lib.rs:505:9
   6: osm2fgb::convert_xml
             at ./src/main.rs:63:20
   7: osm2fgb::main
             at ./src/main.rs:46:5
   8: core::ops::function::FnOnce::call_once
             at /Users/mkirk/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/src/rust/library/core/src/ops/function.rs:227:5

If I open in vim, run :set nobomb and save it as: overpass-export-nobom.osm.gz, then osmio can successfully process the input.

@michaelkirk
Copy link
Contributor Author

Maybe a dupe of netvl/xml-rs#155

@michaelkirk
Copy link
Contributor Author

So it seems that this is a known issue - xml_rs explodes when encountering a BOM.

The author prefers that, since the BOM exists "outside of the xml", this should be fixed by every user of the crate, to be sure any BOM is stripped before handing input to the xml_rs crate.

Their suggestion is to build something like:

use bom_remover::BomRemover;

let file = File::open("file.xml").unwrap();
let file = BufReader::new(file):
let file = BomRemover::new(file);
let reader = EventReader::new(file);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant