Skip to content

Commit e02480f

Browse files
Update README.md
1 parent 26c542b commit e02480f

File tree

1 file changed

+1
-53
lines changed

1 file changed

+1
-53
lines changed

README.md

Lines changed: 1 addition & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,6 @@ Learn how to parse XML in Python using libraries like ElementTree, lxml, and SAX
1010
- [lxml](#lxml)
1111
- [minidom](#minidom)
1212
- [SAX Parser](#sax-parser)
13-
- [untangle](#untangle)
1413

1514
## Key Concepts of an XML File
1615

@@ -378,59 +377,8 @@ Unlike other parsers that load the entire file into memory, SAX processes files
378377

379378
SAX is ideal for efficiently scanning large XML files (e.g., log files) to extract specific information (e.g., error messages). However, if your analysis needs to explore relationships between different data segments, SAX may not be the best choice.
380379

381-
## untangle
382-
383-
[untangle](https://untangle.readthedocs.io/en/latest/) is a lightweight Python library that simplifies XML parsing by allowing you to access XML elements and attributes directly as Python objects. Unlike traditional parsers, which require navigating through hierarchical structures, untangle converts XML documents into nested Python dictionaries. XML elements become dictionary keys, with attributes and text content stored as their corresponding values, making data manipulation easy with standard Python structures.
384-
385-
Untangle is not part of the default Python library and needs to be installed using the following `PyPI` command:
386-
387-
```sh
388-
pip install untangle
389-
```
390-
391-
The following example demonstrates how to parse the XML file using the untangle library and access the XML elements:
392-
393-
```python
394-
import untangle
395-
import requests
396-
397-
url = "https://brightdata.com/post-sitemap.xml"
398-
399-
response = requests.get(url)
400-
401-
if response.status_code == 200:
402-
403-
obj = untangle.parse(response.text)
404-
405-
for url in obj.urlset.url:
406-
print(url.loc.cdata.strip())
407-
else:
408-
print("Failed to retrieve XML file from the URL.")
409-
```
410-
411-
Your output will look like this:
412-
413-
```
414-
https://brightdata.com/case-studies/powerdrop-case-study
415-
https://brightdata.com/case-studies/addressing-brand-protection-from-every-angle
416-
https://brightdata.com/case-studies/taking-control-of-the-digital-shelf-with-public-online-data
417-
https://brightdata.com/case-studies/the-seo-transformation
418-
https://brightdata.com/case-studies/data-driven-automated-e-commerce-tools
419-
https://brightdata.com/case-studies/highly-targeted-influencer-marketing
420-
https://brightdata.com/case-studies/data-driven-products-for-smarter-shopping-solutions
421-
https://brightdata.com/case-studies/workplace-diversity-facilitated-by-online-data
422-
https://brightdata.com/case-studies/alternative-travel-solutions-enabled-by-online-data-railofy
423-
https://brightdata.com/case-studies/data-intensive-analytical-solutions
424-
https://brightdata.com/case-studies/canopy-advantage-solutions
425-
https://brightdata.com/case-studies/seamless-digital-automations
426-
```
427-
428-
Untangle simplifies XML parsing in Python by converting XML data into easy-to-use Python objects, eliminating the need for complex navigation. However, it requires separate installation as it’s not part of the core Python package.
429-
430-
Use untangle when you need to quickly convert well-formed XML into Python objects for processing. For example, if you’re working with weather data in XML, untangle can help parse the data and create objects for temperature, humidity, and forecast, which can be easily manipulated in your application.
431-
432380
## Conclusion
433381

434382
Python offers versatile libraries to simplify XML parsing. However, when using the requests library to access files online, you may face quota and throttling issues. [Bright Data](https://brightdata.com/) offers reliable proxy solutions to help bypass these limitations.
435383

436-
If you'd rather skip the scraping and parsing, check out our [dataset marketplace](https://brightdata.com/products/datasets) for free!
384+
If you'd rather skip the scraping and parsing, check out our [dataset marketplace](https://brightdata.com/products/datasets) for free!

0 commit comments

Comments
 (0)