The error you're encountering is due to an invalid XML character reference in your string. Let's break down what's happening:
b��c contains two XML character references: � and �.� (U+D83D) and � (U+DCA7) as two separate, invalid references, hence the error:xmlParseCharRef: invalid xmlChar value 55357You have two options:
pythonxml_string = '''<?xml version="1.0" encoding="UTF-8" standalone="no"?> <hierarchy a="b💥c"/>''' etree.fromstring(xml_string.encode('utf-8'))
This works because the emoji is a single Unicode character.
The emoji 💥 is actually U+1F4A5, which is the combination of the surrogate pair. You can use this directly in your XML:
pythonxml_string = '''<?xml version="1.0" encoding="UTF-8" standalone="no"?> <hierarchy a="b💥c"/>''' etree.fromstring(xml_string.encode('utf-8'))
Here, 💥 is the decimal representation of U+1F4A5.
��💥xml_string = '''<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<hierarchy a="b💥c"/>'''
etree.fromstring(xml_string.encode('utf-8'))xml_string = '''<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<hierarchy a="b💥c"/>'''
etree.fromstring(xml_string.encode('utf-8'))