Python ElementTree를 문자열로 변환

Question 1

를 호출 할 때마다 ElementTree.tostring(e)다음 오류 메시지가 표시됩니다.

AttributeError: 'Element' object has no attribute 'getroot'

ElementTree 객체를 XML 문자열로 변환하는 다른 방법이 있습니까?

역 추적:

Traceback (most recent call last):
  File "Development/Python/REObjectSort/REObjectResolver.py", line 145, in <module>
    cm = integrateDataWithCsv(cm, csvm)
  File "Development/Python/REObjectSort/REObjectResolver.py", line 137, in integrateDataWithCsv
    xmlstr = ElementTree.tostring(et.getroot(),encoding='utf8',method='xml')
AttributeError: 'Element' object has no attribute 'getroot'

Question 2

Element개체에는 .getroot()메서드 가 없습니다 . 그 전화를 끊으면 .tostring()전화가 작동합니다.

xmlstr = ElementTree.tostring(et, encoding='utf8', method='xml')

Question 3

`ElementTree.Element`문자열 로 어떻게 변환 합니까?

Python 3 :

xml_str = ElementTree.tostring(xml, encoding='unicode')

Python 2 :

xml_str = ElementTree.tostring(xml, encoding='utf-8')

Python 2 및 3과의 호환성을 위해

xml_str = ElementTree.tostring(xml).decode()

사용 예

from xml.etree import ElementTree

xml = ElementTree.Element("Person", Name="John")
xml_str = ElementTree.tostring(xml).decode()
print(xml_str)

산출:

<Person Name="John" />

설명

이름이 의미하는 바에도 불구하고 ElementTree.tostring()Python 2 및 3에서 기본적으로 바이트 문자열을 반환 합니다. 이것은 문자열에 유니 코드를 사용 하는 Python 3의 문제입니다 .

Python 2에서는 str텍스트 및 이진 데이터 모두에 유형을 사용할 수 있습니다 . 불행히도 두 가지 다른 개념의 이러한 합류는 때로는 두 종류의 데이터에 대해 작동하는 깨지기 쉬운 코드로 이어질 수 있습니다. [...]

텍스트와 이진 데이터를 더 명확하고 명확하게 구분하기 위해 [Python 3]은 맹목적으로 혼합 할 수없는 텍스트와 이진 데이터를 구별하는 유형을 만들었습니다 .

^{출처 : Python 2 코드를 Python 3으로 포팅}

사용중인 Python 버전을 알고 있으면 인코딩을 unicode또는 로 지정할 수 있습니다 utf-8. 그렇지 decode()않고 Python 2 및 3과의 호환성이 필요한 경우 올바른 유형으로 변환 하는 데 사용할 수 있습니다 .

참고로 .tostring()Python 2와 Python 3 의 결과를 비교했습니다 .

ElementTree.tostring(xml)
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />

ElementTree.tostring(xml, encoding='unicode')
# Python 3: <Person Name="John" />
# Python 2: LookupError: unknown encoding: unicode

ElementTree.tostring(xml, encoding='utf-8')
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />

ElementTree.tostring(xml).decode()
# Python 3: <Person Name="John" />
# Python 2: <Person Name="John" />

데이터 유형이 Python 2와 3 사이에 변경 되었음을 지적한 Martijn Peters 에게 감사드립니다 str.

str ()을 사용하지 않는 이유는 무엇입니까?

대부분의 시나리오에서 using str()은 개체를 문자열로 변환하는 " 정규적인 "방법입니다. 불행히도 이것을 사용 Element하면 객체 데이터의 문자열 표현이 아니라 메모리에서 객체의 위치가 16 진수 문자열 로 반환됩니다.

from xml.etree import ElementTree

xml = ElementTree.Element("Person", Name="John")
print(str(xml))  # <Element 'Person' at 0x00497A80>

Question 4

비 라틴어 답변 확장

@Stevoisiak의 답변에 대한 확장 및 비 라틴 문자 처리. 비 라틴어 문자는 한 가지 방법으로 만 표시됩니다. 한 가지 방법은 Python 3과 Python 2에서 모두 다릅니다.

입력

xml = ElementTree.fromstring('<Person Name="크리스" />')
xml = ElementTree.Element("Person", Name="크리스")  # Read Note about Python 2

참고 : Python 2에서 toString(...)코드를 호출 할 때 xmlwith ElementTree.Element("Person", Name="크리스")를 할당 하면 오류가 발생합니다.

UnicodeDecodeError: 'ascii' codec can't decode byte 0xed in position 0: ordinal not in range(128)

산출

ElementTree.tostring(xml)
# Python 3 (크리스): b'<Person Name="&#53356;&#47532;&#49828;" />'
# Python 3 (John): b'<Person Name="John" />'

# Python 2 (크리스): <Person Name="&#53356;&#47532;&#49828;" />
# Python 2 (John): <Person Name="John" />


ElementTree.tostring(xml, encoding='unicode')
# Python 3 (크리스): <Person Name="크리스" />             <-------- Python 3
# Python 3 (John): <Person Name="John" />

# Python 2 (크리스): LookupError: unknown encoding: unicode
# Python 2 (John): LookupError: unknown encoding: unicode

ElementTree.tostring(xml, encoding='utf-8')
# Python 3 (크리스): b'<Person Name="\xed\x81\xac\xeb\xa6\xac\xec\x8a\xa4" />'
# Python 3 (John): b'<Person Name="John" />'

# Python 2 (크리스): <Person Name="크리스" />             <-------- Python 2
# Python 2 (John): <Person Name="John" />

ElementTree.tostring(xml).decode()
# Python 3 (크리스): <Person Name="&#53356;&#47532;&#49828;" />
# Python 3 (John): <Person Name="John" />

# Python 2 (크리스): <Person Name="&#53356;&#47532;&#49828;" />
# Python 2 (John): <Person Name="John" />

Python ElementTree를 문자열로 변환

ElementTree.Element문자열 로 어떻게 변환 합니까?

사용 예

설명

str ()을 사용하지 않는 이유는 무엇입니까?

비 라틴어 답변 확장

`ElementTree.Element`문자열 로 어떻게 변환 합니까?