Dom4j解析xml发生如下错误:
org.dom4j.DocumentException: Error on line 4 of document file:///D:/eclipsework/SelenTest/data/test.xml : Invalid byte 1 of 1-byte UTF-8 sequence. Nested exception: Invalid byte 1 of 1-byte UTF-8 sequence.
at org.dom4j.io.SAXReader.read(SAXReader.java:482)at org.dom4j.io.SAXReader.read(SAXReader.java:264)
at com.baidu.common.ReadXml.main(ReadXml.java:33)
Nested exception:
org.xml.sax.SAXParseException; systemId: file:///D:/eclipsework/SelenTest/data/test.xml; lineNumber: 4; columnNumber: 15; Invalid byte 1 of 1-byte UTF-8 sequence.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.dom4j.io.SAXReader.read(SAXReader.java:465)
at org.dom4j.io.SAXReader.read(SAXReader.java:264)
at com.baidu.common.ReadXml.main(ReadXml.java:33)
Caused by: org.apache.xerces.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.skipChar(Unknown Source)
... 10 more
Nested exception: org.xml.sax.SAXParseException; systemId: file:///D:/eclipsework/SelenTest/data/test.xml; lineNumber: 4; columnNumber: 15; Invalid byte 1 of 1-byte UTF-8 sequence.
at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at org.dom4j.io.SAXReader.read(SAXReader.java:465)
at org.dom4j.io.SAXReader.read(SAXReader.java:264)
at com.baidu.common.ReadXml.main(ReadXml.java:33)
Caused by: org.apache.xerces.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown Source)
at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
at org.apache.xerces.impl.XMLEntityScanner.skipChar(Unknown Source)
... 10 more
XML文件中<?xml version="1.0" encoding="UTF-8"?>虽然标示了utf8,但是在window下实际上保存的文件是gbk格式
最简单的解决办法就是将xml文件中的<?xml version="1.0" encoding="UTF-8"?> 改成<?xml version="1.0" encoding="GBK"?> 或者用notepad把xml重新保存一遍,保存时字符集选择utf-8