הערה
הגישה לדף זה מחייבת הרשאה. באפשרותך לנסות להיכנס או לשנות מדריכי כתובות.
הגישה לדף זה מחייבת הרשאה. באפשרותך לנסות לשנות מדריכי כתובות.
Question
Wednesday, November 22, 2017 2:31 PM
When i am trying to parse a XML document from time to time it will have an invalid character i would like to be able to remove this without having to go into the document myself. This is the error i get "System.Xml.XmlException: '_', hexadecimal value 0x02, is an invalid character." for the following line of XML -
Thanks in advance for the help.
All replies (8)
Wednesday, November 22, 2017 8:32 PM | 1 vote
So what do you do to fix it when you do it manually?
Apparently somehow there is an invalid character being added to the document along its way to you or maybe you are creating the file and inserting this character somehow without knowing it. That would be the best place to fix the problem so it never gets put in the xml file to begin with.
However, maybe you have no control over the creation of the file. In that case, an Xml file is basically just a glorified text file which you can open and replace all the (Chr(2) or Chr(&H02)) characters using standard text file methods, then save it back to the hard drive. You could set something like this up to look for the character and replace it if needed before opening and reading it with the xml methods....
Imports System.Xml
Public Class Form1
Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click
Dim fileChars() As Char = IO.File.ReadAllText("C:\TestFolder\MyFile.xml", System.Text.Encoding.UTF8).ToCharArray
If fileChars.Where(Function(x) Not XmlConvert.IsXmlChar(x)).Count > 0 Then
fileChars = fileChars.Where(Function(x) XmlConvert.IsXmlChar(x)).ToArray
IO.File.WriteAllText("C:\TestFolder\MyFile.xml", fileChars, System.Text.Encoding.UTF8)
End If
fileChars = Nothing
'now open and read the xml file...
End Sub
End Class
If you say it can`t be done then i`ll try it
Thursday, November 23, 2017 12:14 PM | 1 vote
When i am trying to parse a XML document from time to time it will have an invalid character i would like to be able to remove this without having to go into the document myself. This is the error i get "System.Xml.XmlException: '_', hexadecimal value 0x02, is an invalid character." for the following line of XML -
Thanks in advance for the help.
Have a look at this on SO:
The accepted answer is in C# but it doesn't look like it would be hard to change to VB. I've seen similar utilities on the net about "cleaning the XML" but I've always wondered how they got there to start with.
If you created the XML to start with then let's talk about what you've got there - that will be the ultimate best solution.
"A problem well stated is a problem half solved.” - Charles F. Kettering
Thursday, November 23, 2017 2:44 PM
Frank, i knew you would get in on this one since you seem to answer most of the xml questions around here. I guess i don't use xml enough because, i have never run into the XmlConvert.IsXmlChar Method before. Quite handy for a situation like this if you have no control over the creation of the xml file.
After testing it, i noticed that the document that was being saved back to the hard drive in my prior example was not the same size as the original. It seems i was missing the part of using UTF8 encoding when reading and writing the file, that fixed the size problem for my saved xml file which uses the UTF8 encoding.
Anyways, i am updating my prior example to use both of these fixes but, i wanted to ask you if there are xml files using other encoding like, utf7, utf32, or even plain ascii? It seems like i have ever only seen UTF8 in all that i have messed around with.
If you say it can`t be done then i`ll try it
Thursday, November 23, 2017 3:03 PM | 1 vote
Frank, i knew you would get in on this one since you seem to answer most of the xml questions around here. I guess i don't use xml enough because, i have never run into the XmlConvert.IsXmlChar Method before. Quite handy for a situation like this if you have no control over the creation of the xml file.
After testing it, i noticed that the document that was being saved back to the hard drive in my prior example was not the same size as the original. It seems i was missing the part of using UTF8 encoding when reading and writing the file, that fixed the size problem for my saved xml file which uses the UTF8 encoding.
Anyways, i am updating my prior example to use both of these fixes but, i wanted to ask you if there are xml files using other encoding like, utf7, utf32, or even plain ascii? It seems like i have ever only seen UTF8 in all that i have messed around with.
If you say it can`t be done then i`ll try it
I'd still like to know how the blemish got there to start with -- that's the best way to deal with it; prevention. ;-)
"A problem well stated is a problem half solved.” - Charles F. Kettering
Thursday, November 23, 2017 3:25 PM | 1 vote
I'd still like to know how the blemish got there to start with -- that's the best way to deal with it; prevention. ;-)
"A problem well stated is a problem half solved.” - Charles F. Kettering
I agree. I mentioned that too but, i figured i would also try giving an option for fixing the file just in case Gixxerluke has no control over the creation of it. 8)
If you say it can`t be done then i`ll try it
Thursday, November 23, 2017 3:30 PM | 1 vote
I agree. I mentioned that too but, i figured i would also try giving an option for fixing the file just in case Gixxerluke has no control over the creation of it. 8)
If you say it can`t be done then i`ll try it
If it has a bunch of odd anomalies in it, can you really count on the XML itself to now be valid though?
Anyway, I hope he gets to the bottom of it all. :)
"A problem well stated is a problem half solved.” - Charles F. Kettering
Friday, November 24, 2017 12:40 PM
When i am trying to parse a XML document from time to time it will have an invalid character i would like to be able to remove this without having to go into the document myself. This is the error i get "System.Xml.XmlException: '_', hexadecimal value 0x02, is an invalid character." for the following line of XML -
Thanks in advance for the help.
How are you parsing the XML document? Do you know how the document was corrupted?
"Those who use Application.DoEvents() have no idea what it does and those who know what it does never use it" - MSDN User JohnWein
Multics - An OS ahead of its time. Serial Port Info
Friday, November 24, 2017 1:16 PM | 1 vote
Frank, i knew you would get in on this one since you seem to answer most of the xml questions around here. I guess i don't use xml enough because, i have never run into the XmlConvert.IsXmlChar Method before. Quite handy for a situation like this if you have no control over the creation of the xml file.
After testing it, i noticed that the document that was being saved back to the hard drive in my prior example was not the same size as the original. It seems i was missing the part of using UTF8 encoding when reading and writing the file, that fixed the size problem for my saved xml file which uses the UTF8 encoding.
Anyways, i am updating my prior example to use both of these fixes but, i wanted to ask you if there are xml files using other encoding like, utf7, utf32, or even plain ascii? It seems like i have ever only seen UTF8 in all that i have messed around with.
If you say it can`t be done then i`ll try it
Ray, XML is just an enhanced HTML file with a focus on data instead of presenting.
Plain ASCII is a 7 bit format created for papertape. Likewise the papertape is it not anymore the best text code system (it contains even a lot codes for printerhandling)
HTML and XML are just string files. What kind of charactercode is used is depending from the OS.
Success
Cor