- This topic has 8 replies, 3 voices, and was last updated 19 years, 3 months ago by alexiz.
-
AuthorPosts
-
Aaron DigullaMemberHello,
I’ve just loaded a document with the Content-type “text/html; charset=ISO-8859-1”. It contained the following code:
“Japanese: 宮”
During loading, the encoded kanji were replaced with the symbol 宮.
Then, I tried to save. MyEclipse warned me that the symbols couldn’t be converted to ISO-8859-1 (which is not true; you just have to escape them). So I changed the encoding the in the Content-type to UTF-8. Now, MyEclipse would save but in the resulting file, the kanji symbols were replaced by “?”.
Suggestions:
1. When a symbol can’t be expressed in a certain encoding, convert it to the escaped form (&#…;)
2. Leave escaped symbols in the input alone
Thanks,
—
Aaron Digulla
http://www.philmann-dark.de/
Aaron DigullaMemberHere is an example HTML. To reproduce the problem:
1. Open Notepad, paste the text into the file, save it into the workspace
2. Refresh the project
3. Open the file in the MyEclipse HTML editor
4. Switch to design mode. The first kanji will show as 雨, the next two will be OK.
5. Click anywhere and type something
6. Save the file
7. You will get a warning: The encoding (ISO-8859-1) cannot … (such as the one in position 376). Press OK.
8. Switch back to Source view: The Editor will have changed the escaped kanji to UTF8.If you load the file from disk with Notepad, you’ll see that it will have written “?” to the file. Now, the editor and the file on disk contain different data.
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head><title>Usagi Yojimbo Dojo - Ame Tomoe</title> <meta http-equiv="Content-type" content="text/html; charset=ISO-8859-1" /> <meta name="distribution" content="global,local" /> </head><body> <h1>Character Information</h1> <p>雨朋絵</p> </body> </html>
Aaron DigullaMemberSorry for the double post but for some reason I cannot preview this nor can I edit the post…
Here is an example HTML. To reproduce the problem:
1. Open Notepad, paste the text into the file, save it into the workspace
2. Refresh the project
3. Open the file in the MyEclipse HTML editor
4. Switch to design mode. The first kanji will show as 雨, the next two will be OK.
5. Click anywhere and type something
6. Save the file
7. You will get a warning: The encoding (ISO-8859-1) cannot … (such as the one in position 376). Press OK.
8. Switch back to Source view: The Editor will have changed the escaped kanji to UTF8.If you load the file from disk with Notepad, you’ll see that it will have written “?” to the file. Now, the editor and the file on disk contain different data.
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head><title>Usagi Yojimbo Dojo - Ame Tomoe</title> <meta http-equiv="Content-type" content="text/html; charset=ISO-8859-1" /> <meta name="distribution" content="global,local" /> </head><body> <h1>Character Information</h1> <p>雨朋絵</p> </body> </html>
Aaron DigullaMemberAdditional note: To make the example work, remove the three “amp;” in the code after you pasted it to notepad.
Riyad KallaMemberAaron,
Thank you for the very detailed report, I am looking into this now and will file it if I can reproduce it.
Riyad KallaMemberAaron,
I followed your steps as you posted them and was unable to reproduce this problem, the content I ended up with in the editor, and notepad and double checked in textpad was this:<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>Usagi Yojimbo Dojo - Ame Tomoe</title> <meta http-equiv="Content-type" content="text/html; charset=ISO-8859-1" /> <meta name="distribution" content="global,local" /> </head> <body> <h1>dCharacter Information</h1> <p>#38632;#26379;#32117;</p> </body> </html>
I did switch to the design view, did add the “p” at the beginning of “Character” and saved it… I wasn’t prompted about any encoding issues however. When I checked the file encoding in Eclipse it is infact ISO-8859-1 and I am on Windows XP Pro SP2 with a US English locale.
Did I miss any of the steps you outlined?
Aaron DigullaMember@support-rkalla wrote:
<p>#38632;#26379;#32117;</p>
Did I miss any of the steps you outlined?
Yes, it’s just a slight but important typo: The & before the # are missing. In my instructions, I said to remove only “amp;”, not the “&” 🙂
Riyad KallaMemberHmmm yes I see what you mean, I was able to reproduce this problem. I will file it ASAP, thank you for taking the time to walk me throught his.
alexizMember<%@page contentType=”text/html; charset=ISO-2022″%>
Please put the above line to firstline of your jsp file . must be first line.
-
AuthorPosts