Tab becomes   characters and displayed as      Â


I have word document that has tabs. When I cut and paste document contents onto Ephox Live editor. The tabs become characters seen via HTML code. When published, they are appearing as  Â. Any idea ?

You should never cut and paste from Word directly into Ephox… the text copies across with all the MSWord formatting and additional code and basically screws up your content big time.

Instead, right click and use the Paste Special ->Paste as plain text menu options and then format your text correctly, as you want it to appear in the published web page.

If you check in the code view tab in Ephox for the problem text you will see that the code is bloated with tons of Microsoft formatting code which will overwrite any CSS etc. on your site.


I did exactly that to see if the problem goes away. I copied the content from word as plain text and formatted the text with ephox. The “tabs” I enter looks fine on the “preview”. It is only when the content is getting published to a site that “tabs” are shown as    Â.

Please report this to Technical Support.

When you view your published page from a web browser, what does the encoding say? I’m going to assume your published page is in ISO-8859 and it’s really meant to be UTF8 and this is why some white spaces and special characters are coming up incorrectly. To fix this you may need to do 3 things( and again I will assume that you want to use UTF8). One, make sure your templates have a meta tag defining the character set. This is not completely necessary as the HTTP header typically tells the browser the encoding style( character set ) but is good in practice. Two, check if your web server is explicitly set to an encoding style( character set ) and make the necessary changes there as well. Three, check your Ephox configuration located in [rx root]\rx_resources\ephox\elj_config.xml and set ( or add ) the meta tag there to be UTF8. When you submit in the content editor, the backend database blob which holds the ephox fields data in UTF8( I believe ), you might be submitting in ISO-8859 and therefore unexpected conversion takes place. These character sets have a 1 to 1 relationship for a small subset of characters( that’s why an “a” in ISO-8859 is still an “a” in UTF8 ) but what is a white space character in ISO-8859 is now appearing as a  in UTF8 (or vice-versa ).

Hope this helps…I had the same exact problem!


P.S. I’m writing this late on a Friday evening so my technical explanation might not be 100% correct :slight_smile:

First of all, my final destination is as “aspx” out (just in case that matters) and I am running it on IIS 5.x.

When I do view source my published page on the browser, I see meta tag as

<meta http-equiv=“Content-Type” content=“text/html; charset=UTF-8” />

Text as: •ÂÂÂÂ<b>Exercise and Sudden Death: Investiga…

I also checked the “aspx” file offline with the wordpad and it has these characters already in the file.

I didn’t change anything in ephox config. but, I confirmed that it has
<meta content=“text/html; charset=UTF-8” http-equiv=“Content-Type”/>

Just wanted to add one more solution to this. For those who are using IIS, under the properties of your website there is an ASP.NET tab. On that tab select Edit Configuration > Application. There is the encoding settings AND Culture settings. I had to change both to make fix this issue. In my case I entered utf-8 and Culture was en-CA (English Canada).