Encoding issue publishing asp.net files

Working a on a project using Rhythmyx 5.6 abd we’re currently having an issue when publishing out files as asp.net (.aspx). If the page contains a “£” character than this is preceded by a “” character. I’ve noticed this in the past and by added the

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

in the header of the html or asp file will fix the issue.

Unfortunately this doesn’t work for asp.net files. If I publish the file out as html or asp it works fine.

I’ve tried changing the encoding in web.config file but that also doesn’t seem to work.

Has anyone else encountered this issue?

Cheers
James

ASP.NET is generally happier with CP-1252, sometimes called “Windows ANSI”, to distinguish it from the actual standard ANSI :slight_smile:

Rhythmyx, of course, can publish in just about any character set. If 5.x, this is set in the properties of the Assembler Query resource. You’ll want to make sure that the <xsl:output> statement of the stylesheet has the correct encoding.

Of course, this strategy probably isn’t the best if you have to publish lots of Chinese content (for example), but for standard Western European languages, you should be ok. Rhythmyx defaults to UTF-8 because it is universal, but sometimes this isn’t the best choice.

Hi Dave

Thanks for the reply and it’s fixed the encoding problem although created another one. When setting the template to publish out as windows-1252 it sets an xml declaration at the top of template

<?xml version="1.0" encoding="windows-1252"?>

Even if omit the xml declaration it still appears

<xsl:output
        omit-xml-declaration="yes" method="saxon:xhtml"
        doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN"
        doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"
        indent="yes" encoding="windows-1252"/>

This is the resulting source .aspx page

<?xml version="1.0" encoding="windows-1252"?>
<!DOCTYPE html
  PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
   <head>
      <meta content="Percussion Rhythmyx" name="generator" />

There are now some formatting issues with IE 6 although it works fine in Firefox and IE 7. You can see here http://81.89.131.131/

If I manually remove the xml declaration then the page displays correctly on the webserver.

Cheers
James

Gotta love Microsoft… I suppose you can’t convince people that IE 6 is just broken.

Some other things to try:

  1. Switch to a different output method (e.g. HTML) instead of saxon:xhtml.

  2. Use the “Namespace Cleanup” checkbox on the stylesheet (on the box in the workbench, not in the XSL itself.

I’m sure that there’s a combination that works, I just don’t know what it is, and it’s been a while since I’ve played with this in Rhythmyx 5.

Dave

Hi Dave

Changing the output method to html fixed the formatting issues.

Thanks for the help on this one.

Cheers
James

I just wanted to post an explanation and alternative solution to this. In .Net everything is assumed to be UTF-16 unless otherwise specified. The two ways to specify a different encoding is to include a byte order mark (BOM) in the file or to add an option to web.config. The first option can be simulated by opening the file in notepad and saving it ontop of itself as UTF-8. If you look at the file in a hex editor you’ll see the first three bytes are now EF-BB-BF which is the BOM for UTF-8. Obviously you don’t want to do this every time (although Percussion could make a change to their publishing routine). The other easier option is to add the following to your web.config file in system.web:

<globalization fileEncoding="utf-8" />