Input Transform: Regular Expression?

I’ve done some searching here on the forums, as well as in the documentation, and I can’t find much about input transforms in general.

Here is what I’m looking for:

Ephox puts weird things in the HTML, especially when creating tables. For example: a new table in Ephox by default has width set on the table cells. I’m sure we could find a way to prevent Ephox from doing this by default, but that does not prevent a content contributor from manually setting a width, or a width coming over from a copy-paste from word. The width issue is one of several examples of things we want to prevent (or replace) when the content is stored.

The best solution I can think of is a regular expression to either eliminate or replace the offending items, and I see that there is an extension in validation for using regular expressions, however, I can not find any extensions for using regular expressions in a transform.

Is there another way to do this, or is there an extension available that I can use to achieve this? I am not well versed in Java, and while I plan to learn it, I’m under some time constraints right now and don’t have the time to tinker.

BTW, I know that there is a basic “replace” feature available in the input transforms section, but I can’t put in an entry for every case. (i.e. width=“25%” works fine for one case, but if the width is any other value, it will not get removed. I would rather use /width="[\d][%px ]"/ and remove any width values with percent, or pixel specified, and also remove if the quotes are empty. Any way you slice it the regular expression is much more efficient.

Thanks,

-Jason

Jason,

You can also use XSLT as described in this thread:

http://forum.percussion.com/showthread.php?t=2632

XSLT has its own learning curve, and it may or may not be easier than Java.

Dave

Hi Dave…
The XSL solution in that thread you referred to was posted by Creig, my co-worker here. That solution deals with the issue on the template level, during output. Is there a way to do this during the input stage, so that it is stored correctly in the database? The reason I want that to happen, is that if someone tries to save a field, which contains things that may break the layout, I can fix them so that the next time the item is edited, it will appear correctly in the editor, as well as during output. I can’t find any way to do XSL transforms from the Input Transform tab, and the closest thing I can find is on the Pre-Processing tab, there is some stuff about TextToDom, TransformDom, DomToText, but I’m unsure what parameters need to go into that. Is there any documentation on how to use them?

Thanks,

-Jason

Jason

the closest thing I can find is on the Pre-Processing tab, there is some stuff about TextToDom, TransformDom, DomToText, but I’m unsure what parameters need to go into that. Is there any documentation on how to use them?

These extensions (in fact, all Rhythmyx extensions) are documented in the [I]Rhythmyx Technical Reference[/I].

RLJII

From the Technical Reference documentation page 209:

“The XSL stylesheet must reside in the current application directory. To do this, attach it to a query in the current application.”

Can anyone explain this a little better? I don’t know what it means by the current application, and I’m not sure what query it is referring to.

Jason

The DOM extensions were originally developed to work with XML applications in the Rhythmyx server. Each XML application resides in a directory, which also includes the supporting files for that application.

For more bout XML applications, see the XML server section of the Rhythmyx Workbench Help.

RLJII

A further clarification…

All Content Types have an “XML Application”. In 6.x, this application will be named “psx_ce<ContentType>”. (in previous versions, this name was set by the implementer). There is a corresponding XML file in the ObjectStore directory for each content type, and a folder with the same name under the Rhythmyx server root folder.

If you want to attach a transform to the “rffBrief” content type, the folder name would be “psx_cerffBrief”, for example.

The XMLDOM extensions were originally built for attaching to custom XML queries in one of these applications, so their documentation seems a bit out of place looking at it from the 6.x content editor perspective. However, these extensions will work on 6.x content editors, even if the documentation is not explicit about it.

I hope this helps

Dave

While the “psx_ce” XML file exists for my desired content type, the folder does not. Can I simply create the folder and drop my transform file inside, or is there some configuration that needs to happen? Also, when I enter a filename for the tranform xsl and try to edit an item of that particular type, I get no error messages, even if I intentionally put in a bogus file… is a silent failure appropriate in this instance, or do I have a problem that runs a little deeper?

Thanks,

-Jason