Linked Content items appear multiple times in JSR-170 query

I am trying to create an RSS feed that contains recently updated and newly created content items (of a certain content type). Of course there are many technical obstacles surrounding this issue. Leaving the other issues aside, there is one problem I can’t seem to find a solution to:

A handful of these content items exist in more than one folder of the site, by way of copy, then past as link. The following JSR-170 query is what I am using to find the content items:

SELECT rx:sys_contentid, rx:sys_folderid FROM rx:CLTTopic

When I run this in the Query debugger, I also add the rx:displaytitle, just to help me distinguish between them, but I can clearly see two content items, with the same title, same contentid, and different folders. I need to find a way to filter those items from the results, so I only get one instance of the content item.

Some ideas that I tried that didn’t work:

I tried to use SQL-like keywords such as distinct or group by. These did not work.
I tried to discover what fields are available by using describe, and show fields, these also failed.
doing a ‘select * from rx:CLTTopic’ shows me the fields available, but none of them seem to be related to the content item’s status as far as being a linked item rather than the real item.

So does anyone know how I can identify and remove those “linked” items, or how to group results by the contentid?

Thanks,
-Jason

When you find out, let me know. I’m having the same problem. There doesn’t seem to be a good online reference for JSR-170 queries either…

Hi Jason

You have couple of options, first you can use jcr:path in your jsr query to restrict the folders it searches on eg

SELECT rx:sys_contentid, rx:sys_folderid FROM rx:CLTTopic WHERE jcr:path like '//Site/yoursite%'

you can also add further conditions to the jsr query to limit the folders searched on:

SELECT rx:sys_contentid, rx:sys_folderid FROM rx:CLTTopic WHERE jcr:path like '//Site/yoursite%' AND NOT jcr:path LIKE '//Site/yoursite/specificfolder%'

Failing that you will have to write a custom slot which doesn’t display the duplicate content reference. eg

		#set($contentidfield = "")
		
		#initslot("auto_index_slot" $slotparams)
			#if($sys.currentslot.relresults.size() > 0)
				#set($contentid = "")

				#foreach( $relresult in $sys.currentslot.relresults )
					#set($contentid = $relresult.Node.getProperty("rx:sys_contentid").String)
					
					#if($contentidfield != $contentid)
						$rx.doc.extractBody($relresult)
						#set($contentidfield = $contentid)
					#end

				#end
			#end
		#endslot("auto_index_slot")

You will have to order by the contentid to make sure the content items are grouped together.

Cheers
James

Hello Jason,

Where are you using the JSR query? Are you using it for publishing the items or for some other purpose?

  • Ravi

As you’ve already discovered, the DISTINCT keyword doesn’t exist in JCR Queries.

The only way to remove the duplicates in the query is to omit the sys_folderid column. If you select only the content id, you will get only one row per item.

Do you need the folder id for something in the process (e.g. you want to publish the folder path)?

If it is not practical, then Jimbo’s suggestion (the custom slot) seems best to me.

Dave