Incremental file publish is frozen

Hi,

Publishing on one of my sites is frozen. It is frozen while running an incremental content list.

If I plug the URL for the same site and content list into my browser, then it returns meaningful Xml, containing a list of items that are ready for publishing.

Any tips on what to do about this and why it might be stuck? I’m not seeing any errors in the console log or in the pub logs. File publishing for all other sites is running fine.

It’s probably hanging (or taking a long time) on one of the items on the list. If the list isn’t too large, you can query each item using the URL in the clist.

Does the publish “freeze” at the same place? Have you checked the console or server log? You might be running out of memory in the JVM Heap. If it is the memory of the JVM, look in the forums oh how to increase the Heap size for your version.

We experienced a problem where a user created an infinite loop via the navigation subnav slots. Eventually, it crashed the Oracle database and subsequently, the application itself.

Anything of interest in your server.log?

And of course, the day after I read your post, we encountered this issue. We had a broken binding on a template. An error was thrown to the server.log, but the publishing edition did “hang.”

We’re tracking down the problem, and we’ll file the issue with Percussion TS.

There were no errors in the logs. A server restart fixed the problem. I haven’t seen it again.

We encounter this too. System “frozen”, nobody can log in, however Rhythmyx still “running” and reporting Polling Aging action…

I was wondering if a user accessing an item from the list about to get published could be causing this? (Wild guess.)

We just had another one! Nothing in the logs, which makes it very frustrating. Is there a way to troubleshoot this, for instance set up some kind of tracing when incremental is running. I’m pretty sure this is related to user interaction.

We’ve run into this issue in the past and it I just noticed that is reoccurring with one of our incremental editions.

One way that I have been to work around this is to have another site that is identical to the the one that has hung up. This way we can run other editions on the other site during the day, then do a restart of the server over night.

The previous time that I had encountered this issue it had to do with an autoindex who’s SQL was taking an log time to return a result. I was able to fix this by chaning the SQL a little. With this case it looks to be a fairly generic page, so I’m not quite sure what the problem is yet.

cwammes,

How did you determine where exactly it’s breaking? Our publogs are inconclusive.

Thanks…

The instance that I was able to figure it out; it was in a content list that only returned two items. From there it was pretty easy to determine which one was causing the problem.

I’m still working through the current one, which is in a much larger content list, over 2,000 items. I’ll post if I can figure out the problem with this one.

Does anyone know if this problem still exists in 6.7?

It still does it in 6.7 , latest patch - just had this issue last w-end. What drives me crazy - no any conclusive logs or any Errors that indicate the item or any meaningful state, content ID or any of the sort to figure it out. It is also NOT skipping this one freak with a “Failed” flag - just stalls the whole Edition, so none of the rest of the items are published.
Console and server logs - no issues, except a message about HashMap incorrect and throwing the meaningless Error without any indication what went wrong other than “ClassCastException [] cannot be cast to …”, but w/out indication of a String or Array or anything that went wrong.
The edition still running , but stuck on “Queued 1” (yeah - that’s one content item out of 2500) and “Prepared for Delivery 2499”

I was trying to see the item at fault in the current running log - nope it’s just not there, I was trying to eliminate several latest batches of added content - since I knew it was a newly Edited or Added item (all was good 3 days ago) - nope, couldn’t find it.

I finally found it by tracing the time of Error in log and time in running edition log and comparing. Since the running Edition log is sorted by Content ID, I was able to figure out which item was between the Content ID’s before and after the time of Error in Server log and get 10 or 11 items that were in between the items in pub log and were ok.

By further eliminating C ID’s that are not in Public state - I got three items that may be at fault - and by previewing each - I determined which item was wrong and had false substring index in binding (DB publishing).
Wow - that was long preface :slight_smile:

The morale: Product Development

  • PLEASE provide more meaningful Errors in logs…
  • PLEASE catch the faulty item and Provide us with some sort of ID…
  • PLEASE let the Edition catch Exception, skip the faulty item and go on with the rest (set the timeout - really, our Edition run for 32 hrs over w-end before I caught it)

Cheers,
Mike

Can you run the same edition without running the incremental at the same time?
If the same edition publishing works by itself, then you may check the database to see if there is any deadlocks while the publishing is in “hang” state.

Yes, as suggested by rmark, please check the memory usage of the JVM/Rhythmyx server as well