How does unpublish filter work?

The detailed explanation of how incremental filter works is a great help:

I wonder if someone can kindly explain the unpublish filter in detail as well.

In particular, we run into the situation when someone creates a new item with the same filename as another item in the same folder, and that other item is archived. When unpublish runs, the system will unpublish the new item!


I would hazard that your problem isn’t caused by the unpublish filter but rather by the order of your content lists. The filter is actually a very simple one. If the content-valid flag of the item’s current state matches one of those supplied (by default, ‘u’ is supplied,) then the item is passed.
To unpublish the item, the location is looked up in the site item table. If you happened to publish first, then the unpublisher blindly removes the item you just published because it happens to match by name. For this reason, it is recommended that unpublish lists run before publish lists.

No, I have unpublish list run first before any other content lists.

I also have a few hundred items “stuck” in the unpublish list, no matter how many times I have run it. I tried “edit” the site item table to set operations to “success” for those items; but still, they are caught in the unpublish list.

As of now, we run unpublish only in full editions.

I would like to take a stab at solving this myself first, before going through tech support.

Can you explain to me the heuristics behind the sys_PublishedSiteItems generator?