Limiting the number of revisions stored

We would like to limit the number of revisions stored in the database. The overall concern is a content-type that is used to store PDFs. Say you have a content type with an upload field, and a user uploads a 10 MB PDF. The content type also stores keywords, meta data, etc. A user edits the meta data, creating a revision of the content item. From what we can tell, now you have the PDF in the database twice. That’s going to eat up database space quickly.

Of course, we’re also instituting a max filesize validation, but that’s only the preventative step.

Sure, we realize users can still get around the max number of revisions stored by altogether creating a new content item, but we don’t think it’s reasonable to allow an infinite number of revisions of a content item. Ultimately, we’d like to be able to set a max number of revisions unique to each content type.

Have you been able to implement such a policy? we would be interested to do the same thing.

thanks,

We are thinking of limiting such expensive revisions using a database trigger to delete old revisions as new ones are created. We did worry that it might mess up the PSX_OBJECTRELATIONSHIP table, but the OWNER_REVISION column is always updated to match the latest revision, and the DEPENDENT_REVISION column is always set to -1.

But we haven’t got round to setting it up yet, so if someone can think of a problem with this please do reply with a reason why it won’t work.

Not sure if I understand what you mean by “OWNER_REVISION column …” I’m not aware of that column ever getting modified after the rel is created. When a revision of an item is created, all relationships are cloned and these clones have their owner id set to the new rev. The existing rels to the previous revs are not modified.

I don’t know if there is anywhere in the system that would misbehave if old revs are removed w/o cleaning up all the other tables referencing those revs.

Yes, I didn’t express myself very clearly. New rows are added to the PSX_OBJECTRELATIONSHIP table when a new revision of the owner item (the one whose slot is filled with a dependent item) is created. The OWNER_REVISION field in that new row is set to the new revision number. So as well as deleting rows in the table containing local fields, and the table(s) containing values from multiple-value controls, and tables(s) containing shared fields, the database trigger could (or should?) remove old rows in the PSX_OBJECTRELATIONSHIP table.

We received some SQL code from PSO that deletes all but the current revision. We’re not actively using it to manage our content revisions, but it may help you in writing code to server your purposes. We’ve only tested it in on 6.5.2. I posted it in the Code and Snippet Repository.

The problem is that you probably don’t want to delete everything older than the current revision. You need to delete revisions that are older than the last public revision (where the item has a last public revision).

This will work okay in VT’s limited case, but it can cause data loss if used indiscriminately.

Dave