Thoughts on database versioning

jwin · August 26, 2019, 9:44am

In a database, many versioned resources can be inserted at the top level.
Is there a concept, to have a versioned database ?
That is, if some resource in a database is modified/deleted/created, the databases own version is changed.

It is possible to have one toplevel resource and have all resources as children of that, but I guess that this is not the best/performant way to do this (?)

(I guess my thinking was triggered by DAtomic https://www.datomic.com/)

johannes · August 26, 2019, 2:54pm

Hey, great question(s)

I’d probably create a special database resource, which stores the names and respective versions. We’d need to finish work on adding the node transactions to a database-wide transaction though and once committed the resource is also kept up to date.

The slightly harder thing will be to manage failed database-wide transactions, but maybe in the first step we can ignore this and add cleanup methods in a second step.

But in general I think it’s not so much work (when a special .commit-file is not deleted inside a resource-folder it means a transaction started but crashed in the middle – we then need to truncate the resource back to the offset of the most recently committed revision and UberPage, otherwise some storage space would be wasted) Other than that I think we either could run background threads periodically to do this or wait until a read-write transaction is opened on the resource again.

We’d also have to check the special resource in the most recent revision to truncate the most recent revisions, if a database-wide transaction fails. Maybe we can also manage this with a special .commit file which just denotes if a database-wide transaction failed or not.

johannes · August 26, 2019, 2:58pm

I think the first part is nearly finished, but I didn’t like the API and I didn’t come up with a good idea. The thing is

I’d use the existing node transactions on a resource and simple add them to a database-wide transaction. However if someone commits the node transaction in-between it somehow destroys the whole purpose. Somehow we’d need to have an internal commit method on the node transaction but not expose this to the public.

johannes · August 28, 2019, 11:04am

Another idea would be to add a Git dependency for versioning the database wide transaction and as a commit-hook add the Git transaction. However, I think that would be overkill.

jwin · August 29, 2019, 8:40pm

simply have a rest-api that only uses database-versioning so that clients can’t commit on resources by any api-calls ?

jwin · August 29, 2019, 8:52pm

You are referring to the currently unused function
org.sirix.access.trx.page.PageTrxImpl#truncateTo
right ?

johannes · August 29, 2019, 9:03pm

Yes, this function should simply truncate the RandomAccessFile or the storage to the most recently committed revision. That is if the system crashes for some reason during a commit there might be already synced stuff. Or in the case of database-wide transactions we have to lock the most recent revisions until all node transactions opened on the resources have committed and we maybe need to truncate those to most-recent-revision - 1 if one of the node transactions fails.

jwin · August 29, 2019, 9:13pm

so basically delaying this call to unLock() here to the last moment
org/sirix/access/trx/node/json/JsonNodeTrxImpl.java:1260

And not calling unLock() will guarantee that the newest revision is not readable anywhere ?

johannes · August 29, 2019, 9:20pm

Hm, maybe splitting the method as we’d need a two-phase commit protocol. Yes, something like that.

I think I introduced the lock as of now only for scheduler based commits, that they don’t interfere with calls from users to the commit-method. But good catch

The line numbers also remind me that I want to split this huge transaction classes…

johannes · September 2, 2019, 4:34pm

I had already created an issue for this in december last year

Topic		Replies	Views
Need guidance: Is this doable with SirixDB?	2	130	April 26, 2024
Working on issues for #Hacktoberfest	3	960	October 1, 2019
Application ideas / what do you want to build?	12	809	April 7, 2024
Welcome to the SirixDB Discussions powered by Discourse	0	2950	February 20, 2019
SirixDB front-end / first steps	44	2920	December 1, 2019

Thoughts on database versioning

Related topics