Persistent Hierarchies and Key Generators

We had a very hot discussion with Alex Yakunin about this strange DataObjects.Net behavior. How can you describe his solution of the problem? Is it workaround or clear and solid solution? Why such problem appeared at all?

In this post I'll describe just my own opinion.

First of all, let's remember that DO4 model can be divided into several persistent hierarchies. Any non-abstract persistent type must belong to one of them. In general hierarchies can be completely defined as sets of classes with following peculiarities:
  • All types(tables) within a single hierarchy has identical key structure.
  • All entities within a hierarchy has unique key value
So, different hierarchies can have different key structures and entity from one hierarchy can have the same key value as entity from another. 'Key' type that uniquely identifies any entity within a domain originally contained two fields: 'Hierarchy' and 'KeyValue'.

If you declare persistent field as a reference to some entity, you must explicitly specify hierarchy you are referencing to. I.e. you can not create field of 'Entity' type, but can create it of 'Person' type, because two different entities with given key value can exist, but two persons not.

OK, we have very solid concepts here, everything is clear, but...

...several months ago, Xtensive headquarter, Ekaterinburg... DO4 team meeting...

- Hi, guys! Our customers ask us when "Persistent Interfaces" (cool feature of DO3) will be implemented in DO4. I think it's time to do this.
- Sure, Persistent Interfaces is such a cool feature! Must have!
- OK, let's say some types from our model can implement some persistent interface, we will be able to query them in unified way and declare persistent fields as references to such interfaces.
- But if we could do such references, we would know which hierarchy we are referencing, i.e. all implementers of persistent interface must belong to a single hierarchy. Is it cool enough?
- Surely no.
- I have the funny idea!!! We have a set of key generators in our database, generally one key generator can generate keys for several hierarchies, in such case all these hierarchies has similar key structures and unique key values. So we can require implementers of any persistent interface to use common key generator, but they still can belong to different hierarchies. That's decided.
- OK, but there is still one little problem. When we are about to resolve reference to persistent interface we must create a 'Key' instance to fetch an entity, but we don't know exact hierarchy.
- Yes, but we surely know key generator, cause all implementers of our interface has common one. Let's replace 'Hierarchy' property in 'Key' type with 'KeyGenerator'.
- OK, I'll do this immediately.

By this time we have very cool feature and two interesting concepts: Persistent Hierarchy and Key Generator. My question is: What's the difference between them.
  • Both concepts divide persistent classes into several groups
  • In both cases classes within a group has common key structure
  • In both cases entities within a group has unique key values
As a fact we use key generators as a persistent hierarchies in most cases, and it leads to misunderstanding.

So in example Alex Yakunin described in his blog we faced the problem generated by that decision to mix hierarchies and key generators. Customer knows that he should declare different hierarchies if he uses identical key values, and he did, but keys are still equal. Why to use hierarchies at all? Are they still make sense or not?

I think, that it will be more effective to require all persistent interface implementers to belong to a single persistent hierarchy and don't mix those two concepts. Surly there are some different solutions, but anyway we should keep our architecture as clear and simple as possible.


Alex Ilyin said...

I fully agree with you, I was surprised by Alex Yakunin's blog post but I gave up to argue.

Alex Yakunin said...

In some cases it's simply a bad idea to keep all the types implementing IXxx in a single hierarchy. Eg. if there are two sets of types, and one is huge, but another is small, and we need to read the small one fast.

Generally, merging two hierarchies implies there will be an additional common base type, that also have some cost.

Btw, we already agreed to refactor [KeyGenerator] a bit to make it possible to specify a kind of scope there (either a string or type), so it won't be necessary to use [ProxyKeyGenerator] further.

I agree this is confusing. On the other hand, it's logical that keys are totally different if they're generated by different key generators, and vice versa. The problem is that this isn't obvious that we use such key comparison rule. So I wrote about this, and will add this to Manual ;)

Alex Kofman said...

Note that the fact that you described something in Manual doesn't make it more logical or obvious for customers and product developers (-: