|A plan for a SyncML system
||[Aug. 30th, 2004|05:37 pm]
For a few months, I've been toying with the idea of implementing (probably based on existing open source libraries) a SyncML system. Such a beast would handle syncing contacts between my phone, my email client, the company contacts/calendaring system, the company contacts LDAP directory, and possibly a LDAP directory at home. Oh, and it needs to work for some other phones used by people at work, so I can justify doing some of the development on company time...|
One of the things that has been bugging me for a while is how to store the data. I originally had ideas about chucking a few extra fields into the LDAP directory for contacts, or maybe having a couple of extra database fields to store things. However, after much thought, I just don't see how this would work.
Instead, I have decided that a full database solution will be required. I'll try to explain this now.
There will be a database, holding each item 1+(number of sync'd devices) times. Looking first at contacts (calendar will be similar), we will keep track of the last time each device sync'd. In addition to a master store of the data, we will keep our own copy of the data on the client (hence the number of times we keep the data).
When a client connects, deciding what new entries to send it is easy. Any entries from the main table that aren't in that client's table need sending. When the client sends its data, that's quite easy as well. For every change entry we get from the client, we compare the entry in their table with the one in the main table. If these two agree, we just accept the change, and pop it into both tables. If the tables differ, but the change matches the main table, we know it was updated on the device and the main table, so we just have to update the client table. If all three disagree, there have been two opposing changes, and we'll need the user to decide which one to pick.
To reduce the chances of this later situation, we can take a simple step. Rather than storing all the data for a contact together, we'll store it split out. There will be a "contact" entry with no data, then a whole bunch of data rows (eg type=home phone number,data=01xxxxxxx), in all the tables. We can then handle the case when the address was updated in one place and the phone number in the other. If the user really did, say, edit the phone number in two places to two different values, then we'll really have to ask them.
(In full sync mode, where the client sends all its data, we can synthesise the change data ourselves, by comparing the data from the client with that in their client table.)
The other thing we'll need to store with the data rows is a "last checked" date flag. This will allow us to, at the end of an update run, scan through and find any rows with their date not updated. These can be considered to have been deleted, and then can be purged accordingly.
So, our database has: one data set per client + one master set, one contact per data set, then multiple data rows per contact, with a data type+the data+last checked date.
The final step is to handle non sync-ml datasources. With LDAP, we dump the whole directory, then perform a three way compare. From this we deduce the changes, and procede as per a full sync. We will need to run this at regular intervals, to keep our directory in sync (we can't just check last sync time and skip if no changes made, since the ldap directory is writeable). Other systems are handled via writing something to handle "dump all", "add", "delete" and "modify" actions against them.
I think it should be fairly quick to implement this (I've done the ldap - database sync thing already). Before I do go ahead, I'll need to read up on SyncML and how Sync4J handles it, since this is what I'll probably use to do the client communication. All being well, once I understand how that'll all interact, it should be quite quick to code something up.