我一直在研究一种方法,在多个设备(如iPad或Mac)之间同步存储在iPhone应用程序中的核心数据。在iOS上,用于core data的同步框架并不多(如果有的话)。然而,我一直在思考以下概念:
A change is made to the local core data store, and the change is saved. (a) If the device is online, it tries to send the changeset to the server, including the device ID of the device which sent the changeset. (b) If the changeset does not reach the server, or if the device is not online, the app will add the change set to a queue to send when it does come online.
The server, sitting in the cloud, merges the specific change sets it receives with its master database.
After a change set (or a queue of change sets) is merged on the cloud server, the server pushes all of those change sets to the other devices registered with the server using some sort of polling system. (I thought to use Apple's Push services, but apparently according to the comments this is not a workable system.)
有什么特别的需要我考虑的吗?我已经研究了REST框架,如ObjectiveResource、Core Resource和RestfulCoreData。当然,这些都是与Ruby on Rails一起工作的,我并不依赖于Ruby on Rails,但这是一个起点。我的解决方案的主要要求是:
任何更改都应该在后台发送,而不需要暂停主线程。
它应该使用尽可能少的带宽。
我想过一些挑战:
Making sure that the object IDs for the different data stores on different devices are attached on the server. That is to say, I will have a table of object IDs and device IDs, which are tied via a reference to the object stored in the database. I will have a record (DatabaseId [unique to this table], ObjectId [unique to the item in the whole database], Datafield1, Datafield2), the ObjectId field will reference another table, AllObjects: (ObjectId, DeviceId, DeviceObjectId). Then, when the device pushes up a change set, it will pass along the device Id and the objectId from the core data object in the local data store. Then my cloud server will check against the objectId and device Id in the AllObjects table, and find the record to change in the initial table.
All changes should be timestamped, so that they can be merged.
The device will have to poll the server, without using up too much battery.
The local devices will also need to update anything held in memory if/when changes are received from the server.
我还遗漏了什么吗?我应该考虑什么样的框架来实现这一点?
我认为GUID问题的一个很好的解决方案是“分布式ID系统”。我不确定正确的术语是什么,但我认为这是MS SQL server文档过去所称的(SQL使用/使用这种方法用于分布式/同步数据库)。其实很简单:
The server assigns all IDs. Each time a sync is done, the first thing that is checked are "How many IDs do I have left on this client?" If the client is running low, it asks the server for a new block of IDs. The client then uses IDs in that range for new records. This works great for most needs, if you can assign a block large enough that it should "never" run out before the next sync, but not so large that the server runs out over time. If the client ever does run out, the handling can be pretty simple, just tell the user "sorry you cannot add more items until you sync"... if they are adding that many items, shouldn't they sync to avoid stale data issues anyway?
我认为这优于使用随机guid,因为随机guid不是100%安全的,通常需要比标准ID长得多(128位vs 32位)。您通常根据ID建立索引,并经常将ID号保存在内存中,因此保持它们较小是很重要的。
并不是真的想把它作为答案,但我不知道有人会把它作为评论,我认为这对这个话题很重要,不包括在其他答案中。
First you should rethink how many data, tables and relations you will have. In my solution I’ve implemented syncing through Dropbox files. I observe changes in main MOC and save these data to files (each row is saved as gzipped json). If there is an internet connection working, I check if there are any changes on Dropbox (Dropbox gives me delta changes), download them and merge (latest wins), and finally put changed files. Before sync I put lock file on Dropbox to prevent other clients syncing incomplete data. When downloading changes it’s safe that only partial data is downloaded (eg lost internet connection). When downloading is finished (fully or partial) it starts to load files into Core Data. When there are unresolved relations (not all files are downloaded) it stops loading files and tries to finish downloading later. Relations are stored only as GUID, so I can easly check which files to load to have full data integrity.
Syncing is starting after changes to core data are made. If there are no changes, than it checks for changes on Dropbox every few minutes and on app startup. Additionaly when changes are sent to server I send a broadcast to other devices to inform them about changes, so they can sync faster.
Each synced entity has GUID property (guid is used also as a filename for exchange files). I have also Sync database where I store Dropbox revision of each file (I can compare it when Dropbox delta resets it’s state). Files also contain entity name, state (deleted/not deleted), guid (same as filename), database revision (to detect data migrations or to avoid syncing with never app versions) and of course the data (if row is not deleted).
这个解决方案适用于数千个文件和大约30个实体。而不是Dropbox,我可以使用键/值存储作为REST web服务,我想稍后做,但没有时间做这个:)目前,在我看来,我的解决方案比iCloud更可靠,这是非常重要的,我可以完全控制它的工作方式(主要是因为它是我自己的代码)。
另一种解决方案是将MOC更改保存为事务-与服务器交换的文件会少得多,但很难按适当的顺序将初始加载到空的核心数据中。iCloud就是这样工作的,其他同步解决方案也有类似的方法,例如TICoreDataSync。
--
更新
过了一段时间,我迁移到Ensembles——我推荐这个解决方案,而不是重新发明轮子。
我做过和你想做的类似的事。让我告诉你我学到了什么以及我是怎么做的。
我假设您的Core Data对象和服务器上的模型(或db模式)之间有一对一的关系。您只是想让服务器内容与客户机保持同步,但是客户机也可以修改和添加数据。如果我没猜错,那就继续读下去。
我添加了四个字段来帮助同步:
sync_status - Add this field to your core data model only. It's used by the app to determine if you have a pending change on the item. I use the following codes: 0 means no changes, 1 means it's queued to be synchronized to the server, and 2 means it's a temporary object and can be purged.
is_deleted - Add this to the server and core data model. Delete event shouldn't actually delete a row from the database or from your client model because it leaves you with nothing to synchronize back. By having this simple boolean flag, you can set is_deleted to 1, synchronize it, and everyone will be happy. You must also modify the code on the server and client to query non deleted items with "is_deleted=0".
last_modified - Add this to the server and core data model. This field should automatically be updated with the current date and time by the server whenever anything changes on that record. It should never be modified by the client.
guid - Add a globally unique id (see http://en.wikipedia.org/wiki/Globally_unique_identifier) field to the server and core data model. This field becomes the primary key and becomes important when creating new records on the client. Normally your primary key is an incrementing integer on the server, but we have to keep in mind that content could be created offline and synchronized later. The GUID allows us to create a key while being offline.
在客户机上,添加代码,在您的模型对象上设置sync_status为1,当某些内容发生变化并需要同步到服务器时。新的模型对象必须生成一个GUID。
同步是一个单独的请求。该请求包括:
模型对象的MAX last_modified时间戳。这告诉服务器您只希望在此时间戳之后进行更改。
包含sync_status=1的所有项的JSON数组。
服务器获得请求并执行以下操作:
它从JSON数组中获取内容,并修改或添加其中包含的记录。last_modified字段会自动更新。
服务器返回一个JSON数组,其中包含所有last_modified时间戳大于请求中发送的时间戳的对象。这将包括它刚刚接收到的对象,这将作为记录成功同步到服务器的确认。
应用程序接收响应并执行以下操作:
它从JSON数组中获取内容,并修改或添加其中包含的记录。每条记录的sync_status设置为0。
我交替使用了“记录”和“模型”这两个词,但我认为您已经明白了这个意思。