我一直在研究一种方法,在多个设备(如iPad或Mac)之间同步存储在iPhone应用程序中的核心数据。在iOS上,用于core data的同步框架并不多(如果有的话)。然而,我一直在思考以下概念:

A change is made to the local core data store, and the change is saved. (a) If the device is online, it tries to send the changeset to the server, including the device ID of the device which sent the changeset. (b) If the changeset does not reach the server, or if the device is not online, the app will add the change set to a queue to send when it does come online. The server, sitting in the cloud, merges the specific change sets it receives with its master database. After a change set (or a queue of change sets) is merged on the cloud server, the server pushes all of those change sets to the other devices registered with the server using some sort of polling system. (I thought to use Apple's Push services, but apparently according to the comments this is not a workable system.)

有什么特别的需要我考虑的吗?我已经研究了REST框架,如ObjectiveResource、Core Resource和RestfulCoreData。当然,这些都是与Ruby on Rails一起工作的,我并不依赖于Ruby on Rails,但这是一个起点。我的解决方案的主要要求是:

任何更改都应该在后台发送,而不需要暂停主线程。 它应该使用尽可能少的带宽。

我想过一些挑战:

Making sure that the object IDs for the different data stores on different devices are attached on the server. That is to say, I will have a table of object IDs and device IDs, which are tied via a reference to the object stored in the database. I will have a record (DatabaseId [unique to this table], ObjectId [unique to the item in the whole database], Datafield1, Datafield2), the ObjectId field will reference another table, AllObjects: (ObjectId, DeviceId, DeviceObjectId). Then, when the device pushes up a change set, it will pass along the device Id and the objectId from the core data object in the local data store. Then my cloud server will check against the objectId and device Id in the AllObjects table, and find the record to change in the initial table. All changes should be timestamped, so that they can be merged. The device will have to poll the server, without using up too much battery. The local devices will also need to update anything held in memory if/when changes are received from the server.

我还遗漏了什么吗?我应该考虑什么样的框架来实现这一点?


当前回答

First you should rethink how many data, tables and relations you will have. In my solution I’ve implemented syncing through Dropbox files. I observe changes in main MOC and save these data to files (each row is saved as gzipped json). If there is an internet connection working, I check if there are any changes on Dropbox (Dropbox gives me delta changes), download them and merge (latest wins), and finally put changed files. Before sync I put lock file on Dropbox to prevent other clients syncing incomplete data. When downloading changes it’s safe that only partial data is downloaded (eg lost internet connection). When downloading is finished (fully or partial) it starts to load files into Core Data. When there are unresolved relations (not all files are downloaded) it stops loading files and tries to finish downloading later. Relations are stored only as GUID, so I can easly check which files to load to have full data integrity. Syncing is starting after changes to core data are made. If there are no changes, than it checks for changes on Dropbox every few minutes and on app startup. Additionaly when changes are sent to server I send a broadcast to other devices to inform them about changes, so they can sync faster. Each synced entity has GUID property (guid is used also as a filename for exchange files). I have also Sync database where I store Dropbox revision of each file (I can compare it when Dropbox delta resets it’s state). Files also contain entity name, state (deleted/not deleted), guid (same as filename), database revision (to detect data migrations or to avoid syncing with never app versions) and of course the data (if row is not deleted).

这个解决方案适用于数千个文件和大约30个实体。而不是Dropbox,我可以使用键/值存储作为REST web服务,我想稍后做,但没有时间做这个:)目前,在我看来,我的解决方案比iCloud更可靠,这是非常重要的,我可以完全控制它的工作方式(主要是因为它是我自己的代码)。

另一种解决方案是将MOC更改保存为事务-与服务器交换的文件会少得多,但很难按适当的顺序将初始加载到空的核心数据中。iCloud就是这样工作的,其他同步解决方案也有类似的方法,例如TICoreDataSync。

-- 更新

过了一段时间,我迁移到Ensembles——我推荐这个解决方案,而不是重新发明轮子。

其他回答

类似于@Cris,我已经实现了客户端和服务器之间的同步类,并解决了迄今为止所有已知的问题(向服务器发送/接收数据,基于时间戳合并冲突,在不可靠的网络条件下删除重复条目,同步嵌套数据和文件等。)

您只需告诉类应该同步哪个实体和哪些列,以及服务器在哪里。

M3Synchronization * syncEntity = [[M3Synchronization alloc] initForClass: @"Car"
                                                              andContext: context
                                                            andServerUrl: kWebsiteUrl
                                             andServerReceiverScriptName: kServerReceiverScript
                                              andServerFetcherScriptName: kServerFetcherScript
                                                    ansSyncedTableFields:@[@"licenceNumber", @"manufacturer", @"model"]
                                                    andUniqueTableFields:@[@"licenceNumber"]];


syncEntity.delegate = self; // delegate should implement onComplete and onError methods
syncEntity.additionalPostParamsDictionary = ... // add some POST params to authenticate current user

[syncEntity sync];

你可以在这里找到源代码,工作示例和更多说明:github.com/knagode/M3Synchronization。

通过推送通知通知用户更新数据。 在应用程序中使用后台线程检查本地数据和云服务器上的数据,当服务器上发生变化时,更改本地数据,反之亦然。

所以我认为最困难的部分是估计哪一方的数据是无效的。

希望这能对你有所帮助

我建议仔细阅读并实施Dan Grover在iPhone 2009会议上讨论的同步策略,这里有一个pdf文档。

这是一个可行的解决方案,实现起来并不难(Dan在它的几个应用程序中实现了它),与Chris描述的解决方案重叠。关于同步的深入理论讨论,请参阅Russ Cox(麻省理工学院)和William Josephson(普林斯顿大学)的论文:

矢量时间对文件同步

它同样适用于经过一些明显修改的核心数据。这提供了一个整体上更加健壮和可靠的同步策略,但是需要更多的努力才能正确实现。

编辑:

格罗弗的pdf文件似乎不再可用(断开链接,2015年3月)。更新:链接可通过回头机在这里

由Marcus Zarra开发的名为ZSync的Objective-C框架已经被弃用,因为iCloud似乎终于支持正确的核心数据同步了。

First you should rethink how many data, tables and relations you will have. In my solution I’ve implemented syncing through Dropbox files. I observe changes in main MOC and save these data to files (each row is saved as gzipped json). If there is an internet connection working, I check if there are any changes on Dropbox (Dropbox gives me delta changes), download them and merge (latest wins), and finally put changed files. Before sync I put lock file on Dropbox to prevent other clients syncing incomplete data. When downloading changes it’s safe that only partial data is downloaded (eg lost internet connection). When downloading is finished (fully or partial) it starts to load files into Core Data. When there are unresolved relations (not all files are downloaded) it stops loading files and tries to finish downloading later. Relations are stored only as GUID, so I can easly check which files to load to have full data integrity. Syncing is starting after changes to core data are made. If there are no changes, than it checks for changes on Dropbox every few minutes and on app startup. Additionaly when changes are sent to server I send a broadcast to other devices to inform them about changes, so they can sync faster. Each synced entity has GUID property (guid is used also as a filename for exchange files). I have also Sync database where I store Dropbox revision of each file (I can compare it when Dropbox delta resets it’s state). Files also contain entity name, state (deleted/not deleted), guid (same as filename), database revision (to detect data migrations or to avoid syncing with never app versions) and of course the data (if row is not deleted).

这个解决方案适用于数千个文件和大约30个实体。而不是Dropbox,我可以使用键/值存储作为REST web服务,我想稍后做,但没有时间做这个:)目前,在我看来,我的解决方案比iCloud更可靠,这是非常重要的,我可以完全控制它的工作方式(主要是因为它是我自己的代码)。

另一种解决方案是将MOC更改保存为事务-与服务器交换的文件会少得多,但很难按适当的顺序将初始加载到空的核心数据中。iCloud就是这样工作的,其他同步解决方案也有类似的方法,例如TICoreDataSync。

-- 更新

过了一段时间,我迁移到Ensembles——我推荐这个解决方案,而不是重新发明轮子。

我做过和你想做的类似的事。让我告诉你我学到了什么以及我是怎么做的。

我假设您的Core Data对象和服务器上的模型(或db模式)之间有一对一的关系。您只是想让服务器内容与客户机保持同步,但是客户机也可以修改和添加数据。如果我没猜错,那就继续读下去。

我添加了四个字段来帮助同步:

sync_status - Add this field to your core data model only. It's used by the app to determine if you have a pending change on the item. I use the following codes: 0 means no changes, 1 means it's queued to be synchronized to the server, and 2 means it's a temporary object and can be purged. is_deleted - Add this to the server and core data model. Delete event shouldn't actually delete a row from the database or from your client model because it leaves you with nothing to synchronize back. By having this simple boolean flag, you can set is_deleted to 1, synchronize it, and everyone will be happy. You must also modify the code on the server and client to query non deleted items with "is_deleted=0". last_modified - Add this to the server and core data model. This field should automatically be updated with the current date and time by the server whenever anything changes on that record. It should never be modified by the client. guid - Add a globally unique id (see http://en.wikipedia.org/wiki/Globally_unique_identifier) field to the server and core data model. This field becomes the primary key and becomes important when creating new records on the client. Normally your primary key is an incrementing integer on the server, but we have to keep in mind that content could be created offline and synchronized later. The GUID allows us to create a key while being offline.

在客户机上,添加代码,在您的模型对象上设置sync_status为1,当某些内容发生变化并需要同步到服务器时。新的模型对象必须生成一个GUID。

同步是一个单独的请求。该请求包括:

模型对象的MAX last_modified时间戳。这告诉服务器您只希望在此时间戳之后进行更改。 包含sync_status=1的所有项的JSON数组。

服务器获得请求并执行以下操作:

它从JSON数组中获取内容,并修改或添加其中包含的记录。last_modified字段会自动更新。 服务器返回一个JSON数组,其中包含所有last_modified时间戳大于请求中发送的时间戳的对象。这将包括它刚刚接收到的对象,这将作为记录成功同步到服务器的确认。

应用程序接收响应并执行以下操作:

它从JSON数组中获取内容,并修改或添加其中包含的记录。每条记录的sync_status设置为0。

我交替使用了“记录”和“模型”这两个词,但我认为您已经明白了这个意思。