如何将数据库置于git(版本控制)下?

我们曾经在一个标准的LAMP配置上运行一个社交网站。我们有一个活动服务器、测试服务器和开发服务器，以及本地开发人员机器。所有这些都使用GIT进行管理。

On each machine, we had the PHP files, but also the MySQL service, and a folder with Images that users would upload. The Live server grew to have some 100K (!) recurrent users, the dump was about 2GB (!), the Image folder was some 50GB (!). By the time that I left, our server was reaching the limit of its CPU, Ram, and most of all, the concurrent net connection limits (We even compiled our own version of network card driver to max out the server 'lol'). We could not (nor should you assume with your website) put 2GB of data and 50GB of images in GIT.

To manage all this under GIT easily, we would ignore the binary folders (the folders containing the Images) by inserting these folder paths into .gitignore. We also had a folder called SQL outside the Apache documentroot path. In that SQL folder, we would put our SQL files from the developers in incremental numberings (001.florianm.sql, 001.johns.sql, 002.florianm.sql, etc). These SQL files were managed by GIT as well. The first sql file would indeed contain a large set of DB schema. We don't add user-data in GIT (eg the records of the users table, or the comments table), but data like configs or topology or other site specific data, was maintained in the sql files (and hence by GIT). Mostly its the developers (who know the code best) that determine what and what is not maintained by GIT with regards to SQL schema and data.

When it got to a release, the administrator logs in onto the dev server, merges the live branch with all developers and needed branches on the dev machine to an update branch, and pushed it to the test server. On the test server, he checks if the updating process for the Live server is still valid, and in quick succession, points all traffic in Apache to a placeholder site, creates a DB dump, points the working directory from 'live' to 'update', executes all new sql files into mysql, and repoints the traffic back to the correct site. When all stakeholders agreed after reviewing the test server, the Administrator did the same thing from Test server to Live server. Afterwards, he merges the live branch on the production server, to the master branch accross all servers, and rebased all live branches. The developers were responsible themselves to rebase their branches, but they generally know what they are doing.

如果测试服务器上有问题，例如。合并有太多冲突，然后代码被恢复(将工作分支指向'live')， SQL文件永远不会执行。在执行sql文件时，这被认为是一个不可逆的操作。如果SQL文件不能正常工作，则使用Dump恢复DB(开发人员被告知，因为提供了测试不佳的SQL文件)。

今天，我们同时维护一个sql-up和sql-down文件夹，它们具有相同的文件名，开发人员必须测试这两个正在升级的sql文件是否可以同样降级。这最终可以用bash脚本来执行，但是如果有人一直监视升级过程，这是个好主意。

虽然不是很好，但还是可以控制的。希望这能让你深入了解一个真实的、实用的、相对高可用性的站点。也许它有点过时，但仍然被遵循。

2017-06-07 20:21:54

使用像iBatis Migrations这样的工具(手动，短教程视频)，它允许您在项目的整个生命周期中对数据库所做的更改进行版本控制，而不是数据库本身。

这允许您有选择地将单个更改应用到不同的环境中，记录哪些更改在哪些环境中，创建脚本以应用从a到N的更改、回滚更改等等。

2010-08-11 01:16:14

没有原子性就无法做到这一点，如果不使用pg_dump或快照文件系统，就无法获得原子性。

我的postgres实例在zfs上，我偶尔会对它进行快照。它几乎是即时和一致的。

2009-05-11 04:47:57

这个问题基本上已经回答了，但我想用一个小建议来补充X-Istence和Dana the Sane的回答。

如果您需要具有一定粒度的修订控制，比如每天，那么您可以使用rdiff-backup之类的工具将表和模式的文本转储与增量备份结合起来。这样做的好处是，不存储每日备份的快照，而只存储与前一天的差异。

这样你就有了修订控制的优势，也不会浪费太多的空间。

在任何情况下，直接在频繁更改的大平面文件上使用git都不是一个好的解决方案。如果数据库变得太大，git在管理文件时会出现一些问题。

2010-07-26 15:00:13

我们曾经在一个标准的LAMP配置上运行一个社交网站。我们有一个活动服务器、测试服务器和开发服务器，以及本地开发人员机器。所有这些都使用GIT进行管理。

On each machine, we had the PHP files, but also the MySQL service, and a folder with Images that users would upload. The Live server grew to have some 100K (!) recurrent users, the dump was about 2GB (!), the Image folder was some 50GB (!). By the time that I left, our server was reaching the limit of its CPU, Ram, and most of all, the concurrent net connection limits (We even compiled our own version of network card driver to max out the server 'lol'). We could not (nor should you assume with your website) put 2GB of data and 50GB of images in GIT.

To manage all this under GIT easily, we would ignore the binary folders (the folders containing the Images) by inserting these folder paths into .gitignore. We also had a folder called SQL outside the Apache documentroot path. In that SQL folder, we would put our SQL files from the developers in incremental numberings (001.florianm.sql, 001.johns.sql, 002.florianm.sql, etc). These SQL files were managed by GIT as well. The first sql file would indeed contain a large set of DB schema. We don't add user-data in GIT (eg the records of the users table, or the comments table), but data like configs or topology or other site specific data, was maintained in the sql files (and hence by GIT). Mostly its the developers (who know the code best) that determine what and what is not maintained by GIT with regards to SQL schema and data.

When it got to a release, the administrator logs in onto the dev server, merges the live branch with all developers and needed branches on the dev machine to an update branch, and pushed it to the test server. On the test server, he checks if the updating process for the Live server is still valid, and in quick succession, points all traffic in Apache to a placeholder site, creates a DB dump, points the working directory from 'live' to 'update', executes all new sql files into mysql, and repoints the traffic back to the correct site. When all stakeholders agreed after reviewing the test server, the Administrator did the same thing from Test server to Live server. Afterwards, he merges the live branch on the production server, to the master branch accross all servers, and rebased all live branches. The developers were responsible themselves to rebase their branches, but they generally know what they are doing.

如果测试服务器上有问题，例如。合并有太多冲突，然后代码被恢复(将工作分支指向'live')， SQL文件永远不会执行。在执行sql文件时，这被认为是一个不可逆的操作。如果SQL文件不能正常工作，则使用Dump恢复DB(开发人员被告知，因为提供了测试不佳的SQL文件)。

今天，我们同时维护一个sql-up和sql-down文件夹，它们具有相同的文件名，开发人员必须测试这两个正在升级的sql文件是否可以同样降级。这最终可以用bash脚本来执行，但是如果有人一直监视升级过程，这是个好主意。

虽然不是很好，但还是可以控制的。希望这能让你深入了解一个真实的、实用的、相对高可用性的站点。也许它有点过时，但仍然被遵循。

2017-06-07 20:21:54

有一个伟大的项目叫做“教义下的移民”，就是为了这个目的而建立的。

它仍然处于alpha状态，是为php构建的。

http://docs.doctrine-project.org/projects/doctrine-migrations/en/latest/index.html

2011-02-25 11:19:38

如何将数据库置于git(版本控制)下?

推荐文章

最新文章

标签