所以我们在项目中有这个巨大的mainmodule.cpp源文件(11000行很大吗?),每次我不得不触摸它时,我都会畏缩。

由于这个文件是如此的核心和大,它不断积累越来越多的代码,我想不出一个好方法来让它实际上开始缩小。

该文件在我们产品的几个(> 10)维护版本中被使用和积极更改,因此很难重构它。如果我“简单地”将其拆分为3个文件,那么从维护版本合并回更改将成为一场噩梦。而且,如果您拆分具有如此长而丰富历史的文件,跟踪和检查SCC历史中的旧更改突然变得非常困难。

这个文件基本上包含了我们程序的“主类”(主要的内部工作调度和协调),所以每次添加一个特性,它也会影响这个文件,每次它的增长。:-(

在这种情况下你会怎么做?关于如何在不打乱SCC工作流程的情况下将新特性移动到单独的源文件中,您有什么想法吗?

(注意:我们使用c++和Visual Studio;我们使用AccuRev作为SCC,但我认为SCC的类型在这里并不重要;我们使用Araxis Merge来做实际的文件比较和合并)


当前回答

这个问题在“有效地使用遗留代码”(http://www.amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/0131177052)一书的其中一章中得到了解决。

其他回答

这让我想起了我以前的工作。似乎,在我加入之前,所有东西都在一个巨大的文件中(也是c++)。然后他们将其拆分(在完全随机的点上使用include)为大约三个(仍然是巨大的文件)。正如你所预料的那样,这个软件的质量非常糟糕。该项目总标线约为40k。(几乎没有注释,但有大量重复代码)

最后,我完全重写了这个项目。我从头开始重做项目中最糟糕的部分。当然,我想到了这个新部分和其他部分之间可能的(小)接口。然后我把这个部分插入到旧的项目中。我没有重构旧代码来创建必要的接口,只是替换了它。然后我从那里迈出了一小步,重写了旧代码。

我不得不说,这花了大约半年的时间,在此期间,除了修复错误之外,没有开发旧的代码库。


编辑:

它的大小保持在40k LOC左右,但与8年前的软件相比,新应用程序在初始版本中包含了更多的功能,可能bug也更少。重写的一个原因是我们需要新的特性,而在旧代码中引入这些特性几乎是不可能的。

该软件是为一个嵌入式系统,一个标签打印机。

我应该补充的另一点是,理论上这个项目是c++的。但它根本不是面向对象的,它可能是c。新版本是面向对象的。

我不知道这是否解决了您的问题,但我猜您想要做的是将文件的内容迁移到彼此独立的更小的文件中(合计)。 我还了解到,你有大约10个不同版本的软件,你需要在不搞砸的情况下支持它们。

首先,这是不可能的简单,将解决自己在几分钟的头脑风暴。文件中链接的函数对应用程序都非常重要,简单地将它们删除并迁移到其他文件中并不能解决问题。

我认为你只有这些选择:

Don't migrate and stay with what you have. Possibly quit your job and start working on serious software with good design in addition. Extreme programming is not always the best solution if you are working on a long time project with enough funds to survive a crash or two. Work out a layout of how you would love your file to look once it's split up. Create the necessary files and integrate them in your application. Rename the functions or overload them to take an additional parameter (maybe just a simple boolean?). Once you have to work on your code, migrate the functions you need to work on to the new file and map the function calls of the old functions to the new functions. You should still have your main-file this way, and still be able to see the changes that were made to it, once it comes to a specific function you know exactly when it was outsourced and so on. Try to convince your co-workers with some good cake that workflow is overrated and that you need to rewrite some parts of the application in order to do serious business.

所以从一开始重写产品代码的API是一个坏主意。需要做两件事。

首先,您需要让您的团队决定对该文件的当前生产版本进行代码冻结。

第二,您需要使用这个生产版本并创建一个分支,该分支使用预处理指令来管理构建,以分割大文件。使用JUST预处理器指令(#ifdefs, #includes, #endifs)拆分编译比重新编码API更容易。对于您的sla和持续的支持来说,这绝对更容易。

在这里,您可以简单地删除类中与特定子系统相关的函数,并将它们放在一个文件(例如mainloop_foostuff.cpp)中,并将其包含在mainloop.cpp中的正确位置。

OR

一种更耗时但健壮的方法是设计一个内部依赖关系结构,在包含内容的方式上具有双重间接性。这将允许您分割内容,并仍然照顾到共同依赖关系。注意,这种方法需要位置编码,因此应该加上适当的注释。

这种方法将包括基于您正在编译的变体而使用的组件。

基本结构是mainclass.cpp将在如下语句块后包含一个名为MainClassComponents.cpp的新文件:

#if VARIANT == 1
#  define Uses_Component_1
#  define Uses_Component_2
#elif VARIANT == 2
#  define Uses_Component_1
#  define Uses_Component_3
#  define Uses_Component_6
...

#endif

#include "MainClassComponents.cpp"

MainClassComponents.cpp文件的主要结构将在那里计算子组件中的依赖关系,如下所示:

#ifndef _MainClassComponents_cpp
#define _MainClassComponents_cpp

/* dependencies declarations */

#if defined(Activate_Component_1) 
#define _REQUIRES_COMPONENT_1
#define _REQUIRES_COMPONENT_3 /* you also need component 3 for component 1 */
#endif

#if defined(Activate_Component_2)
#define _REQUIRES_COMPONENT_2
#define _REQUIRES_COMPONENT_15 /* you also need component 15 for this component  */
#endif

/* later on in the header */

#ifdef _REQUIRES_COMPONENT_1
#include "component_1.cpp"
#endif

#ifdef _REQUIRES_COMPONENT_2
#include "component_2.cpp"
#endif

#ifdef _REQUIRES_COMPONENT_3
#include "component_3.cpp"
#endif


#endif /* _MainClassComponents_h  */

现在,为每个组件创建一个component_xx.cpp文件。

当然,我使用数字,但你应该使用一些更符合逻辑的基于你的代码。

使用预处理器可以让你把事情分开,而不必担心API的变化,这在生产中是一个噩梦。

一旦你确定了产品,你就可以开始重新设计了。

Find some code in the file which is relatively stable (not changing fast, and doesn't vary much between branches) and could stand as an independent unit. Move this into its own file, and for that matter into its own class, in all branches. Because it's stable, this won't cause (many) "awkward" merges that have to be applied to a different file from the one they were originally made on, when you merge the change from one branch to another. Repeat. Find some code in the file which basically only applies to a small number of branches, and could stand alone. Doesn't matter whether it's changing fast or not, because of the small number of branches. Move this into its own classes and files. Repeat.

因此,我们去掉了到处都一样的代码,以及特定于某些分支的代码。

This leaves you with a nucleus of badly-managed code - it's needed everywhere, but it's different in every branch (and/or it changes constantly so that some branches are running behind others), and yet it's in a single file that you're unsuccessfully trying to merge between branches. Stop doing that. Branch the file permanently, perhaps by renaming it in each branch. It's not "main" any more, it's "main for configuration X". OK, so you lose the ability to apply the same change to multiple branches by merging, but this is in any case the core of code where merging doesn't work very well. If you're having to manually manage the merges anyway to deal with conflicts, then it's no loss to manually apply them independently on each branch.

我认为你说这种SCC无关紧要是错误的,因为例如git的合并能力可能比你正在使用的合并工具更好。因此,核心问题“合并困难”发生在不同scc的不同时期。但是,您不太可能更改scc,因此这个问题可能无关紧要。

让我猜猜:10个拥有不同功能集的客户和一个提倡“定制化”的销售经理?我以前做过这样的产品。我们遇到了同样的问题。

您认识到拥有一个巨大的文件是很麻烦的,但更麻烦的是您必须保持10个版本的“最新”。这是多重维护。SCC可以使这更容易,但它不能使它正确。

Before you try to break the file into parts, you need to bring the ten branches back in sync with each other so that you can see and shape all the code at once. You can do this one branch at a time, testing both branches against the same main code file. To enforce the custom behavior, you can use #ifdef and friends, but it's better as much as possible to use ordinary if/else against defined constants. This way, your compiler will verify all types and most probably eliminate "dead" object code anyway. (You may want to turn off the warning about dead code, though.)

一旦所有分支隐式地共享了该文件的一个版本,那么就更容易开始使用传统的重构方法。

#ifdefs主要适用于受影响的代码只在其他分支自定义上下文中有意义的部分。有人可能会说,这也为相同的分支合并方案提供了机会,但不要太疯狂。一次只做一个大项目。

In the short run, the file will appear to grow. This is OK. What you're doing is bringing things together that need to be together. Afterwards, you'll begin to see areas that are clearly the same regardless of version; these can be left alone or refactored at will. Other areas will clearly differ depending on the version. You have a number of options in this case. One method is to delegate the differences to per-version strategy objects. Another is to derive client versions from a common abstract class. But none of these transformations are possible as long as you have ten "tips" of development in different branches.