Template Haskell似乎经常被Haskell社区视为一种不幸的便利。很难用语言准确描述我在这方面的观察,但可以考虑以下几个例子

在“丑陋的(但必要的)”中列出的Haskell模板,以回答用户应该使用/避免使用哪些Haskell (GHC)扩展? 模板Haskell考虑了一个临时/劣质的解决方案在Unboxed Vectors的newtype'd值线程(图书馆邮件列表) Yesod经常因为过于依赖Template Haskell而受到批评(请参阅回应这种观点的博客文章)

我看过很多博客文章,人们用Template Haskell做了一些非常整洁的事情,实现了在普通Haskell中无法实现的漂亮语法,以及大量的样板文件缩减。那么为什么Template Haskell会这样被轻视呢?是什么让它不受欢迎?在什么情况下应该避免使用Template Haskell,为什么?


当前回答

我想谈谈dflemstr提出的几个问题。

我不认为你不能打字检查TH的事实是令人担忧的。为什么?因为即使有错误,它仍然是编译时的。我不确定这是否加强了我的论点,但这在精神上与您在c++中使用模板时收到的错误类似。不过,我认为这些错误比c++的错误更容易理解,因为您将得到生成代码的漂亮打印版本。

如果一个TH表达式/准引号做了一些如此高级的事情,以至于棘手的角落可以隐藏,那么也许它是不明智的?

I break this rule quite a bit with quasi-quoters I've been working on lately (using haskell-src-exts / meta) - https://github.com/mgsloan/quasi-extras/tree/master/examples . I know this introduces some bugs such as not being able to splice in the generalized list comprehensions. However, I think that there's a good chance that some of the ideas in http://hackage.haskell.org/trac/ghc/blog/Template%20Haskell%20Proposal will end up in the compiler. Until then, the libraries for parsing Haskell to TH trees are a nearly perfect approximation.

考虑到编译速度/依赖关系,我们可以使用“zero”包来内联生成的代码。这至少对给定库的用户很好,但对于编辑库的情况,我们不能做得更好。TH依赖会膨胀生成的二进制文件吗?我以为它忽略了编译代码没有引用的所有内容。

Haskell模块的分段限制/编译步骤的分割确实很糟糕。

RE不透明度:这对你调用的任何库函数都是一样的。你无法控制Data.List.groupBy将做什么。你只是有一个合理的“保证”/约定,版本号告诉你一些关于兼容性的东西。这在某种程度上是另一回事。

这就是使用零的好处所在——您已经对生成的文件进行了版本控制——因此您总是知道生成的代码的形式何时发生了变化。但是,对于大量生成的代码来说,查看差异可能有点麻烦,所以这是一个更好的开发人员界面将很方便的地方。

RE Monolithism:您当然可以使用自己的编译时代码对TH表达式的结果进行后处理。对顶级声明类型/名称进行筛选的代码并不多。见鬼,你可以想象写一个函数来做这种通用的事情。为了修改/去单一化quasiquoters,您可以在“QuasiQuoter”上进行模式匹配,并提取出所使用的转换,或者根据旧的转换生成新的转换。

其他回答

避免Template Haskell的一个原因是,它作为一个整体根本不是类型安全的,因此与“Haskell的精神”相悖。以下是一些例子:

You have no control over what kind of Haskell AST a piece of TH code will generate, beyond where it will appear; you can have a value of type Exp, but you don't know if it is an expression that represents a [Char] or a (a -> (forall b . b -> c)) or whatever. TH would be more reliable if one could express that a function may only generate expressions of a certain type, or only function declarations, or only data-constructor-matching patterns, etc. You can generate expressions that don't compile. You generated an expression that references a free variable foo that doesn't exist? Tough luck, you'll only see that when actually using your code generator, and only under the circumstances that trigger the generation of that particular code. It is very difficult to unit test, too.

TH也是完全危险的:

在编译时运行的代码可以执行任意IO,包括发射导弹或窃取您的信用卡。你不会想要查看你下载的每一个阴谋包来寻找TH漏洞。 TH可以访问“模块私有”函数和定义,在某些情况下完全打破了封装。

还有一些问题使得TH函数对于库开发人员来说不那么有趣:

TH code isn't always composable. Let's say someone makes a generator for lenses, and more often than not, that generator will be structured in such a way that it can only be called directly by the "end-user," and not by other TH code, by for example taking a list of type constructors to generate lenses for as the parameter. It is tricky to generate that list in code, while the user only has to write generateLenses [''Foo, ''Bar]. Developers don't even know that TH code can be composed. Did you know that you can write forM_ [''Foo, ''Bar] generateLens? Q is just a monad, so you can use all of the usual functions on it. Some people don't know this, and because of that, they create multiple overloaded versions of essentially the same functions with the same functionality, and these functions lead to a certain bloat effect. Also, most people write their generators in the Q monad even when they don't have to, which is like writing bla :: IO Int; bla = return 3; you are giving a function more "environment" than it needs, and clients of the function are required to provide that environment as an effect of that.

最后,对于最终用户来说,TH函数使用起来不那么有趣:

Opacity. When a TH function has type Q Dec, it can generate absolutely anything at the top-level of a module, and you have absolutely no control over what will be generated. Monolithism. You can't control how much a TH function generates unless the developer allows it; if you find a function that generates a database interface and a JSON serialization interface, you can't say "No, I only want the database interface, thanks; I'll roll my own JSON interface" Run time. TH code takes a relatively long time to run. The code is interpreted anew every time a file is compiled, and often, a ton of packages are required by the running TH code, that have to be loaded. This slows down compile time considerably.

TH为什么不好?对我来说,这可以归结为:

如果您需要生成如此多的重复代码,以至于您发现自己试图使用TH来自动生成它,那么您就做错了!

想想看。Haskell的一半吸引力在于它的高级设计允许您避免使用其他语言编写大量无用的样板代码。如果您需要编译时代码生成,那么您基本上是在说您的语言或应用程序设计失败了。我们程序员不喜欢失败。

当然,有时候这是必要的。但有时你可以通过设计得更聪明一点来避免TH的需要。

(另一件事是TH相当低水平。没有宏大的高级设计;GHC的很多内部实现细节都暴露了出来。这使得API易于更改…)

我想谈谈dflemstr提出的几个问题。

我不认为你不能打字检查TH的事实是令人担忧的。为什么?因为即使有错误,它仍然是编译时的。我不确定这是否加强了我的论点,但这在精神上与您在c++中使用模板时收到的错误类似。不过,我认为这些错误比c++的错误更容易理解,因为您将得到生成代码的漂亮打印版本。

如果一个TH表达式/准引号做了一些如此高级的事情,以至于棘手的角落可以隐藏,那么也许它是不明智的?

I break this rule quite a bit with quasi-quoters I've been working on lately (using haskell-src-exts / meta) - https://github.com/mgsloan/quasi-extras/tree/master/examples . I know this introduces some bugs such as not being able to splice in the generalized list comprehensions. However, I think that there's a good chance that some of the ideas in http://hackage.haskell.org/trac/ghc/blog/Template%20Haskell%20Proposal will end up in the compiler. Until then, the libraries for parsing Haskell to TH trees are a nearly perfect approximation.

考虑到编译速度/依赖关系,我们可以使用“zero”包来内联生成的代码。这至少对给定库的用户很好,但对于编辑库的情况,我们不能做得更好。TH依赖会膨胀生成的二进制文件吗?我以为它忽略了编译代码没有引用的所有内容。

Haskell模块的分段限制/编译步骤的分割确实很糟糕。

RE不透明度:这对你调用的任何库函数都是一样的。你无法控制Data.List.groupBy将做什么。你只是有一个合理的“保证”/约定,版本号告诉你一些关于兼容性的东西。这在某种程度上是另一回事。

这就是使用零的好处所在——您已经对生成的文件进行了版本控制——因此您总是知道生成的代码的形式何时发生了变化。但是,对于大量生成的代码来说,查看差异可能有点麻烦,所以这是一个更好的开发人员界面将很方便的地方。

RE Monolithism:您当然可以使用自己的编译时代码对TH表达式的结果进行后处理。对顶级声明类型/名称进行筛选的代码并不多。见鬼,你可以想象写一个函数来做这种通用的事情。为了修改/去单一化quasiquoters,您可以在“QuasiQuoter”上进行模式匹配,并提取出所使用的转换,或者根据旧的转换生成新的转换。

Template Haskell的一个相当实用的问题是,它只在GHC的字节码解释器可用时才能工作,而不是在所有架构上都是这样。因此,如果你的程序使用Template Haskell或依赖于使用它的库,它将不能在ARM、MIPS、S390或PowerPC CPU的机器上运行。

这在实践中是相关的:git-annex是一个用Haskell编写的工具,在担心存储问题的机器上运行是有意义的,这样的机器通常没有i386- cpu。就我个人而言,我在NSLU 2上运行git-annex (32 MB RAM, 266MHz CPU;你知道Haskell在这样的硬件上工作得很好吗?)如果它使用Template Haskell,这是不可能的。

(ARM上关于GHC的情况这些天改善了很多,我认为7.4.2甚至有用,但这一点仍然成立)。

这完全是我个人的意见。

It's ugly to use. $(fooBar ''Asdf) just does not look nice. Superficial, sure, but it contributes. It's even uglier to write. Quoting works sometimes, but a lot of the time you have to do manual AST grafting and plumbing. The API is big and unwieldy, there's always a lot of cases you don't care about but still need to dispatch, and the cases you do care about tend to be present in multiple similar but not identical forms (data vs. newtype, record-style vs. normal constructors, and so on). It's boring and repetitive to write and complicated enough to not be mechanical. The reform proposal addresses some of this (making quotes more widely applicable). The stage restriction is hell. Not being able to splice functions defined in the same module is the smaller part of it: the other consequence is that if you have a top-level splice, everything after it in the module will be out of scope to anything before it. Other languages with this property (C, C++) make it workable by allowing you to forward declare things, but Haskell doesn't. If you need cyclic references between spliced declarations or their dependencies and dependents, you're usually just screwed. It's undisciplined. What I mean by this is that most of the time when you express an abstraction, there is some kind of principle or concept behind that abstraction. For many abstractions, the principle behind them can be expressed in their types. For type classes, you can often formulate laws which instances should obey and clients can assume. If you use GHC's new generics feature to abstract the form of an instance declaration over any datatype (within bounds), you get to say "for sum types, it works like this, for product types, it works like that". Template Haskell, on the other hand, is just macros. It's not abstraction at the level of ideas, but abstraction at the level of ASTs, which is better, but only modestly, than abstraction at the level of plain text.* It ties you to GHC. In theory another compiler could implement it, but in practice I doubt this will ever happen. (This is in contrast to various type system extensions which, though they might only be implemented by GHC at the moment, I could easily imagine being adopted by other compilers down the road and eventually standardized.) The API isn't stable. When new language features are added to GHC and the template-haskell package is updated to support them, this often involves backwards-incompatible changes to the TH datatypes. If you want your TH code to be compatible with more than just one version of GHC you need to be very careful and possibly use CPP. There's a general principle that you should use the right tool for the job and the smallest one that will suffice, and in that analogy Template Haskell is something like this. If there's a way to do it that's not Template Haskell, it's generally preferable.

Template Haskell的优势在于,你可以用它做其他方法做不到的事情,这是一个很大的优势。大多数时候,使用TH的事情只能在直接作为编译器特性实现时才能完成。拥有TH是非常有益的,因为它可以让您做这些事情,而且它可以让您以一种更轻量级和可重用的方式构建潜在的编译器扩展原型(例如,请参阅各种透镜包)。

总结一下我对Template Haskell的负面看法:它解决了很多问题,但对于它解决的任何给定问题,感觉应该有一个更好、更优雅、更有纪律的解决方案更适合解决这个问题,它不是通过自动生成样板文件来解决问题,而是通过消除对样板文件的需要来解决问题。

*尽管我经常觉得CPP对于那些它能解决的问题有更好的功率重量比。

EDIT 23-04-14: What I was frequently trying to get at in the above, and have only recently gotten at exactly, is that there's an important distinction between abstraction and deduplication. Proper abstraction often results in deduplication as a side effect, and duplication is often a telltale sign of inadequate abstraction, but that's not why it's valuable. Proper abstraction is what makes code correct, comprehensible, and maintainable. Deduplication only makes it shorter. Template Haskell, like macros in general, is a tool for deduplication.