Template Haskell似乎经常被Haskell社区视为一种不幸的便利。很难用语言准确描述我在这方面的观察,但可以考虑以下几个例子

在“丑陋的(但必要的)”中列出的Haskell模板,以回答用户应该使用/避免使用哪些Haskell (GHC)扩展? 模板Haskell考虑了一个临时/劣质的解决方案在Unboxed Vectors的newtype'd值线程(图书馆邮件列表) Yesod经常因为过于依赖Template Haskell而受到批评(请参阅回应这种观点的博客文章)

我看过很多博客文章,人们用Template Haskell做了一些非常整洁的事情,实现了在普通Haskell中无法实现的漂亮语法,以及大量的样板文件缩减。那么为什么Template Haskell会这样被轻视呢?是什么让它不受欢迎?在什么情况下应该避免使用Template Haskell,为什么?


当前回答

避免Template Haskell的一个原因是,它作为一个整体根本不是类型安全的,因此与“Haskell的精神”相悖。以下是一些例子:

You have no control over what kind of Haskell AST a piece of TH code will generate, beyond where it will appear; you can have a value of type Exp, but you don't know if it is an expression that represents a [Char] or a (a -> (forall b . b -> c)) or whatever. TH would be more reliable if one could express that a function may only generate expressions of a certain type, or only function declarations, or only data-constructor-matching patterns, etc. You can generate expressions that don't compile. You generated an expression that references a free variable foo that doesn't exist? Tough luck, you'll only see that when actually using your code generator, and only under the circumstances that trigger the generation of that particular code. It is very difficult to unit test, too.

TH也是完全危险的:

在编译时运行的代码可以执行任意IO,包括发射导弹或窃取您的信用卡。你不会想要查看你下载的每一个阴谋包来寻找TH漏洞。 TH可以访问“模块私有”函数和定义,在某些情况下完全打破了封装。

还有一些问题使得TH函数对于库开发人员来说不那么有趣:

TH code isn't always composable. Let's say someone makes a generator for lenses, and more often than not, that generator will be structured in such a way that it can only be called directly by the "end-user," and not by other TH code, by for example taking a list of type constructors to generate lenses for as the parameter. It is tricky to generate that list in code, while the user only has to write generateLenses [''Foo, ''Bar]. Developers don't even know that TH code can be composed. Did you know that you can write forM_ [''Foo, ''Bar] generateLens? Q is just a monad, so you can use all of the usual functions on it. Some people don't know this, and because of that, they create multiple overloaded versions of essentially the same functions with the same functionality, and these functions lead to a certain bloat effect. Also, most people write their generators in the Q monad even when they don't have to, which is like writing bla :: IO Int; bla = return 3; you are giving a function more "environment" than it needs, and clients of the function are required to provide that environment as an effect of that.

最后,对于最终用户来说,TH函数使用起来不那么有趣:

Opacity. When a TH function has type Q Dec, it can generate absolutely anything at the top-level of a module, and you have absolutely no control over what will be generated. Monolithism. You can't control how much a TH function generates unless the developer allows it; if you find a function that generates a database interface and a JSON serialization interface, you can't say "No, I only want the database interface, thanks; I'll roll my own JSON interface" Run time. TH code takes a relatively long time to run. The code is interpreted anew every time a file is compiled, and often, a ton of packages are required by the running TH code, that have to be loaded. This slows down compile time considerably.

其他回答

TH为什么不好?对我来说,这可以归结为:

如果您需要生成如此多的重复代码,以至于您发现自己试图使用TH来自动生成它,那么您就做错了!

想想看。Haskell的一半吸引力在于它的高级设计允许您避免使用其他语言编写大量无用的样板代码。如果您需要编译时代码生成,那么您基本上是在说您的语言或应用程序设计失败了。我们程序员不喜欢失败。

当然,有时候这是必要的。但有时你可以通过设计得更聪明一点来避免TH的需要。

(另一件事是TH相当低水平。没有宏大的高级设计;GHC的很多内部实现细节都暴露了出来。这使得API易于更改…)

Template Haskell的一个相当实用的问题是,它只在GHC的字节码解释器可用时才能工作,而不是在所有架构上都是这样。因此,如果你的程序使用Template Haskell或依赖于使用它的库,它将不能在ARM、MIPS、S390或PowerPC CPU的机器上运行。

这在实践中是相关的:git-annex是一个用Haskell编写的工具,在担心存储问题的机器上运行是有意义的,这样的机器通常没有i386- cpu。就我个人而言,我在NSLU 2上运行git-annex (32 MB RAM, 266MHz CPU;你知道Haskell在这样的硬件上工作得很好吗?)如果它使用Template Haskell,这是不可能的。

(ARM上关于GHC的情况这些天改善了很多,我认为7.4.2甚至有用,但这一点仍然成立)。

这个答案是对illissius提出的问题的逐条回答:

用起来很难看。$(fooBar " Asdf)看起来不太好。当然,这很肤浅,但也有帮助。

我同意。我觉得选择$()是为了让它看起来像语言的一部分——使用熟悉的Haskell符号托盘。然而,这正是你/不希望/用于宏拼接的符号。他们确实融入太多了,这方面的美容是相当重要的。我喜欢拼接的{{}}外观,因为它们在视觉上非常明显。

It's even uglier to write. Quoting works sometimes, but a lot of the time you have to do manual AST grafting and plumbing. The [API][1] is big and unwieldy, there's always a lot of cases you don't care about but still need to dispatch, and the cases you do care about tend to be present in multiple similar but not identical forms (data vs. newtype, record-style vs. normal constructors, and so on). It's boring and repetitive to write and complicated enough to not be mechanical. The [reform proposal][2] addresses some of this (making quotes more widely applicable).

我也同意这一点,然而,正如“TH的新方向”中的一些评论所观察到的那样,缺乏良好的开箱即用的AST引用并不是一个关键的缺陷。在这个WIP包中,我试图以库的形式解决这些问题:https://github.com/mgsloan/quasi-extras。到目前为止,我允许拼接的地方比平时多一些,可以在ast上进行模式匹配。

The stage restriction is hell. Not being able to splice functions defined in the same module is the smaller part of it: the other consequence is that if you have a top-level splice, everything after it in the module will be out of scope to anything before it. Other languages with this property (C, C++) make it workable by allowing you to forward declare things, but Haskell doesn't. If you need cyclic references between spliced declarations or their dependencies and dependents, you're usually just screwed.

我遇到过循环TH定义是不可能的问题…这很烦人。有一个解决方案,但它很难看——将循环依赖关系中涉及的东西包装在一个TH表达式中,该表达式结合了所有生成的声明。其中一个声明生成器可以是接受Haskell代码的准引号。

It's unprincipled. What I mean by this is that most of the time when you express an abstraction, there is some kind of principle or concept behind that abstraction. For many abstractions, the principle behind them can be expressed in their types. When you define a type class, you can often formulate laws which instances should obey and clients can assume. If you use GHC's [new generics feature][3] to abstract the form of an instance declaration over any datatype (within bounds), you get to say "for sum types, it works like this, for product types, it works like that". But Template Haskell is just dumb macros. It's not abstraction at the level of ideas, but abstraction at the level of ASTs, which is better, but only modestly, than abstraction at the level of plain text.

It's only unprincipled if you do unprincipled things with it. The only difference is that with the compiler implemented mechanisms for abstraction, you have more confidence that the abstraction isn't leaky. Perhaps democratizing language design does sound a bit scary! Creators of TH libraries need to document well and clearly define the meaning and results of the tools they provide. A good example of principled TH is the derive package: http://hackage.haskell.org/package/derive - it uses a DSL such that the example of many of the derivations /specifies/ the actual derivation.

它把你和GHC联系在一起。理论上,另一个编译器可以实现它,但在实践中,我怀疑这种情况是否会发生。(这与各种类型系统扩展形成对比,尽管它们目前可能只由GHC实现,但我很容易想象它们会被其他编译器采用,并最终标准化。)

That's a pretty good point - the TH API is pretty big and clunky. Re-implementing it seems like it could be tough. However, there are only really only a few ways to slice the problem of representing Haskell ASTs. I imagine that copying the TH ADTs, and writing a converter to the internal AST representation would get you a good deal of the way there. This would be equivalent to the (not insignificant) effort of creating haskell-src-meta. It could also be simply re-implemented by pretty printing the TH AST and using the compiler's internal parser.

虽然我可能是错的,但从实现的角度来看,我不认为TH是一个复杂的编译器扩展。这实际上是“保持简单”的好处之一,而不是让基础层成为理论上有吸引力的、静态可验证的模板系统。

API不稳定。当新的语言特性被添加到GHC,并且template-haskell包被更新以支持它们时,这通常涉及到对TH数据类型的向后不兼容的更改。如果你想让你的TH代码兼容多个版本的GHC,你需要非常小心,可能要使用CPP。

This is also a good point, but somewhat dramaticized. While there have been API additions lately, they haven't been extensively breakage inducing. Also, I think that with the superior AST quoting I mentioned earlier, the API that actually needs to be used can be very substantially reduced. If no construction / matching needs distinct functions, and are instead expressed as literals, then most of the API disappears. Moreover, the code you write would port more easily to AST representations for languages similar to Haskell.


In summary, I think that TH is a powerful, semi-neglected tool. Less hate could lead to a more lively eco-system of libraries, encouraging the implementation of more language feature prototypes. It's been observed that TH is an overpowered tool, that can let you /do/ almost anything. Anarchy! Well, it's my opinion that this power can allow you to overcome most of its limitations, and construct systems capable of quite principled meta-programming approaches. It's worth the usage of ugly hacks to simulate the "proper" implementation, as this way the design of the "proper" implementation will gradually become clear.

在我个人理想的涅槃版本中,大部分语言实际上会移出编译器,进入这些类型的库中。特性作为库实现的事实并没有严重影响它们忠实抽象的能力。

Haskell对样板代码的典型回答是什么?抽象。我们最喜欢的抽象概念是什么?函数和类型类!

类型类允许我们定义一组方法,然后可以在该类上的所有泛型函数中使用这些方法。然而,除此之外,类帮助避免模式化的唯一方法是提供“默认定义”。这里有一个无原则特征的例子!

Minimal binding sets are not declarable / compiler checkable. This could lead to inadvertent definitions that yield bottom due to mutual recursion. Despite the great convenience and power this would yield, you cannot specify superclass defaults, due to orphan instances http://lukepalmer.wordpress.com/2009/01/25/a-world-without-orphans/ These would let us fix the numeric hierarchy gracefully! Going after TH-like capabilities for method defaults led to http://www.haskell.org/haskellwiki/GHC.Generics . While this is cool stuff, my only experience debugging code using these generics was nigh-impossible, due to the size of the type induced for and ADT as complicated as an AST. https://github.com/mgsloan/th-extra/commit/d7784d95d396eb3abdb409a24360beb03731c88c In other words, this went after the features provided by TH, but it had to lift an entire domain of the language, the construction language, into a type system representation. While I can see it working well for your common problem, for complex ones, it seems prone to yielding a pile of symbols far more terrifying than TH hackery. TH gives you value-level compile-time computation of the output code, whereas generics forces you to lift the pattern matching / recursion part of the code into the type system. While this does restrict the user in a few fairly useful ways, I don't think the complexity is worth it.

我认为拒绝TH和类似lisp的元编程导致了对方法默认值之类的东西的偏好,而不是更灵活的宏扩展,比如实例声明。避免可能导致不可预见结果的规则是明智的,然而,我们不应该忽视Haskell的强大类型系统允许比许多其他环境中更可靠的元编程(通过检查生成的代码)。

这完全是我个人的意见。

It's ugly to use. $(fooBar ''Asdf) just does not look nice. Superficial, sure, but it contributes. It's even uglier to write. Quoting works sometimes, but a lot of the time you have to do manual AST grafting and plumbing. The API is big and unwieldy, there's always a lot of cases you don't care about but still need to dispatch, and the cases you do care about tend to be present in multiple similar but not identical forms (data vs. newtype, record-style vs. normal constructors, and so on). It's boring and repetitive to write and complicated enough to not be mechanical. The reform proposal addresses some of this (making quotes more widely applicable). The stage restriction is hell. Not being able to splice functions defined in the same module is the smaller part of it: the other consequence is that if you have a top-level splice, everything after it in the module will be out of scope to anything before it. Other languages with this property (C, C++) make it workable by allowing you to forward declare things, but Haskell doesn't. If you need cyclic references between spliced declarations or their dependencies and dependents, you're usually just screwed. It's undisciplined. What I mean by this is that most of the time when you express an abstraction, there is some kind of principle or concept behind that abstraction. For many abstractions, the principle behind them can be expressed in their types. For type classes, you can often formulate laws which instances should obey and clients can assume. If you use GHC's new generics feature to abstract the form of an instance declaration over any datatype (within bounds), you get to say "for sum types, it works like this, for product types, it works like that". Template Haskell, on the other hand, is just macros. It's not abstraction at the level of ideas, but abstraction at the level of ASTs, which is better, but only modestly, than abstraction at the level of plain text.* It ties you to GHC. In theory another compiler could implement it, but in practice I doubt this will ever happen. (This is in contrast to various type system extensions which, though they might only be implemented by GHC at the moment, I could easily imagine being adopted by other compilers down the road and eventually standardized.) The API isn't stable. When new language features are added to GHC and the template-haskell package is updated to support them, this often involves backwards-incompatible changes to the TH datatypes. If you want your TH code to be compatible with more than just one version of GHC you need to be very careful and possibly use CPP. There's a general principle that you should use the right tool for the job and the smallest one that will suffice, and in that analogy Template Haskell is something like this. If there's a way to do it that's not Template Haskell, it's generally preferable.

Template Haskell的优势在于,你可以用它做其他方法做不到的事情,这是一个很大的优势。大多数时候,使用TH的事情只能在直接作为编译器特性实现时才能完成。拥有TH是非常有益的,因为它可以让您做这些事情,而且它可以让您以一种更轻量级和可重用的方式构建潜在的编译器扩展原型(例如,请参阅各种透镜包)。

总结一下我对Template Haskell的负面看法:它解决了很多问题,但对于它解决的任何给定问题,感觉应该有一个更好、更优雅、更有纪律的解决方案更适合解决这个问题,它不是通过自动生成样板文件来解决问题,而是通过消除对样板文件的需要来解决问题。

*尽管我经常觉得CPP对于那些它能解决的问题有更好的功率重量比。

EDIT 23-04-14: What I was frequently trying to get at in the above, and have only recently gotten at exactly, is that there's an important distinction between abstraction and deduplication. Proper abstraction often results in deduplication as a side effect, and duplication is often a telltale sign of inadequate abstraction, but that's not why it's valuable. Proper abstraction is what makes code correct, comprehensible, and maintainable. Deduplication only makes it shorter. Template Haskell, like macros in general, is a tool for deduplication.

我想谈谈dflemstr提出的几个问题。

我不认为你不能打字检查TH的事实是令人担忧的。为什么?因为即使有错误,它仍然是编译时的。我不确定这是否加强了我的论点,但这在精神上与您在c++中使用模板时收到的错误类似。不过,我认为这些错误比c++的错误更容易理解,因为您将得到生成代码的漂亮打印版本。

如果一个TH表达式/准引号做了一些如此高级的事情,以至于棘手的角落可以隐藏,那么也许它是不明智的?

I break this rule quite a bit with quasi-quoters I've been working on lately (using haskell-src-exts / meta) - https://github.com/mgsloan/quasi-extras/tree/master/examples . I know this introduces some bugs such as not being able to splice in the generalized list comprehensions. However, I think that there's a good chance that some of the ideas in http://hackage.haskell.org/trac/ghc/blog/Template%20Haskell%20Proposal will end up in the compiler. Until then, the libraries for parsing Haskell to TH trees are a nearly perfect approximation.

考虑到编译速度/依赖关系,我们可以使用“zero”包来内联生成的代码。这至少对给定库的用户很好,但对于编辑库的情况,我们不能做得更好。TH依赖会膨胀生成的二进制文件吗?我以为它忽略了编译代码没有引用的所有内容。

Haskell模块的分段限制/编译步骤的分割确实很糟糕。

RE不透明度:这对你调用的任何库函数都是一样的。你无法控制Data.List.groupBy将做什么。你只是有一个合理的“保证”/约定,版本号告诉你一些关于兼容性的东西。这在某种程度上是另一回事。

这就是使用零的好处所在——您已经对生成的文件进行了版本控制——因此您总是知道生成的代码的形式何时发生了变化。但是,对于大量生成的代码来说,查看差异可能有点麻烦,所以这是一个更好的开发人员界面将很方便的地方。

RE Monolithism:您当然可以使用自己的编译时代码对TH表达式的结果进行后处理。对顶级声明类型/名称进行筛选的代码并不多。见鬼,你可以想象写一个函数来做这种通用的事情。为了修改/去单一化quasiquoters,您可以在“QuasiQuoter”上进行模式匹配,并提取出所使用的转换,或者根据旧的转换生成新的转换。