在OpenGL中已经有很多关于文本渲染的问题,比如:

如何为GUI做OpenGL实时文本渲染?

但主要讨论的是使用固定函数管道渲染纹理四边形。当然着色器必须做出更好的方法。

我并不真正关心国际化,我的大多数字符串将是plot tick标签(日期和时间或纯数字)。但是这些情节将以屏幕刷新率重新呈现,并且可能会有相当多的文本(屏幕上的符号不会超过几千个,但足够硬件加速布局了)。

使用现代OpenGL进行文本渲染的推荐方法是什么?(引用使用该方法的现有软件是证明其运行良好的好证据)

几何着色器,接受例如位置和方向和字符序列,并发射纹理四边形 渲染矢量字体的几何着色器 如上所述,但是使用镶嵌着色器代替 一个计算着色器来做字体光栅化


最广泛使用的技术仍然是纹理四边形。然而在2005年LORIA开发了一种叫做矢量纹理的东西,即在原语上渲染矢量图形作为纹理。如果有人使用这个转换TrueType或OpenType字体为矢量纹理,你会得到:

http://alice.loria.fr/index.php/publications.html?Paper=VTM@2005


我认为你最好的选择是用OpenGL后台查看cairo图形。

在使用3.3内核开发原型时,我遇到的唯一问题是OpenGL后端不推荐使用函数。那是1-2年前的事了,所以情况可能有所改善…

不管怎样,我希望将来桌面opengl图形驱动程序能够实现OpenVG。


渲染轮廓,除非你总共只渲染十几个字符,否则仍然是“不可能的”,因为每个字符需要近似曲率的顶点数量。虽然已经有了在像素着色器中评估bezier曲线的方法,但这些方法不容易反锯齿,这在使用距离贴图纹理的四方中是微不足道的,并且在着色器中评估曲线的计算仍然比必要的要昂贵得多。

“快速”和“质量”之间的最佳权衡仍然是带有符号距离场纹理的纹理四边形。这是非常慢的使用一个普通的普通纹理四,但不是那么多。另一方面,质量是完全不同的。结果真的是惊人的,它是你能得到的最快的速度,而且像发光这样的效果也很容易添加。此外,如果需要,可以很好地将该技术降级到较旧的硬件。

有关该技术,请参阅著名的Valve论文。

The technique is conceptually similar to how implicit surfaces (metaballs and such) work, though it does not generate polygons. It runs entirely in the pixel shader and takes the distance sampled from the texture as a distance function. Everything above a chosen threshold (usually 0.5) is "in", everything else is "out". In the simplest case, on 10 year old non-shader-capable hardware, setting the alpha test threshold to 0.5 will do that exact thing (though without special effects and antialiasing). If one wants to add a little more weight to the font (faux bold), a slightly smaller threshold will do the trick without modifying a single line of code (just change your "font_weight" uniform). For a glow effect, one simply considers everything above one threshold as "in" and everything above another (smaller) threshold as "out, but in glow", and LERPs between the two. Antialiasing works similarly.

通过使用8位符号距离值而不是单个位,这种技术将纹理映射在每个维度上的有效分辨率提高了16倍(不是黑白,而是使用了所有可能的阴影,因此我们使用相同的存储空间获得了256倍的信息)。但是,即使你放大远远超过16倍,结果看起来还是可以接受的。长直线最终会变得有点摆动,但不会有典型的“块状”采样工件。

You can use a geometry shader for generating the quads out of points (reduce bus bandwidth), but honestly the gains are rather marginal. The same is true for instanced character rendering as described in GPG8. The overhead of instancing is only amortized if you have a lot of text to draw. The gains are, in my opinion, in no relation to the added complexity and non-downgradeability. Plus, you are either limited by the amount of constant registers, or you have to read from a texture buffer object, which is non-optimal for cache coherence (and the intent was to optimize to begin with!). A simple, plain old vertex buffer is just as fast (possibly faster) if you schedule the upload a bit ahead in time and will run on every hardware built during the last 15 years. And, it is not limited to any particular number of characters in your font, nor to a particular number of characters to render.

If you are sure that you do not have more than 256 characters in your font, texture arrays may be worth a consideration to strip off bus bandwidth in a similar manner as generating quads from points in the geometry shader. When using an array texture, the texture coordinates of all quads have identical, constant s and t coordinates and only differ in the r coordinate, which is equal to the character index to render. But like with the other techniques, the expected gains are marginal at the cost of being incompatible with previous generation hardware.

Jonathan Dummer提供了一个生成距离纹理的便利工具:描述页面

Update: As more recently pointed out in Programmable Vertex Pulling (D. Rákos, "OpenGL Insights", pp. 239), there is no significant extra latency or overhead associated with pulling vertex data programmatically from the shader on the newest generations of GPUs, as compared to doing the same using the standard fixed function. Also, the latest generations of GPUs have more and more reasonably sized general-purpose L2 caches (e.g. 1536kiB on nvidia Kepler), so one may expect the incoherent access problem when pulling random offsets for the quad corners from a buffer texture being less of a problem.

这使得从缓冲区纹理中提取常量数据(如四元大小)的想法更有吸引力。因此,一个假设的实现可以通过以下方法将PCIe和内存传输以及GPU内存减少到最小:

Only upload a character index (one per character to be displayed) as the only input to a vertex shader that passes on this index and gl_VertexID, and amplify that to 4 points in the geometry shader, still having the character index and the vertex id (this will be "gl_primitiveID made available in the vertex shader") as the sole attributes, and capture this via transform feedback. This will be fast, because there are only two output attributes (main bottleneck in GS), and it is close to "no-op" otherwise in both stages. Bind a buffer texture which contains, for each character in the font, the textured quad's vertex positions relative to the base point (these are basically the "font metrics"). This data can be compressed to 4 numbers per quad by storing only the offset of the bottom left vertex, and encoding the width and height of the axis-aligned box (assuming half floats, this will be 8 bytes of constant buffer per character -- a typical 256 character font could fit completely into 2kiB of L1 cache). Set an uniform for the baseline Bind a buffer texture with horizontal offsets. These could probably even be calculated on the GPU, but it is much easier and more efficient to that kind of thing on the CPU, as it is a strictly sequential operation and not at all trivial (think of kerning). Also, it would need another feedback pass, which would be another sync point. Render the previously generated data from the feedback buffer, the vertex shader pulls the horizontal offset of the base point and the offsets of the corner vertices from buffer objects (using the primitive id and the character index). The original vertex ID of the submitted vertices is now our "primitive ID" (remember the GS turned the vertices into quads).

像这样,理想情况下可以将所需的顶点带宽减少75%(平摊),尽管它只能渲染一条线。如果想要在一个draw调用中渲染几行,就需要将基线添加到缓冲纹理中,而不是使用统一的(使带宽增益更小)。

However, even assuming a 75% reduction -- since the vertex data to display "reasonable" amounts of text is only somewhere around 50-100kiB (which is practically zero to a GPU or a PCIe bus) -- I still doubt that the added complexity and losing backwards-compatibility is really worth the trouble. Reducing zero by 75% is still only zero. I have admittedly not tried the above approach, and more research would be needed to make a truly qualified statement. But still, unless someone can demonstrate a truly stunning performance difference (using "normal" amounts of text, not billions of characters!), my point of view remains that for the vertex data, a simple, plain old vertex buffer is justifiably good enough to be considered part of a "state of the art solution". It's simple and straightforward, it works, and it works well.

上面已经提到了“OpenGL Insights”,有必要指出Stefan Gustavson的“2D Shape Rendering by Distance Fields”一章,其中详细解释了距离字段的渲染。

2016年更新:

同时,还有一些其他的技术,旨在消除在极端放大时变得令人不安的圆角人工制品。

一种方法简单地使用伪距离字段而不是距离字段(区别在于距离不是到实际轮廓的最短距离,而是到轮廓或从边缘突出的假想线的最短距离)。这稍微好一些,并且以相同的速度(相同的着色器)运行,使用相同数量的纹理内存。

另一种方法是在github中使用三通道纹理细节和实现。这旨在对以前用于解决该问题的and-or黑客进行改进。质量好,稍微慢一点,几乎不明显,但是使用了三倍的纹理内存。另外,额外的效果(例如发光)也很难得到正确的效果。

最后,存储构成字符的实际bezier曲线,并在片段着色器中评估它们已经变得实用,性能略差(但还不至于成为问题),即使在最高放大倍率下也会产生惊人的结果。 WebGL演示使用该技术实时渲染一个大型PDF。


http://code.google.com/p/glyphy/

GLyphy和其他基于SDF的OpenGL渲染器之间的主要区别是,大多数其他项目都将SDF采样到纹理中。这包含了所有抽样的常见问题。Ie。它扭曲轮廓和低质量。相反,GLyphy使用提交给GPU的实际向量表示SDF。这将产生非常高质量的渲染。

缺点是该代码是针对带有OpenGL ES的iOS。我可能会做一个Windows/Linux opengl4。X端口(希望作者会添加一些真正的文档)。


I'm surprised Mark Kilgard's baby, NV_path_rendering (NVpr), was not mentioned by any of the above. Although its goals are more general than font rendering, it can also render text from fonts and with kerning. It doesn't even require OpenGL 4.1, but it is a vendor/Nvidia-only extension at the moment. It basically turns fonts into paths using glPathGlyphsNV which depends on the freetype2 library to get the metrics, etc. Then you can also access the kerning info with glGetPathSpacingNV and use NVpr's general path rendering mechanism to display text from using the path-"converted" fonts. (I put that in quotes, because there's no real conversion, the curves are used as is.)

不幸的是,NVpr字体功能的录制演示并不是特别令人印象深刻。(也许有人应该沿着更时髦的SDF演示的路线,可以在intertube上找到…)

字体部分的2011 NVpr API演示演讲从这里开始,下一部分继续;遗憾的是,演示文稿是如何分开的。

更多关于NVpr的一般资料:

Nvidia NVpr hub, but some material on the landing page is not the most up-to-date Siggraph 2012 paper for the brains of the path-rendering method, called "stencil, then cover" (StC); the paper also explains briefly how competing tech like Direct2D works. The font-related bits have been relegated to an annex of the paper. There are also some extras like videos/demos. GTC 2014 presentation for an update status; in a nutshell: it's now supported by Google's Skia (Nvidia contributed the code in late 2013 and 2014), which in turn is used in Google Chrome and [independently of Skia, I think] in a beta of Adobe Illustrator CC 2014 the official documentation in the OpenGL extension registry USPTO has granted at least four patents to Kilgard/Nvidia in connection with NVpr, of which you should probably be aware of, in case you want to implement StC by yourself: US8698837, US8698808, US8704830 and US8730253. Note that there are something like 17 more USPTO documents connected to this as "also published as", most of which are patent applications, so it's entirely possible more patents may be granted from those.

And since the word "stencil" did not produce any hits on this page before my answer, it appears the subset of the SO community that participated on this page insofar, despite being pretty numerous, was unaware of tessellation-free, stencil-buffer-based methods for path/font rendering in general. Kilgard has a FAQ-like post at on the opengl forum which may illuminate how the tessellation-free path rendering methods differ from bog standard 3D graphics, even though they're still using a [GP]GPU. (NVpr needs a CUDA-capable chip.)

从历史的角度来看,Kilgard也是经典的“一个简单的基于opengl的纹理映射文本API”(SGI, 1997)的作者,不应该将其与2011年推出的基于模板的NVpr混淆。


Most if not all the recent methods discussed on this page, including stencil-based methods like NVpr or SDF-based methods like GLyphy (which I'm not discussing here any further because other answers already cover it) have however one limitation: they are suitable for large text display on conventional (~100 DPI) monitors without jaggies at any level of scaling, and they also look nice, even at small size, on high-DPI, retina-like displays. They don't fully provide what Microsoft's Direct2D+DirectWrite gives you however, namely hinting of small glyphs on mainstream displays. (For a visual survey of hinting in general see this typotheque page for instance. A more in-depth resource is on antigrain.com.)

I'm not aware of any open & productized OpenGL-based stuff that can do what Microsoft can with hinting at the moment. (I admit ignorance to Apple's OS X GL/Quartz internals, because to the best of my knowledge Apple hasn't published how they do GL-based font/path rendering stuff. It seems that OS X, unlike MacOS 9, doesn't do hinting at all, which annoys some people.) Anyway, there is one 2013 research paper that addresses hinting via OpenGL shaders written by INRIA's Nicolas P. Rougier; it is probably worth reading if you need to do hinting from OpenGL. While it may seem that a library like freetype already does all the work when it comes to hinting, that's not actually so for the following reason, which I'm quoting from the paper:

The FreeType library can rasterize a glyph using sub-pixel anti-aliasing in RGB mode. However, this is only half of the problem, since we also want to achieve sub-pixel positioning for accurate placement of the glyphs. Displaying the textured quad at fractional pixel coordinates does not solve the problem, since it only results in texture interpolation at the whole-pixel level. Instead, we want to achieve a precise shift (between 0 and 1) in the subpixel domain. This can be done in a fragment shader [...].

这个解不是很简单,所以我就不解释了。(这篇论文是开放获取的。)


One other thing I've learned from Rougier's paper (and which Kilgard doesn't seem to have considered) is that the font powers that be (Microsoft+Adobe) have created not one but two kerning specification methods. The old one is based on a so-called kern table and it is supported by freetype. The new one is called GPOS and it is only supported by newer font libraries like HarfBuzz or pango in the free software world. Since NVpr doesn't seem to support either of those libraries, kerning might not work out of the box with NVpr for some new fonts; there are some of those apparently in the wild, according to this forum discussion.

Finally, if you need to do complex text layout (CTL) you seem to be currently out of luck with OpenGL as no OpenGL-based library appears to exist for that. (DirectWrite on the other hand can handle CTL.) There are open-sourced libraries like HarfBuzz which can render CTL, but I don't know how you'd get them to work well (as in using the stencil-based methods) via OpenGL. You'd probably have to write the glue code to extract the re-shaped outlines and feed them into NVpr or SDF-based solutions as paths.