Ray March 体积云

Ray March 体积云

作者: HD_520 | 来源:发表于2022-06-22 19:05 被阅读0次

Volume Raymarching

The basic concept behind volumetric rendering is to evaluate rays of light as they pass through a volume. This generally means returning an Opacity and a Color for each pixel that intersects the volume. If your volume is an analytical function you can probably calculate the result directly, but if your volume is stored in a texture, you will need to take multiple steps through the volume, looking up the texture at each step. This can be broken down into two parts:


1) Opacity (Light Absorption)


2) Color (Illumination, Scattering)


Opacity Sampling

To generate an opacity for a volume, the density or thickness at each visible point must be known. If the volume is assumed to have a constant density and color, all that is needed is the total length of each ray before it hits an opaque occluder. For simple untextured fog, this is just the Scene Depth which gets remapped using a standard function: D3DFOG_EXP.  This function is defined as:


F = 1/ e ^(t * d).

Where t is the distance traveled through some media and d is the density of the media. This is how cheap unlit fog has been calculated in games for quite some time. This comes from the Beer-Lambert law which defines transmittance through a volume of particles as:


Transmittance = e ^ (-t * d).

Theses may look similar, because they are exactly the same thing. Note that x^(-y) is the same as 1/(x^y),  so the Exponential Fog function is really just an applied version of the Beer-Lambert law. To understand how these functions apply to volumetrics, we can point out an equation from an old paper by Drebin [1]. It describes how much light will exit a voxel in the ray direction as it passes through it. It is designed to return an accurate color for a volume having a unique color at every voxel:


Cout(v) = Cin(v) * (1 - Opacity(x)) + Color(x) * Opacity(x)

Cin(v) is the light color before it passes the voxel, Cout(v) is the color after passing through it. This states that as a ray of light passes through a volume, at every voxel, the color of the light will multiplied by the inverse opacity of the current voxel to simulate absorption, and the color of the current voxel times the opacity of the current voxel will be added to simulate scattering. This code can work as is, as long as the volume is traced back to front. If we track a variable for Transmittance that is initialized to 1, the volume can be traced in either direction. Transmittance can be thought of as the inverse to opacity.


This is where Exp, or the e^x function comes into play. Similar to the problem of bank account interest, the more often you apply interest to an account, the more money that will be earned but only up to a certain point. That point is defined by e. The same effect is found when comparing the results of integrating density over a volume. The more steps that are taken, the more that the final result will converge on a solution defined by the function Exp or e raised to some power. This is where the Beer-Lambert Law as well as the D3DFOG_EXP functions come from.


The math we have explored so far gives us some hints about how to proceed to build a custom volume renderer. We know we need to figure out the thickness of the volume at each point. This thickness value can then be used with an exponential density function to approximate how much light the volume would block.


To sample the density of our volume, several steps are taken along each ray passing through the volume and value of the volume texture is read at each point.  This example shows an imagined volume texture of a sphere. The camera rays show the result of sampling the volume at regular intervals to measure distance traveled within the media:


If the ray is inside the media during a step, the step length is added to an accumulation variable. If the ray is outside of the media during a step, nothing is accumulated during that step. At the end of this, for each pixel, we have a value describing how far the camera ray traveled while inside of the media in the volume texture. Because the distance is also multiplied by the opacity at each point, the final distance returned represents Linear Density.


That distance is represented in the above example as the yellow line between the yellow dots. Note that when low step counts are used like in the above example, the distances may not match the actual content very well and slicing artifacts become visible. These kinds of artifacts and solutions will be described in more detail further on.


At this point we are just accumulating linear values and returning a linear distance at the end. In order to make this look volumetric, we use an exponential function to remap the final value. The standard Direct3D exponential fog function D3DFOG_EXP mentioned above works well for this.


Example Opacity-Only Ray March

It is possible to do all of the ray marching code in the custom node, but that requires nested function calls which requires multiple custom nodes. Custom nodes get auto-named by the translator which means you have to call them assuming you know the order the compiler will add them (ie, CustomExpression0, 1, 2...). The compiler can start renaming the functions just by adding new ones or changing how they are hooked up between various material pins. 

可以在自定义节点中执行所有射线推进代码,但这需要嵌套的函数调用,这需要多个自定义节点。自定义节点由翻译器自动命名,这意味着你必须调用它们,前提是你知道编译器添加它们的顺序(例如,CustomExpression0, 1,2…)。编译器可以通过添加新函数或改变它们在不同材质引脚之间的连接方式来开始重命名函数。

To make this part a bit easier, I have added a PsuedoVolumeTexture function into the common.usf file.  Simply download and overwrite the common.usf located in Engine\Shaders. You can do this with the editor running and it should work immediately. This is basically just repeated code from the previous post on pseudo volume textures. Having this function greatly simplifies the raymarching code and it can just be swapped for a standard 3d texture sample when a future version of ue4 adds support. If you do not have one of the versions below, download one of them and just copy the last 2 functions into your version. I suggest using 4.13.2 for now over 4.14 until the 4.14.1 version is released. I will go into that at the very end.


common.usf (UE4.14):


common.usf (UE4.13.2):


Example Volume Texture of Smoke Ball:


RayMarching Code:

float numFrames = XYFrames * XYFrames;

float accumdist = 0;

float3 localcamvec = normalize( mul(Parameters.CameraVector, Primitive.WorldToLocal) );

float StepSize = 1 / MaxSteps;

for (int i = 0; i < MaxSteps; i++)


float cursample = PseudoVolumeTexture(Tex, TexSampler, saturate(CurPos), XYFrames, numFrames).r;

accumdist += cursample * StepSize;

CurPos += -localcamvec * StepSize;


return accumdist;

This simple code advances a ray through a specified volume texture over a distance of 0-1 in texture space and returns the linear density of the particulates traveled through. It is by no means complete and missing crucial details. Some bits will be added to the code later and some of the details will be provided in the form of material nodes. 


This allows you to control the number of steps and frame layout you want to use. 


In this simplified example, the node BoundingBoxBased_0-1_UVW is used because its an easy way to get a local 0-1 starting position. It works with box or sphere meshes, but it is not what we will end up using by the end of this for reasons that will be soon apparent. 


Here is what this should look like if you put it on StaticMesh'/Engine/EditorMeshes/EditorCube.EditorCube' with 64 steps:

如果你把它放在StaticMesh'/Engine/ editormesh /EditorCube上,这应该是这样的。EditorCube'包含64个步骤:

A random volumetric puffball, neat! But lets not get too excited yet. With the above 64 steps, the result looks pretty smooth. With 32 steps, strange slicing artifacts appear:


These artifacts betray the box geometry used to render the material.  They are a kind of moire pattern that results from tracing the volume texture starting at exactly the surface of the box intersection. Doing that causes the pattern of sampling to continue the box shape and give it that pattern. By snapping the start positions to view aligned planes, the artifacts can be reduced.


This is an example of emulating a geometric slicing approach using only the pixel shader. It still has slicing artifacts in motion but they are far less noticeable and do not betray the box geometry which is key. Additional sampling improvements can be had with low step counts by introducing temporal jitter. More on that later. Here is the additional code to align the samples. 


// Plane Alignment

// get object scale factor

//NOTE: This assumes the volume will only be UNIFORMLY scaled. Non uniform scale would require tons of little changes.

float scale = length( TransformLocalVectorToWorld(Parameters, float3(1.00000000,0.00000000,0.00000000)).xyz);

float worldstepsize = scale * Primitive.LocalObjectBoundsMax.x*2 / MaxSteps;

float camdist = length( ResolvedView.WorldCameraOrigin - GetObjectWorldPosition(Parameters) );

float planeoffset = GetScreenPosition(Parameters).w / worldstepsize;

float actoroffset = camdist / worldstepsize;

planeoffset = frac( planeoffset - actoroffset);

float3 localcamvec = normalize( mul(Parameters.CameraVector, Primitive.WorldToLocal) );

float3 offsetvec = localcamvec * StepSize * planeoffset;

return float4(offsetvec, planeoffset * worldstepsize);

Notice that both the depth and actorposition are both accounted for. That stabilizes the slices relative the actor so there no movement as the camera moves towards or away. I put this into another custom node for now. It will help to keep the setup part of the code separate from the core raymarching code so that other primitives like spheres can be added more easily. This is not a nested custom node since the value is used directly and only once. It is never called specifically by other custom nodes.


The next task is to control the step count more carefully. You may have noticed that the code so far is saturating the ray position to keep it inside the 0-1 space. That means whenever the tracer hits the edge of the box, it continues to waste time checking the volume. It also will never trace the full corner to corner distance of the volume since the trace distance is limited to 1, and the corner to corner distance of the volume is 1.732. This just happens to not be a problem in the example volume so far because the content is roundish. One way to fix this is by checking to see if the ray exits the volume during the loop, but a solution like that is not ideal because it adds to the overhead of the loop and that should be kept as simple as possible. A better solution is to pre-calculate the number of steps that fit.


It helps to use a simple primitive like a box or a sphere so that you can use simple math to determine thickness. While spheres may be the more performant shape due to covering less screen pixels, boxes let us display the entire content of volume textures and tends to be more flexible when distorting the volume. For now we will just deal with using a box. Here is how we precalculate the steps for a box. The world->local transforms allow the mesh to move. Note that this actually changes a few thing about how we calculate the above plane alignment so I just rolled the above code into this. Now the function returns the local Ray Entry Position and Thickness directly:


//bring vectors into local space to support object transforms

float3 localcampos = mul(float4( ResolvedView.WorldCameraOrigin,1.00000000), (Primitive.WorldToLocal)).xyz;

float3 localcamvec = -normalize( mul(Parameters.CameraVector, Primitive.WorldToLocal) );

//make camera position 0-1

localcampos = (localcampos / (Primitive.LocalObjectBoundsMax.x * 2)) + 0.5;

float3 invraydir = 1 / localcamvec;

float3 firstintersections = (0 - localcampos) * invraydir;

float3 secondintersections = (1 - localcampos) * invraydir;

float3 closest = min(firstintersections, secondintersections);

float3 furthest = max(firstintersections, secondintersections);

float t0 = max(closest.x, max(closest.y, closest.z));

float t1 = min(furthest.x, min(furthest.y, furthest.z));

float planeoffset = 1-frac( ( t0 - length(localcampos-0.5) ) * MaxSteps );

t0 += (planeoffset / MaxSteps) * PlaneAlignment;

t0 = max(0, t0);

float boxthickness = max(0, t1 - t0);

float3 entrypos = localcampos + (max(0,t0) * localcamvec);

return float4( entrypos, boxthickness );

The node marked "Ray Entry" hooks to theCurPosinput on the main ray marching node. The parameterPlane Alignmentallows toggling the alignment on and off.

标记为“Ray Entry”的节点连接到主射线行进节点上的curposinput。parameterPlane alignment允许打开和关闭对齐。

Note that parts of the code now assume that you are using a Box static mesh that has its pivot at the center of the box and not on the floor the box.



So far we have been using the local position of the geometry to easily start a trace from the outside, but that won't let the camera go inside the volume. To support going inside, we can instead use the Ray Entry Position output from the already solved box intersection above, and then flip the faces of the polygons on the box geometry so they face inwards.  This works because we know where the ray would have intersected the outside of the volume and we also know how long the ray will travel through the volume.

到目前为止,我们一直使用几何体的局部位置来轻松地从外部开始跟踪,但这不会让相机进入体积内部。为了支持进入内部,我们可以使用来自上面已经解决的盒子交点的Ray Entry Position输出,然后翻转盒子几何体上的多边形面,使它们面向内部。这是可行的,因为我们知道光线与外部物体的交点我们也知道光线穿过物体的时间。

Flipping the faces and using the intersection will allow the camera to go inside the volume but it will not make objects sort correctly. Any object inside the cube will appear to draw completely on top of the volume. To solve that, we just need to take the localized scene depth into account when calculating the ray distance within the box. This requires a few new lines to be added to the setup function:


float scale = length( TransformLocalVectorToWorld(Parameters, float3(1.00000000,0.00000000,0.00000000)).xyz);

float localscenedepth = CalcSceneDepth(ScreenAlignedPosition(GetScreenPosition(Parameters)));

float3 camerafwd = mul(float3(0.00000000,0.00000000,1.00000000),ResolvedView.ViewToTranslatedWorld);

localscenedepth /= (Primitive.LocalObjectBoundsMax.x * 2 * scale);

localscenedepth /= abs( dot( camerafwd, Parameters.CameraVector ) );

//this line goes just before the line: t0 = max(0, t0);

t1 = min(t1, localscenedepth);

Now, in the material settings,Disable Depth Test should be set to true in order to gain control over how the material blends with the scene. Sorting with other translucent objects will be done on a per object basis and we won't have much control over that, but at least we can solve sorting with opaque objects. While in the material settings, also change the blend mode to AlphaComposite to avoid edge blending artifacts that occur with translucency. Also make sure the material is set to unlit.


Now we can generate accurate sorting with opaque geometry by adding one Scene Depth lookup. This automatically causes the ray marcher to return the correct opacity because we are stopping the ray from accumulating beyond the scene depth. There is still one artifact to fix though. Because we are stopping the ray march using whole step sizes, we will see stair step like artifacts where opaque geometry intersects the volume:


To fix those slicing artifacts requires just taking one additional step. We track how many steps would have fit up to the scene depth and then take one final step sized to fit the remainder. That assures we end up taking a final sample right at the depth location which smooths out those seams. In order to keep the main tracing loop as simple as possible, we do this outside of the main loop as an additional density/shadow pass.


The resulting blend with opaque objects appears accurate as objects move and the view direction changes:



So far we have a fairly functional density only ray marcher. As you can see, the core ray marching part of a shader is probably the simplest part. Handling the tracing behavior for different primitives, sampling and sorting problems are the tricky bits. 


Light Sampling

To render convincingly lit volumes, the behavior of light transport must be modeled. As rays of light pass through a volume, a certain amount of that light will be absorbed and scattered by the particulates in the volume. Absorption is how much light energy is lost to the volume and scattering is how much light is reflected out. The ratio of Absorption (A) to Scattering (S) determines the diffuse brightness of the particulates [shopf2007].


In this case, we are only going to care about one kind of scattering for simplicity and performance reasons:Out-Scattering. That is basically how much light that hits the volume will be reflected back out isotropically or diffusely.In-Scattering refers to light bouncing from within the volume and that is generally too expensive to do in real time but it can be decently approximated by blurring the results of the Out-Scattering. To know the out-scattering at a given point, it must be know how much light energy was lost due to absorption as the photons reached that point from the light source as well as how much energy will then be lost heading towards the eye back out of the volume.


There are a number of techniques to calculate these values, but this post will deal primarily with the brute force method of performing a nested ray march towards the light from each density sample. This method is quite expensive as it means the cost of the shader will be DensitySteps * ShadowSteps, or N*M. It is also by far the easiest and most flexible to implement.

有许多技术来计算这些值,但这篇文章将主要处理从每个密度样本执行一个嵌套的光线行进的蛮力方法。这个方法是非常昂贵的,因为它意味着着色器的成本将是DensitySteps * ShadowSteps,或N*M。到目前为止,它也是最容易实现和最灵活的。

The above example shows nested shadow samples being traced from each density sample originating from a single camera ray. Note that only density samples that are inside of the volume media have to perform the shadow samples, and the shadow loop can quit early if a ray reaches the volume border, or if the shadow density exceeds a threshold where close to full absorption has occurred. These few things can reduce the drastic N * M situation a bit.

上面的例子显示了嵌套的阴影样本被追踪从每个密度样本来自一个单一的相机射线。注意,只有在体积介质内部的密度样本才能执行阴影样本,如果光线到达体积边界,或者如果阴影密度超过了接近完全吸收的阈值,阴影循环可以提前退出。这些方法可以稍微减少激烈的N * M情况。

At each sample, the density is taken and used to determine how much light that sample can scatter back out. That also affects how much transmittance will decrease for the next iteration. The shader then shoots rays towards the light and see how much of the potential light energy made it to that point. Thus, the visible light transmitted from the point to the camera is controlled by the total photon path length through the volume and the scattering coefficient of the point itself. This process can still be described by the prior formula from Drebin, 1988 [1]:

在每个样本上,都要取其密度,并用来确定该样本能散射出多少光。这也会影响下一次迭代时透射率的下降。然后着色器向光发射光线,看看有多少潜在的光能到达那一点。因此,从点传输到相机的可见光是由光子通过体积的总路径长度和点本身的散射系数控制的。这个过程仍然可以用Drebin 1988[1]先前的公式来描述:

Cout(v) = Cin(v) * (1 - Opacity(x)) + Color(x) * Opacity(x)

But the above formula only describes a single light path to the camera. To be able to propagate light from out-scattering as well as calculate volume opacity, we need to recreate that iterative ray sample at each sample location, towards the light. Let's define a few basic functions which describe out lighting calculations.


Linear Density is defined at each point x along the ray as simply Opacity * Density Parameter. The parameter allows user tweaking of the density but will be dropped from the equations for simplicity from here on out, as it could also be pre-multiplied into the volume opacity.


Linear Density is accumulated along a ray from point x to point x' like this:


Thus, Transmittance over the length of a ray from point x to x' is defined as:


This is how we calculated the density for the density-only ray march started above. To add lighting, we now need to account for the light scattering and absorption at each point along the ray. This involves nesting a bunch of these terms. At a point x within the volume, the amount of out-scattering that makes it to that point from a light from direction w is equal to:


Where w is the light direction and l is a point outside the volume towards the negative light direction. The term -LinearDensity(x,l) represents the linear density accumulation from point x towards the light until the volume boundary is reached which represents the amount of particulate that would absorb light. Note that this is still only the value for the amount of light visible at that point, it does not yet account for the fraction of that light absorbed based on the opacity of the sample. For that, the OutScattering term gets multiplied by Opacity(x). It also does not account for further transmission loss as that light exits back out of the volume. To account for that loss, the transmittance from the camera to the point x must be determined. 


We can make a modified function TotalOutScattering(x', w) which describes how much out-scattering is visible along a ray w from point x  to point x', rather than just describing it for a single point:

我们可以做一个修改的函数TotalOutScattering(x', w)来描述沿着光线w从点x到点x'可见的散射量,而不是只描述单个点的散射量:

Note that OS and T are short for the OutScattering and Transmission terms above. OS should also by multiplied by Opacity(s) which I forgot to add but may recreate the expression later. This function will return the total scattering from all points along a view ray through the volume. It is actually a few nested integrals which is too nasty to bother writing out in the expanded form so we might as well start dealing with the code itself. Terms like OutScattering are implied to be multiplied by light color and diffuse color at the beginning.


Traditionally you may see this equation written as Radiance (L) in other papers but I have excluded that because for radiance you also account for the amount of background color transmitted into the volume which is basically just SceneColor * FinalOpacity. We won't add that into the math here for reasons that I somewhat arbitrarily decided upon:

传统上,你可能会在其他文章中看到这个方程被写成辐亮度(L),但我已经排除了它,因为辐亮度还包括传输到体积中的背景颜色的数量,基本上就是SceneColor * final不透明度。我们不会把它加到数学中,因为我有些武断地决定:

1) We aren't going to blend the background color like that. Instead we will just use the AlphaComposite blend mode and plug in our opacity.


2) We aren't actually going to be blurring or scattering the background color which is why I am not going to bother talking about that term too much. For much more detail on the full math, see Shopf [2]. Much of the math on this page is based on equations from that page but I have attempted to make them more artist friendly by using real words instead of greek symbols and explaining the relationships in more simplified ways.


Example Shadowed Volume Code

float numFrames = XYFrames * XYFrames;

float curdensity = 0;

float transmittance = 1;

float3 localcamvec = normalize( mul(Parameters.CameraVector, Primitive.WorldToLocal) ) * StepSize;

float shadowstepsize = 1 / ShadowSteps;

LightVector *= shadowstepsize;

ShadowDensity *= shadowstepsize;

Density *= StepSize;

float3 lightenergy = 0;

for (int i = 0; i < MaxSteps; i++)


float cursample = PseudoVolumeTexture(Tex, TexSampler, saturate(CurPos), XYFrames, numFrames).r;

//Sample Light Absorption and Scattering

if( cursample > 0.001)


float3 lpos = CurPos;

float shadowdist = 0;

for (int s = 0; s < ShadowSteps; s++)


lpos += LightVector;

float lsample = PseudoVolumeTexture(Tex, TexSampler, saturate(lpos), XYFrames, numFrames).r;

shadowdist += lsample;


curdensity = saturate(cursample * Density);

float shadowterm = exp(-shadowdist * ShadowDensity);

float3 absorbedlight = shadowterm * curdensity;

lightenergy += absorbedlight * transmittance;

transmittance *= 1-curdensity;


CurPos -= localcamvec;


return float4( lightenergy, transmittance);

As you can see, just adding basic shadowing adds quite a lot of complexity to the simple density only tracer we started with.


Notice that in this version, the cameravector and lightvector get pre-multiplied by their respective stepsize in the beginning, outside of the loop. That is because shadow tracing makes the shader much more expensive so we want to move as many operations outside of the loops as possible (especially the inner loop).


In the current form, the shader code above is still very slow. We did add one optimization: the shader only evaluates a voxel if it has an opacity > 0.001. This can potentially save a lot of time if our volume texture has a lot of empty space, but it won't help at all if the whole volume is written to. We need more optimizations to make this shader practical.

在当前的形式中,上面的着色器代码仍然非常慢。我们确实添加了一个优化:shader只评估不透明度> 0.001的体素。如果我们的卷纹理有很多空空间,这可能会节省很多时间,但如果要写入整个卷,这一点帮助都没有。我们需要更多的优化使这个着色器实用。

The biggest problem with the above version is that it is going to run all shadow steps for all density samples. So if we used something like 64 density steps and 64 shadow steps, that would be 4096 samples. Because our pseudovolume function requires 2 lookups, that means our shader would be doing 8192 texture lookups per pixel! That is pretty bad, but we can optimize it significantly by quitting early if either the ray leaves the volume or full absorption is reached.


The first part can be handled by checking if the ray has left the volume at each shadow iteration. That would be something like:


if(lpos.x > 1 || lpos.x < 0 || lpos.y > 1 || lpos.y < 0 || lpos.z > 1 || lpos.z < 0) break;

While a check like that works, it turns out to be pretty slow since the shadow loop runs so many times. I have also tried precalculating the number of shadow steps before each shadow loop instead, very similar to how I precalculated the number of density iterations for a box shape. Surprisingly that turned out to be the slowest method. The fastest method I have found so far to early-terminate the shadow loop is with this simple box test math:


float3 shadowboxtest = floor( 0.5 + ( abs( 0.5 - lpos ) ) );

float exitshadowbox = shadowboxtest .x + shadowboxtest .y + shadowboxtest .z;

if(exitshadowbox >= 1) break;

The next bit we need to add is early termination based on an absorption threshold. Typically this means you quit the shadow loop once the transmittance is below some small number such as 0.001. The larger this threshold, the more artifacts will appear so this value should be tweaked to be as large as is visually acceptable.


If we wrote the shadow marching loop by just multiplying the light transmittance by the inverse opacity at each point then we would implicitly know the transmittance at every iteration and checking for the threshold would be as simple a checking:


if( transmittance < threshold) break;

But notice that we are not actually calculating transmittance during shadow iterations. We are accumulating linear density just like in our first density-only example. This is in an effort to make the shadow loop as cheap as possible, since doing a single add for each shadow accumulation is much cheaper than doing two multiplies and a 1-x which would otherwise be required. This just means we need to use some math to determine our shadow threshold in terms of a distance rather than a transmission value.


To do that, we simply invert the final transmittance term which is calculated as e ^ (-t * d). So we want to determine for what value of t would transmittance be less than our threshold. Thankfully this is exactly what the function log(x) does. The default base of log is e. It returns an answer to the question "e raised to what power equals x". So if we want to know at what value of t the transmittance would be less than 0.001, we can calculate:

为了做到这一点,我们只需将最后的透射率项(e -t * d)求反,因此我们想要确定透射率小于阈值的t值是多少。谢天谢地,这正是log(x)函数的作用。log的默认底数是e。它会返回“e的几次方等于x”这个问题的答案。因此,如果我们想知道在t的什么值下透光率会小于0.001,我们可以计算:

DistanceThreshold = -log(0.001) / d;

Assuming the user defined density d = 1,  this would give us a linear accumulation value of 6.907755 needed to reach 0.001 transmittance. We add this to our shader code with the line:

假设用户定义的密度d = 1,这将给我们一个线性累加值6.907755,需要达到0.001的透光率。我们将这一行添加到着色器代码中:

float shadowthresh = -log(ShadowThreshold) / ShadowDensity;

Where ShadowThreshold is a user defined transmittance threshold and ShadowDensity is a user defined shadow density multiplier. This line needs to go after the line that multiplies ShadowDensity by shadowstepsize, above the loops.


Updated Shadow Code:

Adding in the shadow exit and transmittance thresholds, as well as the final partial step evaluation outside of the loop (which also has to perform the same shadow steps) yields this code:


float numFrames = XYFrames * XYFrames;

float accumdist = 0;

float curdensity = 0;

float transmittance = 1;

float3 localcamvec = normalize( mul(Parameters.CameraVector, Primitive.WorldToLocal) ) * StepSize;

float shadowstepsize = 1 / ShadowSteps;

LightVector *= shadowstepsize;

ShadowDensity *= shadowstepsize;

Density *= StepSize;

float3 lightenergy = 0;

float shadowthresh = -log(ShadowThreshold) / ShadowDensity;

for (int i = 0; i < MaxSteps; i++)


float cursample = PseudoVolumeTexture(Tex, TexSampler, saturate(CurPos), XYFrames, numFrames).r;

//Sample Light Absorption and Scattering

if( cursample > 0.001)


float3 lpos = CurPos;

float shadowdist = 0;

for (int s = 0; s < ShadowSteps; s++)


lpos += LightVector;

float lsample = PseudoVolumeTexture(Tex, TexSampler, saturate(lpos), XYFrames, numFrames).r;

float3 shadowboxtest = floor( 0.5 + ( abs( 0.5 - lpos ) ) );

float exitshadowbox = shadowboxtest .x + shadowboxtest .y + shadowboxtest .z;

shadowdist += lsample;

if(shadowdist > shadowthresh || exitshadowbox >= 1) break;


curdensity = saturate(cursample * Density);

float shadowterm = exp(-shadowdist * ShadowDensity);

float3 absorbedlight = shadowterm * curdensity;

lightenergy += absorbedlight * transmittance;

transmittance *= 1-curdensity;


CurPos -= localcamvec;


CurPos += localcamvec * (1 - FinalStep);

float cursample = PseudoVolumeTexture(Tex, TexSampler, saturate(CurPos), XYFrames, numFrames).r;

//Sample Light Absorption and Scattering

if( cursample > 0.001)


float3 lpos = CurPos;

float shadowdist = 0;

for (int s = 0; s < ShadowSteps; s++)


lpos += LightVector;

float lsample = PseudoVolumeTexture(Tex, TexSampler, saturate(lpos), XYFrames, numFrames).r;

float3 shadowboxtest = floor( 0.5 + ( abs( 0.5 - lpos ) ) );

float exitshadowbox = shadowboxtest .x + shadowboxtest .y + shadowboxtest .z;

shadowdist += lsample;

if(shadowdist > shadowthresh || exitshadowbox >= 1) break;


curdensity = saturate(cursample) * Density;

float shadowterm = exp(-shadowdist * ShadowDensity);

float3 absorbedlight = shadowterm * curdensity;

lightenergy += absorbedlight * transmittance;

transmittance *= 1-curdensity;


return float4( lightenergy, transmittance);

Now we have a functioning translucent ray volume ray marcher that can self shadow from one directional light. The above shadow steps would have to be repeated for each additional light supported. The code can easily support point lights in addition to directional lights by calculating inverse squared falloff in addition to each shadow term, but the vector from CurPos to the light must be calculated at each density sample. 


Ambient Light

So far we have only been dealing with Out-Scattering contributed from a single light. This generally will not look very good as if the light is fully shadowed the volume will appear flat in the shadow. Usually some kind of ambient light term is added to address this. There are lots of ways to handle the ambient light. One way is to pre-calculate the ambience inside of the volume texture, like deep shadow maps. The downside to that approach is you won't be able to rotate and instance the volumes as the ambient light would remain fixed. A realtime approach is to cast a few sparse rays up from each voxel to estimate overhead shadowing. This can be done with one additional offset sample, but the results get better with each additional averaged sample.


Another reason to favor a dynamic ambient term over a prebaked one is if you are planning to procedurally stack multiple volume textures. One example of this is described in the Horizon Zero Dawn cloud paper [3]. In this paper, one volume texture describes the macro shape of unique detail over an entire area and a second tiling volume texture is used to modulate the density of the base volume. An approach like this is very powerful as volume rendering techniques are currently limited by resolution. Applying blend modulation is a great way to create the appearance of more detail, but it means methods that precalculate lighting will not match the new details that arise from the combination of volume textures.


Here is how we take three additional offset sample to estimate overhead ambient occlusion. This can go just after the transmittance was multiplied in the main loop:


//Sky Lighting

shadowdist = 0;

lpos = CurPos + float3(0,0,0.05);

float lsample = PseudoVolumeTexture(Tex, TexSampler, saturate(lpos), XYFrames, numFrames).r;

shadowdist += lsample;

lpos = CurPos + float3(0,0,0.1);

lsample = PseudoVolumeTexture(Tex, TexSampler, saturate(lpos), XYFrames, numFrames).r;

shadowdist += lsample;

lpos = CurPos + float3(0,0,0.2);

lsample = PseudoVolumeTexture(Tex, TexSampler, saturate(lpos), XYFrames, numFrames).r;

shadowdist += lsample;

//shadowterm = exp(-shadowdist * AmbientDensity);

//absorbedlight = exp(-shadowdist * AmbientDensity) * curdensity;

lightenergy += exp(-shadowdist * AmbientDensity) * curdensity * SkyColor * transmittance;

The two commented out terms were just an attempt to reduce the number of temporaries used. The same can be done to all of the code.


Light Extinction Color

Notice that we are only applying the LightColor to the shadow term once per density sample. Doing it in this way does not allow the scattering to change color with depth. The scattering from clouds in real life is mostly from mie scattering which scatters light wavelengths equally, so the single color scatter is not bad for clouds. Still, colored extinction can emulate extinction spectra in liquids, sunset IBL response or artistic effects just by replacing the ShadowDensity parameter with a V3. You divide the Shadow Density by the color you want it to show:


Here is what the entire material should look like now:


Notice a phase function was added to the light color (that function exists in engine\content but is not exposed to the function library). It was done this way rather than on the output side of the ray marcher so that the phase function could be separated to just the directional light and not affect the ambient light.


Additional Shadowing Options

It is possible to add support for various shadowing methods, such as the custom per-object depth based shadow maps discussed in a previous post. While a solution like that can work here, depth based shadowmaps do not look great for volumetrics because the shadow will be crisp without performing expensive custom blurring (and remember we are already inside of a crazy expensive nested loop).


I have only experimented so far with enabling Distance Field Shadows. Distance field shadows are nice for volumetrics because the shadows can be made soft without extra cost. The downside is that looking up the global distance fields many times for volumetric purposes is extremely expensive and the resolution of the distance fields themselves is not great. Only try this if you have a 980+ level gpu.


To add distance field shadows requires also passing in or re-computing the world space light vector outside of the loop preferably:


float3 LightVectorWS = normalize( mul( LightVector, Primitive.LocalToWorld));

Then inside of the main loop, just after the shadow steps:


float3 dfpos = 2 * (CurPos - 0.5) * Primitive.LocalObjectBoundsMax.x;

dfpos = TransformLocalPositionToWorld(Parameters, dfpos).xyz;

float dftracedist = 1;

float dfshadow = 1;

float curdist = 0;

float DistanceAlongCone = 0;

for (int d = 0; d < DFSteps; d++)


DistanceAlongCone += curdist;

curdist = GetDistanceToNearestSurfaceGlobal(dfpos.xyz);

float SphereSize = DistanceAlongCone * LightTangent;

dfshadow = min( saturate(curdist / SphereSize) , dfshadow);

dfpos.xyz += LightVectorWS * dftracedist * curdist;

dftracedist *= 1.0001;


Then the term dfshadow gets multiplied by the absorbed light.


Temporal Jitter

Sometimes slicing artifacts will show up even with high step counts and other times the resolution of the volume texture itself can cause artifacts. When low step counts are used, still images can be improved by using the plane snapping described above, but camera motion will still show the slicing artifacts as the slices rotate. Temporal Jitter basically randomly moves around the starting locations every frame and smooths the result. It generally works well unless you have moving objects in front of the jittered surface.


In the past I used the DitherTemporalAA material function to do this, but there is a cheaper and better way now, thanks toMarc Olano'simproved psuedorandom functions added to UE4 in 4.12. It boils down to these three lines (note that localcamvec has bee pre-multiplied by step size at this point):

在过去,我使用DitherTemporalAA材质函数来做这个,但现在有一个更便宜和更好的方法,感谢toMarc Olano在4.12中添加到UE4的改进伪随机函数。它归结为以下三行(注意,localcamvec在这一点上被预先乘以了步长):

int3 randpos = int3(Parameters.SvPosition.xy, View.StateFrameIndexMod8);

float rand =float(Rand3DPCG16(randpos).x) / 0xffff;

CurPos += localcamvec * rand.x * Jitter;


Final Notes

Earlier I suggested using 4.13.2 since 4.14 introduced a regression that prevents the material compiler from sharing instructions between pins. So connecting the opacity and emissive color means the entire raymarch function is done twice. One workaround in 4.14 is to use 1.0 for opacity and then use the opacity to lerp between emissive and scene color.


(I had more notes but turns out this blog template limits the post length and simply omits things beyond that point, so I will add more information in a followup post. it wont even let me fit all references).



[1]: Drebin, R. A., Carpenter, L., and Hanrahan, P. Volume rendering.

In SIGGRAPH ’88: Proceedings of the 15th annual conference on Computer

graphics and interactive techniques (1988), pp. 65–74.


  • Ray March 体积云

    Volume Raymarching The basic concept behind volumetric re...

  • Unity Ray Marching体积云

    以上图中的柱子参考为1000m高,Ray Marching采样数为64,云分辨率为960x540,屏幕分辨率192...

  • 积云

    一开始我的头顶有一块小乌云, 它随着我行走, 停在房间的上空, 后来它随我出了门, 云朵变成了镜子, 折射着身边人...

  • 积云

    厚厚的积云, 三三两两, 分散又聚集在鱼白的天空。 云随着风的意动, 幻出各式各样的物, 物在游动, 活在穹顶的画...

  • 积云

    睡眼惺忪的走出楼 眯眼看天 满是积云 沉沉地说着不开心 我问 为什么 它不说话 只是沉沉地 噢 是脏了吗 我立马忙...

  • Daily Memo &March 2020

    March 2020 week1 March 2020 week2 March 2020 week3 March ...

  • 淡积云

    你是那样的飘逸潇洒 你是那样的的纯净洁白 没有层积云的忧怨缠绵 没有积雨云的狂躁不安 没有毛卷云的高处不胜...

  • 观积云

    云低恋树山入天,氤氲葱郁有神仙 纱罩峰头峰峥蕴,登高直入凌霄间

  • 淡积云

    中午的天空很漂亮,很多人用手机记录了那一刻天空的美景,直到下午下班,抬头看看天,蓝天白云的景色依旧如画。 淡积云即...

  • 慢生活英文版 片段一

    March was Haruhana's period, but the march was still col...


      本文标题:Ray March 体积云
