Age | Commit message (Collapse) | Author |
|
|
|
|
|
away the split storage.
Also fixes a bug discovered by Mike Jewell that crashed BVH creation after
the last commit.
|
|
between GPU and CPU. This allows much more complex geometries to
be run on CUDA devices with less memory.
The GPUGeometry object now takes a min_free_gpu_mem parameter giving
the minimum number of bytes that can be free on the GPU after the BVH
is loaded. By default, this number is 300 MB. Cards with sufficient
memory will have the entire BVH on card, but those without enough
memory will have the BVH split such that the top of the hierarchy (the
most frequently traversed) is on the GPU.
|
|
'weld' a solid onto another one at identical shared triangles. optionally apply a ``Surface`` or color to the shared surface.
this isn't a boolean solid operation -- the triangles must be identical in the two meshes.
|
|
|
|
update remaining unit tests to build BVHs with
``loader.create_geometry_from_obj`` instead of the (removed) ``build``
method.
|
|
The ``Material`` struct now includes two new arrays: ``reemission_prob`` and ``reemission_cdf``. The former is sampled only when a photon is absorbed, and should be normalized accordingly. The latter defines the distribution from which the reemitted photon wavelength is drawn.
This process changes the photon wavelength in place, and is not capable of producing multiple secondaries. It also does not enforce energy conservation; the reemission spectrum is not itself wavelength-dependent.
|
|
remove the ``SURFACE_SPECULAR`` and ``SURFACE_DIFFUSE`` models, since their functionality is available using the more-general ``SURFACE_DEFAULT``. also allow the user to specify the reflection type (specular/diffuse) for the complex and wls models. change wls so the normalization of properties is more consistent with the default.
|
|
All surface models including ``SURFACE_COMPLEX`` and ``SURFACE_WLS`` are now working. Note that the WLS won't work right in hybrid rendering mode since that mode relies on matching up incoming and outgoing photon wavelengths in a lookup table.
|
|
this fixes hybrid rendering mode
|
|
|
|
reduce models to the following:
SURFACE_DEFAULT, // specular + diffuse + absorption + detection
SURFACE_SPECULAR, // perfect specular reflector
SURFACE_DIFFUSE, // perfect diffuse reflector
SURFACE_COMPLEX, // use complex index of refraction
SURFACE_WLS // wavelength-shifting reemission
where SURFACE_COMPLEX uses the complex index of refraction (`eta' and `k') to compute reflection, absorption, and transmission. this model comes from the sno+ rat pmt optical model.
|
|
surfaces now have an associated model which defines how photons are propagated. currently, these include specular, diffuse, mirror, photocathode (not implemented), and tpb. the default is the old behavior, where surfaces do some weighted combination of detection, absorption, and specular and diffuse reflection.
`struct Surface` contains as members the superset of all model parameters; not all are used by all models. documentation (forthcoming) will make clear what each model looks at.
|
|
isotropically distributed directions.
|
|
|
|
|
|
|
|
but no improvement to actual simulation.
|
|
group photons so that they take similar paths on the GPU.
argsort_direction() morton-orders an array of normalized direction
vectors according to their spherical coordinates. Photons sorted in
this way tend to follow similar paths through a detector geometry,
which enhances cache locality. As a result, get_node() uses the GPU
L1 cache again, with good results.
|
|
|
|
|
|
background.
|
|
|
|
|
|
|
|
about 60-90 seconds.
|
|
This is an adaptation of the original Chroma BVH construction algorithm.
The generation stage is very slow, but can be fixed.
|
|
old location in newer pycuda releases.
|
|
parent nodes if combining them would result in a parent node that is
excessively large compared to the surface area of the children.
This doesn't help as much as you might imagine.
|
|
|
|
|
|
|
|
areas of the BVH nodes in a particular layer of the tree.
|
|
|
|
Node access is very irregular as each thread descends the BVH tree.
Each node is only 16 bytes, so the 128 byte cache line size in the L1
cache means that a lot of useless data is often fetched. Using some
embedded PTX, we can force the L1 cache to be skipped, going directly
to L2. The L2 cache line is 32 bytes long, which means that both
children in a binary tree will be cached at the same time.
This improves the speed on the default generated binary trees, but
does not help an optimized tree yet.
|
|
|
|
|
|
using the Camera class.
|
|
|
|
tracing for CUDA" by Hannu Saransaari.
The intersect_box() function has been rewritten to be much shorter and
use the min() and max() functions, which map directly to hardware
instructions. Additionally, the calculations inside intersect_box()
have been reorganized to allow the compiler to use the combined
multiply-add instruction, instead of doing a subtraction followed by a
division (which is way slower).
|
|
consistency everywhere.
|
|
minimization
|
|
* chroma-bvh create [name] [degree] - Creates a new BVH with the specified
branching degree.
* chroma-bvh node_swap [name] [layer] - Optimizes a BVH layer with a
"greedy, short-sighted" algorithm that swaps around nodes to minimize
the surface area of the immediate parent layer. Rebuilds the tree
above the modified layer when finished.
Also modified the chroma-bvh stat command to print the sum of the
logarithms of the areas of each layer. It seems to be a rough
predictor of the simulation speed of the BVH.
|
|
|
|
|
|
|
|
Geometry.
|
|
Note that rendering is still broken by the new BVH format.
|
|
searching through files, named geometries in the cache, and geometry
creation functions.
The loader function also is responsible for fetching or creating a BVH
to go with the geometry.
This commit also removes some code that has been replaced by the new
system. Other bits will come back in future commits.
|