SOC

SOC – Dust radiative transfer with OpenCL

SOC is used for the modelling of scattering and thermal emission of interstellar dust. The name was originally an acronym for Scattering with OpenCL, which refers to the fact that the program is implemented using OpenCL .

The program was used in the code comparison published in Gordon et al. 2017 . However, since then the main program has been translated from C++ to Python and some additional features have been added. The initial Python version was described in Juvela 2019 . Because the main computations are carried out in OpenCL kernels that are compiled to machine code, the use of Python in the main program did not result in any significant degradation of the run times. On the contrary, the main program is now much shorter and allows easier modifications and tests of alternative implementation details.

This page describes the use of the Python version of SOC. Our goal has been to keep SOC simple so that it only concentrates on the actual radiative transfer calculations. SOC needs only minimal information of the dust models, which is given through input files. SOC can calculate dust temperatures for grains that are in equilibrium with the radiation field but the emission of stochastically heated grains is handled outside SOC (for example, using DustEM ). The script SOC.py is used for calculations of dust temperatures and emission and there is a separate script SOCS.py for calculations of light scattering (using the forced first scattering and peel-off methods). A development version of the program (called ASOC.py and ASOCS.py) can be found at GitHub . The GitHub directory contains an example how to make a 3D model, run SOC, and plot the calculated surface brightness and dust temperatures.

Input files

Parameter file ^

The parameter file (ini-file) is what is given to SOC on the command line and from where SOC extracts all information about other input files and the parameters of the run. It is a text file where each line starts with a keyword that is possibly followed by one or more parameters of that keyword. Lines that do not start with a known keyword are discarded and one can also use the “#” character to indicate a comment. The following list contains most of the common keywords with their arguments (before colon) with a short explanation.

In the descriptions we refer to the total number of cells in the model as CELLS and the number of simulated frequencies as NFREQ. When talking about data arrays, for example array [CELLS, NFREQ] corresponds to a storage order where the index over frequencies runs faster.

  • absorbed filename gives the name for the file that will contain the number of absorbed photons in each cell and frequency
  • ali turns on accelerated lambda iterations (only for dust emission, in calculations with multiple iterations, not for stochastically heated grains)
  • background filename [k] specifies the isotropic background where the intensities are read from file filename, optionally scaled by a factor of k
  • bgpackets # specifies the number of photon packages to be simulated (per iteration and per frequency) from the background (see background and hpbg)
  • cellpackets # specifies the number of photon packages to be simulated (per iteration and per frequency) from dust within the model volume
  • CLE tells SOC to calculate dust emission on device instead of in the host Python code (much faster; only for equilibrium dust)
  • CLT tells SOC to calculate dust temperatures on device instead of in the host Python code (much faster; only for equilibrium dust)
  • cload filename tells SOC to read absorptions from the specified file that has been written in a previous SOC run using the csave option. In this case the simulation of “constant” radiation sources (everything apart from the dust medium itself) will be omitted. Initial run with csave can be combined with subsequent SOC runs with cload to avoid the unnecessary overhead of resimulating the absorptions caused by those constant radiation sources.
  • cloud filename gives the name of the cloud file (containing densities and information of the grid hierarchy)
  • colden filename asks for column density maps (corresponding to the actual maps being calculated) to be saved to file named based on filename – the keyword colden has been replaced with savetau
  • csave filename tells SOC to save absorptions to the given file; the file will contain the absorptions that result from all radiation sources except for the dust medium itself; the file can be read in subsequent SOC runs by using the cload keyword
  • DEFS to ease experimentation with different kernels, this can be used to add definitions that are passed to the kernel at compile time; this is a string that should follow the format “-D definition” (or any strings understood by the compiler)
  • density # specifies additional scaling for the density values that are read from the cloud file (default is 1.0, no rescaling)
  • device string specifies the device (in some versions devices) to be used; c stands for CPU (the default) and g for GPU. The string could thus be ‘c’ or ‘g’, or ‘cg’ (to allow either CPU or GPU to be chosen). One can also give a second optional parameter, which chooses a particular device based on string in its name. For example “device c Intel” or “device g GTX”. The alternative way to choose a particular device is to use the platform keyword.
  • diffpack # gives the number of photon packages (per frequency) that are used to simulate emission from the diffuse emission sources specified with diffuse; this is separate from the dust emission from the medium
  • diffuse filename [#] specifies a file with additional sources of diffuse radiation (photons/cm3/Hz); this is a plain binary file containing CELLS and NFREQ (two int32 numbers) followed by CELLS float32 numbers for the actual emission; see keyword diffpack; note: this diffuse emission is completely separate from the emission that is calculated for the actual dust medium
  • directions theta phi specifies the direction towards the observer that is used when calculating the the output maps; theta is the angle from positive Z axis and phi the rotation around the +Z axis, the angle increasing from +X towards +Y. For direction 90 0, the observer is in the direction of the positive X axis and the maps are oriented so that cloud Y axis is right and the Z axis up. For direction 0 0, observer is in the direction of positive Z axis and in the maps Y is right and X is down. For direction 90 90, the observer is in the direction of the positive Y axis and in the maps the cloud X axis points left and the cloud Z axis up.
  • dsc filename specifies the name for the scattering function file; there should be either a single line with dsc or as many instances as there are dust components
  • emitted filename specifies the name for a file where the emissions (number of emitted photons per cell and frequency) are stored
  • emweight # for numbers #>0, generates photon packages from the dust medium according to a weighting scheme (currently options enweight 1 and emweight 2 exist)
  • EXEC executes the remainder of the line in a shell at the time when the ini-file is read. Because of obvious security concerns, this is by default commented out in the code (see the file pySOCAux.py)
  • fits [ra de] save spectra to FITS file instead of a plain binary file. Ra and de are optional image centre coordinates [degrees] used in the file header.
  • ffs # turns on (#=1) or off (#=0) the method of forced first scattering. This applies only to the SOCS.py script where the forced first scattering method is by default on.
  • gridlength # the size of a root grid cell in parsecs
  • hpbg filename # # specifies a Healpix file that is used to describe the surface brightness of the background sky (instead of or in addition to (check!) an isotropic background); the first optional numerical parameter is a scaling factor applied to the intensity values in the file (default=1) and the second optional parameter, if >0, tells SOC to use weighted sampling when generating photon packages from the background
  • iterations # number of iterations (when dust emission is simulated and the self-coupling between this emission and dust temperature is significant)
  • loadtemp filename in case of equilibrium dust (and only a single dust population), recalculates emission based on dust temperatures in the given file; this is used mainly to save disk space, storing CELLS temperature values instead of directly the CELLS×NFREQ emission values
  • local # overrides the default size of OpenCL work groups; the defaults are 8 for CPU and 32 for GPU
  • mapping nx ny step specifies the number of pixels and the pixel size of the output maps; nx and ny are the map dimensions and step the pixel size in units of the root grid cells (=in units of gridlength)
  • mapum # # ... list wavelengths [µm] for which (and only for which) maps should be written; this is used together with the FITS keyword, to produce separate FITS files for the surface brightness at the listed wavelengths
  • mirror xXyYzZ changes simulation so that photon packages that would exit the model volume are reflected back on selected boundaries; in the argument string (1-6 characters long), lower-case letters refer to lower boundaries and upper-case letters to upper boundaries along the corresponding coordinate axes
  • mmapabs I selects the absorption data to be handled using a normal file (I=0) or as a memory mapped file (I=1, default). Option I=1 requires less memory (one frequency at a time) but option I=0 is significantly faster, if data for all frequencies can be kept in the main memory during the run.
  • mmapemit I selects the emission data to be handled using a normal file (I=0) or as a memory mapped file (I=1, default). Option I=1 requires less memory but option I=0 is faster, if data for all frequencies can be kept in the main memory.
  • nnabs # for runs involving neural networks, # is the list of wavelengths at which the intensities are used as the inputs for the networks.
  • nnemit # for runs involving neural networks, # is the list of wavelengths for which the networks provides intensity predictions
  • nnmake string instructs SOC to create a mapping from absorptions (at wavelengths specified by nnabs) to emission (at wavelengths specified by nnemit). The trained networks are saved to files consisting of the string provided by the keyword nnemit and the name of the dust component.
  • nnsolve string instructs SOC to use neural networks to predict the emission (at wavelengths specified by nnemit), based on RT simulations at wavelengths specified by nnabs.
  • nnthin # instructs SOC to solve the emission only for every #:th cell, to be used in the training of neural networkds (with keyword nnmake)
  • noabsorbed in dust emission calculations with equilibrium-temperature dust (no stochastically heated grains), the integral of absorbed energy in each cell can be calculated on the fly, without saving the per-frequency values to a file; by using this keyword, one can in that case skip the writing of the large (CELLS×NFREQ×4 bytes) file of absorptions
  • nomap skip the writing of intensity and polarisation maps
  • nosolve skip the solving of dust temperatures and dust emission inside SOC (even when calculating a single dust species with grains in equilibrium temperature; used for example when SOC should write spectrum files based on emission that was already calculated in a previous run or with an external program)
  • optical filename [abundance-file] this specifies the name of the file containing the dust optical parameters (see Dust Files below); the second optional parameters gives the name of a binary file of [CELLS] float32 numbers that specifies the relative abundance of the dust in every cell (the default would be 1.0 for every cell)
  • optishalf when calculations include multiple dust components with variable abundances, we need to pass to the kernel large arrays that contain absorption and scattering cross section separately for each cell – with this keyword these are passed as half-precision floating point numbers (16 bits) allowing larger models to be calculated (especially on GPUs) but with some loss of accuracy (saving 4×CELLS bytes of device memory)
  • platform # select a specific OpenCL platform; by default SOC uses the first available OpenCL platform, based on the CPU vs. GPU selection made with the keyword device. One can also give an optional second integer argument, which chooses a particular device from the given platform. The default is “platform 0 0”, while “platform 1 2” would choose the second platform and its third device (e.g. if one had multiple GPUs)
  • p0 # if keyword polred is not used, specifies the intrinsic polarisation fraction (= polarisation fraction for uniform plane-of-the-sky magnetic field); note that the keyword is letter “p” followed by zero (0, not o)
  • pointsource x y z filename [scale] adds a point-like radiation source to the model; the position is given by (x,y,z) in units of the root grid, filename refers to the luminosity file (see section Radiation Field below), and scale is an optional scaling (a floating point number) for the source luminosity; there can be many lines with this keyword
  • polmap Bx By Bz asks SOC to write out (I, Q, U) maps using the three data files given as the arguments; these are hierarchical files with the same structure as the density file and contain the components of the magnetic field vectors for each cell
  • polred filename file of polarisation reduction factors R for each cell; the format is a plain binary file with the number of cells (int32) followed by CELLS floating point numbers (float32); the input file format could be changed in the code, in pySOCR.py (look for USER.file_polred) also to a hierarchical file with the same format as the density file
  • psmethod # alternative schemes for the simulation of point sources when they are located outside the model volume; the default is to simulate photon packages isotropically; method 1 sends packages only in the 2 pi solid angle that contains the cloud (a small improvement in SNR); method 2 sends photon packages only towards visible cloud surface elements but the weighting between visible cloud sides is not quite optimal; method 5 sends photon packages towards a cone that is aligned with one of the coordinate axes and contains the cloud (less optimal if the point source position is far from the cloud centre when both are projected onto the plane perpendicular to the centre direction of the cone); currently psmethod 2 is still the recommended one, especially if the point sources is far outside the model volume
  • pspackets # specifies the number of photon packages (per frequency and iteration) that are used to simulate emission from point sources
  • prefix <string> specifies the prefix attached to the names of some output files
  • reference turns on the use of a reference field (only for dust emission with multiple iterations, in practice not for stochastically heated grains)
  • remit min_um max_um limits the produced emission file (see keyword emitted) to contain data only for the given wavelength range, specified with lower and upper limits in micrometers; used mainly to reduce the size of the output files when surface brightness data are needed only for a limited set of wavelengths (note: in case of stochastically heated grains, the helper script A2E_MABU.py will also read this option)
  • saveint # saves intensity values for all cells; if parameter # is equal to 1, SOC saves to ISRF.DAT intensities estimated from photons entering a cell (two int32 numbers CELLS, NFREQ followed by float32 array [CELLS, NFREQ]); with value 2 it saves to the file also the net vector sum using additional three vector components (file begins with three int32 numbers CELLS, NFREQ, 4 that are followed by the float32 array [CELLS, NFREQ, 4]); with parameter value 3 SOC saves intensities that are calculated based on the absorptions within the cell (effective value for the intensity over the cell volume, not the average intensity on the cell boundary), the file starts with two int32 numbers (CELLS, NFREQ) followed by the float32 array [NFREQ, CELLS] of intensity values in cgs units
  • savetau filename float save map of optical depth to the named file for the wavelength specified in units of µm. If the wavelength is omitted, a map of the column density is saved instead. The map (pixel size, number of pixels) is the same as for regular maps (see mapping).
  • seed # specifies a floating point seed value for the random number generation; if the provided value is negative, the seed value is taken from the clock
  • simum # # specifies in units of µm the wavelength range used in the simulations (can be used for example to skip long wavelengths that do not contribute to heating); default range is 0.003µm-3m
  • singleabu in the special case of calculation having two dust components for which the sum of relative abundances is one, this advises SOC to store internally only an array for the abundances of the first dust component, the second having implicitly abundances one minus the abundance of the first dust (saving 4×CELLS bytes of host memory)
  • temperature filename in case of calculations with a single equilibrium temperature dust, specifies a file where the dust temperatures of all cells are stored (the hierarchical file format is the same as for the cloud density file)
  • wavelengths # # specifies on optional smaller wavelength range for the output spectrum files (resulting in smaller files, if spectra are needed only for a smaller wavelength range); the arguments are the minimum and maximum wavelength in units of µm

Cloud description ^

We describe here the format for the density file that uses a hierarchical grid. This is a binary file where all values are either 4 byte integers or 4 byte floats. The file begins with five integers: [Nx, Ny, Nz, levels, cells]. Here Nx-Nz are the dimensions of the cartesian root grid, levels is the number of hierarchy levels and cells the total number of cells in the model. This is followed, for each level in the grid hierarchy, by the number of cells on the current level and the vector of density values for the cells on that level. The file thus in a sense consists of separate density vectors, one for each hierarchy level of the disretisation. We follow the C-order where the index of the X-axis is running fastest.

If the whole hierarchy only contains the root grid, the file thus starts with the numbers [Nx, Ny, Nz, 1, NX×Ny×Nz, Nx×Ny×Nz], where 1 tells that there is only one level. In this case the number Nx×Ny×Nz appears twice, first as the total number of cells, then as the number of cells on the first level. These six integers are followed by Nx×Ny×Nz floating point numbers. Note that if you write the density cube from a 3D python array, the last index is running fastest so that for SOC the Python array element n[k,j,i] would correspond to coordinates x=i, y=j, and z=k.

If the cloud contains multiple hierarchy levels, the density value of a refined cell should be replaced with a link to its first subcell that is part of the density vector for the next hierarchy level. If the first subcell is the k:th cell in the vector containing all cells of the next level, then the density value of the parent cell is replaced by the bit representation of the integer number –k. The fact that it is negative tells SOC that it cannot be a density value and it is thus interpreted this way as a way to find the child cells. Note that in bit representation the value is still integer, not a floating point value like the rest of the elements in the density vector. Thus, when working with Python, one cannot simply take a negative value m from the density vector of floats, multiply it by -1 and assume that int(-m) is the index of the child cell. You have to convert it to integer bit representation using Python ctypes library, something like ctypes.c_int.from_buffer(ctypes.c_float(m)).value. The converse transformation from integer to floating point number would be ctypes.c_float.from_buffer(ctypes.c_int(i)).value.

The density values are typically density of hydrogen atoms per cm³ or density of hydrogen molecules per cm³. The correct unit is determined by the dust model, relative to which density the dust opacities are defined.

Dust description ^

Standard file format - the “simple” format

SOC concentrates on the actual radiative transfer (RT) simulations and was written to be dust model agnostic. Many things that are later needed to compute dust temperatures are not needed during the RT phase, even when model includes several possibly stochastically heated dust populations. Therefore the input file for dust properties can be quite simple and is made only slightly more complex by being backwards compatible with the format used for the CRT program.

We refer to this file format as the “simple” format. Here is an example of the beginning of the file that in the SOC ini file is specified using the optical keyword.

   eqdust
   1.00000e-07      # number density of grains
   1.00000e-04      # grain size [cm]
   184              # number of frequencies
   # freq        g         Qabs         Qsca
   2.99792e+09   0.00000   6.22363e-15  2.97781e-27
   3.67645e+09   0.00000   7.83909e-15  6.73353e-27
   4.50854e+09   0.00000   9.87451e-15  1.51629e-26
   5.52896e+09   0.00000   1.24392e-14  3.40209e-26
   6.78033e+09   0.00000   1.56729e-14  7.67112e-26
   8.31493e+09   0.00000   1.97494e-14  1.73108e-25
   ...

The first line is always eqdust (format assuming that grains are at some equilibrium temperature, ignoring parameters needed to solve temperature distributions of stochastically heated grains). The second line contains the ratio x between the number of grains and the number of particles, whose number density is given in the cloud file. The third row is the radius a of the grains [cm] and the forth row the number of frequencies in the main table. The main table below the four-line header has four columns: frequency, asymmetry parameter g of the scattering function, absorption efficiency Qabs and scattering efficiency Qsca. The actual cross section for absorption per unit density is thus pi×a²×Qabs×x and similarly the cross section for scattering per unit density is pi×a²×Qsca×x and only the values of these products are used. In other words, the fact that a and x are listed in the header is superfluous because those could as well be set to a value of one and the true cross sections per unit density given directly in the third and fourth column of the main table. Also, for real dust models the cross sections are integrals over size distributions so the dust does not have a single value for a. For historical reasons, the format is as described above. However, in future we may allow input files that contain just the main table of four columns – or just three columns because also the g parameter is superfluous when the scattering function is in any case specified separately (see below).

In the ini-file keyword dsc is used to specify the file that contains data on the discrete scattering function. The file contains two data tables. The first one is discretised linearly as a function of the cosine of the scattering angle theta (corresponding to linspace(-1.0,+1.0, bins) where bins is the number of bins). For each value of cos(theta), one lists the scattering probability per unit solid angle (not per unit step in theta or in cosine of theta!). This is repeated for every frequency used in the simulation. The second half of the file includes the cumulative probability functions of cos(theta) that provide a mapping from values [0,1] to [-1,+1]. SOC uses this directly to generate random scattering directions. This is also simply a vector of bins elements that implicitly correspond to linear discretisation of the input range [0,1] and that is repeated for every frequency included in the simulation. All in all, the scattering function file consists of bins×NFREQ+bins×NFREQ floats of 4 bytes each.

When runs include several dust species with varying abundances, SOC must have information of the dust opacities and scattering properties separately for each dust component. In that case the ini-file should just have one line with optical and one line with dsc keyword per each dust component (of course also specifying dust-specific abundances). If one has many dust components but their relative abundances do not vary over the model volume, it is better (=faster) to combine them into a single dust species with a single scattering function.

File format for stochastically heated grains

Although SOC itself has not knowledge of stochastic heating, one can use it for calculations of stochatically heated grains (see the section below ). In that connection other programs use a more complete file format for the dust populations, including the information of the grain size distributions and grain enthalpies. For historical reasons, this is called the “gs” format.

Here is an example of a “gs” file for astrosilicates

gsetdust
prefix     aSilx
nstoch     -1
optical    gs_aSilx.opt
enthalpies gs_aSilx.ent
sizes      gs_aSilx.size

The file starts with the line gsetdust, stating that the file confirms to the “gs” format. Otherwise the main content consists of references to files for optical properties, enthalpies and sizes. The file of optical properties (here gs_aSilx.opt) begins with

50 800  # NSIZE, NFREQ
3.00000e-04   # SIZE [um]
# FREQ      Qabs        Qsca        g             .... 800 lines ....
2.99792e+09  1.91850e-12  2.51210e-31  0.00000e+00
3.18935e+09  2.18650e-12  3.21210e-31  0.00000e+00
3.39301e+09  2.49200e-12  4.10730e-31  0.00000e+00
3.60965e+09  2.84020e-12  5.25190e-31  0.00000e+00
3.84015e+09  3.23700e-12  6.71550e-31  0.00000e+00

The file thus contains a table (freq, Qabs, Qsca, g) for each size (here 50) and a frequency (here 800 frequencies). Each section begins with the grain size given in micrometers. The enthalpy file (here gs_aSilx.ent) contains

50   #  NSIZE
  3.00000e-04  # first size [um]
  3.71040e-04  # second size [um]
  4.58910e-04  # ...
  ...
  8.08530e+00
  1.00000e+01  # 50th size [um]
30   #  NTEMP
  1.00000e-01  # first temperatures [K]
  1.45221e-01
  2.10902e-01
  ...
  3.44271e+03
  5.00035e+03  # 30th temperature [K]
5.46893e-23  1.44984e-22  4.41055e-22  ...
1.03467e-22  2.74296e-22  8.34432e-22  ...
2.24147e-19  6.86570e-19  1.95759e-22  ...
...

After stating the size and temperature grids, the final array in the file (one row per size, one column per temperature) contains the actual enthalpy values (per grain, in cgs units).

Finally, the size distribution is contained in a file like gs_aSilx.size:

 3.28016e-10   # GRAIN_DENSITY
 25 128    # NSIZE NE
 #  SIZE [um]    S_FRAC      Tmin [K]   Tmax [K]
 4.00000e-03  4.62849e-01   4.000e+00  2.500e+03
 5.18223e-03  2.48623e-01   4.000e+00  2.223e+03
 6.71389e-03  1.33550e-01   4.000e+00  1.978e+03
 8.69824e-03  7.17378e-02   4.000e+00  1.759e+03
 1.12691e-02  3.85346e-02   4.000e+00  1.564e+03
 1.45997e-02  2.06992e-02   4.000e+00  1.391e+03
 1.89148e-02  1.11188e-02   4.000e+00  1.237e+03
 2.45053e-02  5.97255e-03   4.000e+00  1.100e+03
 3.17480e-02  3.20821e-03   4.000e+00  9.787e+02
 4.11314e-02  1.72332e-03   4.000e+00  8.705e+02
 5.32882e-02  9.25698e-04   4.000e+00  7.742e+02
 6.90380e-02  4.97247e-04   4.000e+00  6.885e+02
 8.94427e-02  2.67101e-04   4.000e+00  6.124e+02
 1.15878e-01  1.43476e-04   4.000e+00  5.446e+02
 1.50127e-01  7.70693e-05   4.000e+00  4.844e+02
 1.94498e-01  4.13985e-05   4.000e+00  4.308e+02
 2.51984e-01  2.07848e-05   4.000e+00  3.832e+02
 3.26460e-01  8.00860e-06   4.000e+00  3.408e+02
 4.22949e-01  1.85190e-06   4.000e+00  3.031e+02
 5.47955e-01  1.67059e-07   4.000e+00  2.696e+02
 7.09907e-01  2.78308e-09   4.000e+00  2.397e+02
 9.19727e-01  2.36276e-12   4.000e+00  2.132e+02
 1.19156e+00  1.12940e-17   4.000e+00  1.896e+02
 1.54374e+00  7.13642e-27   4.000e+00  1.687e+02
 2.00000e+00  1.02343e-42   4.000e+00  1.500e+02              

The main information is the fraction (S_FRAC) of grains per size bin. The table also includes hints for the minimum and maximum temperature that should be included in the calculations of size distributions. However, the actual methods may ignore these prescriptions.

Using DustEM dust models

DustEM is a modelling tool that includes descriptions for a number of dust model. One can use these dust models in SOC but this requires a conversion to the “simple” or “gs” formats described above. However, the GitHub SOC directory contains a file DustLib.py with routines that assist in this conversion.

import os, sys
sys.path.append('/home/mika/starformation/SOC/')
import DustLib
from DustLib import *

# Specify the DustEM directory
DustLib.DUSTEM_DIR  =  '/home/mika/tt/dustem4.2_web'
# Specify the frequency grid = frequencies used in SOC simulations
FREQ        =  loadtxt('freq.dat')
                
# DUSTEM_FILE = 'GRAIN_C11_aSilx.DAT'  # aSilx component only
DUSTEM_FILE = 'GRAIN_DL07.DAT'         # this contains five dust components

# DustEM may use same dust name multiple times, with different size distributions.
# Routine write_DUSTEM_files assigns each dust component a unique name and writes
# files dustem_<dust name>.dust for use below.
NEWNAME   =  write_DUSTEM_files(DUSTEM_FILE)  # returns unique dust names
        
# Write "simple" dust files for each dust component
DEDUST = []
for name in NEWNAME:
DUST  = DustemDustO('dustem_%s.dust' % name, force_nsize=200)
DEDUST.append(DUST)
write_simple_dust([DUST,], FREQ, filename='%s_simple.dust' % name, dscfile='%s.dsc' % name)
    
# Write one "simple" dust file for the combination of all dust components
# One can use this in SOC, if the relative abundances of the dusts are constant.
write_simple_dust(DEDUST, FREQ, filename='combined_simple.dust', dscfile='combined.dsc')

# Write dust files of the "gs" type, one for each dust component
write_A2E_dustfiles(NEWNAME, DEDUST, NE_ARRAY=128*ones(len(DEDUST)))

In the above example, we export the dust model described in the DustEM file GRAIN_DL07.DAT, which includes five dust components. For each dust component, the above script writes one file in the “simple” format, one file in the “gs” format, and one file for the discretised scattering function. Because the DustEM file has two entries named “Gra” (associated with different grain size distributions), we have to rename one of those using a unique name. Thus, the script produces the files listed below.

#  "gs" dust files
gs_PAH0_DL07.dust
gs_PAH1_DL07.dust
gs_Gra.dust
gs_Gra_copy1.dust     # renamed Gra -> Gra_copy1
gs_aSil.dust

# "simple" dust files
PAH0_DL07_simple.dust
PAH1_DL07_simple.dust
aSil_simple.dust
Gra_simple.dust
Gra_copy1_simple.dust

# scattering functions
PAH0_DL07.dsc
PAH1_DL07.dsc
aSil.dsc
Gra.dsc
Gra_copy1.dsc

# all dust components combined
combined_simple.dust
combined.dsc

One would use the above files in different connections:

  • If one used SOC without stochastic heating, one would use in the ini file only the files combined_simple.dust and combined.dsc.
  • However, if the relative abundances are not constant, one would use the five individual “simple” dust files, each with a file specifying its spatial abundance variation (see ini file keyword optical).
  • If one includes the stochastic heating, SOC radiative transfer calculations would still use the “simple” files. However, the dust emission would be solved outside SOC, based on the information in the “gs” type dust files. See the notes on advanced use of SOC .

In the case of stochastically heated grains, the combination of radiative transfer and dust emission calculations can be automated with the [ASOC_driver.py](#shg) script. That would get as input an ini file with references to “gs”-type dust files.

Radiation field prescription ^

Radiation field can be produced by isotropic and non-isotropic background sources and point-like radiation sources that reside inside or outside the actual model volume.

Specification of the intensity of an isotropic background component requires a binary file that lists the intensity in cgs units for each frequency (in the order of increasing frequency) included in the simulation. The frequency grid must be identical to the one specified by the keyword optical. It is thus a plain binary file with NFREQ 4 byte floating point numbers.

Anisotropic background is specified using a Healpix-type binary file. At the moment the resolution of the background map is hardcoded to correspond to healpix maps with size NSIDE=64 (49152 pixels over the sky, each pixel ~55 arcmin in size) and in the RING order. The file simply lists the pixel values (4 byte floats) for 49152 pixels on the first frequency, followed by similar vectors for the remaining frequencies in the order of increasing frequency. The total size of the file is thus always NFREQ×49152×4 bytes. For example, if the simulation uses 100 frequencies (as specified by the optical keyword), the healpix file for the background intensity is still only 18.8 MB in size. The background file can be prepared in Python for example using the healpy library.

For point sources one of course specifies luminosity (in cgs units, erg/s/Hz) instead of intensity. The file is a plain binary file with NFREQ 4 byte floats, one number per simulated frequency, in the order of increasing frequency. The frequency grid must again be identical to the one specified by the optical keyword.

Output files

Dust emission ^

SOC.py (and the development version ASOC.py)saves the dust emissio maps to binary files map_dir_#.bin where # is a running index of the observer direction, corresponding to the inifile directions keyword. Each file begins with the map dimensions NX and NY in pixels (2×int32), followed by the data cube that in Python corresponds to an array with dimensions (NFREQ, NY, NX). Here NFREQ is the number of frequencies that usually is the same as specified by the keyword optical but may be limited by the keyword wavelengths.

If the output maps are calculated in the healpix format, the files have names map_dir_0_H.bin. The file begins with four int32 numbers: NSIDE, -1, NFREQ, LEVELS. NSIDE is the NSIDE parameter of the Healpix maps, the second number is a dummy negative value, NFREQ is the number of frequencies and levels is the number of levels in the hierarchical discretisation of the model cloud. The rest of the file contains the allsky healpix maps that as a Python array corresponds to an array (NFREQ, 12×nside×nside) of float32 numbers.

Scattered light ^

The scattered light images computed with SOCS.py (and ASOCS.py) are stored in binary files called outcoming.dat that also contain the different observer directions within the same file. The file begins with the number of pixels in the y and x directions and the number of frequencies (3 × int32). These are followed by the list of frequencies (float32 values). The rest of the file forms a data cube (float32 values) that in Python has dimensions (nfreq, ndir, ny, nx): the number of frequencies, the number of observer directions, and the dimensions of each scattered radiation map.

Basic SOC runs ^

In this section we look at basic SOC runs that involve a model with Cartesian spatial discretisation and a single dust population, calculations ignoring the stochastic heating of the dust grains. This would be appropriate mainly for the FIR and longer wavelengths >100 µm - with the caveat that the long-wavelength emission will be slightly overestimated because stochastic heating would move some of that radiated energy to shorter wavelengths.

The format of the cloud file was described above . The writing of the cloud file is shown more conceretely in the example below.

When there is only a single dust component and stochastic heating is not taken into account, the dust temperatures and dust emission can be calculated inside SOC. Thus, a single SOC run will do all three steps: simulate the radiation field, recomputed dust temperatures and emission, and write the results. Depending on the parameters in the [ini file]{#file_ini}, the output files may include both surface brightness maps and the 3D distribution of the dust equilibrium temperature.

To optimise the actual runs, it is possible split them to smaller steps. With the ini file keywords nosolve and nomap one can first do radiation field simulation only. Even further, one can separate the simulations of the dust reemission from the simulations of other (immutable) radiation sources. For example, one could first simulate radiation from the diffuse background and point sources, and save the result with the csave keyword. If one needs to iterate the calculations to find the dust equilibrium temperatures (and one does not know beforehand how many iterations might be needed), one can then remove the nosolve keyword, add the cload keyword to read in the precomputed part of the radiation field. On all subsequent iterations, one then needs to simulate only the photons reemitted by the dust. After all, changes in the dust temperature do not affect the energy input from the other radiation sources (as long as the dust optical properties do not depend on temperature). Once the dust temperatures have converged, a separate SOC run can be done to write the spectra (a run with zero iterations, after removing the nomap keyword). When there is only a single, equilibrium-temperature dust component (i.e. no stochastic heating), one can safely use they keyword noabsorbed to save some disk space.

Below is complete example of a basic model run.

Example of basic SOC runs ^

In the following example we create a 3D model for a cloud, calculate the dust emission with SOC, and plot a surface brightness map and a cross section of the dust temperature in the model. We are here ignoring the stochastic heating so that each cell does have a well-defined single temperature value. The files used in this example can be found in the GitHub SOC directory , in the file soc_example.zip .

Cloud model

SOC uses models with octree discretisation. However, here we use only the root grid and create a regular Cartesian grid of 64³ cells, all with the same density.

from numpy import *
N   =  64                                       # model with N^3 cells
fp  =  open('tmp.cloud', 'w')
asarray([N, N, N, 1, N*N*N], int32).tofile(fp)  #  NX, NY, NZ, LEVELS, CELLS
asarray([N*N*N,], int32).tofile(fp)             #  cells on the first (and only) level
n   = ones((N,N,N), float32)                    #  density cube
n.tofile(fp)
fp.close()

Ini file and SOC run

We define the run-time parameters with a text file tmp.ini. As described above, this consists of keywords, followed by one or more parameters. The commented file looks like this:

gridlength      0.01               # root grid cells have a size of 0.01 pc each
cloud           tmp.cloud          # density field (reference to a file)
mapping         64 64 1.0          # output 64x64 pixels, pixel size == root-grid cell size
density         1.0e4              # scale values read from tmp.cloud
seed           -1.0                # random seed for random numbers
directions      0.0  0.0           # observer in direction (theta,phi)
optical         tmp.dust           # dust optical parameters
dsc             tmp.dsc  2500      # dust scattering function
bgpackets       999999             # photon packages simulated from the background
background      bg_intensity.bin   # background intensity at the
iterations      1                  # one iteration is enough
prefix          tmp                # prefix for output files
absorbed        absorbed.data      # save absorptions to a file
emitted         emitted.data       # save dust emission to a file
noabsorbed                         # actually, we integrate absorptions on the fly and skip
the *file*
temperature     tmp.T              # save dust temperatures
device          g                  # run calculations on a GPU
CLT                                # temperature calculations done on the device
CLE                                # emission calculations done on the device

The format is free, the keyword must just be the first string on a line and the arguments are separated by at least by one space (or tab). One can add comments after the arguments or also as separate lines starting with ‘#’.

Above the keyword density scales the raw density values read from the cloud file with a constant. With the keyword mapping, we define that the number of pixels in the putput maps will be the same as the projected number of cells in the model, 64×64, and the pixel size is equal to the model cell size. The keyword direction put the observer in the direction of the positive z axis. We ask the calculations to be performed on a gpu; if the device line is commented out, the default is to use the CPUs.

Before we can run SOC, we must prepare three files in addition the tmp.cloud that contains the density distribution. These are tmp.dust (listing dust absorption and scattering cross sections for the selected frequencies), tmp.dsc (containing discretised scattering functions), and here also bg_intensity.bin. The file bg_intensity.bin contains the intensities of the external radiation field for the frequencies listed in tmp.dust. The files were described above and examples of these files are included in soc_example.zip, in the SOC directory in our GitHub page. .

The actual radiative transfer calculations are run from the command line:

ASOC.py   my.ini

Note that the latest version of the program is called ASOC.py, not SOC.py. The run should take only some seconds. It will produce a few files, including the file map_dir_00.bin for the dust-emitted intensities and the file tmp.T for the dust temperatures.

Plotting

The following script shows how the data in the files can be read and plotted. For convenience, we assume that we have a separate text file freq.dat that simply lists the frequencies that are included in the calculations - that is thus identical to the first column in tmp.dust, after skipping the first four lines.

from matplotlib.pylab import *
freq     = loadtxt('freq.dat')            # read the frequencies
nfreq    = len(freq)                      # ... number of frequencies
figure(1, figsize=(8,2.8))
fp       = open('map_dir_00.bin', 'rb')   # open the surface-brightness file
dims     = fromfile(fp, int32, 2)         # read map dimensions in pixels
S        = fromfile(fp, float32).reshape(nfreq, dims[0], dims[1]) # read intensities
S       *= 1.0e-5                         # convert surface brightness from Jy/sr to MJy/sr
fp.close()
ifreq    = argmin(abs(freq-2.9979e8/250.0e-6)) # Choose frequency closest to 250um
ax = subplot(121)
imshow(S[ifreq,:,:])                      # plot the 250um surface brightness map
title("Surface brightness")
colorbar()
text(1.34, 0.5, r'$S_{\nu} \/ \/ \rm (MJy \/ sr^{-1})$', transform=ax.transAxes,
va='center', rotation=90)
fp       = open('tmp.T', 'rb')            # open the file with dust temperatures
NX, NY, NZ, LEVELS, CELLS, CELLS_LEVEL_1 = fromfile(fp, int32, 6)
T        = fromfile(fp, float32).reshape(NZ, NY, NX) # T is simply a 64x64x64 cell cube
ax = subplot(122)
imshow(T[NZ//2, :, :])                    # plot cross section through the model centre
title("Dust temperature")
colorbar()
text(1.34, 0.5, r'$T_{\rm dust} \/ \/ \rm (K)$', transform=ax.transAxes, va='center', rotation=90)
show()

The code produces the following figure with the 250µm surface brightness map and a cut through the dust temperature cube. These are not very exciting because the density was constant everywhere. However, the plot does show the expected increase of surface brightness towards the borders and the drop in temperature towards the model centre.

Advanced SOC runs ^

In this section we discuss more complex use cases for the SOC program. Compared to the above basic case, these may include:

Hierarchical discretisation

The use of hierarchical discretisation in principle only means that one adds more hierarchy levels in the cloud file . Otherwise the main consideration is the number of photons packages in the simulation. This affects especially the simulation of the background radiation entering the model volume. Each additional level of refinement means that the smallest cells have a factor of four smaller projected area. If the number of background photon packages (keyword bgpackets) is kept constant, the number of rays hitting those smallest cells goes by a factor of four and the noise in the dust temperature estimates in principle doubles. With deeper hierarchies, one will also need many more photon packages. One can also use the keyword split, which means that simulated photon packages are split into four each time the refinement level increases along the photon path. This does not prevent the noise from increasing at higher refiment levels but the increase is hopefully only linear instead of exponential. However, the behaviour will vary from model to model. In particular, if the refined regions are optically thick, most photon packages will scatter away from those regions and the temperature uncertainties will be high. Note that the photon splitting can create a very large number of rays, making the calculations slow relative to the actual value of the bgphot parameter.

Another affected parameter is the number of pixels in the surface brightness maps. Even if the maps are finally needed at lower resolution, there is little use to make a model with many hierarchy levels unless the emitted radiation is sampled at a corresponding resolution. However, while the cloud model uses octree discretisation to reduce the amount of memory used, the output files use a fixed pixel size. When the map pixel size is set to match the smallest model cell size, the map files can become large. For example, for a model with a root grid of 512×512×512 cells and six levels of refinement, a single map would 4 GB - potentially more than the whole 3D model cloud file itself. It is advisable to limit the number of frequencies in the map files to the absolute minimum using keywords wavelengths or libmap.

Multiple dust components

When there are multiple dust populations with spatially varying abundances, dust emission is no longer computed inside the SOC program itself. SOC is still used to (1) simulate the radiation field and (2) to write the dust emission spectra. However, between these two steps one needs to run another script A2E_MABU.py, which reads the radiation field information written by SOC, splits the number of absorptions between the dust species, solves the temperatures and emission for each dust species separately, and finally adds the results into a single file describing the total dust emission (in each cell and at each frequency). The steps are thus:

  • Initial radiation field simulation
    • After adding nosolve and nomap in the ini file, one runs ASOC.py my.ini. SOC will make the radiation field simulation and save information about the absorbed energy into the file specified by the keyword absorbed (e.g. absorbed.data)
  • Solving of dust temperatures and emission
    • One executes A2E_MABU.py my.ini absorbed.data emitted.data to compute the dust emission into (e.g.) the file emitted.data. Internally, A2E_MABU.py will call the script A2E.py to solve the emission for each dust component separately.
  • Writing of emission maps
    • One modifies the ini file by setting iterations 0 and removing nomap. The second SOC run ASOC.py my.ini will then produce the map files.

The script ASOC_driver.py automates the above. Thus, instead of the normal ASOC.py my.ini one can execute ASOC_driver.py my.ini and this should work irrespective of the number of dust components.

Stochastically heated grains

Even if the calculations involved only a single dust component, if one solves the stochastic heating of the grains, the calculation of dust temperatures and emission cannot be done inside SOC. In fact, SOC only understands dust files of the simple type , listing absorption and scattering cross sections but no information about size distributions or enthalpies. Like in the case of multiple dust components above, the solution is to delegate the dust emission calculation to the A2E_MABU.py script.

In the case of stochastic heating, we need actually two descriptions of the dust model. The “simple” file is used by SOC. The temperature and emission calculations use the [“gs” type]{#dust_gs}, which includes those additional descriptions of size distributions and enthalpies. These more complete dust files can be exported for example from DustEM using existing scripts.

The role of A2E_MABU.py is to loop over different dust components. Inside the script, the emission of a each single dust component is computed with the script A2E.py. One could run A2E.py also directly, but that makes sense only if there is only one dust component, i.e. the absorptions are no divided between different dust components. The call to A2E.py takes the form

A2E.py   some_dust.solver absorbed.data emitted.data

In other words, the script reads and writes similar absorption and emission files as SOC. The first parameter some_dust.solver is a precomputed file (created automatically in the ASOC_driver.py script) that speeds up the actual calculations of the dust temperature distributions and dust emission (for this particular dust component). That file is prepared with yet another script A2E_pre.py.

A2E_pre.py some_dust.dust  freq.dat some_dust.solver

The inputs include the dust file (of the “gs” type) and a file listing the same frequencies that are going to be used in the radiation field simulations. The result of the A2E_pre.py run is the file some_dust.solver.

The full calculations can thus involve quite a few steps:

  • creation of the solver files for each dust component with stochastic heating
  • creation of the “simple” dust files for use with SOC
  • run of SOC to simulate the radiation field and save the result (e.g. the file absorbed.data)
  • split the absorptions between the different dust components, run A2E.py to solve the emission for each of them, and sum the results to get the total emission (e.g. as the file emitted.data), taking into account the possibly spatially varying abudances of each of the dust components
  • run SOC another time to write the emission maps

Fortunately, these are already included in the script ASOC_driver.py. Thus, all one needs to do is to run

ASOC_driver.py   my.ini
The ini file can contain several dust components, possibly a mixture of the “simple” and “gs” kind. For each “gs” dust, the “simple” version is created (if it does not already exist, as needed by SOC) and the solver file is created (if it does not already exist). ASOC_driver.py will then carry out the initial SOC simulation, call A2E_MABU.py to compute the emission, and will finish with another SOC run to write the spectrum files. Note than A2E_MABU.pycan solve emission for both stochastically heated and equilibrium-temperature grains.

Neural networks for stochastic heating

Neural networks can be used to approximate dust emission based on information of absorptions at a small number of wavelengths. Especially in the case of stochastically heated grains, this provides a faster (but approximate) way to calculate the emission, after simulating the radiation field only at a few wavelengths. The use of neural networks with SOC requires the installation of the Pytorch machine learning library.

The calculations with neural networks include two steps. During the training run, one solves the emission from stochastically heated grains in the normal way. This provides the inputs for a neural network that then learns the mapping from the local radiation field intensity at the set of reference frequencies to the resulting dust emission at another set of output frequencies. During subsequent runs, the RT calculations can then be carried using only the small set of reference frequencies, and neural network then converts the information about the absorbed energy directly into estimates of the dust emission. The RT step is much faster because of the smaller number of frequencies, and the use of the neural networks is also much faster than the full calculations of stochastic heating and emission.

The neural network fit is mainly valid for the range of conditions found in the training set. Neural networks are able to give a good representation of the emission vs. absorption mapping already with a small number of connections, which suggests that also the training sets do not have to be very large. Traning sets of tens of thousands of absorption-emission pairs might be enough, compared to the potentially millions of cells in the examined models. This suggests further possibilities for speed-up, as the training data can be extracted from a lower resolution model (cutting down the cost of both the RT simulation and the calculations of stochastic heating), or using a full-resolution model (with full RT simulation) but only a representative subset of cells (i.e. skipping the full calculations of stochastic heating for most of the cells). Usually the neural networks provide a good description of the emission over the full range of the radiation fields found in the training data. The fit may be valid (with increasing errors) also some way outside this range. Thus, if one is calculating a grid of models or especially if one is optimising a cloud model, the network does not have to be retrained on every step.

When one is using neural networks, the ini file must specify both a set of the reference frequencies (probing the absorbed energy) and the frequencies for which the emission is to be predicted. The corresponding keywords are nnabs and nnemit. In the training run, the ini file must include the keyword nnmake to instruct SOC to train a neural network. Otherwise the calculations (RT simulation, solving of dust emission) still proceed with the original, probably much larger number of frequencies that is needed to resolve both absorption and emission as functions of wavelength. Optionally, one can limit the full stochastic heating calculation to a subset of cells, thus also providing a smaller training set for the neural network optimisation.

Below is an example of a possible use of these keyword.

nnabs  0.3 0.6 1.0 2.0 5.0 # absorptions [um] 
nnemit 3.4 4.6 25.0 100.0 250.0 500.0 # emission [um] 
nnmake test nnthin 11 

The radiation field is described through absorptions at five wavelengths between 0.3 µm and 5.0 µm. The wavelengths should be selected so that they give a good idea of the changes in the radiation field. Short wavelengths are sensitive to variations at the model surface and close to the radiation sources, longer wavelengths are better in describing fields at high optical depths. In this example, the trained network will predict emission at six wavelengths from 3.4 µm to 500 µm. In the case of neural networks, one should use a smaller set of wavelengths, and there is currently no option to replace nnemit with the full list of wavelengths. A separate network will be trained for each dust component and saved to disk. The string argument of the nname keyword can be used to separate networks trained for the the same dust components but for different conditions. The argument of the keyword nnthin means that one will use only very 11:th cell for the training. The default value is 1 (using every cell). If the value of nnthin is very large, the training set may fail to sample extreme conditions in the model, thus leading to less precise predictions for such regions.

When one is using an already trained network in further calculations, the keyword nnmake is replaced by nnsolve, followed by the string argument used with the keyword nnmake above.

nnabs   0.3  0.6  1.0  2.0  5.0              # absorptions [um]
nnemit  3.4  4.6  25.0  100.0  250.0  500.0  # emission [um]
nnsolve test
mapum   3.4  4.6  25.0  100.0  250.0  500.0  # maps [um]
FITS    1

The keywords nnabs and nnemit must also be included, with the same list of wavelengths as when the library was created. We include in the above example also the mapum and FITS keywords, so that the run would produce FITS files for the emission, at all the same wavelengths as specified with nnemit.

During the run with nnsolve, only the wavelengths given by nnabs should be included in the radiative transfer calculations. At the moment this means that the dust descriptions (keywords optical and dsc) and the description of the constant radiation sources (keywords background and pointsource) must be changed, so that the associated files include only that set of wavelengths. Instead of calling ASOC.py directly, it is therefore advisable to use the ASOC_driver.py script, which makes these changes automatically. The user only has to change between the nnmake and nnsolve keywords, as networks are either being trained or used to predict emission. When the neural network is trained, the program will also save some diagnostic figures nnfit_<dust>.png that plot the neural network predictions against the emission in the training set, as well as the relative rms error of those predictions (up to nine output wavelengths only).

When one uses ASOC_driver.py, either without neural networks or with nnmake, a single run does the radiative transfer simulations, solves the emission (with the potential training of the neural networks), and saves spectra (if so instructed by the ini file). The one exception is the combination of nnmake and nnthin # with argument greater than one. Because the emission is in this case not calculated for all cells, no maps will be written. In this case, the maps need to be created by a separate run, possibly then using the created neural networks.