SOC is used for the modelling of scattering and thermal emission of interstellar dust. The name was originally an acronym for Scattering with OpenCL, which refers to the fact that the program is implemented using OpenCL .
The program was used in the code comparison published in Gordon et al. 2017 . However, since then the main program has been translated from C++ to Python and some additional features have been added. The initial Python version was described in Juvela 2019 . Because the main computations are carried out in OpenCL kernels that are compiled to machine code, the use of Python in the main program did not result in any significant degradation of the run times. On the contrary, the main program is now much shorter and allows easier modifications and tests of alternative implementation details.
This page describes the use of the Python version of SOC. Our goal has been to keep SOC simple so that it only concentrates on the actual radiative transfer calculations. SOC needs only minimal information of the dust models, which is given through input files. SOC can calculate dust temperatures for grains that are in equilibrium with the radiation field but the emission of stochastically heated grains is handled outside SOC (for example, using DustEM ). The script SOC.py is used for calculations of dust temperatures and emission and there is a separate script SOCS.py for calculations of light scattering (using the forced first scattering and peel-off methods). A development version of the program (called ASOC.py and ASOCS.py) can be found at GitHub . The GitHub directory contains an example how to make a 3D model, run SOC, and plot the calculated surface brightness and dust temperatures.
The parameter file (ini-file) is what is given to SOC on the command line and from where SOC extracts all information about other input files and the parameters of the run. It is a text file where each line starts with a keyword that is possibly followed by one or more parameters of that keyword. Lines that do not start with a known keyword are discarded and one can also use the “#” character to indicate a comment. The following list contains most of the common keywords with their arguments (before colon) with a short explanation.
In the descriptions we refer to the total number of cells in the model as CELLS and the number of simulated frequencies as NFREQ. When talking about data arrays, for example array [CELLS, NFREQ] corresponds to a storage order where the index over frequencies runs faster.
absorbed filename
gives the name for the file that will contain
the number of absorbed photons in each cell and frequencyali
turns on accelerated lambda iterations (only for dust
emission, in calculations with multiple iterations, not for
stochastically heated grains)background filename [k]
specifies the isotropic background where
the intensities are read from file filename, optionally scaled by a
factor of kbgpackets #
specifies the number of photon packages to be
simulated (per iteration and per frequency) from the background (see
background and hpbg)cellpackets #
specifies the number of photon packages to be
simulated (per iteration and per frequency) from dust within the model
volumeCLE
tells SOC to calculate dust emission on device instead of in
the host Python code (much faster; only for equilibrium dust)CLT
tells SOC to calculate dust temperatures on device instead of
in the host Python code (much faster; only for equilibrium dust)cload filename
tells SOC to read absorptions from the specified
file that has been written in a previous SOC run using the csave
option. In this case the simulation of “constant” radiation sources
(everything apart from the dust medium itself) will be omitted.
Initial run with csave can be combined with subsequent SOC runs with
cload to avoid the unnecessary overhead of resimulating the
absorptions caused by those constant radiation sources.cloud filename
gives the name of the cloud file (containing
densities and information of the grid hierarchy)colden filename
asks for column density maps (corresponding to
the actual maps being calculated) to be saved to file named based on
filename – the keyword colden has been replaced with savetaucsave filename
tells SOC to save absorptions to the given file;
the file will contain the absorptions that result from all radiation
sources except for the dust medium itself; the file can be read in
subsequent SOC runs by using the cload keywordDEFS
to ease experimentation with different kernels, this can be
used to add definitions that are passed to the kernel at compile time;
this is a string that should follow the format “-D definition” (or any
strings understood by the compiler)density #
specifies additional scaling for the density values
that are read from the cloud file (default is 1.0, no rescaling)device string
specifies the device (in some versions devices) to
be used; c stands for CPU (the default) and g for GPU. The string could
thus be ‘c’ or ‘g’, or ‘cg’ (to allow either CPU or GPU to be chosen).
One can also give a second optional parameter, which chooses a particular
device based on string in its name. For example “device c Intel” or
“device g GTX”. The alternative way to choose a particular device
is to use the platform keyword.diffpack #
gives the number of photon packages (per frequency)
that are used to simulate emission from the diffuse emission sources
specified with diffuse; this is separate from the dust emission from
the mediumdiffuse filename [#]
specifies a file with additional sources of
diffuse radiation (photons/cm3/Hz); this is a plain binary file
containing CELLS and NFREQ (two int32 numbers) followed by CELLS
float32 numbers for the actual emission; see keyword diffpack; note:
this diffuse emission is completely separate from the emission that is
calculated for the actual dust mediumdirections theta phi
specifies the direction towards the observer
that is used when calculating the the output maps; theta is the angle
from positive Z axis and phi the rotation around the +Z axis, the
angle increasing from +X towards +Y. For direction 90 0, the observer is in the
direction of the positive X axis and the maps are oriented so that
cloud Y axis is right and the Z axis up. For direction 0 0, observer
is in the direction of positive Z axis and in the maps Y is right
and X is down. For direction 90 90, the observer is in the
direction of the positive Y axis and in the maps the cloud X axis
points left and the cloud Z axis up.dsc filename
specifies the name for the scattering function file;
there should be either a single line with dsc or as many instances as
there are dust componentsemitted filename
specifies the name for a file where the
emissions (number of emitted photons per cell and frequency) are storedemweight #
for numbers #>0, generates photon packages from the
dust medium according to a weighting scheme (currently options
enweight 1 and emweight 2 exist)EXEC
executes the remainder of the line in a shell at the time
when the ini-file is read. Because of obvious security concerns, this
is by default commented out in the code (see the file pySOCAux.py)fits [ra de]
save spectra to FITS file instead of a plain
binary file. Ra and de are optional image centre coordinates [degrees]
used in the file header.ffs #
turns on (#=1) or off (#=0) the method of forced first
scattering. This applies only to the SOCS.py script where the forced
first scattering method is by default on.gridlength #
the size of a root grid cell in parsecshpbg filename # #
specifies a Healpix file that is used to
describe the surface brightness of the background sky (instead of or
in addition to (check!) an isotropic background); the first optional
numerical parameter is a scaling factor applied to the intensity
values in the file (default=1) and the second optional parameter, if
>0, tells SOC to use weighted sampling when generating photon packages
from the backgrounditerations #
number of iterations (when dust emission is
simulated and the self-coupling between this emission and dust
temperature is significant)loadtemp filename
in case of equilibrium dust (and only a single
dust population), recalculates emission based on dust temperatures in
the given file; this is used mainly to save disk space, storing CELLS
temperature values instead of directly the CELLS×NFREQ emission valueslocal #
overrides the default size of OpenCL work groups; the
defaults are 8 for CPU and 32 for GPUmapping nx ny step
specifies the number of pixels and the pixel
size of the output maps; nx and ny are the map dimensions and step the
pixel size in units of the root grid cells (=in units of gridlength)mapum # # ...
list wavelengths [µm] for which (and only for which)
maps should be written; this is used together with the FITS keyword, to
produce separate FITS files for the surface brightness at the listed
wavelengthsmirror xXyYzZ
changes simulation so that photon packages
that would exit the model volume are reflected back on selected
boundaries; in the argument string (1-6 characters long), lower-case
letters refer to lower boundaries and upper-case letters to upper
boundaries along the corresponding coordinate axesmmapabs I
selects the absorption data to be handled using
a normal file (I=0) or as a memory mapped file (I=1, default).
Option I=1 requires less memory (one frequency at a time) but
option I=0 is significantly faster, if data for all frequencies
can be kept in the main memory during the run.mmapemit I
selects the emission data to be handled using
a normal file (I=0) or as a memory mapped file (I=1, default).
Option I=1 requires less memory but option I=0 is faster, if data
for all frequencies can be kept in the main memory.nnabs #
for runs involving neural networks, #
is the list
of wavelengths at which the intensities are used as the inputs for the
networks.nnemit #
for runs involving neural networks, #
is the list
of wavelengths for which the networks provides intensity predictionsnnmake string
instructs SOC to create a mapping from absorptions
(at wavelengths specified by nnabs
) to emission
(at wavelengths specified by nnemit
). The trained networks
are saved to files consisting of the string provided by the keyword
nnemit
and the name of the dust component.nnsolve string
instructs SOC to use neural networks to
predict the emission (at wavelengths specified by nnemit
), based
on RT simulations at wavelengths specified by nnabs
.nnthin #
instructs SOC to solve the emission only for every #:th cell,
to be used in the training of neural networkds (with keyword nnmake
)noabsorbed
in dust emission calculations with
equilibrium-temperature dust (no stochastically heated grains), the
integral of absorbed energy in each cell can be calculated on the fly,
without saving the per-frequency values to a file; by using this
keyword, one can in that case skip the writing of the large
(CELLS×NFREQ×4 bytes) file of absorptionsnomap
skip the writing of intensity and polarisation mapsnosolve
skip the solving of dust temperatures and dust
emission inside SOC (even when calculating a single dust species with
grains in equilibrium temperature; used for example when SOC should
write spectrum files based on emission that was already calculated in
a previous run or with an external program)optical filename [abundance-file]
this specifies the name of the
file containing the dust optical parameters (see Dust Files below);
the second optional parameters gives the name of a binary file of
[CELLS] float32 numbers that specifies the relative abundance of the
dust in every cell (the default would be 1.0 for every cell)optishalf
when calculations include multiple dust components with
variable abundances, we need to pass to the kernel large arrays that
contain absorption and scattering cross section separately for each
cell – with this keyword these are passed as half-precision floating
point numbers (16 bits) allowing larger models to be calculated
(especially on GPUs) but with some loss of accuracy (saving 4×CELLS
bytes of device memory)platform #
select a specific OpenCL platform; by default SOC uses
the first available OpenCL platform, based on the CPU vs. GPU
selection made with the keyword device. One can also give an
optional second integer argument, which chooses a particular device
from the given platform. The default is “platform 0 0”, while
“platform 1 2” would choose the second platform and its third
device (e.g. if one had multiple GPUs)p0 #
if keyword polred is not used, specifies the intrinsic
polarisation fraction (= polarisation fraction for uniform
plane-of-the-sky magnetic field); note that the keyword is letter “p”
followed by zero (0, not o)pointsource x y z filename [scale]
adds a point-like radiation
source to the model; the position is given by (x,y,z) in units of the
root grid, filename refers to the luminosity file (see section
Radiation Field below), and scale is an optional scaling (a floating
point number) for the source luminosity; there can be many lines with
this keywordpolmap Bx By Bz
asks SOC to write out (I, Q, U) maps using the
three data files given as the arguments; these are hierarchical files
with the same structure as the density file and contain the components
of the magnetic field vectors for each cellpolred filename
file of polarisation reduction factors R for each
cell; the format is a plain binary file with the number of cells
(int32) followed by CELLS floating point numbers (float32); the input
file format could be changed in the code, in pySOCR.py (look for
USER.file_polred) also to a hierarchical file with the same format as
the density filepsmethod #
alternative schemes for the simulation of point
sources when they are located outside the model volume; the default is
to simulate photon packages isotropically; method 1 sends packages
only in the 2 pi solid angle that contains the cloud (a small
improvement in SNR); method 2 sends photon packages only towards
visible cloud surface elements but the weighting between visible cloud
sides is not quite optimal; method 5 sends photon packages towards a
cone that is aligned with one of the coordinate axes and contains the
cloud (less optimal if the point source position is far from the cloud
centre when both are projected onto the plane perpendicular to the
centre direction of the cone); currently psmethod 2 is still the
recommended one, especially if the point sources is far outside the
model volumepspackets #
specifies the number of photon packages (per
frequency and iteration) that are used to simulate emission from point
sourcesprefix <string>
specifies the prefix attached to the names of
some output filesreference
turns on the use of a reference field (only for dust
emission with multiple iterations, in practice not for stochastically
heated grains)remit min_um max_um
limits the produced emission file (see
keyword emitted) to contain data only for the given wavelength range,
specified with lower and upper limits in micrometers; used mainly to
reduce the size of the output files when surface brightness data are
needed only for a limited set of wavelengths (note: in case of
stochastically heated grains, the helper script A2E_MABU.py will also
read this option)saveint #
saves intensity values for all cells; if parameter # is
equal to 1, SOC saves to ISRF.DAT intensities estimated from photons
entering a cell (two int32 numbers CELLS, NFREQ followed by float32
array [CELLS, NFREQ]); with value 2 it saves to the file also the net
vector sum using additional three vector components (file begins with
three int32 numbers CELLS, NFREQ, 4 that are followed by the float32
array [CELLS, NFREQ, 4]); with parameter value 3 SOC saves intensities
that are calculated based on the absorptions within the cell
(effective value for the intensity over the cell volume, not the
average intensity on the cell boundary), the file starts with two
int32 numbers (CELLS, NFREQ) followed by the float32 array [NFREQ,
CELLS] of intensity values in cgs unitssavetau filename float
save map of optical depth to the named
file for the wavelength specified in units of µm. If the wavelength is
omitted, a map of the column density is saved instead. The map (pixel
size, number of pixels) is the same as for regular maps (see mapping).seed #
specifies a floating point seed value for the random
number generation; if the provided value is negative, the seed value
is taken from the clocksimum # #
specifies in units of µm the wavelength range used in
the simulations (can be used for example to skip long wavelengths that
do not contribute to heating); default range is 0.003µm-3msingleabu
in the special case of calculation having two dust
components for which the sum of relative abundances is one, this
advises SOC to store internally only an array for the abundances of
the first dust component, the second having implicitly abundances one
minus the abundance of the first dust (saving 4×CELLS bytes of host
memory)temperature filename
in case of calculations with a single
equilibrium temperature dust, specifies a file where the dust
temperatures of all cells are stored (the hierarchical file format is
the same as for the cloud density file)wavelengths # #
specifies on optional smaller wavelength range
for the output spectrum files (resulting in smaller files, if spectra
are needed only for a smaller wavelength range); the arguments are the
minimum and maximum wavelength in units of µmWe describe here the format for the density file that uses a
hierarchical grid. This is a binary file where all values are either
4 byte integers or 4 byte floats. The file begins with five
integers: [Nx, Ny, Nz, levels, cells]
. Here Nx-Nz are the
dimensions of the cartesian root grid, levels is the number of
hierarchy levels and cells the total number of cells in the model.
This is followed, for each level in the grid hierarchy, by the
number of cells on the current level and the vector of density
values for the cells on that level. The file thus in a sense
consists of separate density vectors, one for each hierarchy level
of the disretisation. We follow the C-order where the index of
the X-axis is running fastest.
If the whole hierarchy only contains the root grid, the file thus
starts with the numbers [Nx, Ny, Nz, 1, NX×Ny×Nz, Nx×Ny×Nz]
,
where 1 tells that there is only one level. In this case the number
Nx×Ny×Nz appears twice, first as the total number of cells, then as
the number of cells on the first level. These six integers are
followed by Nx×Ny×Nz floating point numbers. Note that if you write
the density cube from a 3D python array, the last index is running
fastest so that for SOC the Python array element n[k,j,i] would
correspond to coordinates x=i, y=j, and z=k.
If the cloud contains multiple hierarchy levels, the density value of
a refined cell should be replaced with a link to its first subcell
that is part of the density vector for the next hierarchy level. If
the first subcell is the k
:th cell in the vector containing all
cells of the next level, then the density value of the parent cell is
replaced by the bit representation of the integer number –k
. The
fact that it is negative tells SOC that it cannot be a density value
and it is thus interpreted this way as a way to find the child cells.
Note that in bit representation the value is still integer, not a
floating point value like the rest of the elements in the density
vector. Thus, when working with Python, one cannot simply take a
negative value m
from the density vector of floats, multiply it
by -1 and assume that int(-m)
is the index of the child cell.
You have to convert it to integer bit representation using Python
ctypes library, something like
ctypes.c_int.from_buffer(ctypes.c_float(m)).value
. The converse
transformation from integer to floating point number would be
ctypes.c_float.from_buffer(ctypes.c_int(i)).value
.
The density values are typically density of hydrogen atoms per cm³ or density of hydrogen molecules per cm³. The correct unit is determined by the dust model, relative to which density the dust opacities are defined.
SOC concentrates on the actual radiative transfer (RT) simulations and was written to be dust model agnostic. Many things that are later needed to compute dust temperatures are not needed during the RT phase, even when model includes several possibly stochastically heated dust populations. Therefore the input file for dust properties can be quite simple and is made only slightly more complex by being backwards compatible with the format used for the CRT program.
We refer to this file format as the “simple” format. Here is an example of the beginning of the file
that in the SOC ini file is specified using the optical
keyword.
eqdust
1.00000e-07 # number density of grains
1.00000e-04 # grain size [cm]
184 # number of frequencies
# freq g Qabs Qsca
2.99792e+09 0.00000 6.22363e-15 2.97781e-27
3.67645e+09 0.00000 7.83909e-15 6.73353e-27
4.50854e+09 0.00000 9.87451e-15 1.51629e-26
5.52896e+09 0.00000 1.24392e-14 3.40209e-26
6.78033e+09 0.00000 1.56729e-14 7.67112e-26
8.31493e+09 0.00000 1.97494e-14 1.73108e-25
...
The first line is always eqdust
(format assuming that grains are at some equilibrium temperature,
ignoring parameters needed to solve temperature distributions of stochastically heated grains). The
second line contains the ratio x between the number of grains and the number of particles, whose number
density is given in the cloud file. The third row is the radius a of the grains [cm] and the forth row
the number of frequencies in the main table. The main table below the four-line header has four
columns: frequency, asymmetry parameter g of the scattering function, absorption efficiency Qabs
and scattering efficiency Qsca. The actual cross section for absorption per unit density is thus
pi×a²×Qabs×x and similarly the cross section for scattering per unit density is pi×a²×Qsca×x and only
the values of these products are used. In other words, the fact that a and x are listed in the header
is superfluous because those could as well be set to a value of one and the true cross sections per
unit density given directly in the third and fourth column of the main table. Also, for real dust
models the cross sections are integrals over size distributions so the dust does not have a single
value for a. For historical reasons, the format is as described above. However, in future we may
allow input files that contain just the main table of four columns – or just three columns because also
the g parameter is superfluous when the scattering function is in any case specified separately (see
below).
In the ini-file keyword dsc
is used to specify the file that contains data on the discrete
scattering function. The file contains two data tables. The first one is discretised linearly as a
function of the cosine of the scattering angle theta (corresponding to linspace(-1.0,+1.0, bins)
where bins is the number of bins). For each value of cos(theta), one lists the scattering probability
per unit solid angle (not per unit step in theta or in cosine of theta!). This is repeated for every
frequency used in the simulation. The second half of the file includes the cumulative probability
functions of cos(theta) that provide a mapping from values [0,1] to [-1,+1]. SOC uses this directly to
generate random scattering directions. This is also simply a vector of bins elements that implicitly
correspond to linear discretisation of the input range [0,1] and that is repeated for every frequency
included in the simulation. All in all, the scattering function file consists of bins×NFREQ+bins×NFREQ
floats of 4 bytes each.
When runs include several dust species with varying abundances, SOC must have information of the dust
opacities and scattering properties separately for each dust component. In that case the ini-file
should just have one line with optical
and one line with dsc
keyword per each dust
component (of course also specifying dust-specific abundances). If one has many dust components but
their relative abundances do not vary over the model volume, it is better (=faster) to combine them
into a single dust species with a single scattering function.
Although SOC itself has not knowledge of stochastic heating, one can use it for calculations of stochatically heated grains (see the section below ). In that connection other programs use a more complete file format for the dust populations, including the information of the grain size distributions and grain enthalpies. For historical reasons, this is called the “gs” format.
Here is an example of a “gs” file for astrosilicates
gsetdust
prefix aSilx
nstoch -1
optical gs_aSilx.opt
enthalpies gs_aSilx.ent
sizes gs_aSilx.size
The file starts with the line gsetdust
, stating that the file confirms to the “gs” format.
Otherwise the main content consists of references to files for optical properties,
enthalpies and sizes.
The file of optical properties (here gs_aSilx.opt
) begins with
50 800 # NSIZE, NFREQ
3.00000e-04 # SIZE [um]
# FREQ Qabs Qsca g .... 800 lines ....
2.99792e+09 1.91850e-12 2.51210e-31 0.00000e+00
3.18935e+09 2.18650e-12 3.21210e-31 0.00000e+00
3.39301e+09 2.49200e-12 4.10730e-31 0.00000e+00
3.60965e+09 2.84020e-12 5.25190e-31 0.00000e+00
3.84015e+09 3.23700e-12 6.71550e-31 0.00000e+00
The file thus contains a table (freq, Qabs, Qsca, g) for each size (here 50) and
a frequency (here 800 frequencies). Each section begins with the grain size given in micrometers.
The enthalpy file (here gs_aSilx.ent
) contains
50 # NSIZE
3.00000e-04 # first size [um]
3.71040e-04 # second size [um]
4.58910e-04 # ...
...
8.08530e+00
1.00000e+01 # 50th size [um]
30 # NTEMP
1.00000e-01 # first temperatures [K]
1.45221e-01
2.10902e-01
...
3.44271e+03
5.00035e+03 # 30th temperature [K]
5.46893e-23 1.44984e-22 4.41055e-22 ...
1.03467e-22 2.74296e-22 8.34432e-22 ...
2.24147e-19 6.86570e-19 1.95759e-22 ...
...
After stating the size and temperature grids, the final array in the file (one row per size, one column per temperature) contains the actual enthalpy values (per grain, in cgs units).
Finally, the size distribution is contained in a file like gs_aSilx.size
:
3.28016e-10 # GRAIN_DENSITY
25 128 # NSIZE NE
# SIZE [um] S_FRAC Tmin [K] Tmax [K]
4.00000e-03 4.62849e-01 4.000e+00 2.500e+03
5.18223e-03 2.48623e-01 4.000e+00 2.223e+03
6.71389e-03 1.33550e-01 4.000e+00 1.978e+03
8.69824e-03 7.17378e-02 4.000e+00 1.759e+03
1.12691e-02 3.85346e-02 4.000e+00 1.564e+03
1.45997e-02 2.06992e-02 4.000e+00 1.391e+03
1.89148e-02 1.11188e-02 4.000e+00 1.237e+03
2.45053e-02 5.97255e-03 4.000e+00 1.100e+03
3.17480e-02 3.20821e-03 4.000e+00 9.787e+02
4.11314e-02 1.72332e-03 4.000e+00 8.705e+02
5.32882e-02 9.25698e-04 4.000e+00 7.742e+02
6.90380e-02 4.97247e-04 4.000e+00 6.885e+02
8.94427e-02 2.67101e-04 4.000e+00 6.124e+02
1.15878e-01 1.43476e-04 4.000e+00 5.446e+02
1.50127e-01 7.70693e-05 4.000e+00 4.844e+02
1.94498e-01 4.13985e-05 4.000e+00 4.308e+02
2.51984e-01 2.07848e-05 4.000e+00 3.832e+02
3.26460e-01 8.00860e-06 4.000e+00 3.408e+02
4.22949e-01 1.85190e-06 4.000e+00 3.031e+02
5.47955e-01 1.67059e-07 4.000e+00 2.696e+02
7.09907e-01 2.78308e-09 4.000e+00 2.397e+02
9.19727e-01 2.36276e-12 4.000e+00 2.132e+02
1.19156e+00 1.12940e-17 4.000e+00 1.896e+02
1.54374e+00 7.13642e-27 4.000e+00 1.687e+02
2.00000e+00 1.02343e-42 4.000e+00 1.500e+02
The main information is the fraction (S_FRAC) of grains per size bin. The table also includes hints for the minimum and maximum temperature that should be included in the calculations of size distributions. However, the actual methods may ignore these prescriptions.
DustEM is a modelling tool that includes descriptions for a number of dust model. One can use these dust models in SOC but this requires a conversion to the “simple” or “gs” formats described above. However, the GitHub SOC directory contains a file DustLib.py with routines that assist in this conversion.
import os, sys
sys.path.append('/home/mika/starformation/SOC/')
import DustLib
from DustLib import *
# Specify the DustEM directory
DustLib.DUSTEM_DIR = '/home/mika/tt/dustem4.2_web'
# Specify the frequency grid = frequencies used in SOC simulations
FREQ = loadtxt('freq.dat')
# DUSTEM_FILE = 'GRAIN_C11_aSilx.DAT' # aSilx component only
DUSTEM_FILE = 'GRAIN_DL07.DAT' # this contains five dust components
# DustEM may use same dust name multiple times, with different size distributions.
# Routine write_DUSTEM_files assigns each dust component a unique name and writes
# files dustem_<dust name>.dust for use below.
NEWNAME = write_DUSTEM_files(DUSTEM_FILE) # returns unique dust names
# Write "simple" dust files for each dust component
DEDUST = []
for name in NEWNAME:
DUST = DustemDustO('dustem_%s.dust' % name, force_nsize=200)
DEDUST.append(DUST)
write_simple_dust([DUST,], FREQ, filename='%s_simple.dust' % name, dscfile='%s.dsc' % name)
# Write one "simple" dust file for the combination of all dust components
# One can use this in SOC, if the relative abundances of the dusts are constant.
write_simple_dust(DEDUST, FREQ, filename='combined_simple.dust', dscfile='combined.dsc')
# Write dust files of the "gs" type, one for each dust component
write_A2E_dustfiles(NEWNAME, DEDUST, NE_ARRAY=128*ones(len(DEDUST)))
In the above example, we export the dust model described in the DustEM file
GRAIN_DL07.DAT
, which includes five dust components. For each dust
component, the above script writes one file in the “simple” format, one file
in the “gs” format, and one file for the discretised scattering function.
Because the DustEM file has two entries named “Gra” (associated with
different grain size distributions), we have to rename one of those using a
unique name. Thus, the script produces the files listed below.
# "gs" dust files
gs_PAH0_DL07.dust
gs_PAH1_DL07.dust
gs_Gra.dust
gs_Gra_copy1.dust # renamed Gra -> Gra_copy1
gs_aSil.dust
# "simple" dust files
PAH0_DL07_simple.dust
PAH1_DL07_simple.dust
aSil_simple.dust
Gra_simple.dust
Gra_copy1_simple.dust
# scattering functions
PAH0_DL07.dsc
PAH1_DL07.dsc
aSil.dsc
Gra.dsc
Gra_copy1.dsc
# all dust components combined
combined_simple.dust
combined.dsc
One would use the above files in different connections:
combined_simple.dust
and combined.dsc
.optical
).In the case of stochastically heated grains, the combination of radiative
transfer and dust emission calculations can be automated with the
[ASOC_driver.py](#shg)
script. That would get as input an ini file with
references to “gs”-type dust files.
Radiation field can be produced by isotropic and non-isotropic background sources and point-like radiation sources that reside inside or outside the actual model volume.
Specification of the intensity of an isotropic background component
requires a binary file that lists the intensity in cgs units for each
frequency (in the order of increasing frequency) included in the
simulation. The frequency grid must be identical to the one specified
by the keyword optical
. It is thus a plain binary file with
NFREQ 4 byte floating point numbers.
Anisotropic background is specified using a Healpix-type binary file.
At the moment the resolution of the background map is hardcoded to
correspond to healpix maps with size NSIDE=64 (49152 pixels over the
sky, each pixel ~55 arcmin in size) and in the RING order. The file
simply lists the pixel values (4 byte floats) for 49152 pixels on the
first frequency, followed by similar vectors for the remaining
frequencies in the order of increasing frequency. The total size of
the file is thus always NFREQ×49152×4 bytes. For example, if the
simulation uses 100 frequencies (as specified by the optical
keyword), the healpix file for the background intensity is still only
18.8 MB in size. The background file can be prepared in Python for
example using the healpy library.
For point sources one of course specifies luminosity (in cgs units, erg/s/Hz) instead of intensity. The file is a plain binary file with NFREQ 4 byte floats, one number per simulated frequency, in the order of increasing frequency. The frequency grid must again be identical to the one specified by the optical keyword.
SOC.py
(and the development version ASOC.py
)saves the dust
emissio maps to binary files map_dir_#.bin
where # is a running
index of the observer direction, corresponding to the inifile
directions keyword. Each file begins with the map dimensions NX
and NY
in pixels (2×int32), followed by the data cube that in
Python corresponds to an array with dimensions (NFREQ, NY, NX)
.
Here NFREQ
is the number of frequencies that usually is the same
as specified by the keyword optical but may be limited by the keyword
wavelengths
.
If the output maps are calculated in the healpix format, the files
have names map_dir_0_H.bin
. The file begins with four int32
numbers: NSIDE, -1, NFREQ, LEVELS
. NSIDE
is the NSIDE
parameter of the Healpix maps, the second number is a dummy negative
value, NFREQ
is the number of frequencies and levels is the
number of levels in the hierarchical discretisation of the model
cloud. The rest of the file contains the allsky healpix maps that as a
Python array corresponds to an array (NFREQ, 12×nside×nside)
of
float32 numbers.
The scattered light images computed with SOCS.py
(and
ASOCS.py
) are stored in binary files called outcoming.dat
that also contain the different observer directions within the same
file. The file begins with the number of pixels in the y and x
directions and the number of frequencies (3 × int32). These are
followed by the list of frequencies (float32 values). The rest of the
file forms a data cube (float32 values) that in Python has dimensions
(nfreq, ndir, ny, nx)
: the number of frequencies, the number of
observer directions, and the dimensions of each scattered radiation
map.
In this section we look at basic SOC runs that involve a model with Cartesian spatial discretisation and a single dust population, calculations ignoring the stochastic heating of the dust grains. This would be appropriate mainly for the FIR and longer wavelengths >100 µm - with the caveat that the long-wavelength emission will be slightly overestimated because stochastic heating would move some of that radiated energy to shorter wavelengths.
The format of the cloud file was described above . The writing of the cloud file is shown more conceretely in the example below.
When there is only a single dust component and stochastic heating is not taken into account, the dust temperatures and dust emission can be calculated inside SOC. Thus, a single SOC run will do all three steps: simulate the radiation field, recomputed dust temperatures and emission, and write the results. Depending on the parameters in the [ini file]{#file_ini}, the output files may include both surface brightness maps and the 3D distribution of the dust equilibrium temperature.
To optimise the actual runs, it is possible split them to smaller steps. With the ini file keywords
nosolve
and nomap
one can first do radiation field simulation only. Even further, one can
separate the simulations of the dust reemission from the simulations of other (immutable) radiation
sources. For example, one could first simulate radiation from the diffuse background and point sources,
and save the result with the csave
keyword. If one needs to iterate the calculations to find the
dust equilibrium temperatures (and one does not know beforehand how many iterations might be needed),
one can then remove the nosolve
keyword, add the cload
keyword to read in the precomputed
part of the radiation field. On all subsequent iterations, one then needs to simulate only the photons
reemitted by the dust. After all, changes in the dust temperature do not affect the energy input from
the other radiation sources (as long as the dust optical properties do not depend on temperature). Once
the dust temperatures have converged, a separate SOC run can be done to write the spectra (a run with
zero iterations, after removing the nomap
keyword). When there is only a single,
equilibrium-temperature dust component (i.e. no stochastic heating), one can safely use they keyword
noabsorbed
to save some disk space.
Below is complete example of a basic model run.
In the following example we create a 3D model for a cloud, calculate the dust emission with SOC, and plot a surface brightness map and a cross section of the dust temperature in the model. We are here ignoring the stochastic heating so that each cell does have a well-defined single temperature value. The files used in this example can be found in the GitHub SOC directory , in the file soc_example.zip .
SOC uses models with octree discretisation. However, here we use only the root grid and create a regular Cartesian grid of 64³ cells, all with the same density.
from numpy import *
N = 64 # model with N^3 cells
fp = open('tmp.cloud', 'w')
asarray([N, N, N, 1, N*N*N], int32).tofile(fp) # NX, NY, NZ, LEVELS, CELLS
asarray([N*N*N,], int32).tofile(fp) # cells on the first (and only) level
n = ones((N,N,N), float32) # density cube
n.tofile(fp)
fp.close()
We define the run-time parameters with a text file tmp.ini. As described above, this consists of keywords, followed by one or more parameters. The commented file looks like this:
gridlength 0.01 # root grid cells have a size of 0.01 pc each
cloud tmp.cloud # density field (reference to a file)
mapping 64 64 1.0 # output 64x64 pixels, pixel size == root-grid cell size
density 1.0e4 # scale values read from tmp.cloud
seed -1.0 # random seed for random numbers
directions 0.0 0.0 # observer in direction (theta,phi)
optical tmp.dust # dust optical parameters
dsc tmp.dsc 2500 # dust scattering function
bgpackets 999999 # photon packages simulated from the background
background bg_intensity.bin # background intensity at the
iterations 1 # one iteration is enough
prefix tmp # prefix for output files
absorbed absorbed.data # save absorptions to a file
emitted emitted.data # save dust emission to a file
noabsorbed # actually, we integrate absorptions on the fly and skip
the *file*
temperature tmp.T # save dust temperatures
device g # run calculations on a GPU
CLT # temperature calculations done on the device
CLE # emission calculations done on the device
The format is free, the keyword must just be the first string on a line and the arguments are separated by at least by one space (or tab). One can add comments after the arguments or also as separate lines starting with ‘#’.
Above the keyword density scales the raw density values read from the cloud file with a constant. With the keyword mapping, we define that the number of pixels in the putput maps will be the same as the projected number of cells in the model, 64×64, and the pixel size is equal to the model cell size. The keyword direction put the observer in the direction of the positive z axis. We ask the calculations to be performed on a gpu; if the device line is commented out, the default is to use the CPUs.
Before we can run SOC, we must prepare three files in addition the tmp.cloud that contains the density distribution. These are tmp.dust (listing dust absorption and scattering cross sections for the selected frequencies), tmp.dsc (containing discretised scattering functions), and here also bg_intensity.bin. The file bg_intensity.bin contains the intensities of the external radiation field for the frequencies listed in tmp.dust. The files were described above and examples of these files are included in soc_example.zip, in the SOC directory in our GitHub page. .
The actual radiative transfer calculations are run from the command line:
ASOC.py my.ini
Note that the latest version of the program is called ASOC.py, not SOC.py. The run should take only some seconds. It will produce a few files, including the file map_dir_00.bin for the dust-emitted intensities and the file tmp.T for the dust temperatures.
The following script shows how the data in the files can be read and plotted. For convenience, we assume that we have a separate text file freq.dat that simply lists the frequencies that are included in the calculations - that is thus identical to the first column in tmp.dust, after skipping the first four lines.
from matplotlib.pylab import *
freq = loadtxt('freq.dat') # read the frequencies
nfreq = len(freq) # ... number of frequencies
figure(1, figsize=(8,2.8))
fp = open('map_dir_00.bin', 'rb') # open the surface-brightness file
dims = fromfile(fp, int32, 2) # read map dimensions in pixels
S = fromfile(fp, float32).reshape(nfreq, dims[0], dims[1]) # read intensities
S *= 1.0e-5 # convert surface brightness from Jy/sr to MJy/sr
fp.close()
ifreq = argmin(abs(freq-2.9979e8/250.0e-6)) # Choose frequency closest to 250um
ax = subplot(121)
imshow(S[ifreq,:,:]) # plot the 250um surface brightness map
title("Surface brightness")
colorbar()
text(1.34, 0.5, r'$S_{\nu} \/ \/ \rm (MJy \/ sr^{-1})$', transform=ax.transAxes,
va='center', rotation=90)
fp = open('tmp.T', 'rb') # open the file with dust temperatures
NX, NY, NZ, LEVELS, CELLS, CELLS_LEVEL_1 = fromfile(fp, int32, 6)
T = fromfile(fp, float32).reshape(NZ, NY, NX) # T is simply a 64x64x64 cell cube
ax = subplot(122)
imshow(T[NZ//2, :, :]) # plot cross section through the model centre
title("Dust temperature")
colorbar()
text(1.34, 0.5, r'$T_{\rm dust} \/ \/ \rm (K)$', transform=ax.transAxes, va='center', rotation=90)
show()
The code produces the following figure with the 250µm surface brightness map and a cut through the dust temperature cube. These are not very exciting because the density was constant everywhere. However, the plot does show the expected increase of surface brightness towards the borders and the drop in temperature towards the model centre.
In this section we discuss more complex use cases for the SOC program. Compared to the above basic case, these may include:
The use of hierarchical discretisation in principle only means that one adds more hierarchy levels in
the cloud file
. Otherwise the main consideration is the number of photons packages in
the simulation. This affects especially the simulation of the background radiation entering the model
volume. Each additional level of refinement means that the smallest cells have a factor of four smaller
projected area. If the number of background photon packages (keyword bgpackets
) is kept constant,
the number of rays hitting those smallest cells goes by a factor of four and the noise in the dust
temperature estimates in principle doubles. With deeper hierarchies, one will also need many more
photon packages. One can also use the keyword split
, which means that simulated photon packages
are split into four each time the refinement level increases along the photon path. This does not
prevent the noise from increasing at higher refiment levels but the increase is hopefully only linear
instead of exponential. However, the behaviour will vary from model to model. In particular, if the
refined regions are optically thick, most photon packages will scatter away from those regions and the
temperature uncertainties will be high. Note that the photon splitting can create a very large number
of rays, making the calculations slow relative to the actual value of the bgphot
parameter.
Another affected parameter is the number of pixels in the surface brightness maps. Even if the maps are
finally needed at lower resolution, there is little use to make a model with many hierarchy levels
unless the emitted radiation is sampled at a corresponding resolution. However, while the cloud model
uses octree discretisation to reduce the amount of memory used, the output files use a fixed pixel
size. When the map pixel size is set to match the smallest model cell size, the map files can become
large. For example, for a model with a root grid of 512×512×512 cells and six levels of refinement, a
single map would 4 GB - potentially more than the whole 3D model cloud file itself. It is advisable to
limit the number of frequencies in the map files to the absolute minimum using keywords
wavelengths
or libmap
.
When there are multiple dust populations with spatially varying abundances, dust emission is no longer
computed inside the SOC program itself. SOC is still used to (1) simulate the radiation field and (2)
to write the dust emission spectra. However, between these two steps one needs to run another script
A2E_MABU.py
, which reads the radiation field information written by SOC, splits the number of
absorptions between the dust species, solves the temperatures and emission for each dust species
separately, and finally adds the results into a single file describing the total dust emission (in
each cell and at each frequency). The steps are thus:
nosolve
and nomap
in the ini file, one runs
ASOC.py my.ini. SOC will make the radiation field simulation and save
information about the absorbed energy into the file specified by the keyword absorbed
(e.g. absorbed.data)emitted.data
. Internally, A2E_MABU.py
will call the
script A2E.py
to solve the emission for each dust component separately.iterations 0
and removing nomap
. The
second SOC run ASOC.py my.ini will then produce the map files.The script ASOC_driver.py automates the above. Thus, instead of the normal ASOC.py my.ini one can execute ASOC_driver.py my.ini and this should work irrespective of the number of dust components.
Even if the calculations involved only a single dust component, if one solves the stochastic
heating of the grains, the calculation of dust temperatures and emission cannot be done inside
SOC. In fact, SOC only understands dust files of the simple type
, listing
absorption and scattering cross sections but no information about size distributions or
enthalpies. Like in the case of multiple dust components above, the solution is to delegate
the dust emission calculation to the A2E_MABU.py
script.
In the case of stochastic heating, we need actually two descriptions of the dust model. The “simple” file is used by SOC. The temperature and emission calculations use the [“gs” type]{#dust_gs}, which includes those additional descriptions of size distributions and enthalpies. These more complete dust files can be exported for example from DustEM using existing scripts.
The role of A2E_MABU.py
is to loop over different dust components. Inside the script,
the emission of a each single dust component is computed with the script A2E.py
. One
could run A2E.py
also directly, but that makes sense only if there is only one dust
component, i.e. the absorptions are no divided between different dust components. The call to
A2E.py
takes the form
A2E.py some_dust.solver absorbed.data emitted.data
In other words, the script reads and writes similar absorption and emission files as SOC. The first
parameter some_dust.solver
is a precomputed file (created automatically in the
ASOC_driver.py script) that speeds up the actual calculations of the dust temperature
distributions and dust emission (for this particular dust component). That file is prepared with yet
another script A2E_pre.py
.
A2E_pre.py some_dust.dust freq.dat some_dust.solver
The inputs include the dust file (of the “gs” type) and a file listing the same frequencies
that are going to be used in the radiation field simulations. The result of the A2E_pre.py
run is the file some_dust.solver
.
The full calculations can thus involve quite a few steps:
A2E.py
to solve the emission for each of them, and sum the results to get the total
emission (e.g. as the file emitted.data), taking into account the possibly spatially
varying abudances of each of the dust componentsFortunately, these are already included in the script ASOC_driver.py
. Thus, all one needs to do
is to run
ASOC_driver.py my.ini
ASOC_driver.py
will then carry out the
initial SOC simulation, call A2E_MABU.py
to compute the emission, and will finish with another
SOC run to write the spectrum files. Note than A2E_MABU.py
can solve emission for both
stochastically heated and equilibrium-temperature grains.
Neural networks can be used to approximate dust emission based on information of absorptions at a small number of wavelengths. Especially in the case of stochastically heated grains, this provides a faster (but approximate) way to calculate the emission, after simulating the radiation field only at a few wavelengths. The use of neural networks with SOC requires the installation of the Pytorch machine learning library.
The calculations with neural networks include two steps. During the training run, one solves the emission from stochastically heated grains in the normal way. This provides the inputs for a neural network that then learns the mapping from the local radiation field intensity at the set of reference frequencies to the resulting dust emission at another set of output frequencies. During subsequent runs, the RT calculations can then be carried using only the small set of reference frequencies, and neural network then converts the information about the absorbed energy directly into estimates of the dust emission. The RT step is much faster because of the smaller number of frequencies, and the use of the neural networks is also much faster than the full calculations of stochastic heating and emission.
The neural network fit is mainly valid for the range of conditions found in the training set. Neural networks are able to give a good representation of the emission vs. absorption mapping already with a small number of connections, which suggests that also the training sets do not have to be very large. Traning sets of tens of thousands of absorption-emission pairs might be enough, compared to the potentially millions of cells in the examined models. This suggests further possibilities for speed-up, as the training data can be extracted from a lower resolution model (cutting down the cost of both the RT simulation and the calculations of stochastic heating), or using a full-resolution model (with full RT simulation) but only a representative subset of cells (i.e. skipping the full calculations of stochastic heating for most of the cells). Usually the neural networks provide a good description of the emission over the full range of the radiation fields found in the training data. The fit may be valid (with increasing errors) also some way outside this range. Thus, if one is calculating a grid of models or especially if one is optimising a cloud model, the network does not have to be retrained on every step.
When one is using neural networks, the ini file must specify both a set of the reference
frequencies (probing the absorbed energy) and the frequencies for which the emission is to
be predicted. The corresponding keywords are nnabs
and nnemit
. In the training
run, the ini file must include the keyword nnmake
to instruct SOC to train a neural
network. Otherwise the calculations (RT simulation, solving of dust emission) still proceed
with the original, probably much larger number of frequencies that is needed to resolve
both absorption and emission as functions of wavelength. Optionally, one can limit the
full stochastic heating calculation to a subset of cells, thus also providing a smaller
training set for the neural network optimisation.
Below is an example of a possible use of these keyword.
nnabs 0.3 0.6 1.0 2.0 5.0 # absorptions [um]
nnemit 3.4 4.6 25.0 100.0 250.0 500.0 # emission [um]
nnmake test nnthin 11
The radiation field is described through absorptions at five wavelengths between 0.3 µm and
5.0 µm. The wavelengths should be selected so that they give a good idea of the changes in
the radiation field. Short wavelengths are sensitive to variations at the model surface and
close to the radiation sources, longer wavelengths are better in describing fields at high
optical depths. In this example, the trained network will predict emission at six
wavelengths from 3.4 µm to 500 µm. In the case of neural networks, one should use a smaller
set of wavelengths, and there is currently no option to replace nnemit
with the full
list of wavelengths. A separate network will be trained for each dust component and saved
to disk. The string argument of the nname
keyword can be used to separate networks
trained for the the same dust components but for different conditions. The argument of the
keyword nnthin
means that one will use only very 11:th cell for the training. The
default value is 1
(using every cell). If the value of nnthin
is very large,
the training set may fail to sample extreme conditions in the model, thus leading to less
precise predictions for such regions.
When one is using an already trained network in further calculations, the
keyword nnmake
is replaced by nnsolve
, followed by the string
argument used with the keyword nnmake
above.
nnabs 0.3 0.6 1.0 2.0 5.0 # absorptions [um]
nnemit 3.4 4.6 25.0 100.0 250.0 500.0 # emission [um]
nnsolve test
mapum 3.4 4.6 25.0 100.0 250.0 500.0 # maps [um]
FITS 1
The keywords nnabs
and nnemit
must also be included, with the same list of
wavelengths as when the library was created. We include in the above example also the
mapum
and FITS
keywords, so that the run would produce FITS files for the
emission, at all the same wavelengths as specified with nnemit
.
During the run with nnsolve
, only the wavelengths given by nnabs
should be
included in the radiative transfer calculations. At the moment this means that the dust
descriptions (keywords optical
and dsc
) and the description of the constant
radiation sources (keywords background
and pointsource
) must be changed, so
that the associated files include only that set of wavelengths. Instead of calling
ASOC.py
directly, it is therefore advisable to use the ASOC_driver.py
script,
which makes these changes automatically. The user only has to change between the
nnmake
and nnsolve
keywords, as networks are either being trained or used to
predict emission. When the neural network is trained, the program will also save some
diagnostic figures nnfit_<dust>.png
that plot the neural network predictions against
the emission in the training set, as well as the relative rms error of those predictions
(up to nine output wavelengths only).
When one uses ASOC_driver.py
, either without neural networks or with nnmake
, a
single run does the radiative transfer simulations, solves the emission (with the potential
training of the neural networks), and saves spectra (if so instructed by the ini file). The
one exception is the combination of nnmake
and nnthin #
with argument greater
than one. Because the emission is in this case not calculated for all cells, no maps will
be written. In this case, the maps need to be created by a separate run, possibly then
using the created neural networks.