`disco.core.cnn_inference`

This module defines the DiscoNet convolutional neural network and the predict_with_cnn() inference function used to generate prior estimates of disk inclination and position angle from a FITS image patch.

Architecture

DiscoNet is a residual convolutional encoder with a multi-layer perceptron head. It accepts a 3-channel \(128 \times 128\) tensor and returns 5 scalar outputs (n_out=5), although only the first three are used for geometric parameter decoding in the current inference path.

class disco.core.cnn_inference.ResBlock(ch)[source]

A residual block consisting of two \(3 \times 3\) convolutional layers with batch normalisation and ReLU activations, plus a skip connection.

\[\text{out} = \text{ReLU}\!\left(x + \text{BN}(\text{Conv}(\text{ReLU}(\text{BN}(\text{Conv}(x)))))\right)\]

Parameters:: ch (int) – Number of input and output channels.

forward(x)[source]

Parameters:: x (torch.Tensor) – Input feature map.
Returns:: Output feature map of the same shape as input.
Return type:: torch.Tensor

class disco.core.cnn_inference.DiscoNet(n_out=6)[source]

Residual convolutional encoder with five downsampling stages and an MLP regression head. At inference time the model is loaded with n_out=5.

Architecture summary:

Stage	Layer(s)	Output channels
`stem`	Conv 3×3, BN, ReLU	32
`enc1`	ResBlock(32) + Conv 3×3 stride 2	64
`enc2`	ResBlock(64) + Conv 3×3 stride 2	128
`enc3`	ResBlock(128) + Conv 3×3 stride 2	256
`enc4`	ResBlock(256) + Conv 3×3 stride 2	512
`enc5`	ResBlock(512) + Conv 3×3 stride 2	512
`pool`	AdaptiveAvgPool2d(4×4)	512 × 4 × 4 = 8192
`head`	Linear(8192→1024), ReLU, Dropout(0.45), Linear(1024→512), ReLU, Dropout(0.30), Linear(512→n_out)	n_out

Parameters:: n_out (int) – Number of output scalars. Default: 6. The CLI pipeline loads the model with n_out=5.

forward(x)[source]

Parameters:: x (torch.Tensor) – Input tensor of shape (N, 3, H, W).
Returns:: Output tensor of shape (N, n_out).
Return type:: torch.Tensor

Output Encoding

The network outputs are encoded as follows (as documented in the training checkpoint outputs field):

Index	Symbol	Encoding
0	`incl/90`	Inclination normalised to [0, 1] (multiply by 90° to recover degrees)
1	`sin2PA`	\(\sin(2\phi)\) — PA encoded as double-angle sine
2	`cos2PA`	\(\cos(2\phi)\) — PA encoded as double-angle cosine
3	`dx/0.14`	Centre x-offset normalised by 0.14 arcsec (not used in inference)
4	`dy/0.14`	Centre y-offset normalised by 0.14 arcsec (not used in inference)

The position angle is decoded as:

\[\hat{\phi} = \left(\frac{1}{2}\arctan_2\!\left(\hat{y}_1,\, \hat{y}_2\right) \times \frac{180}{\pi}\right) \bmod 180°\]

This double-angle encoding is used to ensure continuity across the \(0° / 180°\) boundary of position angle.

Inference Function

disco.core.cnn_inference.predict_with_cnn(data, header, pixel_scale, cx, cy, search_rad, model)[source]

Generate a prior estimate of disk inclination and position angle from a FITS image patch using a pre-loaded DiscoNet model.

Preprocessing:

A rectangular crop of half-width \(1.5 \times r_{\rm search} / \delta_{\rm pix}\) pixels is extracted around (cx, cy). The crop is zero-padded if it extends beyond the image boundary.
The crop is resampled to \(128 \times 128\) pixels using scipy.ndimage.zoom (order 1).
Intensity normalisation: clipped to \([p_1, p_{99.9}]\) then rescaled to [0, 1].
A beam map (2D elliptical Gaussian at the image centre, normalised to peak 1) is constructed for the second channel.
A scalar map filled with \(\text{clip}(b_{\rm maj} / \text{FOV}, 0, 1)\) is used as the third channel.

The resulting 3-channel tensor of shape (1, 3, 128, 128) is forwarded through the model in eval mode with gradient computation disabled.

Output decoding:

\[ \begin{align}\begin{aligned}\hat{i} = \text{clip}(y_0 \times 90°,\ 0°, 85°)\\\hat{\phi} = \left(\frac{\arctan_2(y_1, y_2)}{2} \times \frac{180}{\pi}\right) \bmod 180°\end{aligned}\end{align} \]

Parameters:

data (numpy.ndarray) – 2D FITS image array (float32).
header (dict) – FITS header (used for BMAJ, BMIN, BPA).
pixel_scale (float) – Pixel scale in arcseconds per pixel.
cx (float) – Centroid column coordinate in pixels.
cy (float) – Centroid row coordinate in pixels.
search_rad (float) – Search radius in arcseconds defining the crop region.
model (DiscoNet) – Pre-loaded DiscoNet model in evaluation mode.

Returns:

(cnn_incl, cnn_pa) — estimated inclination (degrees) and position angle (degrees).

Return type:

tuple[float, float]

disco.core.cnn_inference

Architecture

Output Encoding

Inference Function

`disco.core.cnn_inference`