## Abstract

This study offers an analytical estimation model for radiative scattering at nanoscale. The study focuses on isolated nanowires of arbitrary shape cross sections and uses predictive geometric features and statistical regression to model the wavelength-dependent light-particle interaction. This work proposes to estimate the radiative properties of nanowires based on engineered geometric features, potentially leading to new understandings of how the geometric attributes impact light scattering at nanoscale. A predictive model is designed and tested for estimating radiative scattering around nanowires. Random polygon-shaped cross sections with high degrees-of-freedom are chosen to train and test the models. The derived model can successfully explain scattering across out-sample synthetic plasmonic objects with a 90% R-squared metric.

## 1 Introduction

Optical characterization and design of subwavelength objects are challenging problems due to the complexity and computational cost of modeling. Nevertheless, the significant recent engineering advantages offered by the subwavelength particles in radiation applications such as cellular imaging [1], cancer therapy [2], optical antennas [3], photovoltaic cells [4], and more, have urged the optics and nanoscience researchers to come up with systematic ways for studying nanoscale radiation. Despite the growth in computational resources, optical nanoscale design remains an intense repetitive numerical process involving solving Maxwell's partial differential equations. Analytical solutions are widely missing except in limited cases involving well-shaped (spherical, cylindrical, etc.) isolated objects.

### 1.1 Summary of Contributions.

This study proposes an analytical model for the characterization and prediction of radiative response (scattering) around isolated nanowires. This is done in three steps: (1) training data generation: a large dataset of nanowires with arbitrary shape cross sections is generated as the training data. A random 2D polygon generation model is proposed and used to model arbitrary cross sections; (2) predictive feature extraction: we use a common set of geometric features (shape descriptors) to characterize the physical cross section shapes. These geometric features can be used as independent variables in a fitting framework to derive predictive models for the radiative response of nanoparticles in the face of incident light/radiation; and (3) model fitting: feature selection and regression techniques are used to obtain reliable predictive models. These models can reliably replace full-fidelity time-consuming electromagnetic simulations, and efficiently predict radiative scattering provided that the simulation parameters are close to those used in training. Otherwise, such models can still be useful in generalized applications with different settings if accuracy is not crucial, e.g., educational purposes or rapid (surrogate-model) optimizations. In addition, the provided analytical models can be helpful in more rigorous investigations of the effects of geometric shapes in scattering properties, e.g., in inverse design.

### 1.2 Limitations of the Study.

Our study is limited to 2D radiation and is therefore focused on nanowires (not nanoparticles). For data generation and shape description, we use randomly generated polygons. Every real-world 2D shape can be approximated to within a required error by a polygon of appropriate order.^{2} Intuitive features derived from polygon shapes can be as simple as area and eccentricity, and as complicated as high-order metrics of convexity, elongation, etc. The geometric shape features are computed using image characterization methods adopted from image recognition with nuances to incorporate the interaction of light and objects. The work is limited to metallic (silver) objects. A more comprehensive study shall include a larger set of material bases, both other plasmonic and dielectric materials, and generalizing the learning algorithms to include material properties as features (or normalization parameters). The work is also limited to isolated, compact, finite, homogenous particles.

### 1.3 Background and Motivation.

The optical response of isolated nanoparticles and nanowires to radiation is a highly complex function of the physical properties such as particle size, shape and material, as well as the characteristics of the incident light (e.g., wavelength, polarization, angle, etc.). The radiative response profile (i.e., spectral reflectance, absorbance, scattering) can be characterized by multiple metrics, including the response strength, bandgap, stop band, bandwidth, resonance/peak frequency, number, location and range of peaks, etc. These characteristics are intertwined and adjustable via the choice of geometry and material. The tunability of the radiative spectrum makes the geometry design an attractive method for unique devices that utilize nanoparticles and nanowires [5–7]. In some simplified cases, e.g., when the particle size is significantly smaller than the wavelength, the quasi-static assumption holds and the impact of the particle shape can be approximated via various shape factors [3,8–10]. Mie’s theory provides an analytical solution to Maxwell's equations in the case of spherical shapes [11], with a few limited extensions to non-spherical shapes [12,13].

Studying non-spherical/non-circular objects with sizes comparable to the wavelength requires using computational Maxwell's equation solvers, such as the finite element method, finite difference method, and discrete dipole approximation. Recently, shape-dependent optical response of nanoparticles has been studied by many researchers by considering common shapes such as triangles, cylinders, spheroids, and cuboids [14–19], as well as more exotic forms such as bipyramids [20], star and flower shapes [21], bowl and dumbbell shapes [22]. The research on shape-dependent optical properties has led to novel approaches for shape exploration in radiation. For instance, several studies have utilized structural topology optimization to determine suitable nanoscale shapes for optical problems [23,24]. In Refs. [25–30], the efficiency of solar cells was optimized using light-trapping structures with the help of topology optimization. In Ref. [31], photonic crystal waveguides were optimized for desired dispersion properties using similar approaches.

Data-driven approaches are another class of techniques that have recently gained popularity in nano-photonics study and design, with the aim of alleviating the computational burden of modeling [32–43]. The shape geometry in these studies is expressed either by some mathematical parametrization, such as side lengths [44,45] and vertex coordinates in the case of polygons [46] or by pixel representation, analogous to the field of image recognition [47,48]. Though limited, the effect of the shape features on the optical properties has also been studied in certain cases (e.g., Refs. [49,50]). For example, in Ref. [49], the emissivity of particles with four different shapes (rectangular and triangular prisms, sphere, cylinder) was predicted using simple geometric features, namely area-to-volume ratio, shortest, middle, and longest dimensions. Nevertheless, the generality of particle shapes and optical property prediction based on intuitive geometric features are widely missing in the existing literature. We can therefore assert that the state-of-the-art data-driven modeling of nanoscale radiation and scattering is limited and incomplete. This work is a preliminary attempt to fill some of that knowledge gap.

The rest of the paper is organized as follows: First, a quick overview of the light scattering theory is presented. Then, the study framework for nanowires with arbitrary shape cross sections is elaborated. This includes the shape generation methodology, predictive geometric features for shape characterization, feature selection, and regression analysis. Numerical results are presented and discussed using test cases at the end and concluded.

### 2 Light Scattering at Nanoscale

*scattering*results from the interaction of the electromagnetic field with objects of different permittivity distributions and interfaces between these objects [51]. Scattering and absorption together comprise

*extinction*. A common metric to measure light extinction is the

*optical cross section*, defined as the power extincted, scattered, or absorbed by the object normalized by the irradiance, i.e.,

*P*is power,

*I*

_{0}is irradiance, and

*C*is the optical cross section, respectively. Subscripts

*e*,

*s*,

*a*stand for extinction, scattering, and absorption, respectively. These cross section quantities are wavelength-dependent values that can differ from the geometric cross sections, and up to orders of magnitude larger with selective designs, particularly at small dimensions.

*Rayleigh*scattering. Another condition that results in an explicit solution is the symmetry in the object's cross section. Mie's theory provides analytical solutions for sphere and cylinder like objects [52,53]. When the particle size is comparable to or smaller than the wavelength, and the shape is non-spherical, computational methods are required for accurate electromagnetic modeling. Finite difference time domain (FDTD) is one of the widely used methods to model electromagnetic field components due to its simplicity and accuracy [54–58]. FDTD solves Maxwell's equations on a discrete spatial and temporal grid, composed of the collection of so-called Yee's cells.

^{3}Yee's cell involves electric and magnetic field vectors along edges and perpendicular to the faces, respectively. Once the electric and magnetic fields are calculated on the solution domain, the radiation power crossing a plane

**S**is computable using

## 3 Model Framework Description

This study aims to establish a relationship between the shape of nanowires cross sections and its radiative scattering using a statistical approach with engineered geometric features as inputs. The procedure is summarized in Fig. 1. First, random polygons are generated using a methodology described in the next section. Then electromagnetic simulations are performed to find target outputs (scattering cross section) for the generated shapes. Meanwhile, the values of geometric features for the generated polygons are calculated (section 0). The dataset containing the features (*X*-values) and the targets (*y*-values) is then supplied to regression models for training. Finally, the prediction performance is evaluated using (untouched) validation test sets.

### 3.1 Wavelength, Dimensions, and Materials.

For the current study, we consider radiation of wavelengths between 400 nm and 700 nm, a narrowband range that encompasses the peak of the solar irradiance. The studied nanowire cross sections are characterized with polygons of dimensions in the same order as the wavelength. We therefore consider the polygon radius to be less than 50 nm. For data generation, object materials are supposed to be silver (Ag).

## 4 Construction of Random Shapes

The arbitrary shapes are represented by *polygons* due to their geometric diversity and ease of construction. A polygon is a closed shape in *R*^{2} that can be fully characterized via its adjacent vertex coordinates. For a shape to be a valid polygon, the segments connecting adjacent vertices must not intersect. Our random polygon generation involves the following steps:

Randomly select the number of vertices,

*N*_{v}between 3 and $Nvmax=10$.Divide the coordinate system to

*N*_{v}angular segments (to avoid intersection),In each segment, randomly select a radius between (5 nm–50 nm) and an angular position for each vertex.

*i*th vertex are represented by

*V*

_{i}and Ψ

_{i}. All polygons in our database are represented with equisized vectors. To do so, we use a different mathematical representation, where the polygon boundary is divided into

*n*

_{d}= 360 elements in this study. Figure 2 shows the schematic of a polygon cross section of a hypothetical nanowire.

*i*is the index of a vertex and

*N*

_{v}is the number of vertices in the given polygon.

*j*is the index of boundary points. The point

*G*indicates the centroid of a polygon, the location of which is calculated as

*n*

_{d}is the number of boundary points.

## 5 Predictive Geometric Features

Arbitrary shapes can be characterized using techniques adopted from object recognition where predictive features are used in classification. The available methods differ in whether they use the object boundary (*boundary-based* methods) or the interior points of the shapes (*region*-*based* methods) [60]. Since the polygons in this study have no holes, the boundary-based methods seem more appropriate. Some of the most widely used approaches for shape representation are polygonal approximations, interrelation evaluation, moments, transforms, chain codes [61], beam angle statistics [62], shape context [63], chord distributions [64], Fourier and wavelet transforms of shape signatures [65,66], and other techniques [67]. Moments can be realized in various ways such as invariant moments [68,69], Zernike moments [70], and more. One of the approaches to quantifying the geometric features is the bounding rectangles [71,72] that can help in calculating orientation and elongation related features. Additionally, simple descriptors, such as area, perimeter, compactness, eccentricity, and perimeter-area ratios, can also be included [73]. For the detailed classification of the shape representation methods and specific examples, interested readers are referred to the comprehensive reviews in the literature [60,67,74–79].

Most of the aforementioned representation methods characterize a shape via a distribution, which is a vector with *n*_{d} boundary points. To reduce the number of inputs, we only consider the scalar geometric descriptors among the mentioned methods. Furthermore, contrary to the field of object recognition, the scale of the shapes and their orientation with respect to the light direction are important discriminators in the estimation of radiative properties; Therefore, we must include hybrid features to characterize the directional and local properties of shapes vis-à-vis the light radiation. For example, the geometric cross section of a particle with respect to a ray of light depends on the view angle. The symmetry of the particles along the radiation axis and perpendicular to it can have meaningful and different predictive sentiments. In summary, we consider the following criteria in our feature engineering:

Shape

*boundary is sufficient:*the shapes in this study are represented by straight lines with no holes inside.*Features are scalars:*the descriptors expressed as distributions are not considered to limit the feature vector size.*Orientation matters:*In the current study, the orientation of shapes is important because the light direction is single direction. Two polygons that can be rotationally converted to one another are considered different shapes.*Scale is important:*Size certainly changes scattering response. The scattering of light by a polygon-shaped object is different from that of the same polygon but scaled up or down in size. Therefore, there must be features characterizing the scale of particles. On the other hand, the features with a primary purpose other than the scale must be scale-independent.

N_{v} | Number of vertices |

p_{n} | Normalized perimeter |

A | Area |

ξ_{e} | Eccentricity |

ξ_{c} | Compactness |

ξ_{r} | Rectangularity |

ξ_{α} | Coverage angle from an observer far from the shape |

ξ_{v} | The ratio of the visible vertices from an observer to the total number of vertices |

N_{v} | Number of vertices |

p_{n} | Normalized perimeter |

A | Area |

ξ_{e} | Eccentricity |

ξ_{c} | Compactness |

ξ_{r} | Rectangularity |

ξ_{α} | Coverage angle from an observer far from the shape |

ξ_{v} | The ratio of the visible vertices from an observer to the total number of vertices |

ϕ_{1} | 1st moment invariant |

ϕ_{2} | 2nd moment invariant |

ϕ_{3} | 3rd moment invariant |

ϕ_{1} | 1st moment invariant |

ϕ_{2} | 2nd moment invariant |

ϕ_{3} | 3rd moment invariant |

ϕ_{e,1} | 1st order elongation |

ϕ_{e,4} | 4th order elongation |

l_{y} | The ratio of the perimeter of the bounding rectangle located along the x-axis to the actual shape perimeter. |

τ_{x,max} | The ratio of the shape extends on the y-axis to the maximum shape extended along any direction. |

τ_{y,max} | The ratio of the shape extends on the x-axis to the maximum shape extended along any direction. |

τ_{x,avg} | The ratio of the shape extends on the y-axis to the average shape extended along any direction. |

τ_{y,avg} | The ratio of the shape extends on the x-axis to the average shape extended along any direction. |

ϕ_{e,1} | 1st order elongation |

ϕ_{e,4} | 4th order elongation |

l_{y} | The ratio of the perimeter of the bounding rectangle located along the x-axis to the actual shape perimeter. |

τ_{x,max} | The ratio of the shape extends on the y-axis to the maximum shape extended along any direction. |

τ_{y,max} | The ratio of the shape extends on the x-axis to the maximum shape extended along any direction. |

τ_{x,avg} | The ratio of the shape extends on the y-axis to the average shape extended along any direction. |

τ_{y,avg} | The ratio of the shape extends on the x-axis to the average shape extended along any direction. |

γ | Orientation angle |

ρ_{+x} | Directional perimeter in the vicinity of +x direction |

ρ_{+y} | Directional perimeter in the vicinity of +y direction |

ρ_{−x} | Directional perimeter in the vicinity of −x direction |

ρ_{−y} | Directional perimeter in the vicinity of −y direction |

γ | Orientation angle |

ρ_{+x} | Directional perimeter in the vicinity of +x direction |

ρ_{+y} | Directional perimeter in the vicinity of +y direction |

ρ_{−x} | Directional perimeter in the vicinity of −x direction |

ρ_{−y} | Directional perimeter in the vicinity of −y direction |

s_{+x} | Directional sharpness in the vicinity of +x direction |

s_{+y} | Directional sharpness in the vicinity of +y direction |

s_{−x} | Directional sharpness in the vicinity of −x direction |

s_{−y} | Directional sharpness in the vicinity of −y direction |

s_{+x} | Directional sharpness in the vicinity of +x direction |

s_{+y} | Directional sharpness in the vicinity of +y direction |

s_{−x} | Directional sharpness in the vicinity of −x direction |

s_{−y} | Directional sharpness in the vicinity of −y direction |

c | Convexity of the overall shape |

c_{+x} | Directional convexity in the vicinity of +x direction |

c_{+y} | Directional convexity in the vicinity of +y direction |

c_{−x} | Directional convexity in the vicinity of −x direction |

c_{−y} | Directional convexity in the vicinity of −y direction |

c | Convexity of the overall shape |

c_{+x} | Directional convexity in the vicinity of +x direction |

c_{+y} | Directional convexity in the vicinity of +y direction |

c_{−x} | Directional convexity in the vicinity of −x direction |

c_{−y} | Directional convexity in the vicinity of −y direction |

## 6 Dataset

*C*

_{scat}is the scattering cross section and

*A*is the physical area. We refer to

*C*

_{scat}/

*A*as the

*scattering coefficient*. The logarithm of the scattering coefficient provides a normalized target distribution. The distributions of different target choices are compared in Fig. 3. The scatter plots of individual features with respect to the target are presented in Fig. 4. The absolute Pearson intercorrelation of the features is calculated and is graphically demonstrated after using a hierarchical clustering approach in Fig. 5. The heat map represents the 0-to-1 covariance matrix of the features while the tree chart on the left of the figure shows the distance-based hierarchical clustering.

### 6.1 Remarks.

Note that the wavelength *λ* is the feature least correlated with everything else, which is expected. The decision to include wavelength as a feature is to reduce the computational cost of learning and to avoid having to setup a multi-output regression framework. The other feature at the outermost leave of the hierarchy is the orientation angle, *γ*. As shown in Fig. 4, *γ* has a V-shape relationship with the target, unlike other features. Although the number of vertices (*N*_{v}) is also expected to have a small correlation with the rest of the features, it relates to the 3rd moment invariant (*ϕ*_{3}) in the cluster and has a small nonzero correlation with a few other features, such as convexity (*c*) and the ratio of the visible vertices from an observer to the total number of vertices (*ξ*_{v}). Convexity and 1st moment invariant (*ϕ*_{1}) are also highly correlated. The features belong to one of two main clusters: The first cluster includes features that characterize local aspects of the shape, such as orientation, sharpness, convexity-related directional features, and the ratio of the shape extent in *x* and *y* directions to the average. The second cluster includes features that explain the more holistic characteristics of the shape such as simple, moment-based, and sharpness related features.

## 7 Data Fitting

We use a variety of models with different degrees of simplicity to study data fitting and modeling. We choose linear regression as the simple models, and regression trees, Extreme gradient boosted trees (XGBoost), and neural networks as the more complex nonlinear regression models for data fitting. XGBoost combines weak learners with strong learners consequently reducing the chance of overfitting [80]. Linear models are generally good choices when high-quality features are available without the need for additional normalization or transformation. Otherwise, they will be considerably underfitting. Even in the case of quality features, linear models can quickly suffer from overfitting in the presence of feature collinearity. The nonlinear models on the other hand are mostly prone to overfitting and need careful regularization and training cross-validation.

The model performance is monitored during the training using a *validation* set, and training is terminated if the validation set error starts increasing. We also deploy a ten-fold cross-validation scheme to alternate the validation set within the overall set and then average the model parameters to reduce any potential data bias. The model hyperparameters (e.g., size and configuration of the neural network, depth of the tree, etc.) are determined using a grid-search hyperparameter optimization with separate cross-validation. The ultimate predictor performance of every regression model is then assessed using an untouched *test set*. The data modeling and fitting tasks of the current work were implemented in python (v. 3.6) using Sklearn (v. 0.22) and XGBoost package (v. 1.3.0) [80,81].

## 8 Results and Discussion

### 8.1 Training Results.

In Table 8, the summary of the training results is presented. The training and validation set errors are shown with the coefficients of determination, or *R-squared* (*R*^{2}). Figure 6 shows the scatter plots of the target and predicted outputs for each predictor. The linear regression algorithm has the highest error and lowest *R*^{2}, as expected since the input-output relationship is highly nonlinear. Regression trees show an *R*^{2} larger than 0.85, yet it is significantly smaller than those of XGBoost and neural networks. One of the possible reasons for the poorer performance of regression trees is overfitting. As the tree structure gets complex, the generalization capability of the regression trees reduces.^{4}

Training | Validation | |||
---|---|---|---|---|

Method | MSE | R^{2} | MSE | R^{2} |

Regression trees | 0.00558 | 0.997 | 0.355855 | 0.866 |

XGBoost regression | 4.07 × 10^{−7} | 1.000 | 0.127011 | 0.952 |

Neural networks | 0.075895 | 0.972 | 0.12748 | 0.952 |

Linear regression | 1.247262 | 0.539 | 1.312303 | 0.506 |

Training | Validation | |||
---|---|---|---|---|

Method | MSE | R^{2} | MSE | R^{2} |

Regression trees | 0.00558 | 0.997 | 0.355855 | 0.866 |

XGBoost regression | 4.07 × 10^{−7} | 1.000 | 0.127011 | 0.952 |

Neural networks | 0.075895 | 0.972 | 0.12748 | 0.952 |

Linear regression | 1.247262 | 0.539 | 1.312303 | 0.506 |

Note: MSE: Mean-squared-error.

XGBoost and neural networks perform the best for predicting the output according to the validation error. Although the smaller training error of XGBoost compared to neural networks suggests overfitting, the inherent regularization of XGBoost results in a small validation error. On the other hand, neural networks also perform similarly on the validation set and the larger training error is due to the Bayesian regularization scheme employed during neural network training. As a result of the training, XGBoost and neural networks can be considered the best predictors.

### 8.2 Test Results.

The test dataset cases are entirely new to the predictors; thus, the test results show the real-world performance of the predictors. The previous section shows that XGBoost and neural networks are chosen as the best performing predictors and used for the test cases in this section. Figure 7 compares the target and the predictions obtained by XGBoost (Fig. 7(a)) and neural networks (Fig. 7(b)). The error rates and *R*^{2} values of these predictors are very close, which is in agreement with the training and validation results. The test *R*^{2} results are clearly slightly decaying compared to the validation cases. The performance of the final predictive models can be further improved by employing adaptive sampling techniques in which new training points are added from the high-error regions, possibly through an adversarial data-generation/training framework.

In Fig. 8, comparison of the scattering cross section, *C*_{s} over the wavelength range in consideration is shown for three different cases. These examples are kept at a limited number for the sake of brevity. Note that in these plots, the target, log scattering efficiency (log *C*_{s}/*A*), is converted to the scattering cross section, *C*_{s}, for visualization purposes.

The discrepancy between the target and the predictions is more visible at the peak locations of the scattering. It is expected to see discrepancy at large scattering values because data with large output is slightly less than the rest of the data. One possible approach to improve the fit at the peak locations is to create a supplementary predictor for predicting peak locations and peak values of each case. This approach can work as a refinement step of the prediction.

## 9 Conclusion

In this study, the optical characteristics of nanowires with arbitrary shape cross sections were modeled using data-driven techniques. The arbitrary shapes (randomly generated polygons) were characterized by the predictive shape descriptors, describing unique features of each shape. These features include simple properties such as area and eccentricity and more complex features presenting sharpness, convexity, and other more advanced geometric properties. These predictive shape features were used in regression frameworks to estimate optical scattering, namely the log scattering efficiency. Among different techniques, XGBoost and neural networks regressions proved to result in the smallest validation error. The test set experiments demonstrated that the predictors perform well out-sample, i.e., with respect to entirely new (test) cases. The final analytical predictive models are the XGBoost and NN feed forward models.

Although there are discrepancies between the target and the predictions, the general trend of the scattering was predicted very closely with the devised model. The peak (spectral) locations are estimated closely even though there is some mismatch in the peak value in some cases. The performance of the final predictive models can be further improved by employing adaptive sampling techniques in which new training points are added from the high-error regions, possibly through an adversarial data generation/training framework. This shall be a subject of future study.

## Footnotes

This is assuming that the studied shape is sufficiently smooth, i.e., does not have sporadic significant spatial variations, which is a reasonable assumption.

This is named after Kane Yee, who developed the FDTD method.

This can potentially be alleviated with random forests, but XGboost is generally known to outperform that.

## Acknowledgment

This work was supported by the US National Science Foundation (Grant No. CBET-2103008).

## Conflict of Interest

There are no conflicts of interest.

## Data Availability Statement

The datasets generated and supporting the findings of this article are obtainable from the corresponding author upon reasonable request.

## Nomenclature

*ℓ*=optical index (

*a*,*e*, or*s*for absorption, extinction, or scattering, respectively)*A*=area

*C*=optical cross section

*G*=polygon centroid vector

- $E$ =
electric field

- $H$ =
magnetic field

*r*_{i}=polar coordinate radius

*p*_{n}=normalized perimeter

*n*_{d}=number of boundary points

*C*_{s}=scattering cross section

*I*_{0}=irradiance

*N*_{v}=number of polygon vertices

*P*_{j}=boundary point

*V*_{i}=vertex point

*g*_{x},*g*_{y}=coordinates of the polygon centroid

*x*_{i},*y*_{i}=point coordinates

*P*_{a},*P*_{e},*P*_{s}=power (absorbed, extincted, and scattered)

### Greek Symbols

## References

*Computational Electrodynamics, Vol. 28*