Link to paper

The full paper is available here.

You can also find the paper on PapersWithCode here.

Abstract

  • Finding antenna designs that meet frequency requirements and are optimal is a critical component in designing next generation hardware.
  • The process is non-trivial because the objective function is nonlinear and sensitive to design changes.
  • EM simulations are slow and expensive with commercial software.
  • CZP is a sample-efficient and accurate surrogate model to estimate scattering coefficients in the frequency domain of a given 2D planar antenna design.
  • CZP uses a novel image-based representation for antenna topology and attention-based neural network architectures.
  • CZP outperforms baselines in terms of test loss and can find 2D antenna designs with only 40k training samples.

Paper Content

Introduction

  • Next generation of Metaverse computing devices offer exciting possibilities for immersion in the digital world
  • Antenna design is challenging due to physical volume constraints and demand for broader spectrum coverage
  • Devices contain up to 10-20 antennas in a small form factor
  • WiFi 6E extends the WiFi high band to the entire 6GHz spectrum
  • Computational antenna design relies on full-wave electromagnetic simulations
  • Process is expensive in terms of simulation and engineer time
  • Two approaches to antenna surrogate modeling: approximate physics-driven simulation or data-driven methods
  • Leverage machine learning to efficiently learn a surrogate model
  • Antenna search space is high dimensional and relationship between antenna topology and resonance profile is highly non-linear
  • Proposed novel image-based representation of antennas geometries
  • Resonance characteristic of an antenna can be written analytically as the ratio of two complex-valued polynomials
  • Proposed transformer architecture to tokenize the image representation and predict the complex valued zeros and poles of the S scattering matrix

Antenna preliminaries

  • Defined key notions and concepts
  • Introduced antenna design problem
  • Presented derivation based on EM solvers
  • Theoretical innovation of CZP architecture

Antenna parameterization

  • We consider 2D planar antennas to demonstrate an approach
  • An antenna design consists of a substrate, ground plane, discrete port, and front-side metallic patches
  • Combining all these specifications, we have an overall design choice vector which determines the antenna’s frequency response
  • Antenna engineers aim to find the right design choice so that specific antenna design targets are met
  • We consider an example antenna with an FR-4 substrate and 5 front-side metallic patches
  • The relationship between an antenna’s topology and its responses is governed by Maxwell’s equations
  • We can solve Maxwell’s equations by discretizing the space and converting the PDEs into ODEs
  • We can compute the scattering coefficients S 11 (ω) by its definition
  • We can also compute S 11 (ω) by directly performing a temporal Fourier transform for ODEs
  • S 11 (ω) has an analytic form as the quotient of two complex polynomials with respect to frequency ω
  • We can train a neural network to predict the zeros and poles, as well as a global constant c 0 from material properties h

Network architecture

  • Proposed model includes a novel image representation for a 2D antenna
  • Image representation inspired by mesh representations used by EM simulators
  • Image-based transformer architecture predicts zeros and poles of scattering coefficients

Image representation

  • Mesh-based finite element methods are used in many simulation tools.
  • Mesh representations use the fact that an antenna’s resonance characteristics are related to its topology.
  • Images can be used to learn a surrogate model as they contain spatial information.
  • Adaptive meshing enables the simulation of systems unsolvable by traditional discretization methods.
  • A three channel image representation is proposed to explicitly represent the boundaries and corners of substrate.

Surrogate model

  • Proposed architecture for surrogate model to predict zeros and poles from image representation
  • Augment image with two additional channels of linearly spaced x and y coordinates
  • CNN takes augmented image as input and generates feature maps
  • Filter-based tokenizer generates visual tokens to capture semantics such as relative boundary and corner locations
  • Visual tokens passed through multi-layer transformer encoder
  • Fully connected layer and non-linearity used to predict constant, zeros and poles

Model training

  • Training CZP models requires no direct supervision of c 0 (h), zeros z(h), and poles p(h).
  • Estimated S 11 (ω) is computed with c 0 , z, and p to match the ground truth.
  • Model is trained via back-propagation to minimize Mean Squared Error (MSE).
  • Frequency range is [0.2 − 7.0]GHz at increments of 0.1 (69 dimensions).
  • Shrinkage loss variant of MSE is used to reduce error on crucial parts of the scattering coefficients.

Experiments

  • Proposed image representation is better than coordinate-based inputs and binary image input.
  • CZP formulation outperforms raw prediction with same transformer architecture.
  • Transformer architecture outperforms CNN and MLP.
  • CZP generalizes well to unseen antenna designs.

Surrogate modeling

  • 48K samples used, 90% for training, 10% for testing, 10% of training set for validation
  • Experiments demonstrate effectiveness of novel image representation and proposed CZP when using transformer architecture
  • Coordinate-based method concatenates normalized bottom-left x, y coordinate of each patch with onehot vector
  • Tokens projected into 256 dimensional vector by 2-layer MLP with hidden layer of width 256
  • CZP improves over raw prediction by minimum of 4.6% and maximum of 28.0%
  • Increasing transformer depth from 2 to 8 layers improves raw prediction for image and coordinate representations by 29.5% and 35.9%
  • CZP and image representation benefit least from increasing complexity of model, raw prediction and coordinate representation benefit most
  • 8-layer transformer with image input is 40%+ improvement on baselines

Optimization

  • Proposed model can be used by optimization procedure to find antenna configurations with specific resonance characteristics
  • CZP is more robust than raw prediction because it is smooth
  • Antenna design framed as reinforcement learning problem
  • State and action of Markov Decision Process defined
  • Reward given at final timestep
  • Optimization done using Soft Actor Critic
  • CZP outperforms raw prediction in terms of success rate
  • CZP is more robust to less data

Conclusion

  • Proposed a novel surrogate model architecture for antenna design optimization
  • Derived a theoretical form for the S 11 scatter coefficients
  • Proposed a novel image representation inspired by existing mesh-based simulation techniques
  • Demonstrated experimentally that the proposed model and image representation are significant advances
  • Proposed model had significantly higher utility in terms of success rate for optimization of antenna design than baselines
  • Future work will involve solving more complicated 2D problems and generalizing the proposed model and image representation to 3D antenna
  • Future work will involve exploring other tokenization schemes and applying to linear PDEs in general
  • ODE in the form of Eqn. 3 is diagonalizable and the logarithm of modulus of the scattering coefficients log |S 11 (ω)| can be written as a ratio of two complex polynomials
  • Patch p 1 determines the location of the discrete port
  • Image representation outperforms coordinates and CZP outperforms raw prediction
  • Ablations of architectural components of the proposed model and baselines
  • After all patches have been placed, the coordinates are converted to the image representation and the surrogate model predicts the frequency response log |S 11 (ω)|
  • Compute reward components for each resonance target
  • Double-sided Fourier transform and symmetric signal extension
  • Comparison of predicted frequency response with an 8 layer transformer and image input on two test set examples
  • Attention maps learned by the spatial attention component of the proposed transformer architecture