Artificial Neural Network for Partial Differential Equations: From learning mapping function to learning operator

by Nirav Vasant Shah

Arti ficial Neural Network (ANN) has shown promising capabilities for solving Partial Di fferential Equations (PDEs). ANN based approaches can be combined with conventional methods and can give accurate predictions also with noisy data. Despite several known computational advantages of ANN based approaches over conventional approaches, there is limited understanding about “why the ANN based methods work or do not work”. On the contrary, we know about “what ANN is able to learn”.
ANN can act as universal function approximator i.e. ANN can learn mapping function from inputs to outputs. In classical approaches, ANN is used to identify mapping between two finite dimensional spaces. By using proper architecture and a properly defi ned loss function, ANN can compute the solution fi eld in computationally efficient manner by learning from data. ANN based approaches can be modifi ed to utilize knowledge of the governing PDE [7]. However, such methods require training of ANN for each new instance of parameter or coefficient. Additionally, such networks might have limited generalization capabilities. Accuracy of ANN can be divided into three components: training error, optimization error and generalization error. The universal approximation theorem guarantees only small approximation error but it does not consider optimization error and generalization error [5].
Alternatively, ANN can be used to learn non-linear continuous operators. Some of the recent approaches have focused on learning mapping between in finite dimensional spaces i.e. learning the operator. Deep Operator Network (DeepONet) [5] can learn operator accurately from relatively small dataset and has shown promising generalization capabilities. It uses “Branch” sub-network for encoding input function and “Trunk” sub-network for encoding locations of output functions. Graph Kernel Network (GKN) [3] has also been used to learn mapping between infinite dimensional spaces. GKN uses iterative architecture, which includes learning kernel by a neural network. The approach is discretization invariant and is able to train and to generalize on diff erent meshes. Fourier Neural Operator (FNO) [4] is another approach for learning mapping between two in finite dimensional spaces. FNO also uses iterative updates, replacing kernel integral operator by a convolution operator in Fourier space. It is shown that if operator is approximated properly, the error will be constant at any resolution of the data. For any new instance of coefficient or parameter of the governing equation, GKN and FNO only require forward pass of the ANN.


Illustrations of the problem setup and architectures of DeepONets [5]

The full architecture of neural operator [4]

ANN has been used as function approximator for model order reduction of parametric PDEs [2], in areas such as thermomechanical problems of industrial interest [8, 9] and complex problems in computational fluid dynamics [6]. One of the key challenges for deep learning-based model order reduction techniques is to extract more information from the high- fidelity solutions in a non-intrusive manner. Non-intrusive methods are convenient, especially in case the high- fidelity solution is computed with commercial software. The extension of Physics Informed Neural Network [7] to Physics Reinforced Neural Network [1] is an example of modifying ANN-based approaches for application to model order reduction. Similarly, it is important to identify opportunities to extend operator learning to model order reduction techniques.

Relative error for velocity and pressure [6]

Temperature and displacement prole for thermomechanical problem [9]


[1] W. Chen, Q. Wang, J. S. Hesthaven, and C. Zhang. Physics-informed machine learning for reduced-order modeling of nonlinear problems. Journal of Computational Physics, 446:110666, 2021.
[2] J.S. Hesthaven and S. Ubbiali. Non-intrusive reduced order modeling of nonlinear problems using neural networks. Journal of Computational Physics, 363:55-78, 2018.
[3] Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar. Neural operator: Graph kernel network for partial differential equations, 2020.
[4] Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar. Fourier neural operator for parametric partial differential equations, 2021.
[5] L. Lu, P. Jin, and G. E. Karniadakis. Deeponet: Learning nonlinear operators for identifying di fferential equations based on the universal approximation theorem of operators, 2020.
[6] F. Pichi, F. Ballarin, G. Rozza, and J. S. Hesthaven. An arti ficial neural network approach to bifurcating phenomena in computational fluid dynamics, 2021.
[7] M. Raissi, P. Perdikaris, and G.E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial di fferential equations. Journal of Computational Physics, 378:686-707, 2019.
[8] N. V. Shah, M. Girfoglio, P. Quintela, G. Rozza, A. Lengomn, F. Ballarin,
and P. Barral. Finite element based model order reduction for parametrized one-way coupled steady state linear thermomechanical problems, 2021.
[9] N. V. Shah, M. Girfoglio, and G. Rozza. Thermomechanical modelling for industrial applications, 2021.

Mold-steel heat flux estimation in Continuous Casting

by Umberto Morelli

In my first post on this blog, I gave a brief overview on the motivation and objectives of my PhD project. Briefly, we could summarize it as the estimation of the heat flux going from the steel in solidification to the mold during continuous casting. This heat flux is necessary in the process to cool down the liquid steel and trigger its solidification. With respect to the motivation, the reasons for doing that are basically to control the process and, eventually, know what is happening inside the mold, that would otherwise be just a black box.

(credits: Danieli,

The first idea could be a hardware solution: we could equip the mold with some sensors on the “hot” face and measure the heat flux. Unfortunately, this is not possible. The hot face is not a place for sensors because of the critical temperature values and the abrasive effect of the sliding metal. However, casting molds are actually equipped with some kind of sensors.
In particular, they have thermocouples buried inside the mold plate a few centimeters inward with respect to the hot face. And these temperature measurements are gonna be the data of our problem.

So the situation is the following: we have a set of temperature measurements and a mold thermal model (partial differential equation with some boundary conditions including the boundary heat flux). We want to combine these two ingredients to estimate the boundary heat flux. If we call the heat flux g, we have that the thermal model is a function F(g) that, given a heat flux g, provides the mold temperature, i.e.

g → F(g) → T

Where T is the mold temperature field. This is a classical “forward” problem in which we are given a boundary condition and we provide, solving some kind of model equations, the state of a system. However, our situation is somehow the opposite. We know the temperature at some points of our domain, Ť, and want to know the g. So we have something like

Ť → ? → g → F(g) → T

Then we are looking for an operator that allows us to go from Ť to g.
To do this we use an optimization approach. We decide that we are looking for a ĝ that minimize the difference between the measured temperature and the computed one, i.e.

Find ĝ such that (F(ĝ) – Ť) is minimized

There are several tools to solve such a minimization problem, however, all of them require several computations of the forward problem F(ĝ) that in our case is computationally expensive and requires several minutes. However, since the objective is to control the casting process, we want to have the solution in “real-time” i.e. in very little time allowing us to detect problems and reach in time (approximately one second).

To find the solution of this problem in real-time, we exploit the affinity of the direct problem. In fact, we have that if we assume that g can be somehow parameterized like

g = a1φ1+ a2φ2+…+aNφN

for some basis functions φ1, φ2,…,φN. Then, the solution of the direct problem is given by

T = a1F(φ1)+a2F(φ2)+…+aNF(φN)+C

where C is something that we must add to obtain the correct solution.
Skipping all the mathematical details, this affinity allows us to find the minimizer of (F(ĝ) – Ť) in a very simple way, i.e. by solving a linear system having the dimension of the parameterization of the heat flux g. Moreover, to assemble the linear system, we only have to compute the direct problem for all the parameterization basis F(φ1), F(φ2),..,F(φN).

These last computations are expensive but can be done once and for all at the beginning during an offline phase. While, in a very fast online phase, we only solve the little linear system which is a fast and inexpensive computation.

Also in this case then, the offline-online decomposition, which is the core of model order reduction, allows us to achieve real-time performances in the solution of a problem that in the classical setting is very time consuming. I invite the interested reader to have a look at our recent publication on the subject at the link.

Would you let AI make your resource scheduling decisions?

US Airforce, PwC, and Deutsche Bahn have recently employed AI to help navigate challenging decision-making in resource scheduling. Moreover, 85% of the Fortune 500 companies do that for many years now.

The most complicated resource scheduling problems involve scarce and expensive resources. They also require navigating through murky waters of unclear business objectives and nontransparent agendas.

“We have too many options!”

I recently met a director of a large company, who struggled with implementing a decision automation system for its planning departments. In their work, they needed to coordinate the plans for three different types of resources. Hence, there were millions of possible assignment options. Moreover, the plan for each resource originated from a different team. As a result, it took a lot of coordination to arrive at an acceptable global solution.

Many scheduling managers deal with such situations every day. For example, in order to make the complexity manageable, they divide the planning processes into waterfall-like stages. As a result, siloed planning teams emerge which frequently lose sight of the global objective of the planning (if there is one). Instead, they focus on polishing their own plan, frequently at the expense of others.

What to do next?

After months of effort, the best the director could achieve was a simple rule-based system. Moreover, when it was implemented, the planners would reject its suggestions. They would rather “work around the system” than “work with the system”.

To help him go ahead, I suggested that he answers two questions:

  1. What would we like to achieve with the new system?
  2. Which paradigm of decision support would work best for us?

The answer to the first of these questions depends on the context of each individual project. For example, we could aim at maximizing profitability, minimizing costs, or maintaining high customer service quality. In most contexts, it would be a combination of all of these. I will devote a separate post just to this topic.

Choosing the right paradigm

To answer the second question, we need to know more about the paradigms of decision support systems and the criteria in which they work best.

Decision support paradigms

Recently Gartner suggested a very useful classification of decision support/automation paradigms. They differ by the degree, to which humans and machines are involved.

A comparison of decision support systems. Source: Gartner/LinkedIn.

Decision automation systems take advantage of predictive/prescriptive analytics tools to take an automated decision in a fast, scalable, and consistent manner. In particular, human involvement is low – it is usually limited to setting the decision guidelines at the beginning of the system’s operations or intervention in exceptional cases.

Decision augmentation systems try to extract the best from both the human and the machine actor. In such systems, machines use advanced techniques to come up with numerous suggestions and recommendations. The human decision-maker then reviews and validates them.

Decision support systems rely on the knowledge, experience, and intuition of the human decision-maker. In turn, the role of the machine in such a system is limited to providing visualizations, alerts, and other forms of insight required in the decision-making process.

Selection criteria

To decide, which of the paradigms to use, Gartner suggests assessing the decision-making contexts against two criteria:

  1. Time – period between observing the need for a decision and the decision itself. It could be anything between microseconds to months or years.
  2. Complexity – following the so-called Cynefin framework, we could classify the decisions as:
    • Simple, which boil down to a clear cause and effect relationship,
    • Complicated, which require expertise or analysis to identify cause and effect,
    • Complex, which involve multiple relationships and interdependencies,
    • Chaotic, which have unknown causes and effects, with unclear or dynamic interdependencies.

We can look up our decision-making context on the chart below.

Decision Assessment Model by Gartner.

Using AI for resource scheduling decisions

This brings us back to the initial question – should we let AI decide about our resource schedules?

Decisions concerning resource schedules take many forms and have varying degrees of complexity. For example, they range from choosing a vehicle from the available pool for the next transportation task (simpler) to determining the requirements for the new vehicle fleet (more complicated and sometimes complex).

These decisions also have varying degrees of urgency – from days or weeks for long-term plans to minutes for emergency rescheduling. That places many pay decisions in the Decision Augmentation zone, although longer-term planning may require more involvement from human decision-makers.

As I mentioned in the beginning, 85% of the Fortune 500 companies use some kind of AI support in decision-making.

This entry was originally published on We invite you to bookmark this address for more content on mathematical optimization and AI in planning!

My Ph.D. Life

by Onkar Jadhav

I adore mathematics. There is mathematics everywhere: from how much pocket money I should save per day to buy that expensive toy as a kid to solving real-world problems as an adult. It is not just that mathematics helps solve problems; it is fun and intriguing as well. Consider Pi, the famous irrational number, and it keeps on going, forever, without ever repeating. This means this string of decimals contains every possible number combination. Our birthdates, mobile numbers, bank account numbers, etc., and it’s all in there somewhere. If you convert these decimals into letters, it encompasses your entire life story from beginning to end, everything we ever say or do. Everything: all contained in the ratio of a circumference and a diameter of a circle. Now, isn’t that something? Anyways, my point is I was always intrigued and fascinated with mathematics and wanted to apply my computational science background to solve real industrial problems.
Maybe, that’s the reason my luck brought me to the forefront of cutting-edge research funded by the MSCA, the EU’s flagship funding program. The project rendered me a platform to explore one of the intriguing and challenging finance industry problems. I commenced my work at MathConsult GmbH, located in Linz, Austria. I worked on my problem one step at a time by learning new financial concepts, literature surveys for new model reduction techniques, and new optimization techniques.
During my first year, I traveled around Europe to attend planned courses, workshops, etc., and side by side, I had to concentrate on my research as well. It was a bit challenging, although never boring. I learned a lot during those courses and had the amazing opportunity to meet different people while traveling through the breathtakingly beautiful cities of Europe. It is safe to say that I have essentially learned a great deal of time management as well, as I have to efficiently juggle amongst various responsibilities of conferences/workshops and other duties that come with being a researcher. And then there was Covid. Honestly, I miss it now!
In my second year, I started my work at the Technical University of Berlin in a new city sprawling with life, culture, art, music, food, and, well, Maths. This was just the motivation I needed. Berlin is a vibrant city and has everything for everyone. Although I couldn’t enjoy it for a long time as the pandemic hit us all hard. Consequently, the second year was more relaxed in the sense of traveling, and I could concentrate on my research more. I got some excellent results which are recently published in our manuscript [1]. Our method tested on the industrial data of different financial instruments shows excellent results and has potential applications in historical and Monte-Carlo value-at-risk calculations.
All in all, I am really enjoying my Ph.D. life: that feeling when your code works, rush for deadlines, and especially the importance of caffeine!

[1] A. Binder, O. Jadhav, and V. Mehrmann. Model order reduction for the simulation of parametric interest rate models in financial risk analysis. J. Math. Industry, 11:1–34, 2021.

The ESR Group and Onkar pointing his finger

Research from home, a practical guide

Things have changed, to say the least. As a MSCA Fellow the first years of my research project consisted of travelling through Europe with seminars, training courses and conferences spread over multiple countries. But then COVID-19 struck and we’ve been con fined to our homes for more then a year now. The funny thing about doing a PhD in the ROMSOC project is that most of us are quite independent and don’t have an urgent necessity to work in an office.

With a charged laptop and noise cancelling headphones, I can work wherever, whenever. This turned out to not be completely true though. There seems to be a reason that we never abandoned offices. At fi rst this newfound freedom felt great. I could go for a run whenever I felt like it and be back at work with no lost time. If I wanted to make my favourite slow cooked dish, rendang, I would just prepare it in the morning and stir whilst waiting for a simulation run. Technically, as mathematicians we were already quite far from following a dress-code, but I feel that writing a paper in your pyjamas at the faculty might be a bridge too far. But when you work, eat and relax in the same room for a months on end, the lines do start to blur. Therefore, some caution is needed to keep your spirits up and those stress levels down.

Going further into the lockdown I found my schedule drifting off . Waking up a little bit later each day, consequently working a bit longer. Then not working that bit longer and feeling under-productive. Although I had almost complete freedom in choosing when I would do whatever I wanted, I am still a mathematician. As mathematicians, we are always thinking in structures and how to optimise them. More and more I started to experiment with structuring my day and kept the parts that I liked in place. I ended up with rising at 7, work at 7:30, an hour of exercise and lunch at 12, then work till 6 and make sure to relax in the evening to go right back at it again the next day. In the end this is not that diff erent from a regular office schedule, it seems that I reinvented the wheel.

I hope you enjoyed this instalment of our ROMSOC blog. Please find the references below to see the results we have produced since my last blog post [1, 2] to follow the progress of me and my fellow researchers please keep an eye on the ROMSOC website and follow us on Twitter and Facebook!

Marcus Bannenberg

[1] MWFM Bannenberg, A Ciccazzo, and M G ̈unther. Reduced order multirateschemes for coupled differential-algebraic systems. 2021.2
[2] MWFM Bannenberg, F Kasolis, M G ̈unther, and M Clemens. Maximumentropy snapshot sampling for reduced basis modelling.preprint, 2020.

MSCA Fellow and the Lockdown

In order to fully understand what it means to be an MSCA Fellow in the Lockdown, it is important to understand what it meant to be an MSCA Fellow before. Having a Ph.D. position in an MSCA funded project is considered as some kind of “special opportunity” among the young scientists, but almost no-one who hasn’t experienced it can tell exactly why, and what makes it so special.

Coming from a pure science background and from an economically challenging region, I didn’t have much of a travel experience or inter-field connections before my Ph.D. program in European Industrial Doctorate. During an interview, my supervisor warned me, that I would have to travel all around Europe to attend the different courses and interact with professionals from different applied or industrial backgrounds.

My first reaction was a polite chuckle, since, how in the world, could such a thing require a warning when this sounded like an enormous advantage of the project?! Little did I know, that this would be the beginning of eighteen months of my life, which were simultaneously one of the most amazing, but also one of the most stressful.

For those first eighteen months of the project, I felt like an anchor-less ship, always in motion. Almost every month, I had to travel somewhere. Sometimes it was a one-week course or training, sometimes a conference or a workshop, sometimes it was a visit back home to friends and family, sometimes it was a complete relocation from one country to another, but there was always somewhere to go, some travel to plan.

And yet, I was not alone. At all those courses I was meeting ten other fellows who had the same experience, the same amount of travel, the same feeling of constant motion. The same experience of being in so many towns and hearing so many languages, that even remembering which word was “hello” in a particular place was becoming a challenge. This is what was making us so “special”, this is what was simultaneously our curse and our blessing.

The world became smaller for us. Everything in Europe was within arm’s length, just one step to take. We had the feeling of having everything within the reach. There was no place we couldn’t have gone to, no knowledge or professional opinion that we could need and not be able to obtain. Everything was accessible!

And then suddenly, comes the pandemic and the lockdown. If our favorite food is not sold in our closest supermarkets, it is not accessible anymore, if we are not in the same town as our family, we can’t see them for months. Even if we live next to our offices and universities, they are out of reach. Even if we live next to our favorite leisure places, their doors are shut tight. We might not be disconnected, but we are isolated!

So how does an MSCA fellow feel under the lockdown? I think like an explorer, whose ship is damaged in the storm and who is stuck in a remote village. Who knows that other explorers are still out there, fighting the storm, and who is waiting for them to come to aid. Just like we are waiting for our fellow scientists, doctors, and pharmacists, who had been on the front lines of the battle against pandemics, to defeat it, and bring the world within the reach once again!

About the Author

Giorgi Rukhaia is an early-stage researcher (ESR3) in the ROMSOC project. He is affiliated with the Institut national de recherche en informatique et en automatique (INRIA) in Paris, France and working in collaboration with Signify (former Philips Lighting), the Netherlands. In his research project he is working  on an Optimal Transportation computational approach and the development of numerical algorithms for the inverse free-form optical surfaces design using extended sources.

The curious case of modeling acoustics in fluid flow

The significant contributions and problems posed by the peers over time have all contributed to the development of scientific knowledge. Being curious about the history of scientific evolution can generally help develop a new organic perspective. It provides an invaluable lesson by leading us through the reasoning and choices along the path of scientific progress. The last couple blog posts ( viz., Origin and need for coupled models in acoustics – ROMSOC and Mathematical modelling acoustic measurements – ROMSOC ) have shed light into the motivation and approach followed in coupled systems modelling in acoustic industry, while this post intends to delve deep into the world of modeling acoustics in fluid flow. Describing fluid flow models while also coupled with acoustics is definitely harder in ‘simpler’ terms, I assume the reader is somewhat familiar to fluid flow models.

Almost always, fluid flow in mathematical models begins with the Navier-Stokes(NS) equations and involves some  reformulation/restructuring, simplification along with/without change in frame of reference. Lighthill and Curle (1950-60s) pioneered the development in this regard especially in the field of aeroacoustics providing models for jet noise prediction with a rearranged NS equation in the form of a linear wave equation for a fluid medium at rest. The model primarily assumes a equivalent-source analogy- suggesting that the source term may always be modeled approximately. Subsequent works by Ffowcs Williams, Hawkings, Phillips and Liley extended this approach by restructuring the NS equations into a more general form of convective wave equation. These models could help predict sound generation in simple geometries and flow profiles, but complex cases required by the industry were still out of reach. A multitude of empirical and semi-empirical approaches for specific applications followed, although failing to address a holistic analogy for flow acoustics.

Another favored modelling approach was to make some simplifications on the much harsher NS equation and trying to tame a simpler set of equations hoping that they cater to the complex cases. Euler equations for one, are simplified NS equations with the effects of viscosity and heat transfer omitted. Convenient linear approximations on density, pressure and velocity fields around the mean flow give the Linearized Euler Equations (ironically since the equations themselves are non-linear). These equations received interest especially in fluid-structure interaction problems since they could capture refractional effects and reflections at solid boundaries (especially relevant for aero-industry); however, they are often used for 2D geometries and behave well only for low amplitude acoustic perturbations. The model is still sensitive to small changes and there are further simplified LEE models e.g. Bogey-Bailly-Juve model which ignores some more terms to make the model simpler.

I would like to shed light into one of the less-known models – the Galbrun’s equation. This less-used model utilizes a different form of the NS equations within a mixed frame of reference. It assumes a known stationary underlying Eulerian flow and models the Lagrangian fluctuations of acoustics. The displacement formulation of it is given as,

Here,  and  are the density and speed of sound of the underlying flow with a pressure field  while,  is the material derivative. The model offers advantages in coupling problems, especially the compatibility of boundary conditions using Lagrangian terms, and offers an option to modularize the problems of fluid flow and the acoustic fluctuations. It also comes with a few setbacks – the equation is notoriously unstable with traditional finite elements and needs ‘regularization’ for effective use as was demonstrated by Bonnet-Ben Dhia and others [REFs]. Addressing these challenges could provide for a powerful tool to integrate acoustics in flowing medium, helping develop better noise predictions in fluid flow and the ROMSOC project hopes to address this for the real-world applications by providing the open-access knowledge and tools. Keep following this blog for more information into a diverse set of mathematical technologies that ROMSOC hopes to develop for industry.



Howe, M. (1998). Acoustics of Fluid-Structure Interactions (Cambridge Monographs on Mechanics). Cambridge: Cambridge University Press. doi:10.1017/CBO9780511662898

Goldstein, M. E. (2003). A generalized acoustic analogy. Journal of Fluid Mechanics, 488, 315–333.

Billson, M., Eriksson, L.-E., & Davidson, L. (2005). Acoustic Source Terms for the Linearized Euler Equations in Conservative Form. AIAA Journal, 43(4), 752–759.

Maeder, M. (2020). 90 Years of Galbrun’s Equation: An Unusual Formulation for Aeroacoustics and Hydroacoustics in Terms of the Lagrangian Displacement. 39.



About the author:

Ashwin Nayak is an early-stage researcher (ESR2) in the Reduced Order Modeling Simulation Optimization and Coupled methods (ROMSOC) project. He is affiliated with the Technological Institute of Industrial Mathematics (ITMATI), Spain and working in collaboration with Microflown Technologies B.V., Netherlands. His research aims to develop coupled mathematical models to enable acoustical measurements in fluid-porous media interactions.

What makes a solver a good one?

Solving large linear systems in real-time applications

In many practical examples solving large linear systems in real-time is the key issue. This is the case, e.g., for our ROMSOC project “Real Time Computing Methods for Adaptive Optics”, where we are dealing with a rapidly changing atmosphere of the earth. Real-time usually refers to a time frame within milliseconds. For our specific test configuration, it is 2 ms.

Before talking about efficient solvers for linear systems we need to define what is a linear system and what is large. A linear system is a system of linear equations Ax = b, where A is a matrix of size n x n, b is the right hand side vector of size n and x is the desired solution. In this context, large refers to an n up to several hundred thousand.  The larger the size of the matrix, the harder the system is to solve in real-time. Note, that in our setting we are dealing with a dense square matrix A.

Now that we have stated the mathematical problem formulation, we need to fix some performance criteria to compare the different ways to solve a linear system. What does it mean to have a good or efficient solver? There are plenty of ways to solve such systems, either directly or iteratively. However, there are some quantities that can be checked in advance no matter what specific solver you want to analyze in order to be able to choose a suitable one. In the end, efficient always refers somehow to the run-time of the algorithm. But the run-time heavily depends on the hardware and the implementation, which is usually not done in a preliminary state. Hence, it would be nice to have some criteria that can be checked in advance before implementing the whole method.

In mathematics, solvers are often compared using the computational complexity, which simply counts the number of operations needed to solve the problem. Usually, this quantity is given asymptotically in the order of n. This becomes clearer if we look at some examples: Adding two vectors of length n is of linear complexity O(n), because we need to perform n additions to obtain the result. A matrix vector multiplication is of quadratic complexity O(n2). However, this criterion does not take into account if a matrix is sparse or dense. Note that we say a matrix is sparse, if it has a considerable number of zero entries. A more precise measure in this case is the number of floating point operations (FLOPs). When performing a dense matrix vector multiplication, we require 2n2-n FLOPs to get the solution, however, a sparse matrix vector multiplication needs only 2nnz2-nnz FLOPS. Here, nnz refers to the non-zero entries of A. The above two quantities are suitable for a first, theoretical performance analysis. A next step is to look at parallelization possibilities of the method. We call a method parallelizable, if the whole algorithm or at least the main parts can be executed in parallel, i.e., without a dependency on intermediate results. The big advantage here is that the run-time is considerably reduced while the number of FLOPs stays the same. Most of the real-time applications require massively parallel algorithms in order to be able to solve the problem in the desired time frame. The efficiency of parallelization depends on the hardware in use. GPUs have an enormous computational throughput compared to CPUs and perform extremely well for highly parallelizable algorithms. Some applications can also benefit a lot from a so called matrix-free representation, i.e., the matrix A is formulated as a linear function rather than a sparse or dense matrix. Let us illustrate this on the example of the matrix A=vvT. Saving the matrix requires n2 units of memory and performing a matrix vector multiplication with a vector x needs 2n2-n FLOPs. However, if we save only the vector v instead of the whole matrix this requires just n units of storage and multiplying x first by vT and then by v needs only 3n FLOPs. In this example we used the term memory usage, which denotes the units of memory required to store a matrix or vector. This is another important criterion, since the memory of hardware is limited and storing matrices can become quite memory intensive.

Altogether, we listed here 5 criterions based on whom you can decide for a solver how well it is suited for your application. Nevertheless, in the end you have to implement and optimize the algorithm for your specific hardware, which can become a very crucial and time demanding task.

About the author

My name is Bernadett Stadler and I studied Industrial Mathematics at the Johannes Kepler University in Linz. Since May 2018 I am part of the ROMSOC program as a PhD student. I am involved in a cooperation of the university in Linz and the company Microgate in Bolzano. My research addresses the improvement of the image quality of extremely large telescopes. The goal is to enable a sharper view of more distant celestial objects.

Optimal shape design of air ducts

Shape optimization has proved its indispensability from both theoretical and application points of view in many real world applications [1, 2, 4], such as drag reduction of aircrafts, designs of pipelines, bridges, engine components, implants for blood vessels, etc. For many optimal design problems in engineering or biomedical sciences, an optimal shape of a component or a device is determined to minimize some suitable objectives subject to fluid flows. In the automotive industry particularly, the optimal shape design of several components of combustion engines, such as air ducts, piston, crankshaft, valves, etc., is crucial for their performance optimization. Additionally, the possible design is restricted by some geometric constraints. To describe this problem mathematically, we use the stationary Navier-Stokes equation (NSEs)

to model the air flow inside a tube illustrated by the following figure:

Here f and fin are the given source density and in flow profi le, u and p are the velocity and the kinematic pressure, n is the unit outward normal vector. The uniformity of the flow leaving the outlet is an important design criterion of automotive air ducts to enhance the eciency of distributing the air flow [2]. To achieve this criterion, we minimize a cost functional capturing the distance between the velocity (or only its normal component) and a given desired velocity in the outlet. Another criterion engineers also want to minimize is the power dissipated by air ducts (and any fluid dynamic devices in general) [2]. This dissipated power can be computed as the net inward flux of energy through the boundary of the considered tube. Taking into account both objectives, we consider a mixed cost functional as a convex combination of the cost functionals above. A typical PDE-constrained shape optimization problem [1, 4] can be established by finding an admissible shape to minimize the mixed cost functional under the given Navier-Stokes equation. After setting up our optimization problem, the optimal shape of the tube can be analyzed with some suitable methods based on shape derivatives. Furthermore, large Reynolds number flows are known to be unstable and computationally challenging. We will explore the world of turbulence models [3] to enhance our con guration for the shape optimization problem in the next blog.


  1. Mohammadi, Bijan; Pironneau, Olivier. Applied shape optimization for fluids. Second edition. Numerical Mathematics and Scienti c Computation. Oxford University Press, Oxford, 2010. xiv+277 pp. ISBN: 978-0-19-954690-9.
  2. Othmer, C. A continuous adjoint formulation for the computation of topological and surface sensitivities of ducted flows. Internat. J. Numer. Methods Fluids 58 (2008), no. 8, 861{877.
  3. Pope, Stephen B. Turbulent flows. Cambridge University Press, Cambridge, 2000. xxxiv+771 pp. ISBN: 0-521-59886-9.
  4. Sokolowski, Jan; Zolesio, Jean-Paul. Introduction to shape optimization. Shape sensitivity analysis. Springer Series in Computational Mathematics, 16. Springer-Verlag, Berlin, 1992. ii+250 pp. ISBN: 3-540-54177-2.

About the author

Hong Nguyen is an Early-Stage Researcher within the ROMSOC project and a Ph.D. student at the Weierstrass Institute for Applied Analysis and Stochastics (WIAS) in Berlin. He is working in collaboration with Math.Tec GmbH in Vienna (Austria) on optimal shape design of air ducts in combustion engines.

Artificial Neural Network and Data-driven techniques: Scientific Computing in the era of emerging technologies

by Nirav Vasant Shah

Rapidly changing technological and scientific environments pose the challenges, which require “adaptability” of our knowledge with the skills of the future. One of the most important emerging technology of the recent times is the Artificial Intelligence (AI). The Artificial Neural Network (ANN), foundation of AI, has entered into the field of scientific computing and is set to become new paradigm in modeling and computation. The concept of ANN is inspired from the massive, hierarchical neural network in the human brain. The salient feature of the ANN is its ability to mimic the analytical relationship between input and output. This is possible by adjusting the (hyper)parameters of the ANN during training procedure against known input and output pairs. The use of ANN is giving rise to new field called data-driven techniques. The aim of data-driven techniques is to allow the system to learn and to train from the data rather than to create and solve a system of equations.

Neural network schematic

The data-driven techniques have brought alternative approaches in the field of numerical analysis. Usually, classical numerical analysis methods require to solve a system of equations, based on an alternate form of governing physical equations. In the data-driven techniques, the system aims to compute the solution field as minimizer of a loss function. As an example, one of the interesting variant of data-driven techniques is Physics Informed Deep Learning, described in [3],[4]. In this approach, the ANN is used to minimize the residual based on the governing conservation equation. It was demonstrated that such ANN can be used either to compute the solution field or to learn the parameters of the system. The successful results proved that Physics Informed Deep Learning can become an alternative to classical methods.

In Model Order Reduction, another area of scientific computing, various data-driven approaches have been proposed as an alternative to the projection based approaches [2]. The biggest difference between projection based approaches and data-driven approaches is the non-intrusive nature of the latter. Non-intrusive methods are preferrable in the context of industrial applications. Besides, assembly of system of equations was eliminated adding to the significant speedup of the reduced order model.

Despite the promising results and important advantages, data-driven approaches, in their current form, cannot become fully reliable alternative to classical methods such as Finite Element Method. The classical methods have rigorous mathematical background and have been successfully applied to the problems of varying degree of complexity. Besides, as a recent SIAM news article [1] explained, the deep learning methods face some of the major challenges in the context of scientific computing. For example, generation of data and training of ANN can be very expensive or the integration of deep learning libraries with data from other open-source computing libraries may remain a challenging task. The trend that needs to be observed is whether the developments in the areas of mathematical theories, algorithmic efficiency and increasing hardware capabilities, will make the data-driven techniques robust and more acceptable within the scientific computing community. As an Early Stage Researcher, it is a matter of clear professional choice : whether to invest into a quickly emerging field with many known unknowns or to opt for classical approaches due to their reliability and availability of references. A choice that lays foundation for the skills of the future or a choice that matches with seemingly sustainable and reliable career path.

1] Aparna Chandramowlishwaran. Artificial intelligence and highperformance computing: The drivers of tomorrow’s science. SIAM news., October 2020.

[2] Jan S. Hesthaven and Stefano Ubbiali. Non-intrusive reduced order modeling of nonlinear problems using neural networks. Journal of Computational Physics.,363:55 – 78, 2018.

[3] Maziar Raissi, Paris Perdikaris, and George Em Karniadakis. Physics informed deep learning (part i): Data-driven solutions of nonlinear partial differential equations. arXiv preprint arXiv:1711.10561, 2017.

[4] Maziar Raissi, Paris Perdikaris, and George Em Karniadakis. Physics informed deep learning (part ii): Data-driven discovery of nonlinear partial differential equations arXiv preprint arXiv:1711.10566, 2017.

About the author

Nirav Vasant Shah is an Early-Stage Researcher (ESR10) within the ROMSOC project. He is a PhD student at the Scuola Internazionale Superiore di Studi Avanzati di Trieste (SISSA) in Trieste (Italy). He is working in collaboration with ArcelorMittal, the world’s leading steel and mining company, in Asturias (Spain) and Technological Institute for Industrial Mathematics (ITMATI) in Santiago de Compostela (Spain) on the mathematical modelling of thermo-mechanical phenomena arising in blast furnace hearth with application of model reduction techniques.