“ChatGPT for spreadsheets” helps solve difficult engineering challenges quickly | MIT News

Many engineering challenges come down to the same head – too many knots to turn and too few opportunities to explore. Whether you’re tuning a power grid or designing a safer vehicle, each test can be expensive, and there may be hundreds of irrelevant variables.
Consider the car’s safety design. Engineers must integrate thousands of components, and many design choices can affect how a car performs in a crash. Older optimization tools may start to struggle when searching for the best combination.
MIT researchers have developed a new method that rethinks how a classic method, known as Bayesian optimization, can be used to solve problems with hundreds of variables. In tests of real-world engineering-style benchmarks, such as power system optimization, the method found solutions 10 to 100 times faster than commonly used methods.
Their techniques use a basic model trained on tabular data that automatically identifies the most important variables for improving performance, repeating the process to hone in on better and better solutions. Basic models of large artificial intelligence systems are trained on large, regular datasets. This allows them to adapt to different applications.
The base model of the researchers table does not need to be constantly retrained as it works on the solution, increasing the efficiency of the optimization process. The methodology also brings significant acceleration to complex problems, so it can be particularly useful for applications such as materials development or drug discovery.
“Modern AI and machine learning models can fundamentally change the way engineers and scientists build complex systems. We have come up with a single algorithm that can not only solve high-level problems, but is also reusable for many problems without the need to start everything from scratch,” said Rosen Yu, a graduate student in computational science and engineering and the lead author of the paper.
Yu was joined on the paper by Cyril Picard, a former MIT postdoc author and research scientist, and Faez Ahmed, an associate professor of mechanical engineering and a core member of the MIT Center for Computational Science and Engineering. The research will be presented at the International Conference on Advocacy for Learning.
Developing a proven method
When scientists want to solve a multifaceted problem but have expensive ways to evaluate success, such as car crash testing to determine how good each design is, they often use a tried-and-true method called Bayesian optimization. This iterative approach finds the best configuration of a complex system by building a surrogate model that helps estimate what should be tested next while considering the uncertainty of its predictions.
But the surrogate model must be retrained after each iteration, which can quickly become computationally inefficient when the space of possible solutions is too large. In addition, scientists need to build a new model from scratch whenever they want to deal with a different situation.
To address both of these shortcomings, MIT researchers used a generative AI system known as a table base model as a discovery model within a Bayesian optimization algorithm.
“The table base model is similar to ChatGPT for spreadsheets. The input and output of these models are tabular data, which in the engineering domain is more commonly seen and used than language,” said Yu.
Like large language models such as ChatGPT, Claude, and Gemini, the model is pre-trained on a large amount of tabular data. This makes it well equipped to deal with a range of forecasting problems. Furthermore, the model can be deployed as is, without the need for any retraining.
To make their system more accurate and efficient for optimization, the researchers used a technique that allows the model to identify features of the design space that will have the greatest impact on the solution.
“A car may have 300 design criteria, but not all of them drive the best design when you’re trying to optimize certain safety parameters. Our algorithm can intelligently choose the most important features to focus on,” said Yu.
It does this by using a table-based model to estimate which variables (or combinations of variables) most influence the outcome.
It then focuses the search on those variables that have the greatest effect instead of wasting time testing everything equally. For example, if the size of the front crumple area increases significantly and the car’s safety rating is improved, that factor is likely to contribute to the improvement.
Bigger problems, better solutions
One of their biggest challenges was finding the best table base model for the job, Yu said. Then they had to connect it to a Bayesian optimization algorithm in such a way that it could identify the most prominent design features.
“Finding the most dominant dimension is a well-known problem in mathematics and computer science, but coming up with a method that maximizes the properties of the tabular base model was a real challenge,” said Yu.
With the algorithmic framework in place, the researchers tested their method by comparing it to five state-of-the-art algorithms.
On 60 benchmark problems, including real-world scenarios like power grid design and vehicle crash testing, their method found the best solution between 10 and 100 times faster than other algorithms.
“When the optimization problem gets larger, our algorithm really shines,” Yu added.
But their method was not successful in all problems, such as robot path planning. This may indicate that the situation is not well defined in the model’s training data, Yu said.
In the future, researchers want to study methods that can improve the performance of table-based models. They also want to apply their method to problems of thousands or millions of magnitudes, such as the design of a submarine.
“At a high level, this work points to a broader change: using basic models not just for perception or language, but as algorithmic engines within scientific and engineering tools, allowing classical methods such as Bayesian optimization to reach regimes that were previously ineffective,” Ahmed said.
“The approach presented in this work, using a pre-trained baseline model and Bayesian optimization, is a smart and promising way to reduce the heavy data requirements of simulation-based design. Overall, this work is a practical and powerful step to make advanced design optimization more accessible and easy to apply in real-world settings,” said Wei Cheok, chair of the Wilson-Coother Department of Engineering in the Wilson-Cowestern Department of Engineering at Wilson-Coother of Engineering Design at Wilson Cowestern Design at North Mewechanical Design. The university, which was not involved in this study.



