Digital Twins Bridge the Data Gap for Deep Learning

Article By : Aki Fujimura

As company's embark on DL projects that put their data to work, they must protect that data; digital twins offer a key to success.

In today’s world, data is king. The most highly valued companies in the world, whether Amazon, Apple, Facebook, Google, Walmart, or Netflix, have one thing in common: data is their most valuable asset. All of these companies have put that data to work using deep learning (DL). No matter what business you’re in, your data is your most valuable asset. You need to protect that asset by doing your own DL. The most important ingredient for DL success is having enough of the right kinds of data. That’s where digital twins come in.

A digital twin is a digital replica of an actual physical process, system, or device. Most importantly, digital twins can be thekey to success for DL projects — especially DL projects that involve processes that are dangerous, expensive, or time-consuming.

The promise of deep learning

By now, nearly every industry — including semiconductor manufacturing — has recognized the potential of DL to create strategic advantage. DL employs neural networks to perform advanced pattern-matching. DL has been applied to such varied fields as facial and speech recognition, medical image analysis, bioinformatics, and materials inspection. In semiconductor manufacturing, DL has already been applied in areas such as defect classification. Most leading companies are scrambling to gain an advantage on this promising new playing field.

As companies start to explore DL and how it can help them, many are finding two things: first, it’s easy to get to a DL prototype, and second, it’s harder to get from “good prototype” results to “production-quality” results. With all of the low- to no-cost DL platforms, tools, and kits available today, initial development for DL applications is very quick and relatively easy in comparison to conventional application development. However, productizing DL applications isn’t any easier — and can be harder — than productizing conventional applications. The reason for this is data. Having enough data — and enough of the right kinds of data — is very often the difference between a DL application that doesn’t deliver production-quality results and one that revolutionizes the way you approach a particular problem.

The DL data gap

DL is based on pattern-matching, which is “programmed” by presenting neural networks with data that represent a target to be matched. Masses of data train a network to recognize the target (and to know when it’s not the target).

DL is incredibly powerful for quickly producing prototypes and providing proof-of-concepts. But the real advantage of DL isn’t the speed of development — it’s the fact that it unlocks the power of data to do things that can’t be done any other way.

The success of any DL application depends on the depth and breadth of the data set used in training. If the training data set is too small, too narrow, or too “normal,” a DL approach will not do better than standard techniques — in fact, it might do worse. It’s important to train a network with data representing all important states or presentations, in sufficient volumes for the network to learn to capture the correct essence of the problem at hand.

The difficulty for some fields, such as autonomous driving or semiconductor manufacturing, is that some of the most serious anomalous conditions occur (thankfully) very rarely. However, if you want a DL application to recognize a child darting in front of a car — or a fatal photomask error — you have to train the networks with a multitude of these scenarios, which don’t exist in any great volume in the real world (Figure 1). Digital twins are the only way to create enough anomalous data to properly train the networks to recognize these conditions.

Figure 1. Illustration of a normal distribution curve with standard deviation. In semiconductor manufacturing, as with driving, “outlier” events are very rare, but neural networks must be trained as much with them because worst-case incidents result in chip failure; overall average performance isn’t good enough.

Digital twins bridge the gap

Digital twins — virtual representations of actual processes, systems, and devices — are a key tool for creating the right amount of the right kind of data to train DL networks successfully. Last July, I was part of a TechTALK session at SEMICON West 2019 hosted by Dave Kelf of Breker Verification Systems, Inc., titled, “Applied AI in Design-to-Manufacturing.” In this panel session, I outlined the concept of using digital twins in semiconductor manufacturing. You can read an article covering this panel, written by the late and sorely missed Randy Smith for Semiwiki.

There are several reasons to use digital twins to create DL training data:

  • You may be in a position where the data you work with belongs to your customers, so you can’t use it for DL training.
  • You may be in a position where the resources you need to create the data you need for DL are fully committed to customer projects.
  • You have developed DL applications but have found that you need specific data to tune and train your neural networks to reach the required level of accuracy, but the cost of using mask shop/fab resources to create the data is prohibitive.
  • You know that you will not be able to find enough anomalous data to train your DL networks adequately. This last case is nearly universal.

Ideally, to maintain full control over the data, you need three digital twins: a digital twin of the process/equipment that precedes yours in the manufacturing flow to provide input data for the simulation of your own process; a digital twin of your own process/equipment; and a digital twin of the process/equipment that follows yours in the manufacturing flow so that you can feed your output downstream for validation.

At the 2019 SPIE Photomask Technology conference, D2S presented a paper1 demonstrating the creation of two digital twins — a scanning electron microscope (SEM) digital twin, and a curvilinear inverse lithography technology (ILT) digital twin — using DL techniques (Figure 2 shows the output of the SEM digital twin). While the output of digital twins in general is not accurate enough for manufacturing, these digital twins have been used both for training DL neural networks and validation. Importantly, these digital twins were generated by DL, rather than through simulation. This is an example of using DL as a tool to generate data needed to do other DL, and it demonstrates the compounding benefits of investing in DL.

Figure 2. Two examples of mask SEM images generated by the SEM digital twin and the real SEM image. The image intensity on a horizontal cutline at the same location are shown as well. Not only do the images look very similar, but the signal response on edges are similar as well.

A roadmap to DL success

All of this may sound like a lot of work — why not use a consulting company that will do DL for you? Because, remember, data is king! Protect that data and do DL yourself. Thankfully, there is an established path to success for you to follow.

First, you need to identify a project where DL will have an impact. You do need to choose carefully — DL is pattern-matching, so you need to pick something that falls into that realm. Image-based applications, such as defect categorization are obvious matches. Less obvious, but very powerful, is an application such as automatic discovery from machine logs. All of the equipment in the fab creates masses of operational data, which is rarely referenced until something goes wrong. Instead of using this valuable data merely as a diagnostic tool after the fact, you could monitor this data across the fab on an ongoing basis and train DL applications to flag patterns that precede problems, so you can identify and correct issues before they have impact, saving downtime.

Mycronic, for example, disclosed during an eBeam Initiative lunchtime talk at the 2020 SPIE Advanced Lithography Conference how the company put DL to work using data from its machine log files to predict anomalies like “mura” (uneven brightness effects that are annoying to the human eye, but that are notoriously difficult for image-processing algorithms to detect) on flat-panel display (FPD) masks.

In general, tedious and error-prone processes that human operators perform, but that are difficult to automate with traditional algorithms, are good candidates for deep learning.  Whether through visual inspection or otherwise, typically in these problems, a human professional examining a specific situation would have a high probability of correctly performing the task. But presented with many instances of similar situations, humans make mistakes and become increasingly unreliable. DL, given one particular situation, may not do as well as a human can. But its probability of success for one situation extends to unlimited instances over unlimited time with the same probability of success. Humans make more mistakes as the volume of situations and/or time executing the task increases; DL’s probability of success does not degrade over volume or time.

Help to bridge the gap to DL success

Once you’ve identified a DL project, there are various resources available that can put you on the path to success while still enabling you to maintain strict control of your own data. If you’re new to DL and would like comprehensive support for your pilot DL project(s), you can join the Center for Deep Learning in Electronics Manufacturing (CDLe, www.cdle.ai), an alliance of industry leaders designed to pool talent and resources to advance the state-of-the-art in DL for our unique problem space and to accelerate the adoption of DL in each of our company’s products to improve our respective offerings for our customers.

If you’ve already started down the road with your DL projects but have encountered issues due to the DL data gap, D2S can help you to build the digital twins you need to augment and tune your data sets for DL success.

﹣Aki Fujimura is chairman and CEO at D2S

Leave a comment