A study on sturdiness

In the spring, we installed Kide to a customer's sheeting line. The ramp-up phase was somewhat laborous due to problems in installation and the wide variety of different sheets the mill was producing. This fall, everything seemed to be finally OK. We had been able to train the system with almost all product flavors, and it seemed to do a decent job. The system controlled the sheet sorter just fine and had been running steady for a couple of months. So we decided it is about time to roll in our acceptance procedure and declare the project formally closed.

The first 200 sheets went fine, and we were at our performance target of 95% correct classifications compared to human evaluators. Then, all of a sudden, the system started to reject almost half of the sheets for edge cracks. None of us had a clue of what was happening. Needless to say, the customer was not too convinced. Neither were we, as nothing, really nothing should ever cause such behavior.

Back in the office we reran the experiment with a simulator back and forth but could not reproduce the problem. No matter what distortions we added to the images the system just didn't fail. Then, suddenly, we saw an incorrectly classified sheet. The system reported an edge crack, but the simulated sheet was fine, just with a bit of artificial distortions. "A bit" may however be an understatement. When we tracked the failure down to the code, we found that four preconditions must be met simultaneously: 1) part of the image background must be lighter than the rest (there was simulated white dust) 2) the edge of the sheet must be close to this area (we simulated a sidewise moving product) 3) contrast must be bad (the simulated sheet was black) 4) there must be a large dark defect on the sheet (we added simulated dirt). If all these occured simultaneously, our real-time edge tracker gave unreliable estimates for a while. Happily, the problem was quite easily solved once found.

What did we learn, then? At least the fact that in industry, halting a production line or rejecting good material is not an option. An inspection system must work at a nearly 100% performance level 24/7. A single unexplained failure will render the system unreliable and cost an unacceptable amount of money. The challenge is that there is no way to simulate everything in a lab. It takes a heap of tuning on mill floors to get a product to a level that is required for serious applications such as on-line control of a production line.

(Posted by Topi)