We had a large set of very noisy data and knew that it contained important information. However, we didn’t know how many types of information were present, nor how much of each type. This meant we had to be able to identify the distinct types of information with exquisite sensitivity, and without preconceived notions, about what was present.
The data: It consisted of ~669,000 x-ray absorption spectra taken for a sample of a Roman urn. Because the goal was to identify the presence and location of all oxides of iron and unexpected contaminants, it was crucial not to bias the analysis by assuming which oxides were present. Commonly used methods fail this test.
The noise: This raw data contains lots of random noise. The normal approach is to pre-process the data, a step that inevitably loses information. Because DQC works with the raw data no time is lost in figuring out how to pre-process that data and there is no danger of losing unexpected information.
The result: DQC accurately – and with exquisite sensitivity – identified the different oxides of iron present on the sample. In fact, it separated 68 spectra – belonging to an unexpected contaminant – from the almost 669,000 curves in the dataset. This kind of sensitivity surpasses the state of the art and demonstrates a new way of approaching this sort of problem.