Computer Vision News - March 2019

Outlier detection, also known as anomaly detection, refers to identifying rare occurrences, observations, or in the most general sense -- data points -- that show a distinct variance from the general population. Ground truth labeling is scarce, especially for outlier detection tasks. Outlier detection is especially valuable in fields processing big data. Industry applications include financial fraud detection (Ahmed, et al), mechanical failure detection (Shin, et al), network infiltration (Garcia-Teodoro, et al) and pathology detection in medical imaging (Baur, et al). An outlier is any datapoint that is very different from the rest of the observations in a set of data. Some examples: When an 8th-grade class includes one student who is 1.85 m, when all the other students are between 1.55 m and 1.70 m. When a client’s purchasing patterns are analyzed and it turns out that while most of his purchases are under $100, a single purchase of over $20,000 shows up out of nowhere. What happened? Is this a legitimate transaction, an unusual purchase, a change in behavior -- or is there anything wrong? What is the reason for this outlier? There are many reasons for the occurrence of outliers. Perhaps there was an error in data entry, or a measuring error, incorrect data can even be given purposefully -- individuals who don’t want to reveal the real data about themselves, may feed made up data as input (for instance into online forms). Of course, we must remember, outliers may also represent real unusual occurrences. Why do we need to discover outliers? Outliers can dramatically affect the results of our analysis and our statistical models. Let’s look at the following example to realize what happens to our model when outliers are included in our dataset versus when they have been discovered and removed from it. 18 Focus on: PyOD Focus on by Assaf Spanier What happened? Is this a legitimate transaction, an unusual purchase, a change in behaviour? Or something is wrong… What is the reason for this outlier? Computer Vision News