ICCV Daily - Friday

Previous Page

Next Page

Page Background

In the end, they hope to increase the

recall so the precision recall curve will

increase. This allows them to approach

more difficult cases. They could bring

in very powerful systems and classifiers

based on deep learning and

convolutional networks.

On the practical side, their method has

the advantage of working very quickly.

Even in math labs, it is about three or

four frames per second, and this could

easily be done in real time in C++

implementation. It also does not use

any pre-learned features while still

obtaining state of the art results when

compared to other approaches which

may look more complex.

In essence, they try to focus on the

problem, not on a specific system or

specific technique. They want to find

out what they can learn by simply

watching videos. The next step would

be to achieve better performance and

more complex datasets. Because the

datasets are constantly changing in this

field, they want to employ all sorts of

neural nets and test it on the most

complex datasets available. That

means bringing the state of the art to

the next level. Marius feels extremely

optimistic about this idea. Together

with Emanuela, he plans to dedicate at

least three to five years to this idea.

When asked why she loves this work

so much, Emanuela responds, “I think

a solution to this problem would

describe how we are understanding

the world. It’s how children learn their

world basically. They see the object

move in front of them, and they learn

how to recognize it the next time.”

She explains how this method teaches

ways to find an object in the next

frame without knowing what the

object is. Whether a car or something

else, it doesn’t matter. You just need to

know that it is an object, and you learn

how it acts in real life.

Marius compares their approach to the

analogy of a fisherman. A fisherman

must use certain cues on how to catch

a fish without knowing everything

about the fish or where to find it.

Imagine a fisherman in a river for the

first time looking into the water

observing for movements. Perhaps he

throws something into the river to help

find the fish. After he catches the fish,

he looks at the fish and learns even

more about it.

Through this learning process, the

fisherman gains a better understanding

of the fish, its behaviors, and cues to

look for to find it. In the end, it

becomes an easy task knowing much

more about how to catch the fish.

In a similar way, their method gives

them high precision and good quality

features that they can harvest. With

every iteration, it improves. At the end,

they obtain really hard positive and

negative cases and reach the human

level performance and beyond.

To find out more about Marius

and Emanuela’s work, visit their

poster today (Friday) at ICCV

2017.

16

Friday

“

They want to find out

what they can learn by

simply watching videos

”

Marius and Ema

“

Their method gives them

high precision and good

quality features that they

can harvest

”