Computer Vision News

● Task 4 visually aligns matched phrases and their annotations, such as their part-of-speech, allowing comparison of the user’s hypothesis with alternative, structural ones. The set of annotation data should be easily extensible. [Goal 2 and 3] ● Task 5 offers a general interface to enable any RNN model and text-like datasets to be used, with the aim of reaching a wide audience generating a wider discussion comparing various models and producing crowd knowledge. [Goal 3] This list of tasks provided a guideline for LSTMVis design, with tasks 1-4 defining the system’s user interface. We will first describe the user interface, and then demonstrate their application to a couple of example cases. Using the LSTMVis user interface: Select View allows the user to interactively select a range of text which represents a hypothesis they have about the model. Use the mouse to highlight a text region to select it and select an activation threshold - appearing as a red line, which refines the selected hidden state pattern; this leads to matched hidden state (bold blue line) patterns. Users can utilize a ‘heatmap’ showing how many of the selected hidden states are ‘on’ for each word, as an aid to interesting text-range selection. Match View provides ‘evidence’ for or against the hypothesis. Match View returns a “search result” of most relevant or similar phrases / text-ranges, by similarity of their corresponding hidden state patterns to the hidden state patterns selected by the user in Select View . Formally, assume that the user has selected a threshold with hidden states 1 and has not limited the selection to the right or left further. The Match View of LSTMVis ranks all candidate from the dataset starting at time and ending at time with the following two step: 1. Select all hidden states that are “on” for the given range 2 = { ∈ {1 ⋯ }: ℎ , ≥ ≤ ≤ } 2. Find the candidates with overlapping states | 1 ∩ 2 | using the inverse of the number of additional “on” cells −| 1 ∪ 2 | with length - The user interface is consistent and intuitive; the matched text-ranges returned are of similar length to the selected one. Use case: The following example uses the Word Model of the Wall Street Journal dataset (a 2x650 LSTM language model trained on the Wall Street Journal, annotated with gold-standard part-of-speech tags): 12 Tool We Tried for You: RNNVis Computer Vision News

Computer Vision News - March 2018