Learning Saliency Maps to Explain Deep TimeSeries Classifiers
P Parvatharaju, R Doddaiah, T. Hartvigsen, E. Rundenstiner
CIKM ’21 (21.7% acceptance rate), QLD, Australia
Explainable classification is essential to high-impact settings where practitioners require evidence to support their decisions. However, state-of-the-art deep learning models lack transparency in how they make their predictions. One increasingly popular solution is attribution-based explainability, which finds the impact of in- put features on the model’s predictions. While this is popular for computer vision, little has been done to explain deep time series classifiers. In this work, we study this problem and propose PERT, a novel perturbation-based explainability method designed to explain deep classifiers’ decisions on time series. PERT extends beyond re- cent perturbation methods to generate a saliency map that assigns importance values to the timesteps of the instance-of-interest. First, PERT uses a novel Prioritized Replacement Selector to learn which alternative time series from a larger dataset are most useful to perform this perturbation. Second, PERT mixes the instance with the replacements using a Guided Perturbation Strategy, which learns to what degree each timestep can be perturbed without altering the classifier’s final prediction. These two steps jointly learn to identify the fewest and most impactful timesteps that explain the classifier’s prediction. We evaluate PERT using three metrics on nine popular datasets with two black-box models. We find that PERT consistently outperforms all five state-of-the-art methods. Using a case study, we also demonstrate that PERT succeeds in finding the relevant regions of the input time series.
Differential Learning using Neural Network Pruning
P Parvatharaju, S. Murthy
CORR'21 * (in submission)
The idea of linear flow i.e, each node in a layer is connected to a certain weight Wij to every other node in the following layer, for the deep neural network is limiting in the sense of the way we, humans think. It is a constraint for DNN as they process data and emulate relationships in higher dimensions. By replacing the linear flow with a gradient-based decision process combined with skip connections give the ability for the network to develop much deeper constructs from minimal data. We propose a novel idea of extending the capabilities of previously used short-circuits as differentiable functions, essentially solving “What to feed?”. To tackle the problem, “When to stop?” we propose an algorithm for early stopping criterion based on Information Transfer derived from differentiable short-circuits