Skip to content

Latest commit

 

History

History
32 lines (16 loc) · 1.38 KB

README.md

File metadata and controls

32 lines (16 loc) · 1.38 KB

Harmonic Loss Trains Interpretable AI Models

This is the GitHub repository for the paper "Harmonic Loss Trains Interpretable AI Models" [arXiv] [Twitter] [Github].

Harmonic Demo

What is Harmonic Loss?

  • Harmonic logit $d_i$ is defined as the $l_2$ distance between the weight vector $\mathbf{w}_i$ and the input (query) $\mathbf{x}$$d_i = |\mathbf{w}_i - \mathbf{x}|_2$.

  • The probability $p_i$ is computed using the harmonic max function:

Harmonic Max

where $n$ is the harmonic exponent—a hyperparameter that controls the heavy-tailedness of the probability distribution.

  • Harmonic Loss achieves (1) nonlinear separability, (2) fast convergence, (3) scale invariance, (4) interpretability by design, properties that are not available in cross-entropy loss.

Reproducing results

Download the results from the following link: Link

Figure 1: toy_points.ipynb

Figure 2,3,7: notebooks/final_figures.ipynb

Figure 4. notebooks/case_study_circle.ipynb

Figure 5. notebooks/mnist.ipynb

Figure 6. GPT2/function_vectors.ipynb