PCA power-iteration does one more iteration than asked (off-by-1) #1910

jucor · 2025-02-09T13:02:40Z

Reporting here a subtle internal math bug for tracking only.
To be clear: this does not affect the convergence of the PCA: the PCA is still valid at the end once it converges :)

But, if you ask for N power iterations, the current clojure code does N+1 iterations instead. This is due to the way the loop is coded in clojure in:

polis/math/src/polismath/math/pca.clj

Lines 50 to 56 in 2ed917e

    
           (loop [iters iters start-vector start-vector last-eigval 0] 
        
             (let [product-vector (xtxr data start-vector) 
        
                   eigval (matrix/length product-vector) 
        
                   normed (matrix/normalise product-vector)] 
        
               (if (or (= iters 0) (= eigval last-eigval)) 
        
                 normed 
        
                 (recur (dec iters) normed eigval))))))

The let statement is evaluated even when iters is 0.
And on line 55 we return normed upon termination, therefore the result of evaluating the let block.

If we wanted to fix it to do exactly iters iteration, and not iters+1, line 55 should return start-vector instead of normed.

I suggest we do NOT fix this this for now, until I have full clarity of how partial-pca is computed. It will most matter much when we use a small number of iteration.

I see that

polis/math/src/polismath/math/conversation.clj

Lines 703 to 708 in 2ed917e

    
           (defn partial-pca 
        
             "This function takes in the rating matrix, the current pca and a set of row indices and 
        
             computes the partial pca off of those, returning a lambda that will take the latest PCA 
        
             and make the update on that in case there have been other mini batch updates since started" 
        
             [mat pca indices & {:keys [n-comps iters learning-rate] 
        
                                 :or {n-comps 2 iters 10 learning-rate 0.01}}]

defaults to 10 iterations, but I am unclear how often that default is used, knowing that

polis/math/src/polismath/math/conversation.clj

Lines 136 to 151 in 2ed917e

    
           (def base-conv-update-graph 
        
             "Base of all conversation updates; handles default update opts and does named matrix updating" 
        
             {:opts'       (plmb/fnk [opts] 
        
                             "Merge in opts with the following defaults" 
        
                             ;; TODO Answer and resolve this question: 
        
                             ;; QUESTION Does it make senes to have the defaults here or in the config.edn or both duplicated? 
        
                             (merge {:n-comps 2 ; does our code even generalize to others? 
        
                                     :pca-iters 100 
        
                                     :base-iters 100 
        
                                     :base-k 100 
        
                                     :max-k 5 
        
                                     :group-iters 100 
        
                                     ;; These three in particular we should be able to tune quickly 
        
                                     :max-ptpts 100000 
        
                                     :max-cmts 10000 
        
                                     :group-k-buffer 4}

defines a default of 100.

As to the python port in #1893 , for now I am reproducing the behaviour bug-for-bug, see 47c217b .

The text was updated successfully, but these errors were encountered:

jucor mentioned this issue Feb 10, 2025

[Work in Progress DO NOT MERGE] Port math library from clojure to python #1893

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PCA power-iteration does one more iteration than asked (off-by-1) #1910

PCA power-iteration does one more iteration than asked (off-by-1) #1910

jucor commented Feb 9, 2025 •

edited

Loading

PCA power-iteration does one more iteration than asked (off-by-1) #1910

PCA power-iteration does one more iteration than asked (off-by-1) #1910

Comments

jucor commented Feb 9, 2025 • edited Loading

jucor commented Feb 9, 2025 •

edited

Loading