Labels

_fuxi (75) _IV (146) _misc (5) {610610 (30) algo (1) automatedTrading (8) banking/economy (3) book (14) c++misc (125) c++real (15) c++STL/java_container (7) cppTemplate (1) db (13) DB_tuning (4) deepUnder (1) dotnet (69) eTip (17) excelVBA (12) finance+sys (34) financeMisc (24) financeRisk (2) financeTechMisc (4) financeVol (21) finmath (17) fixedIncome (25) forex (16) IDE (24) invest (1) java (43) latency (4) LinearAlgebra (3) math (30) matlab (24) memoryMgmt (11) metaPrograming (2) MOM (15) msfm (1) murex (4) nofx (11) nosql (3) OO_Design (1) original_content (4) scriptUnixAutosys (19) SOA (7) socket/stream (15) sticky (1) subquery+join (2) swing (32) sybase (6) tech_orphan (12) tech+fin_career (30) telco (11) thread (21) timeSaver (13) tune (10) US_imm (2) US_misc (2) windoz (20) z_algo+dataStructure (4) z_arch (2) z_c#GUI (30) z_career (10) z_career]US^Asia (2) z_careerBig20 (1) z_careerFinanceTech (11) z_FIX (6) z_forex (31) z_hib (2) z_ikm (7) z_inMemDB (3) z_j2ee (10) z_oq (14) z_php (1) z_py (26) z_quant (4) z_skillist (3) z_spr (5)

Saturday, October 5, 2013

bak - strength of correlation



"uncorrelated", "strongly correlated" ... I hear these terms as basic concepts. Good to get some basic feel

1) One of the first "sound bites" is the covariance vs correlation definitions. I like http://en.wikipedia.org/wiki/Covariance_and_correlation. Between 2 series of data (X and Y), cov can be a very large num (like 580,189,272billion), which can't possibly reveal the strength of correlation between X and Y. In contrast, the correlation number (say, 0.8) is dimentionless, and
has a value between -1 and 1. This is intuitive.
* highly correlated means close to +1 or -1
* uncorrelated means 0.

2) Below I feel these are 2 similar definitions of the corr. Formally, this number is the Linear correlation between two variables X and
Y.



A) For the entire population,


          


B) For a sample taken from the population,

        


B2) The formula above is identical to the r definition on P612 [[Prem Mann]] 
        
           , where SS stands for sum-of-sqaures

B3) An equivalent (equally useful) formula is 
        SSxy =


3) r-squared is a standard measure of the goodness of a linear regression model