Generalized Linear Models in Python

0

Do you know if you can define a Generalized Linear Model for a random variable with Gamma distribution in Python? Or will I have to settle for R? If it is possible, please attach tutorials, guides, documentation, examples ...

    
asked by Lucy_in_the_sky_with_diamonds 10.06.2017 в 03:46
source

1 answer

1

Hello, of course, the GLM does work in python, look at this example. For a variable y with 20 values of the gamma distribution.

set.seed(1)
y = rgamma(18,10,.1)
print(y)
 [1]  76.67251 140.40808 138.26660 108.20993  53.46417 110.61754 119.11950 113.57558  85.82045  71.96892
[11]  76.81693  86.00139  93.62010  69.49795 121.99775 114.18707 125.43608 120.63640

The output that r generates is as follows:

summary(glm(y~1,family=Gamma))

Call:
glm(formula = y ~ 1, family = Gamma)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-0.57898  -0.24017   0.07637   0.17489   0.34345  

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.009856   0.000581   16.96 4.33e-12 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for Gamma family taken to be 0.06255708)

    Null deviance: 1.1761  on 17  degrees of freedom
Residual deviance: 1.1761  on 17  degrees of freedom
AIC: 171.3

Number of Fisher Scoring iterations: 4

Now let's look with python:

import numpy as np
import statsmodels.api as sm

x = np.repeat(1,18)
y = [76.67251,140.40808,138.26660,108.20993,53.46417,110.61754,
 119.11950,113.57558,85.82045,71.96892,76.81693,86.00139,
 93.62010,69.49795,121.99775,114.18707,125.43608,120.63640]

The output is as follows:

sm.GLM(y,x, family=sm.families.Gamma()).fit().summary()

                 Generalized Linear Model Regression Results                  
==============================================================================
Dep. Variable:                      y   No. Observations:                   18
Model:                            GLM   Df Residuals:                       17
Model Family:                   Gamma   Df Model:                            0
Link Function:          inverse_power   Scale:                        0.062556
Method:                          IRLS   Log-Likelihood:                -83.656
Date:                dom, 20 may 2018   Deviance:                       1.1761
Time:                        17:12:44   Pearson chi2:                     1.06
No. Iterations:                     6   Covariance Type:             nonrobust
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const          0.0099      0.001     16.963      0.000       0.009       0.011
==============================================================================

Now let's compare the results as they are, they are very similar, only in python there are fewer digits.

    
answered by 21.05.2018 в 00:17