Skip to content

Differences in R2 and tss compared to GLM #179

Closed
@mdkeehan

Description

@mdkeehan

Hello,
Thank you for writing this package I wanted to use it as a go faster drop in replacement for the GLM.jl package. As a test I tried fitting the same model using both GLM and FixedEffectsModels and noticed differences in the R^2 reported. I have tried to replicate the difference in a minimum working example which I have attached. I have only been able to show a small difference which comes down to the following: @formula(obs~0+year+treatment) vs @formula(obs~0+year+fe(treatment)). It may be that my understanding of the the statistical model difference between the two formula is incomplete in which case a faq or a note in the documentation may help someone in the future. Here is my small working example which I hope might be the basis for a test.

using CUDA
using FixedEffectModels, GLM
using DataFrames, CategoricalArrays
using Test

@testset "Compare FE to GLM" begin
    treatmentD = ["A","A","A","B","B","B","B"]
    yearD = [0,1,2,       0,1,2,3]

    testframe=DataFrame(treatment = treatmentD, year = yearD,
                    obs = rand(length(treatmentD)) )

    testframe[!,:treatment]=categorical(testframe[:,:treatment])

    ols1=fit(LinearModel,@formula(obs~0+year+treatment),testframe)

    fesol=reg(testframe,@formula(obs~0+year+fe(treatment)),save=true)
    @test isapprox(r2(fesol),r2(ols1))
    @test isapprox(adjr2(fesol),adjr2(ols1))
    @test isapprox(coef(fesol)[1],coef(ols1)[1])
    @test isapprox(residuals(fesol),residuals(ols1))

    nofesol=reg(testframe,@formula(obs~0+year+treatment),save=true)
    @test isapprox(r2(nofesol),r2(ols1))
    @test isapprox(adjr2(nofesol),adjr2(ols1))
    @test isapprox(coef(nofesol),coef(ols1))
    @test isapprox(residuals(nofesol,testframe),residuals(ols1))

    @test isapprox(nofesol.tss, fesol.tss) #this fails
    @test isapprox(fesol.tss, nulldeviance(ols1))

    @test isapprox(nofesol.rss, fesol.rss)
    @test isapprox(nofesol.rss, deviance(ols1))

    nofesolgpu=reg(testframe,@formula(obs~0+year+treatment),method=:gpu, double_precision=true,save=true)
    @test isapprox(r2(nofesolgpu),r2(ols1))
    @test isapprox(adjr2(nofesolgpu),adjr2(ols1))

    fesolgpu=reg(testframe,@formula(obs~0+year+fe(treatment)),method=:gpu, double_precision=true,save=true)
    @test isapprox(r2(fesolgpu),r2(ols1))
    @test isapprox(adjr2(fesolgpu),adjr2(ols1))

    @test isapprox(r2(fesolgpu),r2(fesol))
    @test isapprox(adjr2(fesolgpu),adjr2(fesol))
    @test isapprox(coef(fesolgpu),coef(fesol))
    
    @test isapprox(r2(nofesolgpu),r2(nofesol))
    @test isapprox(adjr2(nofesolgpu),adjr2(nofesol))
    @test isapprox(coef(fesolgpu),coef(fesol))
end

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions