-
Notifications
You must be signed in to change notification settings - Fork 3
glm() in R #203
Comments
Correct me, if I‘m wrong, but I think the reason for R to display all but one of the variable levels in your output is, that if its scaling is ordinal, R automatically picks the first level of a variable as the reference category per default. You can change this category with the contrasts-function, and I remember trying this with the mutate-function, too, so it should also be possible via the tidyverse. I did this in my seminar paper for the „Introduction to R“ - I‘m going to look this up and get back to you! |
Alright, I've found the code I was looking for: As far as interpretation is concerned, in my |
Hey, Florian. About an interpretation through some plots: If you have a scatterplot it is undoubtedly convenient to interpret it, but I did a stepwise logistic Regression of my glm. Therefore R prints only the results of my glm and no plot. After that you could of couse plotting this, but it is ineviteable to interpret something like r-qrt. P.S.: Are you maybe familiar with pca/efa (ger. Faktorenanalyse)? I´ve met there another smaler issue. |
Thanks for the thoroughly described issue @DOC-fau and for the generous suggestions @FWisniewski44! I think @FWisniewski44 hit the nail on the head with the reference category. Two more pointers that you might find interesting:
The statistics stuff, obviously, is a huge rabbit hole, and is a bit out of scope for the class. If you want to dive all in, I can't recommend Richard McElreath's Statistical Rethinking enough. There's also a free YouTube lecture series where he goes through the book. |
And yes @DOC-fau I actually do PCA/EFA all the time, so hit me up with any questions! In some ways, it's easier than glm, because it's not inferential statistics. |
Hey @DOC-fau, I found the code I was looking for that did the job for me in my seminar paper and helped me a lot with interpreting my I think this could work for you, too, if you want to visualise your findings in a cool and simple, yet clear way. As far as numbers are concerned, I used the And also thanks @maxheld83 for the advice on tidymodels and R. McElreath's book (and the advice on building/interpreting models in general!). If the content is on YouTube, too, that's always a big plus... 😄 |
Thank you for your input @maxheld83 and @FWisniewski44! In my case I did a pca previously to my regression to reduce the amount of my independent variables that I suspected to have an influence on my dependent Variable (didVote). I assumed that some variables have similar effects on didVote like "trust in parliament" and "trust in government". After I got some factors through my pca I thought about making some indices like "Trust in political system" and inserted it in my glm(). Some indices and variables were significant and got good VIF-Values afterwards. So I´m wondering if my thinking and doing is right. Especially because I got occasionally an error like "glm.fit fitted probabilities numerically 0 or 1" that vanished after I excluded a variable (age). |
This is my first issue on github. Hopefully I´m doing this right. :)
I´m currently working at my fist R-Project (analysis of people who didn´t Vote 2017). After working in SPSS I´m still a little spoiled and inexperienced. My glm() gives me this output (for simplicity are some variables omitted):
By a dependend variable didVote (NA´s are dropped), which depicts if one did vote in BTW 2017:
and an example out of one of my independend variables dutyVote, which depicts the approval of the statement: "In democracy it is everyones duty to vote regularly." (I´m assuming that didVote is metric (Likert-scale).) :
It might be very difficult to interpret this output without context, but my question more theoretical.
First I´d like to know, why my glm() doesn´t print every item but only those three you can see above. Is it due to missing significance of some items?
Second, I´d like to know If the following interpretation would be correct:
"The higher the approval of "In democracy it is everyones duty to vote regularly." it is more likely that one did vote 2017."_
How would you interpret this correlation, if some items are insignificant?
This is only an example of my glm(). I will probably recode some variables, because I´m not positive about the scales of measurement (e. g. dutyVote).
Many results of a previous search doesn´t use a metric level of measurement and if you want to reproduce my glm(), I could attache my r-script and the dataframe.
Thanks for any answer! :)
The text was updated successfully, but these errors were encountered: