tag:blogger.com,1999:blog-4297242917419089261.post6707019922506966320..comments2023-03-03T03:33:05.224-08:00Comments on Babies Learning Language: Mixed effects models: Is it time to go Bayesian by default?Michael Frankhttp://www.blogger.com/profile/00681533046507717821noreply@blogger.comBlogger23125tag:blogger.com,1999:blog-4297242917419089261.post-20257146944035466782018-03-12T16:09:11.552-07:002018-03-12T16:09:11.552-07:00When you remove compilation time, brms will be fas...When you remove compilation time, brms will be faster than rstanarm on almost any multilevel model, because the Stan code can be hand tailored to the input of the user. For any non-trivial multilevel model, estimation will take a few minutes, and at the time frame brms will usually already be faster even when including compilation time. Why do people continue to think the likelihood is implemented in a more optimal way in rstanarm?Anonymoushttps://www.blogger.com/profile/04334288635765014921noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-36475391565950598342018-03-12T16:07:16.673-07:002018-03-12T16:07:16.673-07:00This comment has been removed by the author.Anonymoushttps://www.blogger.com/profile/04334288635765014921noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-55146540047788801712018-03-02T20:50:01.525-08:002018-03-02T20:50:01.525-08:00That would be great (collaboration)! We have two c...That would be great (collaboration)! We have two completed meta-analyses, and we are working on a third one. Why not visit us in Potsdam sometime this year, and we can talk and plan this out? If this becomes more widespread in psycholx, people will adopt the standards you folks have developed.Shravan Vasishthhttps://www.blogger.com/profile/05926656325558456592noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-70378519974615433782018-03-02T13:58:50.164-08:002018-03-02T13:58:50.164-08:00Henrik, sorry I didn't realize your contributi...Henrik, sorry I didn't realize your contributions on bridgesampling (writing fast)! This is very helpful detail on the technical issues as well. I will have to read more on this, thank you!Michael Frankhttps://www.blogger.com/profile/00681533046507717821noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-2824323257401642862018-03-02T13:22:18.580-08:002018-03-02T13:22:18.580-08:00The bridgesampling R package developed by Quentin ...The bridgesampling R package developed by Quentin Gronau, myself, and E.J. Wagenmakers (https://cran.r-project.org/package=bridgesampling) indeed allows to calculate Bayes factors for all models fitted in Stan 'without tears'. It also works directly with models fitted with brms and rstanarm. However, IMHO these packages currently do not allow to specify priors that are appropriate for Bayes factor based model selection. So whereas I agree that we have solved the technical problem of how to get Bayes factors, the conceptual problem of specifying adequate priors still remains. But I hope there will be progress on this front this year.<br /><br />Furthermore, even when only interested in estimation or measurement, there exists a somewhat subtle issue related to categorical covariates with more than two factor levels in a Bayesian setting. For most coding schemes, the priors do not have the same effect on all factor levels. For example, for contr.sum the prior is more diffuse for the last compared to all other levels. For contr.helmert the prior becomes gradually more diffuse for each subsequent factor level with the exception of the first two. Thus, to make sure that the results are not biased and depend on which factor level is 'first' or 'last', a different type of coding scheme must be used. One such orthonormal coding scheme was developed by Rouder, Morey, Speckman, and Province (2012, JMP) and, AFAIK, is the one used internally by the BayesFactor package. Henrik Singmannhttps://www.blogger.com/profile/02531881768360426790noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-73425191513558747272018-03-02T10:46:57.279-08:002018-03-02T10:46:57.279-08:00Thanks - let me know if you would like to collabor...Thanks - let me know if you would like to collaborate on this. In principle, it would not be too difficult to import and visualize data of a different type. The difficulty is really in getting the meta-analytic data! Each of the MAs for development is essentially a full, labor-intensive paper being written by one of the collaborators...Michael Frankhttps://www.blogger.com/profile/00681533046507717821noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-63620702177272228852018-03-02T10:42:28.908-08:002018-03-02T10:42:28.908-08:00yes, sorry, i was actually thinking of Barr's ...yes, sorry, i was actually thinking of Barr's comment on twitter when I wrote this, which suggests that's the only reason:<br /><br />https://twitter.com/dalejbarr/status/969573499349159941<br /><br />But it's twitter, I can't hold it against Dale if he didn't mean that either.Shravan Vasishthhttps://www.blogger.com/profile/05926656325558456592noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-74041828691516105362018-03-02T10:40:51.860-08:002018-03-02T10:40:51.860-08:00yeah, that metalab thing is just simply amazing. w...yeah, that metalab thing is just simply amazing. why isn't this the norm now in all fields? i want to do something similar with interference studies, but need to find some time to create such an online tool. Shravan Vasishthhttps://www.blogger.com/profile/05926656325558456592noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-73700474165842993532018-03-02T10:00:12.145-08:002018-03-02T10:00:12.145-08:00Rant away, Shravan! This is my cause too.
My lab...Rant away, Shravan! This is my cause too. <br /><br />My lab has taken a number of steps to try and work on these issues. First, we preregister the key analyses for essentially every study we do. This is critical for frequentist stats, but I believe it's important for avoiding this inflation issue regardless of estimation method. Second, we have been actively working through projects like metalab (http://metalab.stanford.edu) and manybabies (http://manybabies.stanford.edu) to compare the state of the literature to unbiased, large-scale estimates of key effects.Michael Frankhttps://www.blogger.com/profile/00681533046507717821noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-23848189937418603572018-03-02T09:56:13.143-08:002018-03-02T09:56:13.143-08:00To be honest, I wasn't sure whether you are am...To be honest, I wasn't sure whether you are among the group of psychologists I am referring to. I'm relieved to hear you are not! But, if you do care what the magnitude of the effect is and not just that it is significant, then if power is low, it simply doesn't matter what model one fits, maximal, minimal, whatever. This is because any significant effect under a low power study is *guaranteed* to be an overestimate (I mean 100% probability of being 2-7 times larger than the true effect). Why would anyone care to publish that in a top journal as big news? Yet, that is exactly what Cognition and JML routinely publish. I elaborate on this point about the great need to focus on whether the effect is accurately estimated, and the importance of paying attention to the imprecision of the estimate, in this paper: https://psyarxiv.com/hbqcw. I did fit maximal models throughout :)<br /><br />Given that most published studies in psycholx (I don't know about child language acquisition) are heavily underpowered (see Appendix B of http://www.ling.uni-potsdam.de/~vasishth/pdfs/JaegerEngelmannVasishthJML2017.pdf for interference studies), I don't know why anyone even cares about whether the model is maximal or not. Whatever we are finding out from the data is misleading us either by giving us an exaggerated effect size, or the wrong sign, or both? I wish the authors of the Keep it Maximal paper had sent a high impact message like Keep it Powerful. Right now, people are continuing to run underpowered studies and publishing them in Cognition and JML, and worrying about whether their model is maximal or not when lack of power is the real problem. Meehl and Cohen came and went, and had no impact on psychology.<br /><br />I know you probably know all this, I am just engaging in a start-of-weekend rant. Shravan Vasishthhttps://www.blogger.com/profile/05926656325558456592noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-83646738885539643862018-03-02T09:12:20.457-08:002018-03-02T09:12:20.457-08:00I can't tell if you're including me in &qu...I can't tell if you're including me in "when psychologists say." I care *what the effect is* - which to me means, what is my best estimate of the magnitude. I'm not really that interested in |t|>2 or whatever. <br /><br />I also was convinced years ago by Andrew Gelman's arguments regarding the importance of including theoretically-meaningful but non-significant predictors in models. If you know the true generative structure of the data, you should model it even if you don't necessarily have the power to make strong decision-theoretic inferences about the importance of every single predictor in model fit. Michael Frankhttps://www.blogger.com/profile/00681533046507717821noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-27690433972246423642018-03-02T09:10:10.753-08:002018-03-02T09:10:10.753-08:00Hi Henrik,
Good points regarding rstanarm - I wi...Hi Henrik, <br /><br />Good points regarding rstanarm - I will go back and do some comparisons!<br /><br />Agreed about hypothesis testing. There is Wagenmakers' method for doing bayes factors "without tears." I haven't experimented with this method. Do you have a view? Personally, the research that I do is more about measurement and less about hypothesis testing so I often care about measuring the effect with some precision and not about testing whether it is "there."Michael Frankhttps://www.blogger.com/profile/00681533046507717821noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-88650839259705420372018-03-02T09:08:42.688-08:002018-03-02T09:08:42.688-08:00I hope the post above suggests that I am certainly...I hope the post above suggests that I am certainly not *only* switching because I want to fit a maximal model. Rather, I am switching because I stuck with frequentist LMEMs because I was thinking "well at least they're easy to fit." Now that advantage is gone, I'm very happy to go Bayesian on philosophical grounds!Michael Frankhttps://www.blogger.com/profile/00681533046507717821noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-85294704043449398312018-03-02T09:04:08.532-08:002018-03-02T09:04:08.532-08:00Doug, when psychologists say "c has an effect...Doug, when psychologists say "c has an effect" they mean that the p-value is less than 0.05, or the absolute t is greater than 2. I think that by "c has an effect" you mean "including it as a fixed effect in the model"?Shravan Vasishthhttps://www.blogger.com/profile/05926656325558456592noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-38099426659141286222018-03-02T08:57:39.113-08:002018-03-02T08:57:39.113-08:00Not only that, if the *only* reason that you are s...Not only that, if the *only* reason that you are switching to brms or rstanarm is that you want to fit a maximal model no matter what, Julia will fit the models much faster. Dave Kleinschmidt can probably help psychologists get started with that, I think that he's active on the Julia MixedModels github repo.Shravan Vasishthhttps://www.blogger.com/profile/05926656325558456592noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-66595696924118480062018-03-01T13:04:03.216-08:002018-03-01T13:04:03.216-08:00I would like to add two different thoughts to this...I would like to add two different thoughts to this discussion:<br /><br />1. If the goal is to fit a model that is essentially equivalent to a model that could be estimated with lme4::lmer or lme4::glmer, there is IMHO not much to be gained by using brms over rstanarm. Rather, rstanarm seems preferable. For those cases that simply require Bayesian estimation (i.e., regularization via prior distributions on top of the regularization provided by the hierarchical-structure), rstanarm::stan_lmer or rstanarm::stan_glmer implements exactly the same likelihood (minus the prior) as their lme4 equivalents. And in comparison to brms, rstanarm is faster because of two reasons. First, rstanarm does not require compilation and second, the likelihood is implemented in a somewhat more optimal way (I am sure about the first of these reasons and kind of sure about the second one).<br /><br />2. One problem that is not yet fully solved in a Bayesian framework is hypothesis testing (there are good arguments for Bayes factors, but these are not easily available with appropriate priors for such models). In cases in which categorical covariates (i.e., factors) have only two levels, as in the current example, this is not a big problem. One can simply inspect the posterior to learn about the difference between the factor levels. Of course, appropriate orthogonal contrasts (e.g., contr.sum) need to be used if the categorical covariates are in an interaction (in the same way that 0 should be meaningful if age is a continuous covariate). <br />However, when a factor has more than two levels simply inspecting the posterior often becomes difficult. Especially if a variable is in an interaction. The reason for this is that categorical covariates with k levels are transformed into k-1 parameters. For example, a factor with three levels will be transformed into two parameters. If the intercept is the grand mean then each parameter contains information pertaining to more than one factor level. So simply inspecting the posterior of those parameter values does not provide a lot of easy to interpret information. <br />Henrik Singmannhttps://www.blogger.com/profile/02531881768360426790noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-72867557434180773362018-03-01T10:12:18.024-08:002018-03-01T10:12:18.024-08:00As a psychologist, the thing I care about in many ...As a psychologist, the thing I care about in many of my studies is estimating the effect of c that is general across a population of participants. So assuming an effect of c by subject, drawn from normal, seems to me to be quite reasonable. Then I want the person-independent effect of c in the fixed effect. In part the reasonableness of the random slope is because I don't typically have the precision to measure each subject's c effect very well and so will get some by measurement error even if there is a true zero effect of c. But in general for any psychological manipulation, it's pretty safe to assume some variation...Michael Frankhttps://www.blogger.com/profile/00681533046507717821noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-5467508414422130342018-03-01T10:08:07.943-08:002018-03-01T10:08:07.943-08:00I'll contribute one - thanks!I'll contribute one - thanks!Michael Frankhttps://www.blogger.com/profile/00681533046507717821noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-7570144265552736132018-03-01T09:57:56.099-08:002018-03-01T09:57:56.099-08:00I was checking in the documentation for the MixedM...I was checking in the documentation for the MixedModels package<br /><br />http://dmbates.github.io/MixedModels.jl/latest/<br /><br />if I had an example of fitting a model in both lme4 and MixedModels. I don't at present.<br /><br />If you or another researcher could suggest a data set and model to use for demonstration I would be happy to write up a description of how to fit such models in Julia.<br />Douglas Bateshttps://www.blogger.com/profile/16223568223556405423noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-62949080157253858012018-03-01T09:51:23.096-08:002018-03-01T09:51:23.096-08:00The point that is lost in discussions of maximal m...The point that is lost in discussions of maximal models is the relationship between interactions and main effects. It is often difficult to make sense of a significance test for a main effect in the presence of a non-negligible interaction. In a model of the form<br /><br />rt ~ 1 + c + (1|subj) + (1|item)<br /><br />testing whether c is significant is a test of the condition taking into account variation between subjects and between items.<br /><br />On the other hand fitting the model<br /><br />rt ~ 1 + c + (1 + c|subj) + (1 + c|item)<br /><br />assumes that subjects and items have an effect on the overall level of the response and on the change with condition. What does the main effect for c mean such a case? You are testing "given that we know that c has an effect that varies from subject to subject and that varies from item to item, does c have an effect"? The answer is obviously "yes". You have assumed that it has an effect.<br /><br />This is why statisticians tend to structure tests from higher order to lower order. If you conclude that you have a non-negligible interaction then you don't test for the main effect.Douglas Bateshttps://www.blogger.com/profile/16223568223556405423noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-84666660975817785012018-03-01T09:45:10.746-08:002018-03-01T09:45:10.746-08:00Thanks very much for the comments Doug.
Regardin...Thanks very much for the comments Doug. <br /><br />Regarding fitting models in Julia, I will try this out - that seems like a very interesting option. As I mentioned in the post, I'm mostly concerned here with what to teach as a default for students doing very straightforward experiments. I'll have to play around but I worry that moving data into an entirely new language might not be an ideal solution (knowing that there are other benefits to Julia that caused you and others to switch). <br /><br />Regarding HLM vs. LMEM - yes, of course, you are right. I was trying to simplify and of course went too far in this direction. In part because lme4 has - thanks to your work - made these distinctions unnecessary for most users, I don't tend to emphasize the differences between nested and crossed effects in the way I would if choices on this front required users to switch software packages!Michael Frankhttps://www.blogger.com/profile/00681533046507717821noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-5583650964567665202018-03-01T09:34:19.553-08:002018-03-01T09:34:19.553-08:00One minor point of terminology, a mixed-effects mo...One minor point of terminology, a mixed-effects model with random effects for subject and for item is not a hierarchical linear model. "Hierarchical" means that the grouping factors for the random effects are nested. That is, they form a hierarchy.<br /><br />One of the basic design objectives of lme4 was to be able to fit models with crossed (each subject is exposed to each item) or partially crossed (each student is taught over time by one or more different teachers) random effects.Douglas Bateshttps://www.blogger.com/profile/16223568223556405423noreply@blogger.comtag:blogger.com,1999:blog-4297242917419089261.post-75467015495839064942018-03-01T09:28:10.898-08:002018-03-01T09:28:10.898-08:00There are many issues of statistical modeling in t...There are many issues of statistical modeling in this post that I would debate but I would mention first that if lme4 users are concerned about convergence they should try the MixedModels package for Julia. Combined with the RData or RCall packages, it is possible to take data from R and fit the model in Julia fairly easily.Douglas Bateshttps://www.blogger.com/profile/16223568223556405423noreply@blogger.com