Sunday, December 19, 2010

Lessons Observed: Learning Bayesian Methods

I've been working with one of my students in a project that involves identifying a proper probability distribution and parameters for a fairly complex and diverse data set. As we did our literature review, one thing that was very unsatisfying was the fact that many published papers either used data that was unavailable at the time needed, or employed magic numbers as part of their method (magic numbers are arbitrarily chosen constants). As she did her literature review, we discovered the applications of Bayesian methods. But neither of us had any experience in using this. At the same time, my PhD student had a problem that we uncovered during his proposal presentation. He needed another course. Solution. We'll have an independent study on Bayesian methods with three of us.

We used as a basic text Carlin and Louis, Bayesian Methods for Data Analysis and Alberts, Bayesian Computation with R as a supplementary text. The alternative to Carlin and Louis would be Gelman et. al., Bayesian Data Analysis. We chose the Carlin and Louis text because it seemed to be more technical while Gelman et. al. was aimed at social scientists (as opposed to the mathematical disciplines we came from). (Note: all of these do require some level of programming using R)

While doing this we were also looking at various Markov Chain Monte Carlo (MCMC) toolkits. The best programmer was working with MCMCPack. The least experienced used WinBUGS and I used JAGS.

Lessons learned:

1. For independent study, I should be more forceful on making them do the exercises. By the time we were done, I had implemented many of the models, but I don't think my students did.

2. Carlin was good to work with. I had gotten the instructors solutions guide direct from him (although I did not use it). I also identified a problem in one of the data files for one of the case studies.

3. Of the three of us, JAGS was the only one we got to work well. We had a hard time formulating models in MCMCPack. WinBUGS would work, but it was only good for interactive use (if you called it from R, it would open its own window to do its work, which is a lot of overhead) and we needed something that could be used as a callable library because we needed to apply this to 1000's of cases.

4. There was a benefit to involving my students in learning this field. Because I knew nothing about it, I could model the process of learning a new field of knowledge to my students.

Outcomes

1. The project is turning out to be successful. We're doing comparative performance evaluation now and it does considerably better then the other methods in the literature. The fact that Bayesian methods blend expert knowledge and historical data in a systematic way gives it considerable face validity.

2. The student that I was working with is going back to her home university with an expectation that she will introduce Bayesian methods to faculty and other grad students in her statistics department (at a university outside the U.S.)

All in all, I think this experience was successful. Not that I am an expert in Bayesian methods, but this has led to very good results that I expect to see implemented on live data in the near future. And some insights on situations that allow Bayesian methods to be more useful then most applications of it.
Post a Comment