Daniela Witten，26岁当上assistant professor，被福布斯评为科学界30个30岁以下的牛人之一。父母都是普林斯顿的professor，姐姐是普林斯顿的assistant professor，她爸得过菲尔兹奖，老公是facebook的元老。
总结：科研是聪明且有钱人的游戏。
The Lasso Page is maintained by the inventor of lasso and provides most important references for lasso.
The book Elements of Statistical Learning (pdf) describes the lasso in detail.
Lasso in R: lars: Least Angle Regression, Lasso and Forward Stagewise, and glmnet: Lasso and elasticnet regularized generalized linear models (Note: lars() function from the lars package is probably much slower than glmnet() from glmnet.)
Adaptive lasso in R
adaptive.lasso function in lqa package (Penalized Likelihood Inference for GLMs)
adalasso function in parcor package (Regularized estimation of partial correlation matrices)
Graphical lasso in R (glasso: Graphical lasso estimation of Gaussian graphical models)
The joint graphical lasso paper
Joint graphical lasso in R (JGL: Performs the Joint Graphical Lasso for sparse inverse covariance estimation on multiple classes)
I want to compute the Pvalue from the joint cumulative distribution of an ndimensional order statistic.
One efficient way is using the following recursive formula.
However, the facts are (or would be):
In Statistics Monte Carlo simulation is a “quick” way to compute some complicated formulas. By saying “quick”, I mean I can see the results without knowing or deriving “ugly” Math formulas. It’s actually a very “slow” method in computing aspect.
Anyway, the R function is here.
I have been doing some research about coexpression network. “coexpression” means that genes have similar expression profiles across different conditions or tissues. In the network, genes are nodes, and “coexpression” relationship between two genes can be reprensented as edges. The coexpressed genes may involve in similar pathways or biological process.
In a small part of my research, I am testing some algorithms to detect coexpression relationship. One way to test algorithm is simulation. In an ideal (simple) case, the expression values of two coexpressed genes can be considered as bivariate normal distributed. To generate expression values of such gene pair or a group of genes given a correlation coefficient, is just to simulate multivariate normal distribution. MASS library in R has an function, mvrnorm, to do that, but it requires a covariance matrix.
The function below is to firstly generate the covariance matrix in order to use the mvnorm function. Because we only know the correlation coefficient, i.e. coexpression relationship (degree), the mean and variance of each gene’s expression profile are random generated in the function. Then the matrix can be calulated as follows.
A couple of months ago, I transfered the website engine from wordpress into Octopress. A lot of errors were found in previous posts due to format incompatibility, especially some perl scripts.
Please read carefully before using any code or script, and leave a comment if you find some “terrible” error.
Thank you!
[update20130808] now only perl howto related posts
[update20150101] All format issues are (putatively) resolved.
I had been using 000webhost.com to host my website till several days ago when I noticed my website was suspendend for “violating 20%+ CPU usage limit for more than 1000 times.”
The 000wbehost server is good and stable. Most importantly it’s free. Now, I have to transfter to another free and good web hosting service. Github is a good choice. But github does not support wordpress. I tried to transferr the website to github before, but I am not comfortable to write blogs using Markdown.
It’s difficult to find another free service supporting wordpress, and lots of people said the static blogging engine is much better than wordpress. Looks like I will stay here for a while.
(to be continued)
Name  Machine cost  Read length (bases)  Cost per megabase 

Illumina MiSeq  US$125,000  500  14–70 cents 
Illumina HiSeq  US$690,000  300  4–5 cents 
PacBio RS  US$695,000  4,575  $2–17 
Ion Torrent PGM  US$49,000  400  60 cents–$5 
Ion Torrent Proton  US$224,000  200  1–9 cents 
