Daniela Witten，26岁当上assistant professor，被福布斯评为科学界30个30岁以下的牛人之一。父母都是普林斯顿的professor，姐姐是普林斯顿的assistant professor，她爸得过菲尔兹奖，老公是facebook的元老。
The Lasso Page is maintained by the inventor of lasso and provides most important references for lasso.
The book Elements of Statistical Learning (pdf) describes the lasso in detail.
Lasso in R: lars: Least Angle Regression, Lasso and Forward Stagewise, and glmnet: Lasso and elastic-net regularized generalized linear models (Note: lars() function from the lars package is probably much slower than glmnet() from glmnet.)
Adaptive lasso in R
adaptive.lasso function in lqa package (Penalized Likelihood Inference for GLMs)
adalasso function in parcor package (Regularized estimation of partial correlation matrices)
Graphical lasso in R (glasso: Graphical lasso- estimation of Gaussian graphical models)
The joint graphical lasso paper
Joint graphical lasso in R (JGL: Performs the Joint Graphical Lasso for sparse inverse covariance estimation on multiple classes)
I want to compute the P-value from the joint cumulative distribution of an n-dimensional order statistic.
One efficient way is using the following recursive formula.
However, the facts are (or would be):
In Statistics Monte Carlo simulation is a “quick” way to compute some complicated formulas. By saying “quick”, I mean I can see the results without knowing or deriving “ugly” Math formulas. It’s actually a very “slow” method in computing aspect.
Anyway, the R function is here.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
I have been doing some research about co-expression network. “co-expression” means that genes have similar expression profiles across different conditions or tissues. In the network, genes are nodes, and “co-expression” relationship between two genes can be reprensented as edges. The co-expressed genes may involve in similar pathways or biological process.
In a small part of my research, I am testing some algorithms to detect co-expression relationship. One way to test algorithm is simulation. In an ideal (simple) case, the expression values of two co-expressed genes can be considered as bivariate normal distributed. To generate expression values of such gene pair or a group of genes given a correlation coefficient, is just to simulate multivariate normal distribution. MASS library in R has an function, mvrnorm, to do that, but it requires a covariance matrix.
The function below is to firstly generate the covariance matrix in order to use the mvnorm function. Because we only know the correlation coefficient, i.e. co-expression relationship (degree), the mean and variance of each gene’s expression profile are random generated in the function. Then the matrix can be calulated as follows.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
A couple of months ago, I transfered the website engine from wordpress into Octopress. A lot of errors were found in previous posts due to format incompatibility, especially some perl scripts.
Please read carefully before using any code or script, and leave a comment if you find some “terrible” error.
[update-2013-08-08] now only perl howto related posts
[update-2015-01-01] All format issues are (putatively) resolved.
I had been using 000webhost.com to host my website till several days ago when I noticed my website was suspendend for “violating 20%+ CPU usage limit for more than 1000 times.”
The 000wbehost server is good and stable. Most importantly it’s free. Now, I have to transfter to another free and good web hosting service. Github is a good choice. But github does not support wordpress. I tried to transferr the website to github before, but I am not comfortable to write blogs using Markdown.
It’s difficult to find another free service supporting wordpress, and lots of people said the static blogging engine is much better than wordpress. Looks like I will stay here for a while.
TechCrunch: With $6.25M In Tow, Bina Technologies Wants To Bring Big Data Insight To Genomic Sequencing
(to be continued)
|Name||Machine cost||Read length (bases)||Cost per megabase|
|Illumina MiSeq||US$125,000||500||14–70 cents|
|Illumina HiSeq||US$690,000||300||4–5 cents|
|Ion Torrent PGM||US$49,000||400||60 cents–$5|
|Ion Torrent Proton||US$224,000||200||1–9 cents|