R Code: Bayesian LASSO for Absorption Wavelength Prediction of Microbial Rhodopsins
Description
Microbial rhodopsins are photoreceptive membrane proteins, which are used as molecular tools in optogenetics.
We introduced a machine learning (ML)-based experimental design method for screening rhodopsins that are likely to be red-shifted from representative rhodopsins in the same subfamily.
This page provides our R code that calculates a Bayesian LASSO model in which the input data is the amino acid sequences and the output is absorption wavelengths. The sequences in the dataset must be aligned so that they have the same length.
The code depends on three R packages 'seqinr', 'pracma', and 'monomvn', which can be installed by
> install.packages(c("seqinr","pracma","monomvn"))
In the R interpreter, you can run an example code with
> source("example.R")
This example calculates 500 samples from the posterior distribution of Bayesian LASSO by using 100 dimensional features, created by PCA from the original 432 dimensional features.
Note that this setting is not the same as our setting in the paper for fast computations (see example.R to reproduce the setting in the paper).
Implementation
Download is
here
References
-
Keiichi Inoue, Masayuki Karasuyama, Ryoko Nakamura, Masae Konno, Daichi Yamada, Kentaro Mannen, Takashi Nagata, Yu Inatsu, Hiromu Yawo, Kei Yura, Oded Béjà, Hideki Kandori, Ichiro Takeuchi, Exploration of natural red-shifted rhodopsins using a machine learning-based Bayesian experimental design, Communication Biology, to appear.
This page is managed by
Masayuki Karasuyama
(karasuyama [at] nitech.ac.jp)
Department of Computer Science, Nagoya Institute of Technology, Japan