R Code: Bayesian LASSO for Absorption Wavelength Prediction of Microbial Rhodopsins


Description

Microbial rhodopsins are photoreceptive membrane proteins, which are used as molecular tools in optogenetics. We introduced a machine learning (ML)-based experimental design method for screening rhodopsins that are likely to be red-shifted from representative rhodopsins in the same subfamily.

This page provides our R code that calculates a Bayesian LASSO model in which the input data is the amino acid sequences and the output is absorption wavelengths. The sequences in the dataset must be aligned so that they have the same length. The code depends on three R packages 'seqinr', 'pracma', and 'monomvn', which can be installed by

> install.packages(c("seqinr","pracma","monomvn")) 

In the R interpreter, you can run an example code with

> source("example.R")

This example calculates 500 samples from the posterior distribution of Bayesian LASSO by using 100 dimensional features, created by PCA from the original 432 dimensional features. Note that this setting is not the same as our setting in the paper for fast computations (see example.R to reproduce the setting in the paper).


Implementation

Download is here

References


This page is managed by Masayuki Karasuyama (karasuyama [at] nitech.ac.jp)

Department of Computer Science, Nagoya Institute of Technology, Japan