LRRsearch is a service-based application to identify leucine-rich repeats (LRRs) regions from primary protein sequences. It uses a Position-Specific Scoring Matrix (PSSM) based on the highly conserved segment (HCS) of the LRR region to find out statistically significant LRRs from the sequence. PSSM is a matrix of score values which is constructed from eleven amino acid residues stretch HCS region of >2300 non-redundant LRR motifs. These motifs are extracted semi-automatically from both publicly available databases: LRRML and in-house NLR family database, NLRdb. The server uses the features of Asynchronous Java Script and XML (AJAX) to create a asynchronous web-application. LRRsearch accepts protein sequence in only format to provide list of LRR region(s) present inside the protein sequence. The additional feature of LRRsearch web server is that it is incorporating with LRRML BLAST search tool.

NOD-like receptors (NLRs) are essential intra-cellular pattern recognition receptors (PRRs) which play key role in innate immunity by detecting different conserved microbial motifs collectively known as pathogen-associated molecular patterns' (PAMPs). Structurally, NLRs are large multi-domain protein with triplet architecture: N-terminal protein-protein interaction domain composed of PYD (pyrin domain), CARD (caspase activation and recruitment domain) or BIR (baculovirus 'inhibitor of apoptosis' repeat) domain, central nucleotide-binding oligomerization (NOD) or NACHT domain and C-terminal leucin-rich repeat (LRR) domain. Based on different N-terminal domain NLR proteins are subdivided into five families: NLRA, NLRB1, NLRC, NLRP and NLRX1. Among these five families, NLRC and NLRP are further classified into different sub-groups. The C-terminal domain is composed of repeating 20-30 amino acid stretches that are unusually rich in the hydrophobic amino acid leucine commonly termed as LRR domain.


LRRsearch - LRR prediction tool:

LRRsearch application implements PSSM for LRR motif identification from protein sequences. LRR-HCS regions are scanned from all available resources found in LRRML as well as annotated NLRdb sequences in different web-server (SMART, Pfam, PROSITE) and 2341 unique LRR-HCS motifs are found. An user defined sequence is scanned using an 11 residue sliding window and the 11X20 matrix is used to score each fragment. Finally statistically significant fragments are extracted using Benjamini-Hochberg procedure or simply the BH procedure. For more information on the program LRRsearch, refer to the paper by Bej et al.

Click Here to see Flow chart of LRR motif search using LRRsearch.


NLRdb - NOD-like receptor family database:

NLRdb provide non-redundant, freely available, easy-to-use NLR protein family database. All the protein information present in NRdb was retrieved from publicly available databases at National Center for Biotechnology Information (NCBI) and UniProt Knowledgebase (UniProtKB). Data redundancy was eliminated by different alignment techniques. Till date NLRdb contains 421 proteins from different NLR family along with various key information. These sequences are well distributed in five NLR families. The distribution is clearly illustrated in the following figure:

NLR family classification


Click Here to see Flow chart of methodology and process in NLRdb.