The Problem of Multiple Isoforms

Arp2/3 Subunits in the FASTA File Format – Problems Encountered

Arp2/3 is a multi-subunit protein complex (seven subunits). The taxonomic identifiers [NCBI], subunit names/abbreviations and numbers of amino acids are tabulated below (accessed from UniProt; follow taxonomic identifiers to access original pages).

Taxonomic Identifier [NCBI] Protein Name Protein Abbreviation Number of Amino Acids
P61160 Actin-related protein 2 ACTR2
ARP2
394
P61158 Actin-related protein 3 ACTR3
ARP3
418
O15145 Actin-related protein 2/3 complex subunit 3 ARPC3
ARC21
178
P59998 Actin-related protein 2/3 complex subunit 4 ARPC4
ARC20
168
O15143 Actin-related protein 2/3 complex subunit 1B ARPC1B
ARC41
372
O15511 Actin-related protein 2/3 complex subunit 5 ARPC5
ARC16
151
O15144 Actin-related protein 2/3 complex subunit 2 ARPC2
ARC34
PRO2446
300

A complication was encountered when running searches to obtain Arp2/3 subunit sequences in FASTA file format. UniProt returned single amino acid sequences per subunit, whereas NCBI’s protein sequence database returned multiple isoforms. This most likely owes to the high level of ‘third-party-processing’ undergone by UniProt in order to adjust and harmonize submitted data, whilst NCBI takes the form of a less processed/unprocessed raw data depository.

FASTA sequences (accessed from NCBI Protein: sequence database): download the file here.

In order to begin to elucidate the nature of the ‘complication’ posed by NCBI’s list of multiple isoforms, a multiple sequence alignment was first run using the protein sequences in FASTA format. Results show regions of high levels of conservation, which most likely represent domains intrinsic to protein functionality. Three multiple sequence alignments were run, corresponding to the three proteins in which isoforms are observed: I. Actin-related protein 2 (isoform A; isoform B), II. Actin-related protein 2/3 complex subunit 4 (isoform A; isoform B; isoform C), and III. Actin-related protein 2/3 complex subunit 5 (isoform 1; isoform 2). Results are as shown in MultAlin.

Multiple Sequence Alignment Results Using: MultAlin

• Actin-related protein 2 (ARP2)
Multiple sequence alignment of ARP2 isoform A and ARP2 isoform B.

Figure 1. Results of the MultAlin multiple sequence alignment of ARP2 isoform A and ARP2 isoform B. (Note: please click on image to expand).
Figure 1. Results of the MultAlin multiple sequence alignment of ARP2 isoform A and ARP2 isoform B. (Note: please click on image to expand).
 

The entire polypeptide sequence is 100% conserved in both isoforms of ARP2 except for a short string of five amino acids in the 50-60 region. This is the genetic result of either: (I) a 15-nucleotide insertion into isoform A, or (II) a 15-nucleotide deletion in isoform B. The result may indicate the region is not critical to protein function (and has thus been liable to high levels of mutation). However, the more likely explanation is that it has resulted in an addition of functionality in isoform A. The added amino acids are: asparagine (Asn, N), lysine (Lys, K) and methionine (Met, M), which ultimately indicates an addition of charge, suggesting this domain in isoform A is involved in bonding/formation of a binding site. The flatfile corresponding to ARP2 isoform A will shed more light on this theory.

• Actin-related protein 2/3 complex subunit 4 (ARC20)

Figure 2. Multiple sequence alignment of ARC20 isoform A, ARC20 isoform B and ARC20 isoform C.
Figure 2. Multiple sequence alignment of ARC20 isoform A, ARC20 isoform B and ARC20 isoform C.
 

The region from 110-187 shows 100% conservation, whereas no conservation is observed from region 20-110 (the amino acid sequences differ entirely).

• Actin-related protein 2/3 complex subunit 5 (ARC16)

Figure 3. Multiple sequence alignment of ARC16 isoform 1 and ARC16 isoform 2.
Figure 3. Multiple sequence alignment of ARC16 isoform 1 and ARC16 isoform 2.
 

Again, a small region of conservation is apparent.

At this stage, it is hoped that research into protein function (i.e. using ‘flat files’) will shed light on the functional differences between the isoforms encountered, helping us to elucidate the evolutionary pressures which may be accountable for the slight structual variation across subunit isomers.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s