This project investigates Variants of Unknown Significance (VUS) in the BCS1L gene associated with Björnstad Syndrome — an extremely rare autosomal recessive mitochondrial disorder. Using evolutionary conservation analysis across 1000 homologous sequences, we developed a Python-based algorithm to classify each VUS as pathogenic or benign.
Björnstad Syndrome (BJS) is an extremely rare autosomal recessive mitochondrial disorder first described by Professor Roar Björnstad in 1965. It is caused by missense mutations in the BCS1L gene on chromosome 2q34–36, which encodes a mitochondrial chaperone protein responsible for assembling respiratory chain Complex III.
When mutated, the BCS1L protein impairs Complex III assembly and increases production of reactive oxygen species (ROS). Hair follicles and inner ear cells are particularly sensitive to mitochondrial dysfunction, leading to the two hallmark features of the syndrome:
- 🦻 Sensorineural hearing loss — typically congenital and bilateral
- 💇 Pili torti — twisted, brittle hair shafts, often leading to alopecia
- Retrieved BCS1L protein sequence (isoform a) from ClinVar and gnomAD
- Performed BLASTp searches with 100, 500, and 1000 homologous sequences
- Selected 1000-hit BLASTp results for optimal species diversity and E-value reliability
- Aligned 1000 homologous sequences using MUSCLE algorithm in MEGA11
- Constructed Neighbor-Joining (NJ) phylogenetic tree for evolutionary analysis
- Rerooted the tree using FigTree for improved interpretation
- Retrieved 34 VUS positions from ClinVar and gnomAD databases
- Mapped each VUS to aligned sequence positions using a custom Python algorithm
For each amino acid position, the algorithm:
- Reads the 1000-hit FASTA alignment file
- Counts the frequency of each amino acid at every position
- Computes a conservation score (CS) = count / total sequences
- Records the most (CS1) and second most (CS2) conserved amino acid per position
Three thresholds (t1=0.9, t2=0.7, t3=0.1) determine variant classification:
If variant == most conserved amino acid:
CS1 > t1 → Benign
CS1 > t2 → Benign
CS1 < t2 → Pathogenic
If variant ≠ most conserved amino acid:
CS1 > t1 → Pathogenic
CS2 == variant and CS2 > t2 → Benign
CS2 == variant and CS2 > t3 → Pathogenic
CS2 == variant and CS2 < t3 → Benign
Otherwise → Pathogenic
Out of 34 VUS positions analyzed from gnomAD and ClinVar:
| Classification | Count |
|---|---|
| ✅ Benign | 8 |
| 26 |
Validation: 10 pre-classified variants from ClinVar were used to verify the algorithm — 8 out of 10 were correctly identified, demonstrating high classification accuracy.
| 100-hit | 500-hit | 1000-hit |
|---|---|---|
![]() |
![]() |
![]() |
1000-hit MUSCLE sequence alignment
Neighbor-Joining phylogenetic tree rerooted in FigTree
Amino acid conservation scores across BCS1L positions
Conservation scores with 0.5 threshold for region comparison
Pathogenic and benign VUS positions across the BCS1L gene
| Tool | Purpose |
|---|---|
| BLASTp | Homologous sequence search |
| MEGA11 | Multiple sequence alignment (MUSCLE) & phylogenetic tree construction |
| FigTree | Phylogenetic tree visualization & rerooting |
| gnomAD / ClinVar | VUS data retrieval |
| Python | Conservation score algorithm & pathogenicity classification |
| RStudio | Visualization of conservation scores and E-values |
- Python 3.x
- MEGA11
- FigTree
- R / RStudio
# Clone the repository
git clone https://github.com/gizemdogafiliz/Bjornstad-Syndrome-BCS1L-Gene-VUS-Pathogenicity-Classifier.git
# Navigate to the directory
cd Bjornstad-Syndrome-BCS1L-Gene-VUS-Pathogenicity-Classifier
# Run conservation score calculation
python conservation_score.py
# Run pathogenicity classifier
python pathogenicity_classifier.pyconservation_scores.tsv— Conservation scores for each amino acid positionpathogenic_or_benign.tsv— Pathogenicity classification for each VUS
- Björnstad, R. (1965). Pili torti and sensory-neural loss of hearing. Proceedings of the 17th Meeting of the Northern Dermatological Society.
- Hinson, J. T., et al. (2007). Missense mutations in the BCS1L gene as a cause of the Björnstad syndrome. New England Journal of Medicine, 356(8), 809–819.
- Bénit, P., Lebon, S., & Rustin, P. (2008). Respiratory-chain diseases related to complex III deficiency. Biochimica et Biophysica Acta.
- Calvo, S. E., & Mootha, V. K. (2010). The mitochondrial proteome and human disease. Annual Review of Genomics and Human Genetics, 11, 25–44.
- Richards, S., et al. (2015). Standards and guidelines for the interpretation of sequence variants. Genetics in Medicine, 17(5), 405–424.
- Tamura, K., Nei, M., & Kumar, S. (2004). Prospects for inferring very large phylogenies by using the neighbor-joining method. PNAS.
- Kinene, T., et al. (2016). Rooting trees, methods for. Encyclopedia of Evolutionary Biology.
Gizem Doğa Filiz, Yıldız Zeynep Şensan, Alpay Emir Aktan, Zeynep Tuana Anıç
Course: ENS 210 — Computational Biology
Institution: Sabancı University
Instructor: Asst. Prof. Dr. Ogün Adebali


