When you are finished with this program, please copy and paste “Extra Credit Question 3 of the Proctored Final Exam is complete and ready for grading.” in the text box for this question. Program 3: Annotate Variants (5 Points Extra Credit Possible) After identifying shared variants, in order to determine if one of them might be causing the phenotype, it’s necessary to figure out which gene harbors each of the shared mutations. Your task is to take a file formatted the same as your output in Question 2 (list of shared variants) and determine which gene each of the mutations is from. You will not be given the exact file you created in Question 2, just a file that’s formatted the same: four columns, where column 1 is the chromosome, column 2 is the chromosome position, column 3 is the reference allele, and column 4 is the variant, or mutated, allele. You will also be provided with a gene annotations file, which has four columns: chromosome, start position (inclusive), stop position (inclusive), and gene name. Your task is to determine which gene each variant is located in and create a new file exactly the same as the shared variants file except it will have another column with gene name, or “no gene” if the mutation isn’t located in a known gene. To be located in a gene, a mutation should be located on the same chromosome and in a position within the range defined by the genes file. Your program should accept three files from the command line (in the following order): shared variants file, output file where you’ll write your new file, and the genes file. It is possible that more than one mutation will be in the same gene, some mutations will not be located in a gene, and not all genes from the gene annotations file will be used. Following is an example, assuming I have the two following files: shared_variants.txt: chr1 3675 A Ghr1 3789 T Gchr11 55 T C gene_annotations.txt: chr1 3700 6000 GeneAchr2 3300 10000 GeneBchr2 11000 12000 GeneCchr11 55 4500 GeneD Example #1 If I execute the following command: python studentcode.py shared_variants.txt gene_annotations.txt annotated.txt. Your program should create the following file, annotated.txt: annotated.txt (all uppercase and tab-delimited): CHR1 3675 A G NO GENE CHR1 3789 T G GENEA CHR11 55 T C GENED
Blog
Ground meats are allowed on a mechanical soft diet.
Ground meats are allowed on a mechanical soft diet.
A patient with irritable bowel syndrome (IBS) is unaware of…
A patient with irritable bowel syndrome (IBS) is unaware of what is causing the disorder. What is a recommendation that they should do in order to find the culprit?
Which major economic problem did President Truman face immed…
Which major economic problem did President Truman face immediately after the Second World War?
Why did President Truman veto the McCarran Internal Security…
Why did President Truman veto the McCarran Internal Security Act?
Water accounts for ________ of adult body weight.
Water accounts for ________ of adult body weight.
Match the type of fluid loss to its description:
Match the type of fluid loss to its description:
What were the main U.S. foreign policy concerns following th…
What were the main U.S. foreign policy concerns following the Second World War and why?
Whose support did Truman lose because of postwar policies in…
Whose support did Truman lose because of postwar policies intended to control prices and wages?
Which statement BEST describes why amine I has a lower boili…
Which statement BEST describes why amine I has a lower boiling point than II and III? Amine Boiling point (°C) I. 3 II. CH3CH2NHCH3 36 III. CH3CH2CH2NH2 48