One way to reduce adverse impact of an assessment method is…

One way to reduce adverse impact of an assessment method is through ___________________, used to assign the same score to applicants who score in a range on the assessment (for example, ranges of 90-100 = A, 80-89 = B, 70-79 = C) and then comparing applicants only on this score.

Erin is a more senior employee at the local bank. Recently,…

Erin is a more senior employee at the local bank. Recently, she has started helping one of the new hires, April, learn the ins and outs of how to do her job well. Erin and April are involved in a reciprocal relationship aimed at career development. Erin provides April with career guidance and psychosocial support, while April provides Erin with a fresh perspective on a job she’s always loved. What internal method is being leveraged here?

This is a short answer question that can be answered in one…

This is a short answer question that can be answered in one to two sentences. After 12 years, Japanese scientists succeeded in isolating a new species of microorganism in the mud gathered from the bottom of the Sea of Japan.  It was difficult to isolate and culture because this single-celled microbe lives under very inhospitable environmental conditions (high pressure, extremely cold, no oxygen, and very little organic matter).  The scientist named it Prometheoarchaeum syntrophicum.  Transmission electron microscopy revealed that it did not have a nucleus.  What scientific evidence was MOST LIKELY used to phylogenetically categorize this single-celled microbe in the domain Archaea as opposed to the domain Bacteria? (Be as specific as possible.)

What are my 10 conclusions? Can a describe in a sentence or…

What are my 10 conclusions? Can a describe in a sentence or two with 1-2 examples or ideas from class. Move from counterfeit to ___ communities Move from consumption to ___ Move from sensationalism to ___ Move from global to ___ Move from “calling out” to ___ Move from politics to ___ Move from interventions to ___ Move from explanation to ___ Move from programs to ___ Move from systems to ___

One of the biggest challenges in genetics is to determine th…

One of the biggest challenges in genetics is to determine the relationship between genetic variants and phenotypes. To learn more about these relationships, researchers often sequence the DNA of individuals and then analyze the DNA variant data. For this exam, you will be provided with DNA variant data for four individuals from the same family: a mother, a father, a son, and a daughter. The DNA variant data are provided in Variant Call Format (VCF) files. Like we’ve discussed in the course, VCF files contain multiple “metadata” or header lines that each start with one or more # characters. The remaining lines contain variant data: each variant is listed on a separate line. The CHROM column indicates the chromosome name, the POS column indicates the chromosome position, the REF column indicates the reference allele, and the VAR column indicates the alternative (variant or mutated) allele. So, if I say an “A” is mutated to a “T”, then “A” will appear in the REF column and “T” in the VAR column. The FILTER column indicates whether or not each variant passed the quality-control test. Variants that passed the quality-control test have a value of PASS. Variants that failed the quality-control test have a value of NO PASS. The variant-data columns are tab-delimited. Below are two (small) example VCF files. (Even though the columns may not line up perfectly on the printed page, the variant columns are separated by single tabs.) You do not need to do any error checking on command lines, and all files are tab-delimited. You cannot assume that all characters in all files will be uppercase or lowercase. Example VCF files (note that on the computer, the columns may not line up perfectly visually, even though they are still separated by a single tab): VCF_file1.vcf ##header line 1##other stuff that you don’t have to know##another header line##blah blah blah#CHR      POSITION  REF  VAR  FILTER   chr1      3675      a    g    PASS     chr1      3789      T    G    pass     chr7      787879    T    C    NO PASS  chr7      787882    C    A    PASS     CHR10     6321      A    C    PASS     chr11     55        T    C    PASS      VCF_file2.vcf ##header garbage##other stuff that you don’t have to know and is really annoying##another header line##blah blah blah##who thought of this file format anyway#Chr      POSITION  REF  VAR  FILTER   chr1      3675      A    G    PASS     chr1      3789      T    G    PASS     chr7      787879    T    C    PASS     chr7      787883    C    A    PASS     chr11     55        T    C    PASS     chr22     54321     G    C    NO PASS  

When you are finished with this program (and ready to move o…

When you are finished with this program (and ready to move on to the extra credit program, if you want), please copy and paste “Question 2 of the Proctored Final Exam is complete and ready for grading.” in the text box for this question. Program 2: Find Shared Variants If multiple people carry the same DNA variant as well as the same phenotype (for example, a disease), it may be that this variant caused the phenotype. Your task is to search multiple VCF files and identify the variants that are shared across all of those VCF files. For a variant to be considered shared, the exact same variant line must appear in all the files and the value in the FILTER column must be PASS in all the files. As an illustration, suppose you were looking at VCF_file1.vcf and VCF_file2.vcf (shown above). You would want to find the following three shared variants:chr1 3675 A G PASS chr1 3789 T G PASS chr11 55 T C PASS These variants are shared because the exact same line appears in both input files and the value in the FILTER column is PASS in both files. A fourth variant (chr7, 787879) appears in both files. However, in the first file, the FILTER value is NO PASS, so this variant does not count as a shared variant. Write a Python script that uses sys.argv to accept the following five arguments: The name of the mother’s VCF file. The name of the father’s VCF file. The name of the daughter’s VCF file. The name of the son’s VCF file. The name of an output file that your code will need to create. Your Python script should search the four VCF files and identify the variants that are shared across all four individuals (mother, father, daughter, son). After identifying the shared variants, write the data to the specified output file. This should be a tab-delimited file with four columns that correspond to the CHR, POS, REF, and VAR columns in the VCF files. The output file should look the same as the modified VCF files used in this final, except there should be no metadata lines or header lines, all output should be uppercase, and it should not include the FILTER column. You should write the variants to output.txt in the same order they appear in the first input file. (In the example below, that is the order the variants appear in VCF_file1.vcf.) All columns in the output file are tab-delimited. For example, if the server were to execute your code (using only the two VCF files, VCF_file1.vcf and VCF_file2.vcf, for brevity): python studentcode.py VCF_file1.vcf VCF_file2.vcf output.txt Expected output (tab-delimited and all uppercase): CHR1   3675   A   G CHR1   3789   T   G CHR11   55   T   C You may assume we will always give you exactly four VCF files.

When you are finished with this program, please copy and pas…

When you are finished with this program, please copy and paste “Extra Credit Question 3 of the Proctored Final Exam is complete and ready for grading.” in the text box for this question. Program 3: Annotate Variants (5 Points Extra Credit Possible) After identifying shared variants, in order to determine if one of them might be causing the phenotype, it’s necessary to figure out which gene harbors each of the shared mutations. Your task is to take a file formatted the same as your output in Question 2 (list of shared variants) and determine which gene each of the mutations is from. You will not be given the exact file you created in Question 2, just a file that’s formatted the same: four columns, where column 1 is the chromosome, column 2 is the chromosome position, column 3 is the reference allele, and column 4 is the variant, or mutated, allele. You will also be provided with a gene annotations file, which has four columns: chromosome, start position (inclusive), stop position (inclusive), and gene name. Your task is to determine which gene each variant is located in and create a new file exactly the same as the shared variants file except it will have another column with gene name, or “no gene” if the mutation isn’t located in a known gene. To be located in a gene, a mutation should be located on the same chromosome and in a position within the range defined by the genes file. Your program should accept three files from the command line (in the following order): shared variants file, output file where you’ll write your new file, and the genes file. It is possible that more than one mutation will be in the same gene, some mutations will not be located in a gene, and not all genes from the gene annotations file will be used. Following is an example, assuming I have the two following files: shared_variants.txt: chr1  3675  A     Ghr1   3789  T     Gchr11 55    T     C gene_annotations.txt: chr1        3700        6000        GeneAchr2        3300        10000       GeneBchr2        11000       12000       GeneCchr11       55          4500        GeneD Example #1  If I execute the following command: python studentcode.py  shared_variants.txt  gene_annotations.txt annotated.txt. Your program should create the following file, annotated.txt: annotated.txt (all uppercase and tab-delimited): CHR1        3675  A     G     NO GENE    CHR1        3789  T     G     GENEA      CHR11       55    T     C     GENED