# 2vcf reference v3.0 released

the v3.0 release of the 2vcf reference is, as always, our best attempt yet at balancing the demands of coverage with accuracy and efficiency. this release represents a targeted approach at refining the list of included reference sites. that is, rather than taking the publicly available Illumina manifests for the genotyping arrays used by the public genotyping companies, we took actual examples of call sets from users, and compiled a list of those sites which have actually been observed. we make no claim about the completeness of the reference, as there are some sites which are deliberately left out, but we do claim that the reference VCF is as small as we can make it and still get good coverage of 23andme and Ancestry.com marker sets.

download v3 from https://openb.io/2vcf/2vcf-v2.0.vcf.gz

```
wget https://openb.io/2vcf/2vcf-v2.0.vcf.gz
```

the version 3.0 reference, like the v2.0 reference, is based on the dbSNP build 151 VCF. the reference contains 1,006,190 loci across 25 contigs.

contig name | marker count |
---|---|

1 | 80,376 |

2 | 80,791 |

3 | 65,915 |

4 | 58,058 |

5 | 58,936 |

6 | 66,479 |

7 | 53,733 |

8 | 51,765 |

9 | 45,063 |

10 | 52,686 |

11 | 49,940 |

12 | 49,279 |

13 | 37,676 |

14 | 32,182 |

15 | 29,788 |

16 | 31,777 |

17 | 28,228 |

18 | 29,389 |

19 | 19,853 |

20 | 24,959 |

21 | 14,052 |

22 | 14,723 |

X | 27,239 |

Y | 2,862 |

MT | 441 |

there are a small class of markers that 23andme included calls for, but which disagreed with dbSNP on which chromosome they are located. since we were unable to get help from 23andme and unable to make sense of the situation, those sites in were excluded from the reference.

RSID | 23andme chromosome | dbSNP b151 chromosome |
---|---|---|

rs10106770 | 8 | 2 |

rs1140961 | 1 | 6 |

rs1140965 | 1 | 3 |

rs11857958 | 15 | 5 |

rs11861001 | 16 | 4 |

rs11942835 | 4 | 3 |

rs12043679 | 1 | 13 |

rs12496398 | 3 | 4 |

rs12804886 | 11 | 8 |

rs12914236 | 15 | 1 |

rs1347505 | Y | X |

rs1347507 | Y | X |

rs1435909 | Y | X |

rs17863175 | 7 | 15 |

rs2125843 | 8 | 3 |

rs2129709 | Y | X |

rs2215794 | Y | 1 |

rs2220162 | Y | X |

rs2229051 | 1 | 6 |

rs2229625 | 2 | X |

rs2352696 | Y | X |

rs2433989 | Y | X |

rs2437511 | Y | X |

rs2452115 | Y | X |

rs2452335 | Y | X |

rs2496951 | Y | X |

rs2522620 | Y | X |

rs2522676 | Y | X |

rs2524623 | Y | X |

rs2524749 | Y | X |

rs2524797 | Y | X |

rs2524862 | Y | X |

rs2525234 | Y | X |

rs2557841 | Y | X |

rs2558153 | Y | X |

rs2562967 | Y | X |

rs2563090 | Y | X |

rs2563145 | Y | X |

rs2563212 | Y | X |

rs2563488 | Y | X |

rs2563845 | Y | X |

rs2563850 | Y | X |

rs2574085 | Y | X |

rs2574595 | Y | X |

rs2578863 | Y | X |

rs2580641 | Y | X |

rs2750380 | Y | X |

rs2750610 | Y | X |

rs2750816 | Y | X |

rs2751061 | Y | X |

rs2751444 | Y | X |

rs2751615 | Y | X |

rs2751964 | Y | X |

rs2754895 | Y | X |

rs2754899 | Y | X |

rs2754935 | Y | X |

rs2760594 | Y | X |

rs2766317 | Y | X |

rs2771511 | Y | X |

rs2771662 | Y | X |

rs2771666 | Y | X |

rs2774569 | Y | X |

rs2882725 | Y | X |

rs3021087 | MT | 1 |

rs35680999 | MT | 1 |

rs3749270 | 3 | 6 |

rs3853041 | Y | X |

rs401949 | Y | X |

rs4714901 | 6 | 3 |

rs61774271 | 5 | 1 |

rs8896 | MT | 1 |

rs9785828 | Y | 8 |

if you have any suggestions about the anomalous markers, please file an issue at the 2vcf github site. we appreciate your help in improving the utility, along with any other suggestions or issues you may have with the tool.