(最后更新：2017/10/01)

Lepmap3 遗传图谱构建流程

Lepmap3基本流程

运行前数据准备

vcf文件

vcf文件必须有likelihood信息，lepmap会用likelihood信息进行分析

pedigree文件

pedigree文件包含了样本名及家系等相关信息，pedigree文件为tab分隔的文本文件，示例如下：


CHR	POS	F	F	F	F	F	F
CHR	POS	female	male	progeny_1	progeny_2	progeny_3	progeny_4
CHR	POS	0	0	male	male	male	male
CHR	POS	0	0	female	female	female	female
CHR	POS	2	1	0	0	0	0
CHR	POS	0	0	0	0	0	0

每行涵义如下

line 1：家系（Family）名称
line 2：样本名称
line 3：为father的名称，对应第二行第四列
line 4：为mother的名称，对应第二行第三列
line 5：性别，1 male, 2 female, 0 unknown
line 6：表型，一般用不到。can be 0 for all individuals, this is not currently used

其他软件输出格式转换为Lepmap3格式

JoinMap loc files:

1	awk -f locsingle.awk file.loc\|awk -f loc2genotypes.awk\|awk -f genotypes2post.awk \|java -cp Lep-MAP3/bin ... data=- ...

Genotype assay data (in tab separated affx_genotypes.txt):

SNP   IND1 IND2 IND3 IND4   ...
AX-01 AA   AB   BB   NoCall ...
AX-02 AB   AA   AA   BB     ...
...

1	awk -f affx2post.awk affx_genotypes.txt\|java -cp Lep-MAP3/bin ParentCall2 data=pedigree.txt posteriorFile=- ...

运行方法

ParentCall2

该步骤是为了call亲本的基因型（如果已经有，可以矫正亲本基因型，如果没有，可以补缺失亲本基因型），也就是猜测亲本的基因型，~~并且可以查看性染色体的marker~~。支持Grandparents信息，以及半同胞家系。

The module ParentCall2 is used to call (possible missing or erroneous) parental genotypes and markers on sex chromosomes (XY or ZW system, assumes known offspring sex). Grandparents and half-sib familes are also supported (especially the grandparents should be provided if there is data on them).

1	java ParentCall2 data=pedigree.txt vcfFile=input.vcf removeNonInformative=1\|gzip >data.call.gz

Calling markers on autosomes and on X chromosome (X-)

1	java ParentCall2 data=... XLimit=2 ...

Calling markers on autosomes and on Z chromosome (Z-)

1	java ParentCall2 data=... ZLimit=2 ...

Filtering2

该步骤可以根据偏分离（dataTolerance）和缺失率过滤标记（missingLimit）

1	java Filtering2 data=data1.call dataTolerance=0.001 >data_f.call

SeparateChromosomes2

该步骤可以将标记进行分群

1
2
3

java SeparateChromosomes2 data=data_f.call lodLimit=5 >map5.txt
java SeparateChromosomes2 data=data_f.call lodLimit=5 theta=0.05 >map5.5.txt
java SeparateChromosomes2 data=data.call lodLimit=5 distortionLod=1 >map5.nofilt.txt

(parameter numThreads utilises multiple cores)

The size distribution of linkage groups can be obtained like this:

1	sort map5.txt\|uniq -c\|sort -n

JoinSingles2All

该步骤可以将上一步中未分到连锁群的标记重新插入到已有的连锁群中

The JoinSingles2All module assigns singular markers to existing LGs by computing LOD scores between each single marker and markers from the existing LGs. It needs a map file as input parameter and then it outputs a new map file with possible more markers assigned to LGs.

1 2	java -cp bin/ JoinSingles2All map=map5.txt data=data_f.call lodLimit=4 >map5_js.txt zcat data_f.call.gz\|java -cp bin/ JoinSingles2All map=map5.txt data=- lodLimit=3 lodDifference=2 >map5_js.txt

(parameter numThreads utilises muliple cores)

(iterated joinSingles2All yields same result as iterating it until no markers can be added)

1	java -cp bin/ JoinSingles2All map=map5.txt data=data_f.call lodLimit=4 iterate=1 >map5_js_iterated.txt

The size distribution of linkage groups can be obtained like this:

1	cut -f 1 map5_js.txt\|sort\|uniq -c\|sort -n

OrderMarkers2

对每个连锁群内的标记进行排序，以最大化likelihood

OrderMarkers2 orders the markers within each LG by maximizing the likelihood of the data given the order. Either a map file (map=map_file) or output of previous run of OrderMarkers2 (evaluateOrder=order_file) have to be provided (only the first column (the marker name or lg name) is read from these files).

1 2	java -cp bin/ OrderMarkers2 map=map5_js.txt data=data_f.call chromosome=1 >order1.txt java -cp bin/ OrderMarkers2 evaluateOrder=order1.txt data=data_f.call >order1.2.txt

已知标记顺序，计算遗传距离

1	java -cp bin/ OrderMarkers2 evaluateOrder=order2.txt data=data_f.call improveOrder=0 >order1.3.txt

已知标记顺序，计算中性图距离

1	java -cp bin/ OrderMarkers2 evaluateOrder=order1.txt data=data_f.call improveOrder=0 sexAveraged=1 >order1.SA.txt

无交叉减数分裂achiasmatic meiosis (父本中无重组交换):

1	java -cp bin/ OrderMarkers2 map=map.txt data=data_f.call recombination1=0

无交叉减数分裂achiasmatic meiosis (母本中无重组交换):

1	java -cp bin/ OrderMarkers2 map=map.txt data=data_f.call recombination2=0

分染色体排序

1
2
3

java -cp bin/ OrderMarkers2 map=mapBig.txt data=dataBig.call chromosome=1 >order1.1.txt
...
java -cp bin/ OrderMarkers2 map=mapBig.txt data=dataBig.call chromosome=N >orderN.1.txt

LMPlot

绘制点图
for every chromosome :

1
2
3

java -cp bin/ OrderMarkers2 map=map.txt data=data_f.call outputPhasedData=1 chromosome=1 >order_wp_1.txt
java -cp bin/ LMPlot order_wp_1.txt >order_wp_1.dot
xdot.py order_wp_1.dot

or:

1
2
3

java -cp bin/ OrderMarkers2 evaluateOrder=order1.txt data=data_f.call outputPhasedData=1 >order_wp_1.txt
java -cp bin/ LMPlot order_wp_1.txt >order_wp_1.dot
xdot.py order_wp_1.dot

Bioinformatics dōjō

LepMap3的使用方法说明