Physical maps are built using molecular biology techniques, like fingerprinting, to examine DNA molecules in order to show sequence features positions. Physical distance between landmarks is measured in base pairs.

Recommendations

Summary

To display physical maps in browsers performed by data managers:
  1. Use the FPC format for physical map raw data
  2. Use the GFF3 format for data integration

 

1. Raw data format and submission process

    • File formats: We recommend using FPC file format (generate by FPC and LTC softwares).

2. Data integration file format

GFF3 is recommended to integrate data to feed GBrowse and display physical maps. Look at the “Genome annotations” recommendation part to find more details about GFF3 format. See also how to produce GFF3.

Warning: GFF2 to GFF3 conversion

Converting a file from GFF2 to GFF3 format is problematic for several reasons. There are several GFF2 to GFF3 converters available on the web, but each makes specific assumptions about the GFF2 data that limit its applicability. GMOD does not endorse (or disparage) any particular converter. If you have GFF2 data from an external source, and they do not also provide it in GFF3 format, then you may be stuck with GFF2.

If the GFF2 file does not use Sequence Ontology terms in column 3 then some sort of translation will need to be done on the types in the GFF2 to convert them to be SO terms.

Another big problem is that GFF2 supports only one level of feature nesting. While you can certainly reproduce this minimal nesting in GFF3, it would be better to also convert your feature representations to be multi-level at the time you migrate the data to GFF3.

Convert data format
You can convert different formats to GFF3 using the Bioconvert tool.

https://bioconvert.readthedocs.io/en/master/_images/conversion.png

Most popular tools

Physical map building

Visualization tools

Example

GFF3 sample of the 3B physical map browser:

ctg110	assembly	contig	1	1041601	.	.	.	Sequence "ctg110"; Name "ctg110"
ctg110	FPC	BAC	820801	938401	.	.	.	BAC "TaaCsp3BFhA_0290A06"; Name "TaaCsp3BFhA_0290A06"; Contig_hit "110" 
ctg110	FPC	BAC	835201	912001	.	.	.	BAC "TaaCsp3BFhA_0130L06"; Name "TaaCsp3BFhA_0130L06"; Contig_hit "110" 
ctg110	FPC	BAC	261601	468001	.	.	.	BAC "TaaCsp3BFhA_0117E07"; Name "TaaCsp3BFhA_0117E07"; Contig_hit "110" 
ctg110	FPC	BAC	55201	327601	.	.	.	BAC "TaaCsp3BFhA_0111D21"; Name "TaaCsp3BFhA_0111D21"; Contig_hit "110" 
ctg110    FPC    marker    808801    808801    .    .    .    marker "Ta#S32641420-3B"; Name "Ta#S32641420-3B"; Contig_hit "ctg110 - 1" (TaaCsp3BFhA_0347M21) 
ctg110    FPC    marker    345601    345601    .    .    .    marker "Xcfp1207-3B"; Name "Xcfp1207-3B"; Contig_hit "ctg110 - 1" (TaaCsp3BFhA_0017C12)

 

 

Additional information

Workflow for physical map data: click on the image
Workflow_phys_map

 

Written on: WDI working group
Published on: 02 October 2014
Updated on: 09 July 2015