
Output files description
========================
  
File Format (.vhy)
------------------

The primary output of the vhybridize script is in .vhy format.  This
really simple file format is biased to be trivially parsable from
Python.  Each non-comment line is a tuple describing a probe hit
except for the first line that represents the format version.  No
backward compatible parser is provided but the format is really simple
and one could either write a converter of use a legacy distribution if
large data set in an older format needs to be processed.  A probe hit
record is in the form:

  (start, stop, probe, %length, %identity, probe_length)

The tuple is in a valid Python format and you can parse it with
eval().  The user should be aware that even thought this is really
convenient is might pose a security risk to "execute" an arbitrary
data set.  It follows that one should only process with eval() data
that he generated himself or that comes from a trusted source.

Lines that starts with ">" are optional.  If they are supplied they
define a comment for the block of lines that runs from the following
line until the next ">".  This is quite similar to the Fasta format
except that in Fasta is it possible to merge and split lines without
changing the meaning of the file.  Here the format is line oriented
and line fusions or fissions with result in data loss.

Comments can be inserted anywhere using "#" with the Python semantic.
When new information is added to the tuple reasonable effort is made to
do it at the end of the tuple.  Code that don't use relative
positioning from the end of the tuple should be able to upgrade to
new file formats without pain.  In example, don't do:

  percent_id = rec[-2] # wrong

but do 

  percent_id = rec[4]  # right


File Format (.vhx)
------------------

The tools in this package also support the .vhx format.  In .vhx all
meta information are striped and only probes order is kept.  This is
to say that a .vhx contains only a list of probes.  Is it possible
thought optional to separate the probes names with a comma (,).  An
optional leading + sign will mean a forward hit of this probe while a
- sign means a hit on the reverse complement.  It is possible to
include more than one ordered list of probes by defining section with
line that starts with ">" just like in .vhy and in the Fasta formats.

