NAME¶
substitch.pl -- Split/merge stitch files into/out of stitch files
SYNOPSIS¶
substitch.pl --split 5 allchromosomes.stitch #split big stitch into 5 roughly
equal chunks
substitch.pl --project allspecies.seqs sub.anchors #project some anchors into a
different coordinate space (as long as the stitch component sequences match)
OPTIONS¶
--verbose => makes more verbose --faketfidf => fake tfidf scores based on
score stat in file
Note on split: This program does not claim to produce an optimal splitting. It
tries a couple heuristics, refines the results, and picks the best arrangement
it's found so far. Technically this is a variation on the traditional
"trunk packing problem," which is (at least in the abstract case)
NP-hard, if I remember 15-251 correctly. This particular variety of trunk
packing however, seems like it should be solvable faster (worst case some n^k
dynamic programming I think, but I'm betting this way is faster and tons
easier to write for 90% of the cases out there). If anyone reading this goes
"You moron, this has been solved a thousand times already," please
let me know how: krisp@dna.bio.keio.ac.jp