天涯海角

My Web Home

Category Archives: 生物软件

Latest PICARD-tools installation (from v1.124)

PICARD Homepage

http://broadinstitute.github.io/picard/

https://github.com/broadinstitute/picard

Here I am not going to  introduce how excellent and useful this toolset is, but to celebrate the the author finally integrate all tools into one package. So you will not need to find the package name every time you want it. Actually it’s boring to do that: set PICARDHOME or PICARD_HOME to the package root directory and find a package you want, and java –jar PACKAGE_NAME [options] ….

Now I am happy to to see all the functions are integrated into one package ‘picard.jar’. so you would just call

$> java –jar PICARDHOME/picard.jar

Here I am giving my way to install PICARDtools and use it simply.

 

Installation (Linux):

 

1. create a directory you want to install PICARDtools

$> mkdir -p $HOME/programs/picard/v1.124/x86_64

$> cd $HOME/programs/picard/v1.124/x86_64

 

2. Download the software from github

$> wget https://github.com/broadinstitute/picard/releases/download/1.124/picard-tools-1.124.zip

$> unzip picard-tools-1.124.zip

$> ls

picard-tools-1.124

$> mv picard-tools-1.124 bin

$> cd bin

$>ls -1

htsjdk-1.124.jar
libIntelDeflater.so
picard.jar
picard-lib.jar

 

3. [optional] create one link to directly call picard.jar

$> vim picard

#press i on your keyboard to enter VIM edit mode

#pasta following into terminal:

——————————-file border—do not copy this line——————————-

#!/bin/sh

CurDir=$(cd `dirname $(readlink -f $0)`; pwd)

java -jar $CurDir/picard.jar “$@”

——————————file border—–do not copy this line—————————–

#press Esc, and then input :wq to save and exit

#This is to create one file containing content described above. Same with other text editor, like gedit

$> chmod +x ./picard

#And then put this picard file into you PATH by editing ~/.bashrc, /etc/profile or other ways.

$> export PATH=$HOME/programs/picard/v1.124/x86_64/bin:$PATH

in my case, to make the setting immediately work. But remember to change the picardpath to yours.

 

4. test

$> picard

USAGE: PicardCommandLine <program name> [-h]

Available Programs:
————————————————————————————–
Fasta:                                           Tools for manipulating FASTA, or related data.
    CreateSequenceDictionary                     Creates a SAM or BAM file from reference sequence in fasta format
    ExtractSequences                             Extracts intervals from a reference sequence, writing them to a FASTA file
    NormalizeFasta                               Normalizes lines of sequence in a fasta file to be of the same length

————————————————————————————–
Illumina Tools:                                  Tools for manipulating data specific to Illumina sequencers.
    CheckIlluminaDirectory                       Asserts the validity of the data in the specified Illumina basecalling data
    CollectIlluminaBasecallingMetrics            Given an Illumina basecalling and a lane, produces per-lane-barcode basecalling metrics
    CollectIlluminaLaneMetrics                   Collects Illumina lane metrics for the given basecalling analysis directory
    CollectIlluminaSummaryMetrics                Collects summary metrics according to Illumina specifications.
    ExtractIlluminaBarcodes                      Tool to determine the barcode for each read in an Illumina lane
    IlluminaBasecallsToFastq                     Generate fastq file(s) from data in an Illumina basecalls output directory
    IlluminaBasecallsToSam                       Generate a SAM or BAM file from data in an Illumina basecalls output directory
    MarkIlluminaAdapters                         Reads a SAM or BAM file and rewrites it with new adapter-trimming tags

————————————————————————————–
Interval Tools:                                  Tools for manipulating Picard interval lists.
    BedToIntervalList                            Converts a BED file to an Picard Interval List.
    IntervalListTools                            General tool for manipulating interval lists
    LiftOverIntervalList                         Lifts over an interval list from one reference build to another
    ScatterIntervalsByNs                         Writes an interval list based on splitting the reference by Ns

————————————————————————————–
Metrics:                                         Tools for reporting metrics on various data types.
    CalculateHsMetrics                           Calculates Hybrid Selection-specific metrics for a SAM or BAM file
    CollectAlignmentSummaryMetrics               Produces from a SAM or BAM a file containing summary alignment metrics
    CollectBaseDistributionByCycle               Program to chart the nucleotide distribution per cycle in a SAM or BAM file.
    CollectGcBiasMetrics                         Collects information about GC bias in the reads in the provided SAM or BAM
    CollectHiSeqXPfFailMetrics                   Classify PF-Failing reads in a HiSeqX Illumina Basecalling directory into various categories.
    CollectInsertSizeMetrics                     Writes insert size distribution metrics for a SAM or BAM file
    CollectJumpingLibraryMetrics                 Produces jumping library metrics for the provided SAM/BAMs
    CollectMultipleMetrics                       A “meta-metrics” calculating program that produces multiple metrics for the provided SAM/BAM
    CollectOxoGMetrics                           Collects metrics quantifying the CpCG -> CpCA error rate from the provided SAM/BAM
    CollectQualityYieldMetrics                   Collects a set of metrics that quantify the quality and yield of sequence data from the provided SAM/BAM
    CollectRnaSeqMetrics                         Produces RNA alignment metrics for a SAM or BAM file
    CollectRrbsMetrics                           Collects metrics about bisulfite conversion for RRBS data
    CollectTargetedPcrMetrics                    Produces Targeted PCR-related metrics given the provided SAM/BAM
    CollectWgsMetrics                            Writes whole genome sequencing-related metrics for a SAM or BAM file
    EstimateLibraryComplexity                    Estimates library complexity from the sequence of read pairs
    MeanQualityByCycle                           Writes mean quality by cycle for a SAM or BAM file
    QualityScoreDistribution                     Charts quality score distributions for a SAM or BAM file

————————————————————————————–
Miscellaneous Tools:                             A set of miscellaneous tools.               
    BaitDesigner                                 Designs baits or oligos for hybrid selection reactions.
    FifoBuffer                                   FIFO buffer used to buffer input and output streams with a customizable buffer size

————————————————————————————–
SAM/BAM:                                         Tools for manipulating SAM, BAM, or related data.
    AddCommentsToBam                             Adds comments to the header of a BAM file
    AddOrReplaceReadGroups                       Replaces read groups in a BAM or SAM file with a single new read group
    BamIndexStats                                Generates index statistics from a BAM file
    BamToBfq                                     Create BFQ files from a BAM file for use by the Maq aligner.
    BuildBamIndex                                Generates a BAM index (.bai) file
    CalculateReadGroupChecksum                   Creates a hash code based on the read groups (RG) in the SAM or BAM header.
    CheckTerminatorBlock                         Asserts the provided gzip file’s (e.g., BAM) last block is well-formed; RC 100 otherwise
    CleanSam                                     Cleans the provided SAM/BAM, soft-clipping beyond-end-of-reference alignments and setting MAPQ to 0 for unmapped reads
    CompareSAMs                                  Compares two input SAM or BAM files
    DownsampleSam                                Down-sample a SAM or BAM file to retain a random subset of the reads
    FastqToSam                                   Converts a fastq file to an unaligned BAM or SAM file
    FilterSamReads                               Creates a new SAM or BAM file by including or excluding aligned reads
    FixMateInformation                           Ensure that all mate-pair information is in sync between each read and its mate pair
    GatherBamFiles                               Concatenates one or more BAM files together as efficiently as possible
    MarkDuplicates                               Examines aligned records in the supplied SAM or BAM file to locate duplicate molecules.
    MarkDuplicatesWithMateCigar                  Examines aligned records in the supplied SAM or BAM file to locate duplicate molecules.
    MergeBamAlignment                            Merges alignment data from a SAM or BAM with data in an unmapped BAM file
    MergeSamFiles                                Merges multiple SAM or BAM files into one file
    ReorderSam                                   Reorders reads in a SAM or BAM file to match ordering in reference
    ReplaceSamHeader                             Replace the SAMFileHeader in a SAM file with the given header
    RevertOriginalBaseQualitiesAndAddMateCigar   Reverts the original base qualities and adds the mate cigar tag to read-group BAMs
    RevertSam                                    Reverts SAM or BAM files to a previous state
    SamFormatConverter                           Convert a BAM file to a SAM file, or a SAM to a BAM
    SamToFastq                                   Converts a SAM/BAM into a FASTQ
    SortSam                                      Sorts a SAM or BAM file
    SplitSamByLibrary                            Splits a SAM or BAM file into individual files by library
    ValidateSamFile                              Validates a SAM or BAM file
    ViewSam                                      Prints a SAM or BAM file to the screen

————————————————————————————–
VCF/BCF:                                         Tools for manipulating VCF, BCF, or related data.
    FilterVcf                                    Hard filters a VCF.
    GatherVcfs                                   Gathers multiple VCF files from a scatter operation into a single VCF file
    GenotypeConcordance                          Calculates the concordance between genotype data for two samples in two different VCFs
    MakeSitesOnlyVcf                             Creates a VCF bereft of genotype information from an input VCF or BCF
    MergeVcfs                                    Merges multiple VCF or BCF files into one VCF file or BCF
    RenameSampleInVcf                            Rename a sample within a VCF or BCF.
    SortVcf                                      Sorts one or more VCF files
    SplitVcfs                                    Splits an input VCF or BCF file into two VCF or BCF files
    UpdateVcfSequenceDictionary                  Takes a VCF and a second file that contains a sequence dictionary and updates the VCF with the new sequence dictionary.
    VcfFormatConverter                           Converts a VCF file to a BCF file, or BCF to VCF
    VcfToIntervalList                            Converts a VCF or BCF file to a Picard Interval List.

————————————————————————————–

#To use subfunctions: just call that behind picard, like:

$> picard SamToFastq

#To see options help of SamToFastq

……

 

WORKS. DONE. I am genius…… Ho………HighFive…….

Leave a comment if you think it’s helpful please.

Web genetic software

Haphazardly and sporadically updated by Dave McDonald, Dept. Zoology, University of Wyoming, Laramie, WY 82071-3166
http://www.uwyo.edu/dbmcd/mcd.html
dbmcd@uwyo.edu (307)-766-3012 Please send suggestions for updates, corrections, etc.

[Another much larger list is at: http://www.nslij-genetics.org/soft/ Seems to contain many programs that are not of intense interest to those whose primary interest is population genetics of natural populations…]

Program list

AFLPOP 1.1.xls –see Duchesne

AFLP-SURV – see Vekemans

API-CALC see Ayres

Arlequin — see Excoffier

Assignment tests — see Paetkau, also Cornuet.

Beast – see Drummond

BLAST — NIH site for finding related DNA sequences http://www.ncbi.nlm.nih.gov/BLAST/

Bottleneck — — see Cornuet

CAIC –- see Purvis

Cervus — see Marshall

ClustalX — sequence alignment software
http://inn-prot.weizmann.ac.il/software/ClustalX.html (Mac, Windows, Linux et al.)

CPC — see Phillips Common Principal Components

Delrious (relatedness Mathematica notebook)– see Stone

DISPAN – see http://mep.bio.psu.edu/readme.html

distruct – see Rosenberg

Excel Microsatellite Toolkit Park

famoz Mol Ecol Notes online Aug-03

FSTAT — see Goudet

GDA — see Lewis

GENALEX – see Peakall

GENECLASS — see Cornuet

GENEPOP — see Raymond and Rousset

GeneStat — see Lewis.

Genetix — see Montpelier .

Genographer – see Benham, James MSU

GeoDis — see Posada

Gimlet – see Valière

Hickory – see Holsinger

Identix (Français)— see Belkhir

IMMANC — see Rannala.

Kinship — see Goodnight and Queller.

LEA — see Langella

MARK – see Ritland

McMantell — see McDonald.

MEGA — see Kumar

Micsat — see Wilson.

Microsat — see Stanford (Goldstein, Shriver et al.)

Migrate — see Beerli.

MISAT — see Nielsen.

MS Tools (Excel macro utility)– see Park

MSA (microsatellite analyzer) — see Schlotterer

Parentage — see Emery et al.

Partition — see Dawson and Belkhir Bayesian approach

PartitionML — see Belkhir

PCAGen — see Goudet.

PDAP – see Garland

PHYLIP — see Felsenstein.

POPGENE – see

Populations — see Langella

PowerSSR — see Liu

Relatedness 5.07— see Goodnight and Queller.

RSTCalc— see Goodman (Goudet’s FSTAT also)

Sample size calculatorsee York Univ. stats web

SPAGeDi — see Hardy

Structure — see Pritchard

SWEEP_BOTT (requires C) – see Galtier

TextWrangler – utility converts Mac OSX carriage returns to Unix line feeds; see Felsenstein.

TFPGA — see Miller.

TreeExplorer — see Kumar

TreeView — see Page.

WINAMOVA — see Michalakis and Excoffier.

Measures produced:
Alignment of DNA sequences: ClustalX
Allelic richness: FSTAT

Cavalli-Sforza distances: PHYLIP, TFPGA (?)

Dominant marker analyses: AFLPOP, AFLP-Survey, Hickory
F-statistics: FSTAT, GDA, GenePop, GeneStat, Genetix.
Gene diversity (D): GeneStat, TFPGA, Genetix.
Gene frequencies: (from genotypic data) FSTAT, Relatedness, others.
GST: GeneStat, FSTAT, TFPGA.
Hardy-Weinberg fit: GenePop, FSTAT, TFPGA, Arlequin.
Independent contrasts CAIC

K (number of pops.) Structure

Mantel tests: Genetix, TFPGA, McMantell.
Ne (effective pop. size): Migrate, Misat.
Nei’s distance(’72, ’78): AFLP-Survey, GeneStat, GDA, FSTAT, TFPGA
(PHYLIP
Nei’s 1972 only).
Nei and Li (1979) RestDist in PHYLIP (useful for individual-based trees with
AFLP data)

Nested clade GeoDis
PCA: Principal comp. Analysis w/ PCP, PCAGen, MiniTab
Phylo-independent contrast CAIC, PDAP

Relatedness Relatedness, Identix, SPAGeDi, delrious, MARK, AFLP-Surv
RST: FSTAT, Genetix, RSTCalc.
Rogers’ distance: TFPGA, GeneStat.
Theta (Q): GDA, FSTAT. (Cockerham & Weir F-stat)
Tree diagrams: TreeView
Q (F-statistic): PowerSSR, GDA, FSTAT, Genetix. (Cockerham & Weir)
Q = 4 Neµ Migrate, Misat

Alphabetical, by author/programmer.

________________________________________________________________

Ayres, K.L. and A.D.J. Overall. 2004. API-CALC 1.0: a computer program for calculating the average probability of identity allowing for substructure, inbreeding and the presence of close relatives. Mol. Ecol. Notes 4: 315-318.

http://www.rdg.ac.uk/statistics/genetics/

API-CALC v. 1.0 Windows 19-Nov-04
Calculates the probability of identification in the presence of given degrees of relatedness in the population, inbreeding and coancestry

________________________________________________________________

Beerli, P.

http://popgen.csit.fsu.edu/ miggui 0.8 Graphical interface for Mac OSX10.2+

http://evolution.genetics.washington.edu/lamarc/migrate.html

Migrate Version 2.0.6 26-May-05 √ Macintosh
Requires repeat numbers for input if using SMM. Takes huge amount of time (days)

Beerli, P. 1998. Estimation of migration rates and population sizes in geographically structured populations. Pp. 39-53In Advances in molecular ecology (G. Carvalho, ed.). NATO-ASI workshop series. IOS Press, Amsterdam.

Beerli, P. and J. Felsenstein. 1999. Maximum likelihood estimation of migration rates and effective population numbers in two populations using a coalescent approach. Genetics 152:763-773.

________________________________________________________________

Belkhir, K., Castric, V., and F. Bonhomme. 2002. IDENTIX, a software to test for relatedness in a population using permutation methods.

Identix
http://www.univ-montp2.fr/~genetix/labo.htm#programmes (actual download)
http://www.univ-montp2.fr/%7Egenetix/identix_ms.pdf (downloads PDF)
Site is clunky and download is unclear, but…..
See “Identix.pdf” Input can be “Import” of GenePop inputs created in MS Tools
(mise en garde, on doit comprendre un peu de Français; logiciels means software, téléchargement means download – bonne chance)

Partition ML http://www.univ-montp2.fr/%7Egenetix/partitionml.htm

ftp://162.38.181.25/pub/partitionml.zip (actual download)

________________________________________________________________

Benham, James. MT State U. (Hordeum Sequencer User group) james_benham@hmc.edu

Genographer 1.6. Java 11-Sep-04 √
This program will read in data from an ABI 3700, 3100, 377 or 373, CEQ 2000 or SCF and reconstruct them into a gel image which is straightened and sized. Bins can be defined easily and viewed as thumbnails, which allows for a fairly quick and easy way of scoring a gel.
http://hordeum.oscs.montana.edu/genographer/

________________________________________________________________

Cornuet, J.-M.

http://www.montpellier.inra.fr/URLB/geneclass/geneclass.html

GENECLASS 1.0.02 Windows 26-Oct-04 √
GeneClass is a program for assignation and exclusion using molecular markers
(similar to, but more diverse than Paetkau assignment testing; includes Bayesian approach)
Has not been updated since 1999

http://www.ensam.inra.fr/URLB/bottleneck/ bottleneck.html

BOTTLENECK 1.2.02 Windows 28-Aug-01 √
Uses GENEPOP input format
Detecting recent effective population size reductions from allele data frequencies

From URLB/ Pop100gene is a small tool for population genetics that compute various informations.

________________________________________________________________

Dawson, K.J., and K. Belkhir. A Bayesian approach to the identification of panmictic populations and the assignment of individuals. 2001. Genetical Research 78: 59-77.

http://www.genetix.univ-montp2.fr/partition/partition.htm
Improved link for Partition Windows 20-Oct-02 Uses same inputy format as Genetix

(on doit comprendre un peu de Français; logiciels means software, téléchargement means download – bonne chance)

PartitionML under Belkhir is a maximum likelihood approach

________________________________________________________________

DeWoody, U. of Georgia (Avise lab.) now at Purdue dewoody@fnr.purdue.edu

Parentage and exclusion programs. Matlab and Excel.

http://www.genetics.uga.edu/popgen/parentage.html

________________________________________________________________

Drummond, Alexei and Andrew Rambaut , University of Oxford
BEAST v1.0.3 2002-2003 Bayesian Evolutionary Analysis Sampling Trees
Windows, Mac, Linux (uses Java virtual machine) Aug-03 √
package for evolutionary inference from molecular sequences
BEAST uses a complex and powerful input format (specified in XML) to describe the evolutionary model. This has advantages in terms of flexibility in that the developers of BEAST do not have to try and predict every analysis that researchers may wish to perform and explicitly provide an option for doing it. However, this flexibility means it is possible to construct models that don’t perform well under the Markov chain Monte Carlo (MCMC) inference framework used. We cannot test every possible model that can be used in BEAST. There are two solutions to this: Firstly, we supply a range of recipes for commonly performed analyses that we know should work in BEAST and provide example input files for these (although, the actual data can also produce unexpected behaviour). Secondly, we provide advice and tools for the diagnosis of problems and suggestions on how to fix them:
http://evolve.zoo.ox.ac.uk/Beast/

________________________________________________________________

Duchesne, P., and L. Bernatchez. 2002. AFLPOP : A computer program for simulated and real population allocation based on AFLP data. Molecular Ecology Notes. 3: 380-383.
Assignment and other utilities for AFLP data.

http://www.bio.ulaval.ca/louisbernatchez/downloads.htm
AFLPOP Excel (with macros) 11-Sep-04 √

________________________________________________________________

Emery, A.M., I.J. Wilson, S. Craig, P.R. Boyle and R. Noble. 2001. Assignment of paternity groups without access to parental genotypes: multiple mating and development plasticity in squid. Molecular Ecology: 10: 1265-1278.

http://maths.abdn.ac.uk/~ijw/downloads/download.htm
Parentage Windows, Unix 20-Oct-02 √ PDF help file “parentage.pdf”

________________________________________________________________

Excoffier

http://anthropologie.unige.ch/arlequin/
Arlequin 2.000 Windows, Macintosh, Java 2-Nov-00 √
Does many kinds of analyses for many kinds of molecular data (RFLPs, AFLPs, minisatellites, sequences, microsatellites, allozymes et al.) Very difficult to get input correct. May work better via MSTools intermediary

________________________________________________________________

Felsenstein, Joe, U of Washington

http://evolution.genetics.washington.edu/
PHYLIP (many systematics, tree-building and pop. gen. routines)
Windows, PowerMac, C etc. For nice tree diagram program from PHYLIP treefile outputs see TreeView by Page.
Cite by: Felsenstein, J. 1995. PHYLIP (phylogeny inference package), version 3.57 manual. U. of Washington, Seattle.
[Infile notes: If you use PHYLIP on OSX you will likely find that you get an infile memory error due to carriage return issues (ASCII/ISO . I use a Mathematica routine, but you may want to download the free TextWrangler utility from http://www.barebones.com/products/textwrangler/download.shtml
Open the file in TextWrangler and click the 5th icon in the menu bar (the little page symbol) and then click Unix. Save and quit. File will now be acceptable to PHYLIP. You can also “Save as” MS-DOS with line breaks EACH TIME you modify the file. ]
LAMARC (Kuhner, Yamato, Beerli)

________________________________________________________________

Galtier, Nicolas, U Montpelier, France

http://biology.ucr.edu/people/faculty/Garland/PDAP.html
SWEEP_BOTT √ ANSI C (just learn C et un peu de Français over a weekend, and you’re on your way….)
dedicated to the detection of bottlenecks and selective-sweeps from a coalescence-based maximum-likelihood analysis of DNA sequence polymorphism data
Galtier N., Depaulis F., and Barton N.H. 2000. Detecting bottlenecks and selective sweeps from DNA sequence polymorphism. Genetics 155: 981-987.

__________________________________________

Garland, T., Jr. UC Riverside

http://biology.ucr.edu/people/faculty/Garland/PDAP.html
PDAP √ For Windows/DOS
PHENOTYPIC DIVERSITY ANALYSIS PROGRAMS; Independent contrasts etc.
Garland, T., Jr., A. W. Dickerman, C. M. Janis, and J. A. Jones. 1993. Phylogenetic analysis of covariance by computer simulation. Systematic Biology 42:265-292.

________________________________________________________________

Goodknight, K. & Queller

http://www.bioc.rice.edu/Keck2.0/labs/
Relatedness 5.07 √ For PowerMac (resolves problem with System 8.5)
Kinship 1.3 √ 2-Nov-00
Queller, D.C., and Goodnight, K.F. 1989. Estimating relatedness using genetic markers. Evol. 43, 258-275.

________________________________________________________________

Goodman, Simon

http://helios.bto.ed.ac.uk/evolgen/rst/rst.html
RST Calc 2.2 Windows 25-Nov-99 √
Cite by: Goodman, S.J. 1997. RST Calc: a collection of computer programs for calculating estimates of genetic differentiation from microsatellite data and a determining their significance. Mol. Ecol. 6: 881-885.

________________________________________________________________

Goudet: (see also Raymond & Rousset GENEPOP) Lausanne University, Switzerland
Cite by: Goudet J. 1995. FSTAT Version 1.2: a computer program to calculate F-statistics.
J. Heredity 86: 485-486.

FSTAT Version 2.9.3.2 Windows 11-Sep-01 √ Nov-04
http://www2.unil.ch/popgen/softwares/fstat.htm
Assesses several variance-based measures (Theta Q of Weir and Cockerham),
RST of Goodman, allelic richness. Tests for HWE, various other things.

PCAGEN Windows Principal components analysis of gene frequency data.
http://www2.unil.ch/popgen/softwares/pcagen.htm 28-Nov-99 √

________________________________________________________________

Hardy, O.J. :

http://www.ulb.ac.be/sciences/lagev/spagedi.html
SPAGeDi Version 1.2d Windows 30-Aug-04√
Computes various statistics describing relatedness or differentiation between individuals or populations by pairwise comparisons, and analyze how these values are related to geographical distances, 1°) in a way similar to a spatial autocorrelation analysis, 2°) by linear regressions (the slopes of these regressions can be used to obtain indirect estimates of gene dispersal distances parameters such as neighborhood size). The statistics computed include Fst, Rst, Ds (Nei’s standard genetic distance), and (delta mu)2 (Goldstein and Pollok 1997) for analyses at the population level and, for analyses at the individual level, pairwise kinship, relatedness and fraternity coefficients as well as Rousset’s distance between individuals and a kinship analogue based on allele size. Jackknife over loci gives approximate standard errors, and permutations of locations, individuals or genes provide ad hoc tests. In addition, the actual variance of these statistics can be estimated following the method of Ritland (2000), providing a measure necessary for marker-based inference of the heritability or Qst of quantitative traits.

________________________________________________________________

Holsinger, Kent, and Paul Lewis, U. of Conn.

http://darwin.eeb.uconn.edu/hickory/hickory.html
Hickory V. 1.03 Windows 1-Jun-05√
The software implements the Bayesian method described in Holsinger (1999) for estimating F-statistics co-dominant marker data and the method described in Holsinger et al. (2002) for estimating F-statistics from dominant marker data. It also includes routines to allow posterior comparisons as described in Holsinger and Wallace (2004).

Holsinger, K. E. 1999. Analysis of genetic diversity in geographically structured populations: a Bayesian perspective. Hereditas 130:245–255.

Holsinger, K. E., and L. E. Wallace. 2004. Bayesian approaches for the analysis of population genetic structure: an example from Platanthera leucophaea (Orchidaceae). Molecular Ecology 13:887-894.

Holsinger, K. E., P. O. Lewis, and D. K. Dey. 2002. A Bayesian approach to inferring population structure from dominant markers. Molecular Ecology 11:1157-1164.

________________________________________________________________

Kumar, S.: (Authors: Sudhir Kumar, Koichiro Tamura, Ingrid Jakobsen, Masatoshi Nei)

http://www.megasoftware.net/
MEGA v. 4.0 ß Version Windows 7-Apr-07 √
“…the goal of the MEGA (Molecular Evolutionary Genetics Analysis) software project has been to make useful methods of comparative sequence analysis easily accessible to the scientific community

TreeExplorer (embedded in MEGA) has a great feature that allows one to collapse branches in individual-based trees to show homogeneous clades

Nei, M. and S. Kumar. 2000. Molecular Evolution and Phylogenetics. Oxford Univ. Press, NY. ISBN 0195135857

________________________________________________________________

Langella, O Olivier.Langella@pge.cnrs-gif.fr

http://www.pge.cnrs-gif.fr/bioinfo/wini386/populations.exe

http://www.pge.cnrs-gif.fr/bioinfo/ (then look at [cryptic] list of software choices in banner near top of screen; pour d’autres on doit comprendre un peu de Français; logiciels means software)
Populations 1.2.24 Apr-02 Windows . Population genetic software (individuals or populations distances, phylogenetic trees).

http://www.pge.cnrs-gif.fr/bioinfo/lea/index.php?lang=en
LEA Likelihood based estimation of admixtures.

Langella O., L. Chikhi, and M. Beaumont. 2001 LEA (Likelihood-based estimation of admixture) : a program to simultaneously estimate admixture and the time since admixture Molecular Ecology Notes 1(4): 357-358.

(on doit comprendre un peu de Français; logiciels means software, téléchargement means download – bonne chance)

________________________________________________________________

Lewis, P.O.

http://lewis.eeb.uconn.edu/lewishome/software.html
GDA v. 1.0d13 Windows 28-Aug-01 √. Calculates many of the genetic estimates found in Weir, B.S. 1996. Genetic Data Analysis. Sinauer, Sunderland, MA.

GeneStat. Old DOS-based program for analysis of codominant, allelic marker data. D. McD. has this on diskette and computers.
Cite by: Lewis, P. O., and R. Whitkus. 1989. GENESTAT for microcomputers. ASPT Newsletter 2: 15-16.

________________________________________________________________

Liu, J.

http://www.stat.ncsu.edu/~kliu2/index.htm
PowerSSR Windows 8-Apr-02 √. PowerSSR is a comprehensive set of statistical methods for discrete genetic data analysis, designed especially for microsatellite data analysis. From the NC State Weir lab. (Website not active Oct-02)

________________________________________________________________

Marshall, T.

http://helios.bto.ed.ac.uk/evolgen/cervus/cervusregister.html
CERVUS 1.0 Windows 25-Nov-99
Cite by: Marshall, TC, Slate, J, Kruuk, L and Pemberton, JM (1998) Statistical confidence for likelihood-based paternity inference in natural populations. Mol. Ecol. 7: 639-655.

________________________________________________________________

McDonald, D.B.. University of Wyoming

McDonald, D.B., W.K. Potts, J.W. Fitzpatrick, and G.E. Woolfenden. 1999. Contrasting genetic structures in sister species of North American scrub-jays. Proc. Royal Soc. London B. 266: 1117-1125.

dbmcd@uwyo.edu
McMantell — Macintosh (System 9/Classic) application that conducts a Mantel test (correspondence between genetic and geographic distances). A separate randomization procedure compares strength of correlation across two sets of matrices. That is, in addition to asking whether genetic distance correlates with geographic distance, one can ask whether the correlations in data set one are stronger than those in set two.
Also available as a Mathematica notebook by request to McD, (requires expensive Mathematica software, so worth it only if you already use Mathematica: http://www.wolfram.com/)

________________________________________________________________

Michalakis & Excoffier

http://acasun1.unige.ch/LGB/Software/Windoze/amova
WINAMOVA

or ftp from acasun1.unige.ch/pub/comp/win/amova

________________________________________________________________

Miller, Mark P. Tools for Population Genetic Analyses TFPGA ASU post-doc
Mark.Miller@cnr.usu.edu

http://bioweb.usu.edu/mpmbio/

20-Nov-01 √
Includes Mantel test , AMOVA Prep Major overlap with other programs

________________________________________________________________

Montpelier group Genetix 4.05 Belkhir K., P. Borsa, L. Chikhi, N. Raufaste, and F. Bonhomme

http://www.univ-montp2.fr/~genetix/genetix/genetix.htm
20-Jun-06 √
Includes F-statistics, permutation tests for HWE, Mantel (on doit comprendre un peu de Français; logiciels means software, téléchargement means download – bonne chance)
See also Partition, which uses same data input format

Belkhir K., Borsa P., Chikhi L., Raufaste N. & Bonhomme F. 1996-2002 GENETIX 4.04, logiciel sous Windows TM pour la génétique des populations. Laboratoire Génome, Populations, Interactions, CNRS UMR 5000, Université de Montpellier II, Montpellier (France).

________________________________________________________________

NCSU Power SSR Windows program for microsatellite analysis

http://www.stat.ncsu.edu/~kliu2/download.htm
28-Feb-02 √
17 distance measures, 4 hierarchy levels

________________________________________________________________

Nielsen, Rasmus

http://www.biom.cornell.edu/Homepages/Rasmus_Nielsen/misat/misat.1.0.exe

http://ib.berkeley.edu/labs/slatkin/rasmus/MISAT.1.0.hqx
Misat 1.0 for PowerMac or (newer) Windows October-01 √
Calculates 4Nµ. See McD. ln-regression approach to resolving multi-locus, multi-pop. output.

Nielsen, Rasmus. 1997. A likelihood approach to population samples of microsatellite alleles. Genetics 146: 711-716

________________________________________________________________

Paetkau/Brzustowski

http://www.biology.ualberta.ca/jbrzusto/Doh.html
Assignment testing on the web Uses Titterington et al. 1981 for correction (ref in Msat refs. doc)

Based on Paetkau et al. 1995. Mol. Ecol. 4: 347 (Msat refs.doc)

________________________________________________________________

Page, Rod.

http://taxonomy.zoology.gla.ac.uk/rod/treeview.html
TreeView (1.6.6 Sep-01). A nice application for producing images of trees. Works especially well from treefiles generated by PHYLIP. Windows, Macintosh.
OS X version (0.2.0) available Apr-02
Cite by: Page, R.D.M. 1996. TREEVIEW: An application to display phylogenetic trees on personal computers. Computer Applications in the Biosciences 12: 357-358.

________________________________________________________________

Park, Stephen. Animal Genomics Lab, University College, Dublin, Ireland

spark@ucd.ie;

http://animalgenomics.ucd.ie/sdepark/ms-toolkit/
Excel Microsatellite Toolkit v. 3.1 Windows (only) Dec-01
An Excel spreadsheet toolkit for data conversion etc. The tools available allow you to check data, format data for population genetics programs (Arlequin, Microsat, Genepop, Fstat etc) and perform a number of basic calculations. The updated version allows you to:
work with haploid data as well as diploid data
save input files for genetics programs direct from Excel with no need for further editing
access help using the new help file
choose which loci and populations in your dataset to work with
calculate allele-sharing index of Chakraborty and Jin. 1993.

________________________________________________________________________________

Peakall, R., and Smouse, P. E. (2001) GenAlEx V5: Genetic Analysis in Excel. Population genetic software for teaching and research. Australian National University, Canberra, Australia..

http://www.anu.edu.au/BoZo/GenAlEx/
Excel-based package for teaching and analysis for a range of markers and problems.
Mac or Windows versions 8-Nov-04 √

_______________________________________________

Phillips, P.C., and S.J. Arnold. 1999. Hierarchical comparison of genetic variance-covariance matrices. I. Using the Flury hierarchy. Evol. 53: 1506-1515.

http://www.uoregon.edu/~pphil/programs/cpc/cpc.htm
CPC – Common Principal Component Analysis Program
Mac Linux, DEC or Windows versions 20-Oct-02 √
Reference: Flury, B. 1988. Common Principal Components and Related Multivariate Methods. Wiley, New York

________________________________________________________________

Posada, D. (Templeton method)

http://bioag.byu.edu/zoology/crandall_lab/programs.htm
GeoDis: nested clade analyses of Templeton. Mol. Ecol. 9: 487-488.

________________________________________________________________

Pritchard, J.K., M. Stephens, and P. Donnelly. 2000. Inference of population structure using multilocus genotype data. Genetics 155: 945-959.

Falush, D., M. Stephens, and J.K. Pritchard. 2007. Inference of population structure using multilocus genotype data: dominant markers and null alleles. Mol. Ecol. Notes 7: 574–578.

http://pritch.bsd.uchicago.edu/ Windows 27-Jul-04 √

(V. 2.2 incorporates explicit accommodation of AFLP/domonant marker data. Since the manual is impenetrable for Mac installation and for file input format, I have put some tips below).

PDF of ms. as well as help file “Struct ReadMe.pdf” and “Pritchard.pdf”. Best not to give population information to Structure a priori.

________________________________________________________________

Purvis, A. Imperial College, UK

http://www.bio.ic.ac.uk/evolve/software/caic/ v. 2.6.9 Mar-02 Macintosh 18-Oct-05 √

Conducts comparative analysis by independent contrasts.

Purvis, A., and A. Rambaut (1995) Comparative analysis by independent contrasts (CAIC): an Apple Macintosh application for analysing comparative data. Computer Applications in the Biosciences (CABIOS) 11: 247-251.

Felsenstein, J. 1985. Phylogenies and the comparative method. Am. Nat. 125: 1-15.

Pagel, M.D. 1992. A method for the analysis of comparative data. J. theor. Biol. 156: 431-442.

________________________________________________________________

Rannala, Bruce. SUNY-Stony Brook

http://allele.bio.sunysb.edu/software.html

All the programs below are for Windows 95/NT

BMDC MLE of allele age Version 2.1 25-Nov-99 √

Slatkin, M., and B. Rannala. 1997. Estimating the age of alleles by use of intraallelic variability. Am. J. Human Genetics 60: 447-458.

PMLE estimation of gene flow Version 2.0 25-Nov-99 √

Rannala, B., and J. A. Hartigan. 1996. Estimating gene flow in island populations. Genetical Res. 67:147-158.

IMMANC Version 5.0 Nov-25-99 √ Joanna Mountain.

Rannala, B., and J.L., Mountain. 1997. Detecting immigration by using multilocus genotypes. PNAS USA 94: 9197-9201.

________________________________________________________________

Raymond M. & Rousset F, 1995. GENEPOP (version 3.3): population genetics software for exact tests and ecumenicism. J. Heredity, 86:248-249

Hardy-Weinberg test, differentiation, linkage disequilibrium

Very clunky DOS-based format!!!

ftp://ftp.cefe.cnrs-mop.fr/pub/pc/msdos/genepop

GENEPOP 3.1d Windows 28-Aug-01 √
Includes tests from the following references.

Garnier-Gere P and Dillmann C, 1992. A computer program for testing pairwise linkage disequilibria in subdivided populations. J Heredity 83:239.

Goudet J, Raymond M, De Meeüs T and Rousset F, 1996. Testing differentiation in diploid populations. Genetics 144:1933-1940.

Raymond M and Rousset F, 1995. An exact test for population differentiation. Evolution 49 :1280-1283.

Rousset F and Raymond M, 1995. Testing heterozygote excess and deficiency. Genetics 140 :1413-1419.

Rousset F, 1996. Equilibrium values of measure of population subdivision for stepwise mutation processes. Genetics 142 :1357-1362.

Rousset F, 1997. Genetic differentiation and estimation of gene flow from F-statistics under isolation by distance. Genetics 145 : 1219-1228.

________________________________________________________________

Ritland, K. Multilocus estimation of pairwise relatedness with dominant markers. Mol. Ecol. 14: 3157-3165.

Assesses relatedness for dominant markers such as AFLP

http://www.genetics.forestry.ubc.ca/ritland/programs.html (not the same as listed in the paper!)

MARK (updated Sep-04) Windows 1-Oct-05 √

________________________________________________________________

Rosenberg, Noah noahr@usc.edu

http://www.cmb.usc.edu/people/noahr/distruct.html 26-May-05
Sun, Linux, Windows
distruct is a program that can be used to graphically display results produced by the genetic clustering program structure. The figures produced by distruct display individual membership coefficients in the same form as used in Genetic structure of human populations Science 298:2381-2385 (2002). Various options enable the user to control left-to-right printing order of populations, bottom-to-top printing order of clusters, colors, and other graphical details.

________________________________________________________________

Stanford (Lynch, Goldstein, Shriver)

http://hpgl.stanford.edu/projects/microsat/ 27-Nov-02
Microsat 1.5b software does many microsatellite distance analyses (e.g., dm2 of Goldstein, but may have errors e.g., in DSW of Shriver et al.) Macintosh

________________________________________________________________

Schlotterer

http://i122server.vu-wien.ac.at/MSA/MSA_download.html 11-Oct-04
Microsatellite Analyzer software does many microsatellite distance analyses Macintosh or Windows

________________________________________________________________

Stone, J., and M. Björklund .

Delrious is a computer program that accepts as input data representing codominant , single locus, diploid molecular markers (e.g., microsatellites ), applies to them the algorithm described in Lynch, M. & K. Ritland . 1999. Genetics 152:1753-1766 , and returns delta D and relatedness r estimates. Del rious can implement bootstrap and jackknife resampling procedures to provide confidence measures. Syntax templates for running analyses are contained in a notebook distributed with del rious . Del rious uses Mathematica (Wolfram Research, Inc. 2000) as a software platform and can be run under Linux, Macintosh, Microsoft Windows, or Unix operating systems.
http://www.zoo.utoronto.ca/stone/DELRIOUS/delrious.htm

________________________________________________________________

Tufto, J., S. Engen, and K. Hindar, 1996. Inferring patterns of migration from gene frequencies under equilibrium conditions. Genetics 144:1911-1921.

Tufto, J., A. F. Raybould, K. Hindar, and S. Engen, 1998. Analysis of genetic structure and dispersal patterns in a population of sea beet. Genetics in press.

http://www.ed.ac.uk/~jarlet/migration/ 25-Nov-99
S-Plus software required

________________________________________________________________

Valière, N. 2003. GIMLET : a computer program for analysing genetic individual identification data. Mol. Ecol. Notes 2: 377-379.
Laboratoire de Biométrie et de Biologie Evolutive, Université de Lyons, France

http://pbil.univ-lyon1.fr/software/Gimlet/gimlet%20frame1.html 18-Nov-04
Gimlet v. 1.3.2 software addresses issues in individual identification (e.g., forensic matching) where errors may result from low-quality DNA (e.g., from hair, feces). Among its abilities are: estimating error rates, false alleles and allelic dropout , finding matches, estimating kinship and providing several basic measures of genetic variability.

________________________________________________________________

Vekemans, X.. T. Beauwens, M. Lemaire, and I. Roldan-Ruiz. 2002. Data from amplified fragment length polymorphism (AFLP) markers show indication of size homoplasy and of a relationship between degree of homoplasy and fragment size. Mol. Ecol. 11: 139-151.
Laboratoire de Génétique et d’Ecologie Végétales, Université de Libre de Bruxelles, Belgium

http://www.ulb.ac.be/sciences/lagev/aflp-surv.html 16-Jun-05
AFLP-Surveyt v. 1.0 AFLP-SURV estimates genetic diversity and population genetic structure from population samples analysed with AFLP or RAPD methods and computes genetic distance matrices between populations (Nei’s distance and 1-r measure based as developed or modified by Lynch and Milligan 1994). The program starts by estimating allelic frequencies at each marker locus in each population assuming they are dominant and have only two alleles (a dominant marker allele coding for the presence of a band at a given position, and a recessive null allele coding for the absence of the band). Relies heavily on methods developed in Lynch and Milligan (1994).
Lynch, M. and B.G. Milligan 1994. Analysis of population genetic structure with RAPD markers. Mol Ecol. 3: 91-99. * Indiana http://www.bio.indiana.edu/facultyresearch/faculty/Lynch.html

________________________________________________________________

Wilson, I.J., and D.J. Balding. 1998. Genealogical inference from microsatellite data. Genetics 150: 499-510.

http://www.maths.abdn.ac.uk/~ijw/ downloads/download.htm

Micsat Runs in C but has Windows executable 20-Oct-02 √

________________________________________________________________

Notes and scraps….

/mac/development/source/macstarterpascal1.0.cpt.hqx

168 12/19/93 BinHex4.0,Compact1.51

A simple application shell for THINK Pascal. Uses a window class

to provide basic window behavior: dragging, changing size,

zooming, closing and vertical and horizontal scroll bars. Detailed

knowledge of the THINK Class Library not required.

A source for computer software on the following topics: genetic linkage analysis, marker mapping, linkage disequilibrium mapping, and pedigree drawing.
http://www.nslij-genetics.org/soft/

Check on the following

Sequence Navigator

Mac seq.app

PRIMER!

http://www.math.yorku.ca/SCS/Demos/power/

will do sample size analysis for you simply and quickly. This will give you an idea of the sample size needed to detect differences at an error level that you choose. You have to set the parameters, for instance, if you want to detect a difference of 0.8 (the effect size) between two populations at some error level (say, 0.05), the program tells you that you will need to sample X number of individuals.

__________________________________________________________________________________________________________________

Installing Structure 2.2 on Mac OSX.

1. Open a Terminal (UNIX interface) shell:

a. Find and open Terminal.app

b. Click “New shell” in the “File” menu

2. After the default prompt (something like “mcdg4-2:~ mcd$”), type (or better, paste) the full, path name for where the Structure program is housed. Note that UNIX does not allow spaces and certain other characters in folder (directory) names, and will give an error message saying that file or directory does not exist if you use sloppy folder-naming techniques. On my Intel Powerbook the path name is

“/Applications/ScienceApps/GeneticsPrograms/Structure222/structure”

3. Save the “new shell” as something like “StructureTerminalStarter”.

4. Keep that shell in the same folder that has the Structure 2.2 app.

5. Create aliases for the shell and put them wherever you actually want to start using Structure 2.2.

6. So far, I can’t seem to get the Mac version to run jobs (k = 1,2,3,….) or even to allow me to go back to an existing project (both of which work on the Windows version). That means it’s much less convenient, forcing me to run jobs one setup at a time….

The input file (since the manual is impenetrable) for AFLP data:

Line 1: list of n Loci

Line 2: list of n zeros

Lines 3 & 4 (duplicates): first “data lines”

Optional ID; optional population number; Extra column for Ord 1-166 with 5 digits missing for the 5 longnose suckers;

Population number codes: 1=B; 2=BF; 3=BW; 4=F; 5=FW; 6=W; 7=WS (Laramie River) then 1 for fragment present, 0 for fragment absent

101 103 105 106 108 113 114

0 0 0 0 0 0 0

20 UA2_2n AB1_E1M2_01_A01 1 0 0 0 1 0 0

20 UA2_2n AB1_E1M2_01_A01 1 0 0 0 1 0 0

24 MK_Cow AB13_E1M2_01_E02 0 0 0 0 0 0 0

24 MK_Cow AB13_E1M2_01_E02 0 0 0 0 0 0 0

23 MKWyeth AB14_E1M2_01_F02 1 0 1 0 0 0 1

23 MKWyeth AB14_E1M2_01_F02 1 0 1 0 0 0 1

* Thanks to Amy Blair of CSU for this tip

几款常用RNA分析软件下载

1.RNA draw 1.1 b

简介:RNAdraw: an integrated program for RNA secondary structure calculation and analysis under 32-bit Microsoft Windows

RNA draw 1.1b2 RNA二级结构分析软件,对RNA一级序列进行二级结构分析作图,最有意思的是大部分功能集中在鼠标右键。

下载地址:http://www.ibioo.com/soft/biosoft/20070806/1664.html

2.RNAstructure 4.5

简介:Unix平台软件mfold的for Windows版本,输入或载入RNA的一级序列,根据最小自由能原理,依据一定算法,预测出其二级结构图,非常出色的一个程序。

下载地址:http://www.ibioo.com/soft/biosoft/20070806/1665.html

3.Vienna RNA Package 1.64

简介:维也纳大学RNA二级结构预测与比较软件包。在windows下运行需要模拟linux环境的Cygwin。作者同时提供C语言源文件,感兴趣的人可以自行在WINDOWS下编译。

下载地址:http://www.ibioo.com/soft/biosoft/20070806/1666.html