org.bdgenomics.adam

rdd

package rdd

Visibility
  1. Public
  2. All

Type Members

  1. class ADAMContext extends Serializable with Logging

  2. class ADAMRDDFunctions[T] extends Serializable with Logging

  3. trait ADAMSaveAnyArgs extends SaveArgs

  4. abstract class ADAMSequenceDictionaryRDDAggregator[T] extends Serializable with Logging

    A class that provides functions to recover a sequence dictionary from a generic RDD of records.

    A class that provides functions to recover a sequence dictionary from a generic RDD of records.

    T

    Type contained in this RDD.

  5. class ADAMSpecificRecordSequenceDictionaryRDDAggregator[T] extends ADAMSequenceDictionaryRDDAggregator[T]

    A class that provides functions to recover a sequence dictionary from a generic RDD of records that are defined in Avro.

    A class that provides functions to recover a sequence dictionary from a generic RDD of records that are defined in Avro. This class assumes that the reference identification fields are defined inside of the given type.

    T

    A type defined in Avro that contains the reference identification fields.

    Note

    Avro classes that have specific constraints around sequence dictionary contents should not use this class. Examples include ADAMRecords and ADAMNucleotideContigs

  6. class Coverage extends Serializable

    A base is 'covered' by a region set if any region in the set contains the base itself.

    A base is 'covered' by a region set if any region in the set contains the base itself.

    The 'coverage regions' of a region set are the unique, disjoint, non-adjacent, minimal set of regions which contain every covered base, and no bases which are not covered.

    The Coverage class calculates the coverage regions for a given region set.

  7. case class GenomeBins(binSize: Long, seqLengths: Map[String, Long]) extends Serializable with Product

    Partition a genome into a set of bins.

    Partition a genome into a set of bins.

    Note that this class will not tolerate invalid input, so filter in advance if you use it.

    binSize

    The size of each bin in nucleotides

    seqLengths

    A map containing the length of each contig

  8. case class GenomicPositionPartitioner(numParts: Int, seqLengths: Map[String, Long]) extends Partitioner with Logging with Product with Serializable

    GenomicPositionPartitioner partitions ReferencePosition objects into separate, spatially-coherent regions of the genome.

    GenomicPositionPartitioner partitions ReferencePosition objects into separate, spatially-coherent regions of the genome.

    This can be used to organize genomic data for computation that is spatially distributed (e.g. GATK and Queue's "scatter-and-gather" for locus-parallelizable walkers).

    numParts

    The number of equally-sized regions into which the total genomic space is partitioned; the total number of partitions is numParts + 1, with the "+1" resulting from one extra partition that is used to capture null or UNMAPPED values of the ReferencePosition type.

    seqLengths

    a map relating sequence-name to length and indicating the set and length of all extant sequences in the genome.

  9. case class GenomicRegionPartitioner(partitionSize: Long, seqLengths: Map[String, Long], start: Boolean = true) extends Partitioner with Logging with Product with Serializable

  10. class InstrumentedADAMAvroParquetOutputFormat extends InstrumentedOutputFormat[Void, IndexedRecord]

  11. class PairingRDD[T] extends Serializable

    PairingRDD provides some simple helper methods, allowing us take an RDD (presumably an RDD whose values are in some reasonable or intelligible order within and across partitions) and get paired or windowed views on that list of items.

    PairingRDD provides some simple helper methods, allowing us take an RDD (presumably an RDD whose values are in some reasonable or intelligible order within and across partitions) and get paired or windowed views on that list of items.

    T

    The type of the values in the RDD

  12. case class ReferencePartitioner(sd: SequenceDictionary) extends Partitioner with Product with Serializable

    Repartitions objects that are keyed by a ReferencePosition or ReferenceRegion into a single partition per contig.

  13. trait RegionJoin extends AnyRef

  14. case class ShuffleRegionJoin(sd: SequenceDictionary, partitionSize: Long) extends RegionJoin with Product with Serializable

Value Members

  1. object ADAMContext extends Serializable

  2. object BroadcastRegionJoin extends RegionJoin

    Contains multiple implementations of a 'region join', an operation that joins two sets of regions based on the spatial overlap between the regions.

    Contains multiple implementations of a 'region join', an operation that joins two sets of regions based on the spatial overlap between the regions.

    Different implementations will have different performance characteristics -- and new implementations will likely be added in the future, see the notes to each individual method for more details.

  3. object GenomicPositionPartitioner extends Serializable

  4. object GenomicRegionPartitioner extends Serializable

  5. object PairingRDD extends Serializable

  6. package contig

  7. package features

  8. package fragment

  9. package read

  10. package variation

Ungrouped