Class Sample
This class extends Attributable
to inherit functionality for managing attributes.
It provides fields and methods to store and manipulate variant calls, alleles, and other
sample-specific data. Each instance of this class is uniquely identified by its name
.
-
Field Summary
FieldsModifier and TypeFieldDescriptionA map that assigns features to their corresponding alleles.final String
The name or internal identifier of this sample.static final Pattern
Regular expression pattern to match variant call strings.Hierarchical map structure to store variant calls.Fields inherited from class datastructure.Attributable
sampleOccurrence
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionint
Retrieves the number of alleles in this sample.Retrieves the entries of the alleles map for this sample.static String
getReferenceOfCall
(String call) Extracts the reference base character from the starting position of a variant call string.getVariantCalls
(String contig) Retrieves the variant calls for the specified contig in this sample.protected void
Associates a specific allele with a feature in this sample.toString()
Converts this sample to its string representation.Methods inherited from class datastructure.Attributable
addAttributeIfAbsent, addAttributesIfAbsent, attributesAsString, attributesAsString, clearAttributes, extendAttribute, extendAttributes, getAttribute, getAttributeAsCollection, getAttributeOrDefault, getAttributes, hasAttribute, hasAttributes, removeAttribute, setAttribute, setAttributes
-
Field Details
-
name
The name or internal identifier of this sample.This field uniquely identifies the sample within the context of the application. It is a final field, meaning its value is immutable once assigned during the construction of the
Sample
instance. -
variantCalls
Hierarchical map structure to store variant calls.This map organizes variant calls in a hierarchical structure:
- First level: The key is the name of the contig (
Contig.name
). - Second level: The key is the position of the variant on the contig.
- Third level: The value is a string representing the variant call, formatted as:
CALL_INDEX;DP;GQ;REF_0:ALT_0:AD_0:PL_0,...
.CALL_INDEX
: Indicates the numeric index of the alternative allele with an optional prefix character of eitherf
(low frequency) orx
(low coverage).DP
: The read depth at the variant site.GQ
: The genotype quality score.REF_0:.:AD_0:PL_0
: The reference allele, a placeholder (`.`), the allele depth, and the phred-scaled likelihood.REF_1:ALT_1:AD_1:PL_1,...
: One or more alternate alleles, each with their respective reference allele, alternate allele, allele depth, and phred-scaled likelihoods.
- First level: The key is the name of the contig (
-
alleles
A map that assigns features to their corresponding alleles.This
Map
stores the relationship between feature names and their associated allele identifiers. The keys represent the names of the features, and the values represent the unique identifiers of the alleles. This structure is used to track which allele is associated with each feature in the sample. -
variantCallPattern
Regular expression pattern to match variant call strings.This pattern is designed to parse variant call strings that conform to the VCF specification. The expected format includes fields separated by semicolons (`;`), with the following structure:
CALL_INDEX
: An optional prefix indicating the call index, which can be `f` (low frequency), `x` (low coverage), or a numeric index of the alternative allele.DP
: The read depth at the variant site.GQ
: The genotype quality score.REF_0:.:AD_0:PL_0
: The reference allele, a placeholder (`.`), the allele depth, and the phred-scaled likelihood.REF_1:ALT_1:AD_1:PL_1,...
: One or more alternate alleles, each with their respective reference allele, alternate allele, allele depth, and phred-scaled likelihoods.
Example match:
1;13;99;TTC:.:0:585,TTC:T--:13:0
-
-
Constructor Details
-
Sample
Constructs a newSample
instance with the specified name and initial capacity for the alleles map.This constructor initializes a
Sample
object with the given name and allocates aHashMap
for thealleles
field with the specified initial capacity. Thename
field is set to the provided name, and the superclass constructor is invoked to initialize inherited properties.- Parameters:
name
- The name of the sample, used as its unique identifier.capacity
- The expected initial capacity of thealleles
map.
-
-
Method Details
-
setAllele
Associates a specific allele with a feature in this sample.This method updates the
alleles
map by setting the sequence type (allele) for the specified feature. The feature is identified by its name, and the allele is identified by its unique identifier.- Parameters:
featureName
- The name of the feature (Feature.name
) to associate with the allele.alleleUid
- The unique identifier of the allele (SequenceType.name
) to set for the feature.
-
getAlleles
Retrieves the entries of the alleles map for this sample.This method returns a collection view of the mappings contained in the
alleles
map. Each entry in the collection represents a feature name and its associated allele identifier. Modifications to the returned collection will reflect in the underlying map.- Returns:
- A
Collection
ofMap.Entry
objects representing the entries in thealleles
map.
-
getAlleleCount
public int getAlleleCount()Retrieves the number of alleles in this sample.This method returns the size of the
alleles
map, which represents the number of unique alleles associated with features in this sample. This corresponds to the number of non-reference alleles present in the sample.- Returns:
- The number of alleles in this sample.
-
getVariantCalls
Retrieves the variant calls for the specified contig in this sample.This method returns a
TreeMap
containing the variant calls for the given contig. The keys in the map represent the positions of the variants on the contig, and the values are the corresponding variant call strings. If no variant calls exist for the specified contig, an emptyTreeMap
is returned.- Parameters:
contig
- The name of the contig to retrieve the variant calls for.- Returns:
- A
TreeMap
where the keys are variant positions and the values are variant call strings.
-
getReferenceOfCall
Extracts the reference base character from the starting position of a variant call string.This method processes a variant call string formatted as per the VCF specification and retrieves the reference base character from the starting position. The call string is expected to follow the structure defined in
variantCallPattern
, where fields are separated by semicolons, commas, and colons.Example call string:
1;13;99;TTC:.:0:585,TTC:T--:13:0
- Parameters:
call
- The variant call string to process.- Returns:
- The reference base character from the starting position of the specified call.
- Throws:
ArrayIndexOutOfBoundsException
- If the call string does not conform to the expected format.
-
toString
Converts this sample to its string representation.This method generates a string representation of the sample, including its name and attributes. The attributes are formatted as key-value pairs separated by an equals sign (`=`) and delimited by semicolons (`;`). If the last character of the generated string is a semicolon, it is removed to ensure proper formatting.
-