Class VariantInformation
The actual alternative content is not stored in this class!
This class represents a nucleotide variant, including its reference base content, type (e.g., SNV, insertion, deletion), and occurrences in samples and alleles. It provides methods to determine the type of the variant, check its canonical or padded canonical status, and manage occurrences in samples and features.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic enum
Enum representing the type of variant. -
Field Summary
FieldsModifier and TypeFieldDescriptionA mapping of occurrences of this variant in samples and alleles.final String
The reference base content of this variant.final VariantInformation.Type
The type of this variant (e.g., SNV, insertion, deletion). -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotected
VariantInformation
(String referenceContent, String alternativeContent) Constructor forVariantInformation
. -
Method Summary
Modifier and TypeMethodDescriptionprotected void
addAlleleOccurrence
(String featureName, String alleleUid) Adds an allele occurrence to this variant for a specific feature.protected void
addFeatureOccurrence
(String name) Adds a feature occurrence to this variant.protected void
addSampleOccurrence
(String name) Adds a sample occurrence to this variant.Retrieves the features associated with this variant.getReferenceBaseString
(boolean strip) Retrieves the reference base content of this variant.Retrieves the occurrences of this variant in samples.boolean
hasOccurrence
(String name) Checks whether this variant has an occurrence in a specific feature.boolean
hasOccurrence
(String of, String name) Checks whether this variant has an occurrence in a sample or allele.static boolean
isCanonicalVariant
(String referenceContent, String alternativeContent) Determines whether a variant is canonical.static boolean
isDeletion
(String alt) Determines whether a variant is a deletion based on its alternative content.static boolean
isDeletion
(String ref, String alt, boolean padded) Determines whether a variant is a deletion, i.e., either the reference base content is a string of any length ofConstants.baseSymbols
and the alternative base content is a single base ofConstants.baseSymbols
followed byConstants.gapString
s matching the reference content's length (padded canonical), or the reference base content is a string of any length ofConstants.baseSymbols
and the alternative content is a single base ofConstants.baseSymbols
(un-padded canonical).static boolean
isInsertion
(String alt) Determines whether a variant is an insertion based on its alternative content.static boolean
isInsertion
(String ref, String alt, boolean padded) Determines whether a variant is an insertion, i.e., either the alternative base content is a string of any length ofConstants.baseSymbols
and the reference base content is a single base ofConstants.baseSymbols
followed byConstants.gapString
s matching the alternative content's length (padded canonical), or the reference base content is a single base ofConstants.baseSymbols
and the alternative content is a string of any length ofConstants.baseSymbols
(un-padded canonical).static boolean
isPaddedCanonicalVariant
(String referenceContent, String alternativeContent) Determines whether a variant is padded canonical.static boolean
isSubstitution
(String alt) Determines whether a given alternative base content represents a substitution.static boolean
isSubstitution
(String ref, String alt) Determines whether a variant is a substitution; i.e., both the reference and alternative base content match a single base ofConstants.baseSymbols
.Methods inherited from class datastructure.Attributable
addAttributeIfAbsent, addAttributesIfAbsent, attributesAsString, attributesAsString, clearAttributes, extendAttribute, extendAttributes, getAttribute, getAttributeAsCollection, getAttributes, hasAttribute, hasAttributes, removeAttribute, setAttribute, setAttributes
-
Field Details
-
reference
The reference base content of this variant. -
occurrence
A mapping of occurrences of this variant in samples and alleles.The `occurrence` map is structured as follows:
- The key is either
Constants.$Attributable_samplesOccurrence
(representing sample occurrences) or the name of aFeature
(representing feature occurrences). - The value is a
HashSet
containing names ofSample
orSequenceType
associated with the key.
- The key is either
-
type
The type of this variant (e.g., SNV, insertion, deletion).
-
-
Constructor Details
-
VariantInformation
Constructor forVariantInformation
.Initializes the variant with its reference and alternative content, determining its type (SNV, insertion, or deletion) based on the provided content.
This constructor checks if the reference and alternative content match any padded canonical content type. If they do not, an
IllegalArgumentException
is thrown.- Parameters:
referenceContent
- The reference base content of the variant.alternativeContent
- The alternative base content of the variant.- Throws:
IllegalArgumentException
- If the reference and alternative content do not match any padded canonical content type.
-
-
Method Details
-
isSubstitution
Determines whether a variant is a substitution; i.e., both the reference and alternative base content match a single base ofConstants.baseSymbols
.- Parameters:
ref
- The reference base content.alt
- The alternative base content.- Returns:
true
if the variant is a substitution,false
otherwise.
-
isSubstitution
Determines whether a given alternative base content represents a substitution.A substitution is defined as a single base from the set of valid nucleotide symbols defined in
Constants.baseSymbols
.- Parameters:
alt
- The alternative base content to check.- Returns:
true
if the alternative content represents a substitution,false
otherwise.
-
isInsertion
Determines whether a variant is an insertion, i.e.,- either the alternative base content is a string of any length of
Constants.baseSymbols
and the reference base content is a single base ofConstants.baseSymbols
followed byConstants.gapString
s matching the alternative content's length (padded canonical), - or the reference base content is a single base of
Constants.baseSymbols
and the alternative content is a string of any length ofConstants.baseSymbols
(un-padded canonical).
- Parameters:
ref
- The reference base content.alt
- The alternative base content.padded
- Whether the variant is padded by gap symbols.- Returns:
true
if the variant is an insertion,false
otherwise.
- either the alternative base content is a string of any length of
-
isInsertion
Determines whether a variant is an insertion based on its alternative content.This method checks if the alternative base content represents an insertion. An insertion is defined as a string of at least two consecutive bases from the set of valid nucleotide symbols defined in
Constants.baseSymbols
.- Parameters:
alt
- The alternative base content to check.- Returns:
true
if the alternative content represents an insertion,false
otherwise.
-
isDeletion
Determines whether a variant is a deletion, i.e.,- either the reference base content is a string of any length of
Constants.baseSymbols
and the alternative base content is a single base ofConstants.baseSymbols
followed byConstants.gapString
s matching the reference content's length (padded canonical), - or the reference base content is a string of any length of
Constants.baseSymbols
and the alternative content is a single base ofConstants.baseSymbols
(un-padded canonical).
- Parameters:
ref
- The reference base content.alt
- The alternative base content.padded
- Whether the variant is padded by gap symbols.- Returns:
true
if the variant is a deletion,false
otherwise.
- either the reference base content is a string of any length of
-
isDeletion
Determines whether a variant is a deletion based on its alternative content.This method checks if the alternative base content represents a deletion. A deletion is defined as a string that starts with a valid nucleotide base (from
Constants.baseSymbols
) followed by one or more gap symbols (defined inConstants.gapString
).- Parameters:
alt
- The alternative base content to check.- Returns:
true
if the alternative content represents a deletion,false
otherwise.
-
isCanonicalVariant
Determines whether a variant is canonical.A variant is canonical if it is:
- a single nucleotide variant (SNV) (
isSubstitution(java.lang.String, java.lang.String)
), - an un-padded canonical insertion (
isInsertion(java.lang.String, java.lang.String, boolean)
), or - an un-padded canonical deletion (
isDeletion(java.lang.String, java.lang.String, boolean)
).
- Parameters:
referenceContent
- The reference base content.alternativeContent
- The alternative base content.- Returns:
true
if the variant is canonical,false
otherwise.
- a single nucleotide variant (SNV) (
-
isPaddedCanonicalVariant
Determines whether a variant is padded canonical.A variant is padded canonical if it is:
- a single nucleotide variant (SNV) (
isSubstitution(java.lang.String, java.lang.String)
), - a padded canonical insertion (
isInsertion(java.lang.String, java.lang.String, boolean)
), or - a padded canonical deletion (
isDeletion(java.lang.String, java.lang.String, boolean)
).
- Parameters:
referenceContent
- The reference base content.alternativeContent
- The alternative base content.- Returns:
true
if the variant is padded canonical,false
otherwise.
- a single nucleotide variant (SNV) (
-
addSampleOccurrence
Adds a sample occurrence to this variant.- Parameters:
name
- The name of the sample to add.
-
addFeatureOccurrence
Adds a feature occurrence to this variant.- Parameters:
name
- The name of the feature to add.
-
addAlleleOccurrence
Adds an allele occurrence to this variant for a specific feature.- Parameters:
featureName
- The name of the feature.alleleUid
- The unique identifier of the allele to add.
-
hasOccurrence
Checks whether this variant has an occurrence in a sample or allele.- Parameters:
of
- Eithersamples
or the name of aFeature
.name
- The name of the sample or allele to check for.- Returns:
true
if the sample or allele is associated with this variant,false
otherwise.
-
hasOccurrence
Checks whether this variant has an occurrence in a specific feature.- Parameters:
name
- The name of the feature to check for.- Returns:
true
if the feature is associated with this variant,false
otherwise.
-
getSampleOccurrence
Retrieves the occurrences of this variant in samples.- Returns:
- A
Collection
of sample names.
-
getFeatureOccurrence
Retrieves the features associated with this variant.- Returns:
- A
Collection
of feature names.
-
getReferenceBaseString
Retrieves the reference base content of this variant.- Parameters:
strip
- Whether to strip gap symbols from the reference base content.- Returns:
- The reference base content of this variant.
-