Class VCFProcessor
- All Implemented Interfaces:
Closeable
,AutoCloseable
VCFProcessor
class is responsible for processing Variant Call Format (VCF) files.
This class handles the analysis of VCF files, including the extraction of variant data, imputation of contigs, and integration of the processed data into the storage system. It provides methods to analyze VCF files, process variant contexts, and track statistics such as the number of processed, ignored, and filtered variant calls.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic final record
Represents an upstream deletion affecting a sample. -
Constructor Summary
ConstructorsConstructorDescriptionVCFProcessor
(List<Path> paths, Storage storage, boolean imputeContigs) Constructs a newVCFProcessor
instance for processing VCF files. -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
long
Retrieves the total number of filtered variant calls.long
Retrieves the total number of ignored variant calls.long
Retrieves the total number of processed variant calls.long
Retrieves the total number of realigned variant calls.Retrieves an unmodifiable collection of sample identifiers.int
Loads variant calls from the storage and processes them.void
Analyzes the provided VCF files and processes their variant data.void
Updates the variants in the storage by processing variant calls from the cache.
-
Constructor Details
-
VCFProcessor
Constructs a newVCFProcessor
instance for processing VCF files.This constructor initializes the processor with the specified list of VCF file paths, a storage object for managing genomic data, and a flag indicating whether to impute contigs from the VCF files. The
imputeContigs
flag determines if contigs should be inferred and added to the storage based on the VCF data.
-
-
Method Details
-
close
public void close()- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
-
processFiles
Analyzes the provided VCF files and processes their variant data.This method iterates through the list of VCF file paths, creates temporary indexed VCF files, and processes their variant contexts. If the `imputeContigs` flag is set, it infers contigs from the VCF files and adds them to the storage. The method processes variant contexts either for specific features in the storage or for all variants if no features are defined.
- Throws:
IOException
- If an I/O error occurs during file operations or VCF processing.
-
getProcessedCallsCount
public long getProcessedCallsCount()Retrieves the total number of processed variant calls.- Returns:
- The total number of processed variant calls as a
long
.
-
getRealignedCallsCount
public long getRealignedCallsCount()Retrieves the total number of realigned variant calls.- Returns:
- The total number of realigned variant calls as a
long
.
-
getIgnoredCallsCount
public long getIgnoredCallsCount()Retrieves the total number of ignored variant calls.This method returns the count of variant calls that were ignored during processing. A variant call may be ignored for reasons such as missing data, being classified as a reference call, or lacking sufficient information for analysis.
- Returns:
- The total number of ignored variant calls as a
long
.
-
getFilteredCallsCount
public long getFilteredCallsCount()Retrieves the total number of filtered variant calls.This method returns the count of variant calls that were filtered out during processing. Filtering may occur due to criteria such as low coverage, low frequency, or other conditions defined in the program.
- Returns:
- The total number of filtered variant calls as a
long
.
-
getSamples
Retrieves an unmodifiable collection of sample identifiers.This method returns a collection of all sample identifiers currently indexed in the variant call cache. The returned collection is unmodifiable, ensuring that the underlying data cannot be altered.
- Returns:
- An unmodifiable
Collection
ofString
objects representing the sample identifiers.
-
updateVariants
public void updateVariants()Updates the variants in the storage by processing variant calls from the cache.This method iterates through all samples and contigs in the cache, processes their variant calls, and adds the resolved variants to the storage. It handles complex cases such as deletions, insertions, and mixed InDels, ensuring that the variants are stored in a canonical format.
-
loadVariantCallsFromStorage
public int loadVariantCallsFromStorage()Loads variant calls from the storage and processes them.This method iterates through all contigs and samples in the variant call cache (`vcCache`), retrieves the associated variants, and processes their variant calls. Each variant call string is parsed into a `VariantCall` object and passed to the `processVariantCall` method for further processing.
The method ensures that only valid contigs and samples present in the storage are processed. It handles the relationship between contigs, samples, and variants, and updates the storage with the processed variant calls.
- Returns:
- The total number of loaded variant calls as an
int
.
-