pytximport.importers

Importing functions for different transcript quantification tools.

Functions contained within this module are primarily destined for internal use but are exposed for advanced users who may want to use them directly.

Functions

parse_dataframe(transcript_dataframe, id_column, ...)

Parse a DataFrame with the transcript-level expression.

read_inferential_replicates_kallisto(file_path)

Read inferential replicates from a kallisto quantification file.

read_inferential_replicates_piscem(file_path)

Read inferential replicates from a piscem quantification file.

read_inferential_replicates_salmon(file_path[, ...])

Read inferential replicates from a salmon quantification file.

read_kallisto(file_path[, id_column, counts_column, ...])

Read a kallisto quantification file.

read_piscem(file_path[, id_column, counts_column, ...])

Read a piscem-infer quantification file.

read_rsem(file_path[, id_column, counts_column, ...])

Read an RSEM quantification file.

read_salmon(file_path[, id_column, counts_column, ...])

Read a salmon quantification file.

read_tsv(file_path, id_column, counts_column, ...[, ...])

Read a quantification file in tsv format.

Package Contents

pytximport.importers.parse_dataframe(transcript_dataframe, id_column, counts_column, length_column, abundance_column=None, recompute_counts=False)[source]

Parse a DataFrame with the transcript-level expression.

Parameters:
  • transcript_dataframe (pd.DataFrame) – The DataFrame with the transcript-level expression.

  • id_column (str) – The column name for the transcript id.

  • counts_column (str) – The column name for the counts.

  • length_column (str) – The column name for the length.

  • abundance_column (Optional[str], optional) – The column name for the abundance. Defaults to None.

  • recompute_counts (bool, optional) – Whether inferential replicates will be used to recompute counts and abundances. If true, the counts and abundances will not be read from the file. Defaults to False.

Returns:

The transcript-level expression.

Return type:

TranscriptData

pytximport.importers.read_inferential_replicates_kallisto(file_path)[source]

Read inferential replicates from a kallisto quantification file.

Parameters:

file_path (Union[str, Path]) – The path to the quantification file.

Returns:

The inferential replicates.

Return type:

InferentialReplicates

pytximport.importers.read_inferential_replicates_piscem(file_path)[source]

Read inferential replicates from a piscem quantification file.

Parameters:

file_path (Union[str, Path]) – The path to the quantification file. The file should be a .quant file that is colocated with the inferential replicates file (.infreps.pq).

Returns:

The inferential replicates.

Return type:

InferentialReplicates

pytximport.importers.read_inferential_replicates_salmon(file_path, aux_dir_name='aux_info')[source]

Read inferential replicates from a salmon quantification file.

Parameters:
  • file_path (Union[str, Path]) – The path to the quantification file.

  • aux_dir_name (Literal["aux_info", "aux"], optional) – The name of the aux directory. Defaults to “aux_info”.

Returns:

The inferential replicates.

Return type:

InferentialReplicates

pytximport.importers.read_kallisto(file_path, id_column='aux/ids', counts_column='est_counts', length_column='aux/eff_lengths', abundance_column=None, inferential_replicates=False, recompute_counts=False)[source]

Read a kallisto quantification file.

Parameters:
  • file_path (Union[str, Path]) – The path to the quantification file.

  • id_column (str, optional) – The column name for the transcript id. Defaults to “aux/ids”.

  • counts_column (str, optional) – The column name for the counts. Defaults to “est_counts”.

  • length_column (str, optional) – The column name for the length. Defaults to “aux/eff_lengths”.

  • abundance_column (Optional[str], optional) – The column name for the abundance. Defaults to None.

  • inferential_replicates (bool, optional) – Whether to read inferential replicates. Defaults to False.

  • recompute_counts (bool, optional) – Whether inferential replicates will be used to recompute counts and abundances. If true, the counts and abundances will not be read from the file. Defaults to False.

Returns:

The transcript-level expression.

Return type:

TranscriptData

pytximport.importers.read_piscem(file_path, id_column='target_name', counts_column='ecount', length_column='eeln', abundance_column='tpm', inferential_replicates=False, recompute_counts=False)[source]

Read a piscem-infer quantification file.

Parameters:
  • file_path (Union[str, Path]) – The path to the quantification file.

  • id_column (str, optional) – The column name for the transcript id. Defaults to “Name”.

  • counts_column (str, optional) – The column name for the counts. Defaults to “NumReads”.

  • length_column (str, optional) – The column name for the length. Defaults to “EffectiveLength”.

  • abundance_column (str, optional) – The column name for the abundance. Defaults to “TPM”.

  • aux_dir_name (Literal["aux_info", "aux"], optional) – The name of the aux directory. Defaults to “aux_info”.

  • inferential_replicates (bool, optional) – Whether to read inferential replicates. Defaults to False.

  • recompute_counts (bool, optional) – Whether inferential replicates will be used to recompute counts and abundances. If true, the counts and abundances will not be read from the file. Defaults to False.

Returns:

The transcript-level expression.

Return type:

TranscriptData

pytximport.importers.read_rsem(file_path, id_column='transcript_id', counts_column='expected_count', length_column='effective_length', abundance_column='TPM', gene_level=False)[source]

Read an RSEM quantification file.

Parameters:
  • file_path (Union[str, Path]) – The path to the quantification file.

  • id_column (str, optional) – The column name for the transcript ID. Defaults to “transcript_id”.

  • counts_column (str, optional) – The column name for the counts. Defaults to “expected_count”.

  • length_column (str, optional) – The column name for the length. Defaults to “effective_length”.

  • abundance_column (str, optional) – The column name for the abundance. Defaults to “TPM”.

  • gene_level (bool, optional) – Whether the quantification is at the gene level. Defaults to False.

Returns:

The transcript-level expression.

Return type:

TranscriptData

pytximport.importers.read_salmon(file_path, id_column='Name', counts_column='NumReads', length_column='EffectiveLength', abundance_column='TPM', aux_dir_name='aux_info', inferential_replicates=False, recompute_counts=False)[source]

Read a salmon quantification file.

Parameters:
  • file_path (Union[str, Path]) – The path to the quantification file.

  • id_column (str, optional) – The column name for the transcript id. Defaults to “Name”.

  • counts_column (str, optional) – The column name for the counts. Defaults to “NumReads”.

  • length_column (str, optional) – The column name for the length. Defaults to “EffectiveLength”.

  • abundance_column (str, optional) – The column name for the abundance. Defaults to “TPM”.

  • aux_dir_name (Literal["aux_info", "aux"], optional) – The name of the aux directory. Defaults to “aux_info”.

  • inferential_replicates (bool, optional) – Whether to read inferential replicates. Defaults to False.

  • recompute_counts (bool, optional) – Whether inferential replicates will be used to recompute counts and abundances. If true, the counts and abundances will not be read from the file. Defaults to False.

Returns:

The transcript-level expression.

Return type:

TranscriptData

pytximport.importers.read_tsv(file_path, id_column, counts_column, length_column, abundance_column=None, recompute_counts=False)[source]

Read a quantification file in tsv format.

Parameters:
  • file_path (Union[str, Path]) – The path to the quantification file.

  • id_column (str) – The column name for the transcript id.

  • counts_column (str) – The column name for the counts.

  • length_column (str) – The column name for the length.

  • abundance_column (Optional[str], optional) – The column name for the abundance. Defaults to None.

  • recompute_counts (bool, optional) – Whether inferential replicates will be used to recompute counts and abundances. If true, the counts and abundances will not be read from the file. Defaults to False.

Returns:

The transcript-level expression.

Return type:

TranscriptData