spine.data.batch.IndexBatch

class spine.data.batch.IndexBatch(data: Any | Sequence[Any], spans: Sequence[int] | Any, counts: Sequence[int] | Any | None = None, single_counts: Sequence[int] | Any | None = None, batch_ids: Sequence[int] | Any | None = None, batch_size: int | None = None, default: Any | None = None)[source]

Batched index with the necessary methods to slice it.

spans

(B) Per-entry parent spans used to build the batch offsets. This is the same quantity as the parser-side span and may be required when serializing unwrapped indexes for later rebatching.

Type:

Union[np.ndarray, torch.Tensor]

offsets

(B) Offsets between successive indexes in the batch, computed from the cumulative sum of spans.

Type:

Union[np.ndarray, torch.Tensor]

single_counts

(I) Number of index elements per index in the index list. This is the same as counts if the underlying data is a single index

Type:

Union[np.ndarray, torch.Tensor]

Attributes:
batch_ids

Returns the batch ID of each index in the list.

full_batch_ids

Returns the batch ID of each element in the full index list.

full_counts

Returns the total number of elements in each batch entry.

full_index

Returns the index combining all sub-indexes, if relevant.

index

Alias for the underlying data stored.

index_ids

Returns the ID of the index in the list each element belongs to.

index_list

Alias for the underlying data list stored.

shape

Shape of the underlying data.

splits

Boundaries needed to split the data into its constituents.

Methods

get_counts(batch_ids, batch_size)

Finds the number of elements in each entry, provided a batch ID list.

get_edges(counts)

Finds the edges between successive entries in the batch.

merge(index_batch)

Merge this index batch with another.

split()

Breaks up the index batch into its constituents.

to_numpy()

Cast underlying index to a np.ndarray and return a new instance.

to_tensor([dtype, device])

Cast underlying index to a torch.tensor and return a new instance.

__init__(data: Any | Sequence[Any], spans: Sequence[int] | Any, counts: Sequence[int] | Any | None = None, single_counts: Sequence[int] | Any | None = None, batch_ids: Sequence[int] | Any | None = None, batch_size: int | None = None, default: Any | None = None) None[source]

Initialize the attributes of the class.

Parameters:
  • data (Union[np.ndarray, torch.Tensor,) – List[Union[np.ndarray, torch.Tensor]]] Simple batched index or list of indexes

  • spans (Union[List[int], np.ndarray, torch.Tensor]) –

    1. Per-entry parent spans used to derive offsets.

  • counts (Union[List[int], np.ndarray, torch.Tensor], optional) –

    1. Number of indexes in the batch

  • single_counts (Union[List[int], np.ndarray, torch.Tensor], optional) – (I) Number of index elements per index in the index list. This is the same as counts if the underlying data is a single index

  • batch_ids (Union[List[int], np.ndarray, torch.Tensor], optional) – (I) Batch index of each of the clusters. If not specified, the assumption is that each count corresponds to a specific entry

  • batch_size (int, optional) – Number of entries in the batch. Must be specified along batch_ids

  • default (Union[np.ndarray, torch.Tensor], optional) – Empty-index prototype used when initializing an empty index list

Methods

__init__(data, spans[, counts, ...])

Initialize the attributes of the class.

get_counts(batch_ids, batch_size)

Finds the number of elements in each entry, provided a batch ID list.

get_edges(counts)

Finds the edges between successive entries in the batch.

merge(index_batch)

Merge this index batch with another.

split()

Breaks up the index batch into its constituents.

to_numpy()

Cast underlying index to a np.ndarray and return a new instance.

to_tensor([dtype, device])

Cast underlying index to a torch.tensor and return a new instance.

Attributes

batch_ids

Returns the batch ID of each index in the list.

full_batch_ids

Returns the batch ID of each element in the full index list.

full_counts

Returns the total number of elements in each batch entry.

full_index

Returns the index combining all sub-indexes, if relevant.

index

Alias for the underlying data stored.

index_ids

Returns the ID of the index in the list each element belongs to.

index_list

Alias for the underlying data list stored.

shape

Shape of the underlying data.

splits

Boundaries needed to split the data into its constituents.

data

counts

edges

batch_size

spans

offsets

single_counts

data: Any | Sequence[Any]
counts: Any
single_counts: Any
edges: Any
spans: Any
offsets: Any
batch_size: int
property index: Any

Alias for the underlying data stored.

Returns:

Underlying index

Return type:

Union[np.ndarray, torch.Tensor]

property index_list: Sequence[Any]

Alias for the underlying data list stored.

Returns:

Underlying index list

Return type:

List[Union[np.ndarray, torch.Tensor]]

property full_index: Any

Returns the index combining all sub-indexes, if relevant.

Returns:

  1. Complete concatenated index

Return type:

Union[np.ndarray, torch.Tensor]

property index_ids: Any

Returns the ID of the index in the list each element belongs to.

Returns:

  1. List of index IDs for each element

Return type:

Union[np.ndarray, torch.Tensor]

property full_counts: Any

Returns the total number of elements in each batch entry.

Returns:

  1. Number of elements in each batch entry

Return type:

Union[np.ndarray, torch.Tensor]

property batch_ids: Any

Returns the batch ID of each index in the list.

Returns:

  1. Batch ID array, one per index in the list

Return type:

Union[np.ndarray, torch.Tensor]

property full_batch_ids: Any

Returns the batch ID of each element in the full index list.

Returns:

  1. Complete batch ID array, one per element

Return type:

Union[np.ndarray, torch.Tensor]

split() list[Any] | list[list[Any]][source]

Breaks up the index batch into its constituents.

Returns:

List of list of indexes per entry in the batch

Return type:

List[List[Union[np.ndarray, torch.Tensor]]]

merge(index_batch: IndexBatch) IndexBatch[source]

Merge this index batch with another.

Parameters:

index_batch (IndexBatch) – Other index batch object to merge with

Returns:

Merged index batch

Return type:

IndexBatch

to_numpy() IndexBatch[source]

Cast underlying index to a np.ndarray and return a new instance.

Returns:

New TensorBatch object with an underlying np.ndarray tensor.

Return type:

TensorBatch

to_tensor(dtype: Any = None, device: Any = None) IndexBatch[source]

Cast underlying index to a torch.tensor and return a new instance.

Parameters:
  • dtype (torch.dtype, optional) – Data type of the tensor to create

  • device (torch.device, optional) – Device on which to put the tensor

Returns:

New TensorBatch object with an underlying np.ndarray tensor.

Return type:

TensorBatch