EmbeddingsDivision package

Submodules

EmbeddingsDivision.embedding_division

class EmbeddingsDivision.embeddings_division.EmbeddingsDivision(model_name, device='cpu')[source]

Bases: object

create_inner_model(model_class: AutoModelForCausalLM, model_name: str)[source]

Generates a specialized model class by inheriting from the specified base model and initializes it with the provided model name. The resulting model is loaded with pretrained weights via the from_pretrained method.

Args:

model_class (AutoModelForCausalLM): The Hugging Face model class to be extended. model_name (str): The name or path of the pretrained model.

Returns:

AutoModelForCausalLM: An instance of the custom model class with loaded weights.

create_tuned_model(model_class: AutoModelForCausalLM, model_name: str)[source]

Generates a specialized model class by inheriting from the specified base model and initializes it with the provided model name. The resulting model is loaded with pretrained weights via the from_pretrained method.

Args:

model_class (AutoModelForCausalLM): The Hugging Face model class to be extended. model_name (str): The name or path of the pretrained model.

Returns:

AutoModelForCausalLM: An instance of the custom model class with loaded weights.

divide_embeddings(ratio: float)[source]

Splits the model’s embedding layer into two new embedding layers, each sized according to the specified ratio of the original vocabulary size.

Args:
ratio (float): A floating-point value between 0 and 1, indicating the

proportion of the vocabulary allocated to the first embedding layer.

Raises:

ValueError: If the ratio is not strictly between 0 and 1.

This method replaces the original embedding layer with two new embedding layers, transfers the corresponding weight data from the original embedding to each of the new layers, and updates the model’s forward pass. The original embedding layer is deleted afterward to free resources.

forward_change()[source]
modified_forward(input_ids: LongTensor | None = None, attention_mask: Tensor | None = None, position_ids: LongTensor | None = None, past_key_values: Cache | List[FloatTensor] | None = None, inputs_embeds: FloatTensor | None = None, labels: LongTensor | None = None, use_cache: bool | None = None, output_attentions: bool | None = None, output_hidden_states: bool | None = None, return_dict: bool | None = None, cache_position: LongTensor | None = None, num_logits_to_keep: int = 0, **kwargs)[source]
original_forward(*args, **kwargs)[source]
scheduler_hook(grad, row)[source]

Module contents

EmbeddingsDivision library v0.1.0

The Apache 2.0 License Copyright © Dmitrii Kuzmin