EmbeddingsDivision package

Submodules

EmbeddingsDivision.embedding_division

class EmbeddingsDivision.embeddings_division.EmbeddingsDivision(model_name, device='cpu')[source]

Bases: object

create_inner_model(model_class: AutoModelForCausalLM, model_name: str)[source]

Generates a specialized model class by inheriting from the specified base model and initializes it with the provided model name. The resulting model is loaded with pretrained weights via the from_pretrained method.

Args:: model_class (AutoModelForCausalLM): The Hugging Face model class to be extended. model_name (str): The name or path of the pretrained model.
Returns:: AutoModelForCausalLM: An instance of the custom model class with loaded weights.

create_tuned_model(model_class: AutoModelForCausalLM, model_name: str)[source]

Generates a specialized model class by inheriting from the specified base model and initializes it with the provided model name. The resulting model is loaded with pretrained weights via the from_pretrained method.

Args:: model_class (AutoModelForCausalLM): The Hugging Face model class to be extended. model_name (str): The name or path of the pretrained model.
Returns:: AutoModelForCausalLM: An instance of the custom model class with loaded weights.

divide_embeddings(ratio: float)[source]

Splits the model’s embedding layer into two new embedding layers, each sized according to the specified ratio of the original vocabulary size.

Args:

ratio (float): A floating-point value between 0 and 1, indicating the: proportion of the vocabulary allocated to the first embedding layer.

Raises:

ValueError: If the ratio is not strictly between 0 and 1.

This method replaces the original embedding layer with two new embedding layers, transfers the corresponding weight data from the original embedding to each of the new layers, and updates the model’s forward pass. The original embedding layer is deleted afterward to free resources.

forward_change()[source]

modified_forward(input_ids: LongTensor | None = None, attention_mask: Tensor | None = None, position_ids: LongTensor | None = None, past_key_values: Cache | List[FloatTensor] | None = None, inputs_embeds: FloatTensor | None = None, labels: LongTensor | None = None, use_cache: bool | None = None, output_attentions: bool | None = None, output_hidden_states: bool | None = None, return_dict: bool | None = None, cache_position: LongTensor | None = None, num_logits_to_keep: int = 0, **kwargs)[source]

original_forward(*args, **kwargs)[source]

scheduler_hook(grad, row)[source]

Module contents

EmbeddingsDivision library v0.1.0