Base Classes

BaseDeltaConfig

class BaseDeltaConfig(modified_modules=None, exclude_modules=None, unfrozen_modules=['deltas'], common_structure=False, backbone_class=None, backbone_checkpoint_name=None, backbone_hash=None)[source]

Base class for all configuration classes. Handles a few parameters common to all delta models’ configurations as well as methods for loading/downloading/saving configurations.

Class attributes (overridden by derived classes):

  • delta_type (str) – the name of the delta modules, used to create the correct AutoConfig.

Parameters
  • modified_modules (List[str], optional, defaults to None) –

    The list of keys to determine which modules you want to modify. OpenDelta will take every modulees that ends with the one of the provided keys as the modification target. When not given any value, i.e. modified_modules=None, the delta module will use the it corresponding default modification modules. Taking DistilBertModel with an classifier on top as an example:

    Note

    Examples: When adding delta to DistilBertModel,

    1. set to ["0.attention.out_lin"] will add delta modules to the attention output of distilbert’s layer 0, i.e., distilbert.transformer.layer.0.attention.out_lin.

    2. set to ["attention.out_lin"] will add the delta modules in every layer’s attention.out_lin.

  • unfrozen_modules (List[str], optional, defaults to ["deltas"]) – The modules that are unfrozen during training in freeze_module(), which includes the ones that are newly introduced as delta modules, and the ones that are originally a part of the model but set to trainable (requires_grad=True) to train together with the delta modules. Opendelta will take every modules that ends with the one of the provided keys and all its sub-modules and paramters as trainable.

  • exclude_modules (str, optional, default to None) –

    The modules starts with these strings will be excluded in modification. Note that currently only plain text (no regular expression) is supported.

    Note

    Examples: When adding delta to DistilBertModel,

    1. set this argument to ["bias"] will make all bias terms tunable.

    2. set this argument to ["attention"] will make all parameters in all attention modules tunable.

    3. set this argument to ["deltas"] will make all the parameters in the newly introduced delta modules tunable.

    4. set this argument to ["classifier"] will make all parameters in the classifier tunable.

    5. set this argument to ["3.ffn.lin2", "deltas", "classifier"], will make all parameters in the third layer’s feed forward layer’s send linear layer, the detla modules, and the classifiers modules tunable.

  • common_structure (bool, optional, default to None) – Whether using the common structure mapping of the transformer model when designating modified_modules` and ``unfrozen_modules.

  • backbone_class (str, optional, default to None) – The name of backbone model’s class, e.g. RobertaForMaskedLM. Saving this infomation let the users explicitly know on which backbone the delta model is trained.

  • backbone_checkpoint_name (str, optional, default to None) – The specific checkpoint of the model. In ideal case, it should be the url to download the checkpoint. However, we do not force the user to specify a downloadable url here.

  • backbone_hash (str, optional, default to None) – The md5-hash of the backbone model. It is calculated using the string representation of the model and the sequential expansion of all the parameters in the model. When loading a delta checkpoint in strict mode, the hash of the backbone model will be compared to the hash in this config.

classmethod from_finetuned(finetuned_delta_path: Union[str, PathLike], **kwargs) BaseDeltaConfig[source]

Instantiate a BaseDeltaConfig (or a derived class) from a finetined delta module configuration.

Parameters
  • finetuned_model_name_or_path (str or os.PathLike) –

    This can be either:

    • a string, the model id of a finetuned delta model configuration hosted inside a model repo on deltahub.co. Valid model ids can be located at the root-level, like bert-base-uncased, or namespaced under a user or organization name, like dbmdz/bert-base-german-cased.

    • a path to a directory containing a configuration file saved using the BaseDeltaConfig.save_finetuned() method, e.g., ./my_model_directory/.

    • a path or url to a saved configuration JSON file, e.g., ./my_model_directory/configuration.json.

  • cache_dir (str or os.PathLike, optional) – Path to a directory in which a downloaded pretrained delta model configuration should be cached if the standard cache should not be used.

delta_config = AdapterConfig.from_finetuned("thunlp/FactQA_T5-large_Adapter", backbone_model=t5)
save_finetuned(save_directory: Union[str, PathLike], **kwargs)[source]

Save a configuration object to the directory save_directory, so that it can be re-loaded using the BaseDeltaConfig.from_finetuned() class method.

Parameters
  • save_directory (str or os.PathLike) – Directory where the configuration JSON file will be saved (will be created if it does not exist).

  • push_to_hub (bool, optional, defaults to False) –

    Whether or not to push your model to the Hugging Face model hub after saving it.

    Warning

    1. Will raise error if you haven’t config a Huggingface Model Hub.

    2. Using push_to_hub=True will synchronize the repository you are pushing to with save_directory, which requires save_directory to be a local clone of the repo you are pushing to if it’s an existing folder. Pass along temp_dir=True to use a temporary directory instead.

  • kwargs – Additional key word arguments.

classmethod from_dict(config_dict: Dict[str, Any], **kwargs) BaseDeltaConfig[source]

Instantiate a BaseDeltaConfig from a python dictionary of parameters.

Parameters
  • config_dict (Dict[str, Any]) – Dictionary that will be used to instantiate the configuration object. Such a dictionary can be retrieved from a pretrained checkpoint by leveraging the get_config_dict() method.

  • kwargs (Dict[str, Any]) – Additional parameters from which to initialize the configuration object.

Returns

The configuration object instantiated from those parameters.

Return type

BaseDeltaConfig

to_dict() Dict[str, Any][source]

Serializes this instance to a Python dictionary.

DeltaBase

class DeltaBase(backbone_model: Module, modified_modules: Optional[List[str]] = None, exclude_modules: Optional[List[str]] = None, unfrozen_modules: Optional[List[str]] = None, interactive_modify: Optional[Union[bool, int]] = False, common_structure: Optional[bool] = False, backend: Optional[str] = 'hf')[source]

This is the base class for all delta models. It provides four simple but effective functionalities for building the delta model:

  1. addressing a module inside the backbone model using a minimal description key.

  2. provide the interface for modifying and inserting model which keeps the docs/IO the same as the module before modification.

  3. pass a pseudo input to determine the inter dimension of the delta models.

  4. freeze a part of model parameters according to key.

It also provides unified interface for model loading and saving.

Class attributes (overridden by derived classes):

  • delta_type (str): the name of the delta modules, used to create the correct opendelta.AutoDeltaModel.

  • config_class (BaseDeltaConfig): The corresponding config model

Parameters
  • backbone_model (nn.Module, required) – backbone model that the delta models are build opon. The modification to the backbone model are in place.

  • modified_modules (List[str], optional, default to None) –

    The modules are subjected to update.

    Note

    leave this argument None will make the delta model return to the default setting, which add the delta models to the position experimented the paper. In this setting, the common structure mapping is loaded to addressing the corresponding modules.

  • exclude_modules (str, optional, default to None) – The modules starts with these strings will be excluded in modification. Note that currently only plain text (no regular expression) is supported.

  • unfrozen_modules (str, optional, default to None) – The modules that are not frozen when freezing the main part of the model.

  • registraction_name (str, optional, default to "deltas") – The root name of the delta models when attached to the backbone model.

  • common_structure (bool, optional, default to None) – Whether use the common structure mapping to specify the modified_modules. i.e., if common_structure=True, then we use a common [“attn”] for attention module in different models. We DO NOT recommend manually set common_structure to true by yourself unless you are using delta among multiple backbones and don’t want to modify the code.

  • interactive_modify (bool or int, optional, default to None) – Whether to use interactive modification. By setting to int can specify the port of web server.

config_class

alias of BaseDeltaConfig

forward(*args, **kwargs) RuntimeError[source]

Warning

Removed method. As the model is a delta model, which should be attached to a backbone model and can’t forward any data by itself. Please using the backbone model’s forward function after attach the delta model to the backbone.

classmethod from_config(config: Union[BaseDeltaConfig, dict], backbone_model: Module, check_hash=True, **kwargs)[source]

Initialize a delta model from a config object or a dict containing the configs. To temperarily change a value in the config, pass it through kwargs. If the config has a backbone model’s hash, which means it is a finetuned delta model’s config, then we will compare the hash in the config and the newly caculated to ensure the finedtuned delta model is trained on the passed backbone_model. Pass check_hash=False to disable the checking.

Parameters
  • config (BaseDeltaConfig or dict) – initialize the delta model.

  • backbone_model (nn.Module) – model. modifications will be made in place in the backbone model.

  • check_hash (bool, default to True) – backbone hash.

  • kwargs – Any configurations that are passed to update the config object. #TODO unit test needed.

add_all_delta_to_backbone(backbone: Module, modified_modules: List[str]) Module[source]

The main function to add delta models to the backbone model based on the modified_modules.

Parameters
  • backbone_model (nn.Module, required) – modification to the backbone model are in place.

  • modified_modules (List[str], optional, default to None) – leave this argument None will make the delta model return to the default setting, which add the delta models to the position experimented the paper. In this setting, the common structure mapping is loaded to addressing the corresponding modules.

Returns

nn.Module The modified backbone model.

update_module(module: Module, key: str)[source]

Update a module specified by key. The method is reimplemented in each specific delta model.

freeze_module(module: Optional[Module] = None, exclude: Optional[List[str]] = None, set_state_dict: Optional[bool] = True)[source]

Freeze the parameters of plm. Leave the parameters in exclude untouched. deltas module is filtered with _is_delta attributes because it may have parameter sharing to the main model, (e.g., bias term)

Parameters
  • module (nn.Module, optional, default to None) – The module of which some parts are frozen. If left with None, the function will the self.backbone_model as the module to be frozen.

  • exclude (List[str], optional, default to ["deltas"]) – The parameters that don’t need to be freezed. Default to all the delta parameters.

  • set_state_dict (bool, optional, default to True) – Whether setting the backbone model’s state dict to all the parameters that still need grad.

  • prefix (str, optional, default to "") – A parameters that are used for recursive frozen. Should not be changed by passing argument other than "".

find_key(key: str, target_list: List[str])[source]

Check whether any target string is in the key or in the tail of the key, i.e.,

Parameters
  • key (str) – The key (name) of a submodule in a ancestor module. E.g., model.encoder.layer.0.attention

  • target_list (List[Union[str, re.Pattern]]) – The target list that we try to match key with. E.g., [“attention”]

Returns

bool True if the key matchs the target list.

find_module(root_module: Module, key: str)[source]

Find the module using a key and the root module. Return both the parent reference, the child name and reference.

Parameters
  • root_module (root_module) – The root_module to find the sub module in

  • key (str) – The relative key to the root module.

Returns

  • A reference to the parent module of the target module, mainly for substuting the target module.

  • The key of the target module relevant to its parent module

  • Target module.

Return type

(nn.Module, str, nn.Module)

replace_module(parent_module: Module, child_name: str, child_module: Module, new_module: Module, delta_name: Optional[str] = 'delta')[source]

Replace a module’s child module with the new_module(a delta module). Used by delta method based on direct replacement, such as opendelta.delta_modules.lora.LoraModel.

Parameters
  • parent_module (nn.Module) – The parent module of the replacement.

  • child_name (str) – The chird module’s name, i.e., parent_module.child_name give us child_module

  • child_module (nn.Module) – The original child module.

  • new_module (nn.Module) – The delta module.

  • delta_name (str, optional, default ot delta) – The name of the delta module, used for recording. parent_module.delta_name WILL NOT give you the delta module.

modify_module(module: Module)[source]

Modify the inside parameteres of a module. This method will be reimplemented in different derived class if needed.

insert_module(module, method='sequential', delta_module=None, delta_name='delta', strict=False, _delta_info=None)[source]

insert a module (previous not exists in the code base) before/after a module. Specifically, it modifies the forward function of the original module to firstly pass the arguments into the new module’s forward function and then pass it into the original ones. The new module can also be inserted after the original module with similar mechanism.

When implementing the new module , researchers should be aware of the components of arguments of the original module’s forward function.

Parameters
  • module – (nn.Module): The (sub)module to inserted a delta module.

  • delta_module – (DeltaBase): The delta module to be inserted.

  • name – (str, optional): The name of the delta in the backbone module.

  • strict – (bool, optional): Whether to prohibit modify a modified module.

  • _delta_info (Dict, optional) – Used in attach(), reattach a delta module to backbone. The info of original delta is passed through _delta_info.

insert_sequential_module(module, delta_module=None, delta_name='delta', strict=False, _delta_info=None)[source]

insert a module (previous not exists in the code base) before/after a module. Specifically, it modifies the forward function of the original module to firstly pass the arguments into the new module’s forward function and then pass it into the original ones. The new module can also be inserted after the original module with similar mechanism.

When implementing the new module , researchers should be aware of the components of arguments of the original module’s forward function.

Parameters
  • module – (nn.Module): The (sub)module to inserted a delta module.

  • delta_module – (DeltaBase): The delta module to be inserted.

  • name – (str, optional): The name of the delta in the backbone module.

  • strict – (bool, optional): Whether to prohibit modify a modified module.

  • _delta_info (Dict, optional) – Used in attach(), reattach a delta module to backbone. The info of original delta is passed through _delta_info.

insert_parallel_module(module, delta_module=None, delta_name='delta', strict=False, _delta_info=None)[source]

insert a module (previous not exists in the code base) across a module. Specifically, it modifies the forward function of the original module to firstly pass the arguments into the delta model’s forward function and set aside the calculation result. Then combine it with the calculation result output from the backbone module.

When implementing the new module , researchers should be aware of the arguments and keywards of the original module’s forward function.

Parameters
  • module – (nn.Module): The (sub)module to inserted a delta module.

  • delta_module – (DeltaBase): The delta module to be inserted.

  • name – (str, optional): The name of the delta in the backbone module.

  • strict – (bool, optional): Whether to prohibit modify a modified module.

  • _delta_info (Dict, optional) – Used in attach(), reattach a delta module to backbone. The info of original delta is passed through _delta_info.

set_active_state_dict(module: Module)[source]

modify the state_dict function of the model (by default, the backbone model) to return only the tunable part.

Parameters

module (nn.Module) – The module modified. The modification is in-place.

log(module=None, delta_ratio=True, trainable_ratio=True, visualization=True, cuda_memory=True)[source]

Log and visualize the result of applying delta. Possible Options are trainable_ratio, visualization, delta_ratio.

Parameters
  • delta_ratio (bool, optional) – Whether computing the ratio of parameters in the delta modules.

  • trainable_ratio (bool, optional) – Whether computing the ratio of trainable parameters.

  • visualization (bool, optional) – Whether visualize the parameter information of the modified backbone.

attach(module: Optional[Module] = None, reset_state_dict=True)[source]

Reattach the delta modules to the backbone. Note that this method can not be used to create new delta modules. Instead, a DeltaBase.detach() should precede this method.

Parameters

module (object, optional, default to None) – The backbone module that we reattach the deltas to.

detach(module: Optional[Module] = None, reset_state_dict=True)[source]

Detach the delta module from the backbone. The delta module is not deleted, but temporarily turned off. Use DeltaBase.attach() to reattach the delta model to the backbone.

Parameters

module (object, optional, default to None) – The backbone module that we detached the deltas from.