Utilities for parsing configuration from YAML files and merging them with argparse CLI args (See Getting started for a concrete example).

For this I use OmegaConf and I have extended it with integration for argparse.

This way we can tweak models and run investigative experiments with weird configurations from the CLI, without polluting our configuration files.

When some configuration shows promise then we can create a configuration file out of it with detailed description and set it in stone for reproducibility.

The whole process is transparent, if you follow the conventions.

generate_example_config(parser, output_file, args=None)

parse_config Parse a provided YAML config file and command line args and merge them

During experimentation we want ideally to have a configuration file with the model and training configuration, but also be able to run quick experiments using command line args. This function allows you to double dip, by overriding values in a YAML config file through user provided command line arguments.

The precedence for merging is as follows * default cli args values < config file values < user provided cli args


  • if you don't include a value in your configuration it will take the default value from the argparse arguments
  • if you provide a cli arg (e.g. run the script with --bsz 64) it will override the value in the config file

Note we use an extended OmegaConf istance to achieve this (see slp.config.omegaconf.OmegaConf)


Name Type Description Default
parser ArgumentParser

The argument parser you want to use

output_file str

Configuration file name or file descriptor to save example configuration

args Optional[List[str]]

Optional input sys.argv style args. Useful for testing. Use this only for testing. By default it uses sys.argv[1:]

Source code in slp/config/
def generate_example_config(
    parser: argparse.ArgumentParser,
    output_file: str,
    args: Optional[List[str]] = None,
) -> None:
    """parse_config Parse a provided YAML config file and command line args and merge them

    During experimentation we want ideally to have a configuration file with the model and training configuration,
    but also be able to run quick experiments using command line args.
    This function allows you to double dip, by overriding values in a YAML config file through user provided command line arguments.

    The precedence for merging is as follows
       * default cli args values < config file values < user provided cli args


       * if you don't include a value in your configuration it will take the default value from the argparse arguments
       * if you provide a cli arg (e.g. run the script with --bsz 64) it will override the value in the config file

    Note we use an extended OmegaConf istance to achieve this (see slp.config.omegaconf.OmegaConf)

        parser (argparse.ArgumentParser): The argument parser you want to use
        output_file (Union[str, IO]): Configuration file name or file descriptor to save example configuration
        args (Optional[List[str]]): Optional input sys.argv style args. Useful for testing.
            Use this only for testing. By default it uses sys.argv[1:]
    config = parse_config(parser, None, include_none=True), output_file)

make_cli_parser(parser, datamodule_cls)

make_cli_parser Augment an argument parser for slp with the default arguments

Default arguments for training, logging, optimization etc. are added to the input {parser}. If you use make_cli_parser, the following command line arguments will be included

Name Type Description Default
parser ArgumentParser

A parent argument to be augmented

datamodule_cls LightningDataModule

A data module class that injects arguments through the add_argparse_args method



Type Description

argparse.ArgumentParser: The augmented command line parser


>>> import argparse
>>> from import PLDataModuleFromDatasets
>>> parser = argparse.ArgumentParser("My cool model")
>>> parser.add_argument("--hidden", dest="model.hidden", type=int)  # Create parser with model arguments and anything else you need
>>> parser = make_cli_parser(parser, PLDataModuleFromDatasets)
>>> args = parser.parse_args(args=["--bsz", "64", "--lr", "0.01"])
Source code in slp/config/
def make_cli_parser(
    parser: argparse.ArgumentParser, datamodule_cls: pl.LightningDataModule
) -> argparse.ArgumentParser:
    """make_cli_parser Augment an argument parser for slp with the default arguments

    Default arguments for training, logging, optimization etc. are added to the input {parser}.
    If you use make_cli_parser, the following command line arguments will be included

parse_config(parser, config_file, args=None, include_none=False)

parse_config Parse a provided YAML config file and command line args and merge them

During experimentation we want ideally to have a configuration file with the model and training configuration, but also be able to run quick experiments using command line args. This function allows you to double dip, by overriding values in a YAML config file through user provided command line arguments.

The precedence for merging is as follows * default cli args values < config file values < user provided cli args


  • if you don't include a value in your configuration it will take the default value from the argparse arguments
  • if you provide a cli arg (e.g. run the script with --bsz 64) it will override the value in the config file

Note we use an extended OmegaConf istance to achieve this (see slp.config.omegaconf.OmegaConf)


Name Type Description Default
parser ArgumentParser

The argument parser you want to use

config_file Union[str, IO]

Configuration file name or file descriptor

args Optional[List[str]]

Optional input sys.argv style args. Useful for testing. Use this only for testing. By default it uses sys.argv[1:]



Type Description
Union[omegaconf.listconfig.ListConfig, omegaconf.dictconfig.DictConfig]

OmegaConf.DictConfig: The parsed configuration as an OmegaConf DictConfig object


>>> import io
>>> from slp.config.config_parser import parse_config
>>> mock_config_file = io.StringIO('''
  hidden: 100
>>> parser = argparse.ArgumentParser("My cool model")
>>> parser.add_argument("--hidden", dest="model.hidden", type=int, default=20)
>>> cfg = parse_config(parser, mock_config_file)
{'model': {'hidden': 100}}
>>> type(cfg)
<class 'omegaconf.dictconfig.DictConfig'>
>>> cfg = parse_config(parser, mock_config_file, args=["--hidden", "200"])
{'model': {'hidden': 200}}
>>> mock_config_file = io.StringIO('''
random_value: hello
>>> cfg = parse_config(parser, mock_config_file)
{'model': {'hidden': 20}, 'random_value': 'hello'}
Source code in slp/config/
def parse_config(
    parser: argparse.ArgumentParser,
    config_file: Optional[Union[str, IO]],
    args: Optional[List[str]] = None,
    include_none: bool = False,
) -> Union[ListConfig, DictConfig]:
    """parse_config Parse a provided YAML config file and command line args and merge them

    During experimentation we want ideally to have a configuration file with the model and training configuration,
    but also be able to run quick experiments using command line args.
    This function allows you to double dip, by overriding values in a YAML config file through user provided command line arguments.

    The precedence for merging is as follows
       * default cli args values < config file values < user provided cli args


       * if you don't include a value in your configuration it will take the default value from the argparse arguments
       * if you provide a cli arg (e.g. run the script with --bsz 64) it will override the value in the config file

    Note we use an extended OmegaConf istance to achieve this (see slp.config.omegaconf.OmegaConf)

SPECIAL_TOKENS Special Tokens for NLP applications

Default special tokens values and indices (compatible with BERT):

* [PAD]: 0
* [MASK]: 1
* [UNK]: 2
* [BOS]: 3
* [EOS]: 4
* [CLS]: 5
* [SEP]: 6
* [PAUSE]: 7


OmegaConfExtended Extended OmegaConf class, to include argparse style CLI arguments

Unfortunately the original authors are not interested into providing integration with argparse (, so we have to get by with this extension

from_argparse(parser, args=None, include_none=False) staticmethod

from_argparse Static method to convert argparse arguments into OmegaConf DictConfig objects

We parse the command line arguments and separate the user provided values and the default values. This is useful for merging with a config file.


Name Type Description Default
parser ArgumentParser

Parser for argparse arguments

args Optional[List[str]]

Optional input sys.argv style args. Useful for testing. Use this only for testing. By default it uses sys.argv[1:]



Type Description
Tuple[omegaconf.dictconfig.DictConfig, omegaconf.dictconfig.DictConfig]

Tuple[omegaconf.DictConfig, omegaconf.DictConfig]: (user provided cli args, default cli args) as a tuple of omegaconf.DictConfigs


>>> import argparse
>>> from slp.config.omegaconf import OmegaConfExtended
>>> parser = argparse.ArgumentParser("My cool model")
>>> parser.add_argument("--hidden", dest="model.hidden", type=int, default=20)
>>> user_provided_args, default_args = OmegaConfExtended.from_argparse(parser, args=["--hidden", "100"])
>>> user_provided_args
{'model': {'hidden': 100}}
>>> default_args
>>> user_provided_args, default_args = OmegaConfExtended.from_argparse(parser)
>>> user_provided_args
>>> default_args
{'model': {'hidden': 20}}
Source code in slp/config/
def from_argparse(
    parser: argparse.ArgumentParser,
    args: Optional[List[str]] = None,
    include_none: bool = False,
) -> Tuple[DictConfig, DictConfig]:
    """from_argparse Static method to convert argparse arguments into OmegaConf DictConfig objects

    We parse the command line arguments and separate the user provided values and the default values.
    This is useful for merging with a config file.

from_yaml(file_) staticmethod

Alias for OmegaConf.load OmegaConf.from_yaml got removed at some point. Bring it back


Name Type Description Default
file_ Union[str, pathlib.Path, IO[Any]]

file to load or file descriptor



Type Description
Union[omegaconf.dictconfig.DictConfig, omegaconf.listconfig.ListConfig]

Union[DictConfig, ListConfig]: The loaded configuration

Source code in slp/config/
def from_yaml(
    file_: Union[str, pathlib.Path, IO[Any]]
) -> Union[DictConfig, ListConfig]:
    """Alias for OmegaConf.load
    OmegaConf.from_yaml got removed at some point. Bring it back

