All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
-
Fixed
num_training_steps
for lightning 1.7. -
Changed all static methods
add_*_args
to standard formadd_argparse_args
. -
Deprecated strategies based on DataParallel as in
pytorch-lightning
and added MPS accelerator. -
Fixed deprecated classes in lightning 1.7.
-
Moved
pre_trained_dir
hyperparameter fromDefaults
toTransformersModelCheckpointCallback
. -
Fixed
JsonboardLogger
withpytorch-lightning>=1.6
.
-
Fixed steps computation when
max_steps
is not provided by the user. -
Added
JsonboardLogger
. -
Added some tests for automatic steps computation with
deepspeed
.
-
Fixed
TransformersMapDataset
parameters and adapter loading. -
Removed
CompressedDataModule
. -
Added
RichProgressBar
withglobal_step
logging. -
Fixed deprecated
transformers
AdamW
inside optimizers totorch
implementation. -
Fixed typos.
-
Removed update
TransformersModelCheckpointCallback
. -
TransformersModel.num_training_steps
is not a function and not a property anymore + fix. -
Updated tests to use new
accelerator
andstrategy
signature for defining the training hardware to be used. -
Fixed check on shuffle in
SuperDataModule
. -
Completely removed metrics package, now all metrics available in
torchmetrics
library.
- Package publication fixed
-
Added
trainer
as second positional argument of every DataModule. -
Renamed
MapDataset
toTransformersMapDataset
. -
Fixed typo about default shuffling in
SuperDataModule
andCompressedDataModule
.
-
Added
SortingLanguageModeling
technique and tests. -
Added
SwappingLanguageModeling
technique and tests. -
Added
add_argparse_args
method toSuperAdapter
to allow adding parameters to the CLI. -
Fixed typo with which
AdapterDataModule
was not receivingcollate_fn
argument. -
Fixed typos in
imports
. -
Refactored
datamodules
section.
-
Added
get_dataset
method toAdaptersDataModule
to facilitate creation of dataset from adapters. -
Dropped support for
drop_last
in every dataloader: lightning usesFalse
everywhere by default. -
Fixed
TransformersModel.num_training_steps
that in some cases was providing slightly wrong numbers due to rounding. -
Fixed
whole_words_tail_mask
inlanguage_modeling
which was not working correctly. -
Improved testing of
models
andlanguage_models
.
-
Added tests for
optimizers
package. -
Fixed some imports.
-
Fixed some calls to super method in optimizers and schedulers.
- Fixed
metrics
package imports and added tests.
-
Added
LineAdapter
to read files line by line. -
Every
add_*_specific_args
method now should return nothing. -
Added
predict
capability toAdaptersDataModule
. -
Added
predict
capability toCompressedDataModule
. -
Added
do_predict()
andpredict_dataloader()
toSuperDataModule
. -
Added
do_preprocessing
init argument toMapDataset
andTransformersIterableDataset
to eventually avoid calling the preprocessing function defined in theAdapter
. -
Added check over tokenizer type in
whole_word_tails_mask()
. -
Added functions
get_optimizer
,get_scheduler
,num_training_steps
and corresponding CLI parameters toTransformersModel
to allow for more flexible definition of optimizers and schedulers. -
Added optimizer wrappers to be instantiated through CLI parameters. You can still use your own optimizer in
configure_optimizers
without problems. -
Added scheduler wrappers to be instantiated through CLI parameters. You can still use your own scheduler in
configure_optimizers
without problems. -
(Re)Added metrics package with
HitRate
. However, this will likely be moved totorchmetrics
in the next releases. -
Changed
hparams
attribute of every class (models
,adapters
,datamodules
,optimizers
,schedulers
,callbacks
anddatasets
) tohyperparameters
to avoid conflict with new lightninghparams
getters and setters. -
Changed logic of
TransformersModelCheckpointCallback
since training loop has changed inpytorch-lightning
v1.4. -
Removed
TransformersAdapter
because it was too specific and useless. -
General refactoring of classes. Cleaned and removed unused imports. Refactored also some tests.
-
Added
CompressedDataModule
based onCompressedDataset
-
Added
CompressedDataset
based onCompressedDictionary
-
Removed
IterableDataset
-
Metrics has been moved to the
torchmetrics
library (#81) -
Removed losses package because it has been empty for months.
-
Language models do not modify
inputs
anymore (#74) -
All
Language Models
have now a genericprobability
parameter (signature of all language models has been updated). -
Improved efficiency of
ElectraAdamW
.