Update all non-major dependencies - abandoned#198
Open
renovate[bot] wants to merge 2 commits into
Open
Conversation
Author
Autoclosing SkippedThis PR has been flagged for autoclosing. However, it is being skipped due to the branch being already modified. Please close/delete it manually or report a bug if you think this is in error. |
Author
Edited/Blocked NotificationRenovate will not automatically rebase this PR, because it does not recognize the last commit author and assumes somebody else may have edited the PR. You can manually request rebase by checking the rebase/retry box above. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
v3.2.0->v3.4.03.8.0->3.9.65.6.4->5.9.3==4.3.3->==4.9.2Release Notes
pre-commit/pre-commit-hooks
v3.4.0Compare Source
Features
file-contents-sorter: Add--uniqueargumentcheck-vcs-permalinks: Add--additional-github-domainoptiondestroyed-symlinksto detect unintentional symlink-breakages on windows.v3.3.0Compare Source
Features
file-contents-sorter: add--ignore-caseoption for case-insensitive sortingcheck-added-large-files: add--enforce-alloption to check non-added files as wellfix-byte-order-marker: new hook which fixes UTF-8 byte-order marker.Deprecations
check-byte-order-markeris now deprecated forfix-byte-order-markertimothycrosley/isort
v5.9.3Compare Source
--from-firstCLI flag shouldn't take any arguments.v5.9.2Compare Source
isort --check --atomicagainst Cython files.__init__.pyfiles during placement.v5.9.1Compare Source
v5.9.0Compare Source
__pypackages__directories by default.reverse_sortwhenforce_sort_within_sectionsis true PyCQA/isort#1726): isort ignores reverse_sort when force_sort_within_sections is true.Goal Zero (Tickets related to aspirational goal of achieving 0 regressions for remaining 5.0.0 lifespan):
v5.8.0Compare Source
-j) now defaults to number of CPU cores if no value is provided.--overwrite-in-placeto ensure same file handle is used after sorting.--extend-skipand--extend-skip-glob.v5.7.0Compare Source
isort.file.huggingface/transformers
v4.9.2Compare Source
v4.9.2: Patch release
v4.9.1Compare Source
v4.9.1: Patch release
Fix barrier for SM distributed #12853 (@sgugger)
v4.9.0Compare Source
v4.9.0: TensorFlow examples, CANINE, tokenizer training, ONNX rework
ONNX rework
This version introduces a new package,
transformers.onnx, which can be used to export models to ONNX. Contrary to the previous implementation, this approach is meant as an easily extendable package where users may define their own ONNX configurations and export the models they wish to export.CANINE model
Four new models are released as part of the CANINE implementation:
CanineForSequenceClassification,CanineForMultipleChoice,CanineForTokenClassificationandCanineForQuestionAnswering, in PyTorch.The CANINE model was proposed in CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation by Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting. It’s among the first papers that train a Transformer without using an explicit tokenization step (such as Byte Pair Encoding (BPE), WordPiece, or SentencePiece). Instead, the model is trained directly at a Unicode character level. Training at a character level inevitably comes with a longer sequence length, which CANINE solves with an efficient downsampling strategy, before applying a deep Transformer encoder.
Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=canine
Tokenizer training
This version introduces a new method to train a tokenizer from scratch based off of an existing tokenizer configuration.
TensorFlow examples
The
TFTraineris now entering deprecation - and it is replaced byKeras. With version v4.9.0 comes the end of a long rework of the TensorFlow examples, for them to be more Keras-idiomatic, clearer, and more robust.TensorFlow implementations
HuBERT is now implemented in TensorFlow:
Breaking changes
When
load_best_model_at_endwas set toTruein theTrainingArguments, having a differentsave_strategyandeval_strategywas accepted but thesave_strategywas overwritten by theeval_strategy(the option to keep track of the best model needs to make sure there is an evaluation each time there is a save). This led to a lot of confusion with users not understanding why the script was not doing what it was told, so this situation will now raise an error indicating to setsave_strategyandeval_strategyto the same values, and in the case that value is"steps",save_stepsmust be a round multiple ofeval_steps.General improvements and bugfixes
--log_levelfeature #12365 (@bhadreshpsavani)printstatement withlogger.infoin QA example utils #12368 (@bhadreshpsavani)einsumin Albert's attention computation #12394 (@mfuntowicz)push_to_hub#12391 (@patrickvonplaten)Repositoryimport to the FLAX example script #12501 (@LysandreJik)model_kwargswhen loading a model inpipeline()#12449 (@aphedges)_mask_hidden_statesto avoid double masking #12692 (@mfuntowicz)config.mask_feature_prob > 0#12705 (@mfuntowicz)listtype ofadditional_special_tokensinspecial_token_map#12759 (@SaulLu)clsand checkpoint #12619 (@europeanplaice)datasets_modulesImportError with Ray Tune #12749 (@Yard1)save_steps=0|Noneandlogging_steps=0#12796 (@stas00)v4.8.2Compare Source
Patch release: v4.8.2
v4.8.1Compare Source
v4.8.1: Patch release
v4.8.0Compare Source
v4.8.0 Integration with the Hub and Flax/JAX support
Integration with the Hub
Our example scripts and Trainer are now optimized for publishing your model on the Hugging Face Hub, with Tensorboard training metrics, and an automatically authored model card which contains all the relevant metadata, including evaluation results.
Trainer Hub integration
Use --push_to_hub to create a model repo for your training and it will be saved with all relevant metadata at the end of the training.
Other flags are:
push_to_hub_model_idto control the repo namepush_to_hub_organizationto specify an organizationVisualizing Training metrics on huggingface.co (based on Tensorboard)
By default if you have
tensorboardinstalled the training scripts will use it to log, and the logging traces folder is conveniently located inside your model output directory, so you can push them to your model repo by default.Any model repo that contains Tensorboard traces will spawn a Tensorboard server:
which makes it very convenient to see how the training went! This Hub feature is in Beta so let us know if anything looks weird :)
See this model repo
Model card generation
The model card contains info about the datasets used, the eval results, ...
Many users were already adding their eval results to their model cards in markdown format, but this is a more structured way of adding them which will make it easier to parse and e.g. represent in leaderboards such as the ones on Papers With Code!
We use a format specified in collaboration with [PaperswithCode] (https://github.com/huggingface/huggingface_hub/blame/main/modelcard.md), see also this repo.
Model, tokenizer and configurations
All models, tokenizers and configurations having a revamp
push_to_hub()method as well as apush_to_hubargument in theirsave_pretrained()method. The workflow of this method is changed a bit to be more like git, with a local clone of the repo in a folder of the working directory, to make it easier to apply patches (useuse_temp_dir=Trueto clone in temporary folders for the same behavior as the experimental API).Flax/JAX support
Flax/JAX is becoming a fully supported backend of the Transformers library with more models having an implementation in it. BART, CLIP and T5 join the already existing models, find the whole list here.
General improvements and bug fixes
Configuration
📅 Schedule: At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Renovate will not automatically rebase this PR, because other commits have been found.
👻 Immortal: This PR will be recreated if closed unmerged. Get config help if that's undesired.
This PR has been generated by WhiteSource Renovate. View repository job log here.