Use git lfs to pull only specific folders or files
Some git lfs repositories, because the files are so large, sometimes you only need to download a few folders or files from them.
Like this project, the website is https://huggingface.co/datasets/Wenetspeech4TTS/WenetSpeech4TTS/tree/main
1. Clone with GIT_LFS_SKIP_SMUDGE=1
I want to clone without large files - just their pointers, run:
GIT_LFS_SKIP_SMUDGE=1 git clone git@hf.co:datasets/Wenetspeech4TTS/WenetSpeech4TTS
The directory structure is as follows:
WenetSpeech4TTS
|
|____ Premium
| |__ Premium_md5check.txt
| |__ WenetSpeech4TTS_Premium_0.tar.gz
| |__ ...
|
|____ Standard
| |__ Standard_md5check.txt
| |__ WenetSpeech4TTS_Standard_0.tar.gz
| |__ ...
|
|____ Basic
| |__ Basic_md5check.txt
| |__ WenetSpeech4TTS_Basic_0.tar.gz
| |__ WenetSpeech4TTS_Basic_1.tar.gz
| |__ ...
|
|____ Rest
| |__ Rest_md5check.txt
| |__ WenetSpeech4TTS_Rest_0.tar.gz
| |__ ...
|
|____ Filelists
| |__ Premium.lst
| |__ ...
|
|____ DNSMOS
| |__ Premium_DNSMOS.lst
| |__ ...
|
|____ Testset
| |__...
|
|____ README.md
2. Pull data with --include option
According to the git-lfs
help document:
$ git lfs pull -h
git lfs pull [options] []
Download Git LFS objects for the currently checked out ref, and update
the working copy with the downloaded content if required.
This is equivalent to running the following 2 commands:
git lfs fetch [options] []
git lfs checkout
Options:
* -I --include=:
Specify lfs.fetchinclude just for this invocation; see "Include and exclude"
* -X --exclude=:
Specify lfs.fetchexclude just for this invocation; see "Include and exclude"
You can specify the --include
or -I
flag (they are aliases of each other) to only include a specific filename in your pull.
For example, if you only wanted to pull the file called "WenetSpeech4TTS_Standard_0.tar.gz" in the folder of "Standard", try:
git lfs pull --include "Standard/WenetSpeech4TTS_Standard_0.tar.gz"
Or, if you only wanted to pull files matching the ".tar.gz" extension, try:
git lfs pull --include "*.tar.gz"
When given either the --include or --exclude, LFS will only pull files that are explicitly included and not excluded. For more information on these filters, you can check out our documentation here.