This guide provides step-by-step instructions to set up the ImageNet dataset for your machine learning projects using two different sources: Hugging Face and Kaggle.
-
Clone the ImageNet Dataset Repository
Use the following
git
command to clone the ImageNet dataset. You will be prompted to enter a password; use your Hugging Face access token as the password.git clone https://HUGGINGFACE_ACCESS_TOKEN@huggingface.co/datasets/imagenet-1k
Replace
HUGGINGFACE_ACCESS_TOKEN
with your actual Hugging Face access token. -
Install
pigz
sudo apt-get install pigz
-
Decompress the Dataset
bash ./decompress.bash
-
Organize Dataset Files
Execute the Python script to create necessary folders and move the dataset files into their respective categories.
python create_folders_and_mv.py
Follow the detailed guide on Using the ImageNet Dataset with PyTorch.
-
Just do
bash setup_kaggle.bash