Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commented out the comments #4

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Commented out the comments
  • Loading branch information
asaf-c committed Dec 1, 2019
commit 6ecc0789a7a458da49362ed6db43068e6a9b7a94
6 changes: 3 additions & 3 deletions archive/code/pullrobots.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ if [[ $NOS =~ ^[0-9]+$ ]]; then
exit 1
fi

#Get the current list of Alexa sites (Updated daily)
# Get the current list of Alexa sites (Updated daily)
echo ""
echo "Downloading the top websites file…"
echo ""
Expand All @@ -39,10 +39,10 @@ rm *.zip
sed 's/.*,//g' top-1m.csv > tocomma.csv
sed 's/,//g' tocomma.csv > domains.txt

Take a certain number of them to work on
# Take a certain number of them to work on
head -n $NOS domains.txt > $DATE-top$NOS-domains.txt

Pull the robots.txt file from each
# Pull the robots.txt file from each
echo ""
echo "Downloading the robots.txt file for each site…"
echo ""
Expand Down