aurynn: ok. here we go: first part: for a in $( seq 1 34 ); do lynx -dump http://www.ted.com/index.php/talks/list/page/$a > $a.html; done
aurynn: it fetches lists of talks - 34 pages of them
stores each page as .html (which is bad, as it is not html)
aurynn: 2nd part: cat *.html | perl -ne 'print if s{^\s*\d+\.\s+(http://www.ted.com/index.php/talks/[^/]+\.html)\s*$}{$1\n}' | sort | uniq > big.list
aurynn: it generates list of all pages of talks.
aurynn: cat big.list | while read URL; do OUTPUT=$( echo $URL | sed 's#.*/#out/#;s#html$#txt#' ); lynx -dump "$URL" > "$OUTPUT"; echo $URL; done
it fetches all pages of talks, and stores them in .txt
for a in *.txt; do grep -q -E '\[[0-9]+\]Video to desktop \(Zipped MP4\)' $a || echo $a; done
this lists all .txt files which don't have link of "video to desktop" (to talks, both seem to be someone singing
i removed these .txt files
finally: for a in *.txt; do POS=$( grep -E '\[[0-9]+\]Video to desktop \(Zipped MP4\)' $a | sed 's/^.*\[//;s/\].*//' ); grep -E "^[[:space:]]*$POS\.[[:space:]]http" $a | sed "s/.*http/wget -O $a.zip http/;s/txt.zip/zip/"; done > runme.sh
this generates runme.sh, which wgets all videos, and stores them in .zip