michaelg/dev: links for common-crawl-extractor