Sr Web Crawler Engineer in Data
- BSc/MSc in Computer Science or other related field.
- 2+ years working experience in web crawler related systems - design, development and maintain scalable web crawlers with solid experiences bypassing anti-crawling mechanism using techniques including headers, cookies, proxies, simulated logins, etc.
- Familiar with web crawling frameworks like Selenium and Scrapy is a huge plus.
- Experienced in Spark, Hadoop Yarn or similar distributed computation framework is a plus.
- Large-scale web crawling platform – design, implement and maintain scalable web crawlers to collect large scale data from the web and inject into our platform. You will be responsible for constructing the core modules and tech stacks in crawling platform and solve challenging problems like bypassing anti-crawling.
- Crawler Monitoring – design, implement our internal platform for monitoring all web crawlers and use our big data platform to integrate with other monitoring metrics.