JSoup.connect Throws 403 Error While Apache.httpclient Is Able To Fetch The Content
I am trying to parse HTML dump of any given page. I used HTML Parser and also tried JSoup for parsing. I found useful functions in Jsoup but I am getting 403 error while calling D
Solution 1:
Working solution is as follows (Thanks to Angelo Neuschitzer for reminding to put it as a solution):
Document doc = Jsoup.connect(url).userAgent("Mozilla").get();
Elements links = doc.getElementsByTag(HTML.Tag.CITE.toString);
for (Element link : links) {
String linkText = link.text();
System.out.println(linkText);
}
So, userAgent does the trick :)
Post a Comment for "JSoup.connect Throws 403 Error While Apache.httpclient Is Able To Fetch The Content"