Jan. 27th, 2017

dennisgorelik: (Default)
PostJobFree crawler found web page that causes fatal crash in AngleSharp parser:
using AngleSharp.Parser.Html;
.....
string pageHtml = LoadUrlContent("http://onestop.fiu.edu/financial-aid/loans/")
var parser = new HtmlParser();
var document = parser.Parse(pageHtml);
document.QuerySelectorAll("a"); // Fatal crash: "An unhandled exception of type 'System.StackOverflowException' occurred in AngleSharp.dll".

We cannot catch that exception and it simply restarts the whole process (PostJobFreeService Windows service).
That is very frustrating.

In development environment that crash is not always reproducible.
When we run code above in test - it just works.
But if we run the same code under Visual Studio debugger - it crashes with 'System.StackOverflowException'.

Update:
https://github.com/AngleSharp/AngleSharp/issues/523
AngleSharp library maintainers noticed that problematic page contains a lot of "<content /><content /><content /><content />" attributes.
view-source:http://onestop.fiu.edu/financial-aid/loans/

Obviously it is not an excuse to fail. Hopefully their latest build would fix the problem.

Profile

dennisgorelik: (Default)
Dennis Gorelik

July 2017

S M T W T F S
      1
2345678
9 101112 131415
16171819202122
23242526272829
3031     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 20th, 2017 08:42 pm
Powered by Dreamwidth Studios