Jan. 27th, 2017

dennisgorelik: (Default)
PostJobFree crawler found web page that causes fatal crash in AngleSharp parser:
using AngleSharp.Parser.Html;
.....
string pageHtml = LoadUrlContent("http://onestop.fiu.edu/financial-aid/loans/")
var parser = new HtmlParser();
var document = parser.Parse(pageHtml);
document.QuerySelectorAll("a"); // Fatal crash: "An unhandled exception of type 'System.StackOverflowException' occurred in AngleSharp.dll".

We cannot catch that exception and it simply restarts the whole process (PostJobFreeService Windows service).
That is very frustrating.

In development environment that crash is not always reproducible.
When we run code above in test - it just works.
But if we run the same code under Visual Studio debugger - it crashes with 'System.StackOverflowException'.

Update:
https://github.com/AngleSharp/AngleSharp/issues/523
AngleSharp library maintainers noticed that problematic page contains a lot of "<content /><content /><content /><content />" attributes.
view-source:http://onestop.fiu.edu/financial-aid/loans/

Obviously it is not an excuse to fail. Hopefully their latest build would fix the problem.

Profile

dennisgorelik: (Default)
Dennis Gorelik

September 2017

S M T W T F S
      12
34567 8 9
1011 12131415 16
17181920212223
24252627282930

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Sep. 21st, 2017 08:47 am
Powered by Dreamwidth Studios