dennisgorelik: (Default)
Today Moon's shadow is going to cross from Pacific to Atlantic coast of the US in 90 minutes:

It takes passenger jet 3.5 hours to fly the same distance (e.g. from San Francisco to New York)

We (in Ponte Vedra, FL) are going to see only partial ~90% solar eclipse.
San Francisco is going to get 75% of the eclipse 90 minutes ahead of us.
dennisgorelik: (Default)
I think it's more likely that your "great programmers" simply understand the difference between the same functionality and accidentally similar functionality. The latter is where you have two use cases that are very similar, so you spend all this time deduplicating. Then one of the use cases changes... The correct response would be to duplicate the code again, because the two use cases are no longer similar. In reality, they should have never been combined in the first case. They weren't the same; they were only accidentally similar.
But instead what you usually see is minor tweaks to the common functions. Pass in a flag here, tweak the inputs there, add an if statement over yonder... And before you know it, it's all a terrible tangled mess that is full of branches and technical debt. The two use cases have the same functions, but don't even follow the same branches within the functions.
dennisgorelik: (Default)
Waterfall sightseeing in Yemen on Google Maps

1) Garbage.
2) Males clothes.
3) Females closes.

Yemen GDP per Capita: $990/year
dennisgorelik: (Default)
Жертва пропаганды бьет пропагандиста кремля. РФ 21 век.

Страница нападавшего:

Полный выпуск НТВ:

На мужчине, напавшем на Развозжаева, надета футболка с надписью «Оплот». Так называется созданная на Украине организация, которая выступала против «Евромайдана» и занимала пророссийскую позицию.

Александр Орлов, задержанный: «Слушайте, я готов принести извинения и денежную компенсацию. Давай добазаримся. Я не хотел. Просто я иду, и он мне какую-то гадость говорит, а я ему руку кинул, и все».
dennisgorelik: (Default)
I was surprised today that very few people understand what falsifiability means and how to apply that useful evaluation tool.

Of course I knew pretty well, that an average person has no idea about what Popper's criteria of falsifiability is.
But most of my online friends do not understand what "Falsifiability" means either.

It is a pity, because Falsifiability criteria - is a very powerful tool that allows to quickly separate potentially useful theories from pseudo-scientific scam.

Hopefully my friends understand better what testability is (the meaning of "testability" and "falsifiability" overlaps a lot).
dennisgorelik: (Default)
About half a year ago my dentist told me that there is a decay below my tooth bridge. Therefore that the bridge needs to be re-done (at the cost of around $4,000). She insisted, that tooth decay is not reversible.
When I asked her if a tooth can heal itself - she said that it is possible, but only if decay did not reach below enamel.
In the following months I started to pay more attention to flossing and rinsing around that problem space, and yesterday went to another dentist for a cleanup and dental exam.

New dentist found decay below the bridge and recommended to replace it (at the cost of $4,814). He insisted, that tooth decay is not reversible. When I asked him if a tooth can heal itself - he said that it is possible, but only if decay did not reach below enamel.

The good news for me is that the new dentist did not find any problems with the bridge that my previous dentist wanted to re-do.
And my previous dentist did not find any significant problems with the bridge that the new dentist wants to fix.

In addition to replacing bridge, the new dentist suggested to pull my 4 wisdom teeth ($236 each) and re-do a filling ($350).

The question is - how do I know know what recommendation has a merit and what recommendation is just a money grab?
dennisgorelik: (Default)
Пенсии не будет. Старости, впрочем, тоже. Как и работы. Будет одобренная Роскомнадзором и РПЦ виртуальная реальность, где вы будете вечно брать Сталинград.
dennisgorelik: (Default)
Short names vs unique names
It is a good practice to use shorter method names, because long names are harder to read.
It is cleaner to call:

But then we end up with multiple "Save()" methods in different classes. For example:

Problem with non-unique names
If we search our codebase for "Save(" - we would find a lot of methods and method calls. Only some of them would relate to the functionality we actually want to research (for example, we may want to research where "Candidate Save()" functionality is used because we consider refactoring or deleting it).

Plain text search vs code references
Visual Studio allows to find all references to a specific method by right-mouse-clicking on a method name and selecting "Find All References".
So, non-unique method names problem is solved, right?
Not quite.
Visual Studio is not able to track method calls that are made from aspx and ashx files.
Visual Studio is also not able to find method references in the comments.

ReSharper vs vanilla Visual Studio
ReSharper actually is able to find method references in aspx, ashx and even in comments. Until Visual Studio 2015 that worked fine. But since ReSharper team and codebase aged, and Visual Studio switched to new Roslyn compiler, ReSharper team was not able to keep up and delivered only barely working resource hog, that is practically not usable with newer version of Visual Studio (too slow).

Get rid of aspx and ashx files?
It is actually pretty easy to avoid using ashx handlers and use standard C# classes to implement HttpHandler interface.
But what about aspx pages: can we get rid of them too and use only standard C# HttpHandlers?
If we could do that, then we would be able to rely on "Find All References" feature again.
But, unfortunately, getting rid of aspx pages is not that simple. We would have to reimplement a lot of functionality that aspx has.
For example:
- Page PostBack support would be gone.
- Ability to nicely combine HTML code and aspx controls alongside each other would be gone.
- HTML syntax validation would be gone (no HTML syntax validation for C# strings in Visual Studio).

If it ain't broke - don't fix it
Even though it is pretty straightforward operation to convert existing ashx files into standard C# classes (where Visual Studio is able to track all method references) - such conversion is not without its own problems.
- Conversion takes developer's time.
- Code replacement could introduce silly bugs.
- Moving code from class to class makes navigating "svn blame" - a little bit trickier.
So if an ashx handler was working in the solution for many years already - does it make sense to touch it now?

The benefits of code refactoring
In spite of "If it ain't broke - don't fix it" rule - cleaning up code is still needed. If we do not keep code clean (do not delete unneeded parts and do not clear confusing things such as hidden references) - then our codebase would be extremely hard to maintain. Fixing a bug would introduce other bugs. Features would be very hard to add without adding bugs.

It depends
There is no single solution that can be applied to all situations. In software development we consider multiple problems and constantly weigh pros and cons against each other.
For example, out of 11 remaining ashx files, we:
- Deleted one file because we do not use it ("Reduce amount of code when possible" principle).
- Would migrate one file to the standard C# HttpHandler, because today during refactoring a developer missed a method call from that ashx file.
- Keep other 9 ashx files as is ("If it ain't broke - don't fix it" principle).

What are your examples of balancing problems against each other?
dennisgorelik: (Default)
One of our existing air-conditioners is about to die soon.
So we need a replacement. But which one?

Currently we need a smaller unit for the second floor.

Carrier system
One of AC installer companies recommend "Carrier" system:
Cooling Capacity : 22000
Heating Capacity : 21600
SEER : 14.5
EER : 12
HSPF : 8.2

Heat Pump Carrier 25HCE424A003
-10 year parts limited to original purchaser upon timely registration (otherwise 5 years).

Air Handler Carrier FV4CNF002L00
- Variable Speed

Auxiliary Heater Carrier CE0501N05

Thermostat Carrier TP-WEM01
- Wifi Capable.
- 5 year warranty if registered in a timely manner.
Price (including installation): $5,699

That choice seems a little bit pricey for a small AC unit. But what do you think?

Most homeowners report spending between $3,710 and $7,140 to have air conditioning installed. This price is more typical of a central A/C unit installation rather than a window central air conditioner addition which typically averages about $300.
dennisgorelik: (Default)
Couple of years ago security pen testers found clickjacking bug in Google API Explorer:
Google did pay out a $1,337 bounty
“The idea behind the exploit is to frame the page where that button was, and make the frame transparent.”

Here is the demo of how highjacking setup page looks like:
Hijacker's web site content that invites user to click somewhere:
<p><input type="button" value="Click to see cats' videos"></p>
iframe { 
position: absolute;
top: 0; left: 0; 
filter: alpha(opacity=50); 
opacity: 0.50;
<iframe src="">
Note that Wikipedia's security team made a conscious choice to allow clickjacking of their home page, because there is nothing at risk there.
But if you click "Log in" (or replace "" with "" in the demo html above) - you would notice that Wikipedia login page is not rendered in the iframe.

How did Wikipedia do that?
I opened Fiddler2 debuggin proxy and found out that "" renders this HTTP header:
X-Frame-Options: DENY
But "" page does NOT render that header.

How to prevent clickjacking?
Extra experimenting showed that, and use "X-Frame-Options: SAMEORIGIN" uses "X-Frame-Options: DENY" (the same as Wikipedia login page).
There are three possible directives for X-Frame-Options:
X-Frame-Options: DENY
X-Frame-Options: SAMEORIGIN
X-Frame-Options: ALLOW-FROM

What is the best practice for using "X-Frame-Options"?
I am trying to decide what "X-Frame-Options:" should I use for
Does the flexibility of iframe worth the security risk?
Should we support web site that include content into their own iframe?
Such iframe support has both cons and pros...

Why secure option is not a default choice in browsers?
What do you think, why browsers (such as Google Chrome and Firefox) do not assume "X-Frame-Options: SAMEORIGIN" by default?
If allowing loading your page content into parent iframe is inherently insecure, then such a risky behavior should be explicitly requested, right?
dennisgorelik: (Default)
How-to create a VM on Hyper-V
Yesterday I learned how to setup a new virtual machine from scratch on Hyper-V.
Then I created an instruction for that:

1) Install "Hyper-V" (if not installed yet)

2) Launch "Hyper-V Manager".

3) Create Virtual Network Switch (if not created yet)
- Right-click "RONAM" -> "Virtual Switch Manager".
- "New virtual network switch" -> "External" -> click [Create Virtual Switch].
- Make sure that "External network" is selected.
- Make sure that "Intel(R) Ethernet Connection (2) I219-LM" is selected.
- Click [Apply].
- "Pending changes may disrupt network connectivity" popup would show up. Click [Yes]. Wait for ~10 seconds.

4) Right-click "RONAM" and select "New" -> "Virtual Machine".

5) "Specify Name and Location"
Name: "BaseVM"
Location: "C:\VM"

6) "Specify Generation"
Select "Generation 2" radiobutton.

7) "Assign Memory"
Startup memory: 4096 MB 

8) "Configure Networking"
Connection: "VS2017Switch"

9) "Connect Virtual Hard Disk"
Select "Create a virtual hard disk" radiobutton.
Size: 80 GB.

10) "Installation Options" sub-tab
Select "Install An Operating System From A Boot CD/DVD-ROM".
Image file (.iso): "C:\install\en_windows_server_2016_x64_dvd_9718492.iso"

11) "Summary"
Click [Finish].

12) Right-click "BaseVM" -> "Start".
That would start Windows installation.

Then I asked K. (a developer on my team) to test my instruction by creating a new VM from scratch and fix the instruction if needed.

The "fix"
K. successfully created new VM and "fixed" the instruction by adding small details to it. That effectively made that instruction about 25% longer:
Read more... )
Formally, that "fixed" instruction is correct -- clicking "Next" button is one of the likely steps that person would take in order to go through new VM settings.
However including "Click [Next]" steps into the instruction makes instruction worse and not better, and here is why:
1) The longer the instruction - the harder it is to read, understand, and follow that instruction.
It is also harder to review and modify longer instructions.
2) The obvious steps in an instruction - distract the reader from the non-obvious steps, such as "what 'Generation' option to choose" and "how much RAM to allocate".
3) Any user who is going to create a new VM from scratch does not really need to get instruction on how to operate the wizard:

Understand your readers
Here is a rule that K. violated by "fixing" the instruction:
Do NOT include into instruction steps that are obvious for all likely readers of that instruction: if the reader is able to reliably figure out an omitted step in a few seconds - then this step does not belong to the instruction.
However if the step is obvious only for some readers and is not obvious for other readers - then such step should be included into the instruction (because it is much more time consuming to figure out non-obvious step than to skip obvious instruction steps).

Understand your goals
Another important consideration when writing instructions - is to clearly understand the purpose of the instruction. I knew why I needed that instruction:
1) To help developers on my team to quickly familiarize themselves with setting up Hyper-V VMs.
2) In the future - to remind me and other developers key steps we used in creating our VMs.
3) To have a source file that we can edit to reflect key choices in configuration of our VM.

K. did not have these goals in mind and, actually, thought that such "VM setup" instruction is useless, but puffed up the instruction anyway. According to his belief - all instructions must be written in such a way that even a dummy would be able to follow it. The mistake here is to assume that the dummy (who does not know how to navigate a wizard) would actually read our instruction.
dennisgorelik: (Default)
About half a year ago Samsung released two pretty fast SSDs:
Samsung 960 EVO
$479.99 for 1TB

Samsung 960 PRO
$579.99 for 1TB

As you can see, PRO version is exactly $100 more expensive than EVO.

Does it worth?

According to specification:
960 EVO sequential read is up to 3.2GB/second.
960 PRO sequential read is up to 3.5GB/second.

However the reality is about 40% slower than advertising specification:
On my home server I got about 2GB/second sequential read for 960 EVO, and about 1.8 GB/second for 960 PRO.

To benchmark my SSDs I copied several files with ~80GB size into nul in Far Manager.

I used this motherboard (which is quite modern):
ASUS Motherboard, (PRIME Q270M-C/CSM)


Do you know what could be the reason why I cannot get promised 3.2 GB/second?
And why PRO has slower performance than cheaper EVO?

I even swapped PRO and EVO between NVMe slots on my motherboard, but the results were consistent: PRO was slower than EVO.

Update (thanks to mugunin):
Finally the benchmark that looks similar to what I measured (sequential read):
In our 2MB sequential benchmark, the Samsung 960 EVO recorded the best results in read with 2,308.5MB/s—even beating out the 960 Pro. On writes, it came in second with 1,660.9MB/s, only losing to the Pro version.
dennisgorelik: (Default)
Santhosh claims in his resume that he is a Web deveper.
Interview showed that he is probably a web or graphic designer. But not a developer.
He mentioned Javascript, but when I started to talk with him about specific task that could be implemented in Javascript - he quickly gave up.
All people on his team have "Senior System Analyst" title, but from my understanding, Santhosh is a junior at his role.
When describing the accomplishments, Santhosh used word "we". Sometimes "we" meant him, and sometimes "we" meant his team.
Skype audio connection was good (which is not typical for Skype calls to India). That is probably because Santhosh worked (on the bench) from his employer office.
2000 rupees per day ($30/day = ~$700/month).

Unfortunately nothing of what Santhosh can do a meaningful contribution to PostJobFree: we need mostly backend work (middle-tier, parser, SQL queries and database design) or solid UX. Santhosh did not show signs of either of that knowledge.
So I told Santhosh that his skills do not match what I am looking for and asked him if he has any questions for me.
He did not have any questions.
Few minutes later he messaged me:
Santhosh: Hi is any possibility to give one task related to Ui Ux Design and see if I didn't complete we can drop for further or else we will continue as well
Ui Ux/Front-end Development
Dennis: I do not have tasks suitable for your skills
This was a 27 minutes interview.
I should learn to recognize such mismatches much faster.
dennisgorelik: (Default)
Couple of days ago I interviewed Volodimir from Ukraine.
Volodymyr promised to work 6 days per week 14 hour per day for about $1500/mo.
His expertise is in writing "data processing" code.

I asked Volodymyr to give me examples of input and output of his processes.
Volodimir said that the input could be anything.
I asked him to be more specific, so I could understand.
Volodimir kept insisting that it could be any data.
I asked what kind of business problem does that process solve.
Volodimir kept insisting that it does not matter.
Eventually we both gave up in frustration.
I wrote to Volodymyr "your skills probably would not work out for working with me -- I simply would not be able to communicate with you clearly".
Volodymyr replied:
This is a content of one column of one row of more than 1000000 rows which I use as input data : "2025050201401014016060 6090305025050201401014016060609030507014010901303016014".
If this is interested for You - try to understand what is this.
Your knowleges in programming is so low.
At first, You need to understand what is a main tasks of programming.
At second, You need to choose a tasks which You will solve and decide for why You need it.
You absolutely not understanding bases of programming.
When You will have enough skills in programming You will stop ask "an examples of data you are working".
I think - speaking skills of Russian, English or other languages for speaking about nothing - is just spent time. 
I`m usually very busy. 
And don`t want spent time.All Your conclusions is - big mistake.
I don`t want spent time for nothing.
dennisgorelik: (Default)
ElasticSearch team defends the bloat in ElasticSearch Percolator 5.4
If you're not interested in ranking you can easily turn it off, by wrapping the percolate query in a constant_score query.
The percolator tries to tag the queries automatically based on the containing query terms. However it can't do this for all percolator queries, because the percolator doesn't know how to extract meaningful information during indexing for all queries. This is a work in progress and will get better over time. It already has shown a significant performance improvement for cases where the percolator was able to analyze the percolator query correctly at index time.

1) Funny how in order to turn off unneeded feature, application developers have to create an extra wrapper around their query.

2) "work in progress" did not stop ElasticSearch team from breaking backward compatibility and forcing their users to rewrite their legacy code in favor of "work in progress" ElasticSearch 5.4.

3) "a significant performance improvement" is not quantified, and the cases where that improvement happened -
not described.

See also: ElasticSearch Percolator Bloat - part 1
dennisgorelik: (Default)
Кирша Данилов вышел на работу на Демидовских заводах:

Подписывайтесь и следите за новостями. Должно быть интересно.

dennisgorelik: (Default)
I think these sanctions would end up being mostly symbolic and would have no real effect. Which is exactly how it should be, considering that some limited lobbying of US elections by other countries is a good thing (it keeps countries together).

The overwhelmingly bipartisan vote of 97-2 sent a message to Vladimir Putin that lawmakers on both sides of the aisle are serious about punishing Russia for its actions last year -- and sent a message to Trump that they're serious about ensuring that those sanctions stay in place until Congress is ready to lift them.

German Chancellor Angela Merkel's spokesman ... said it was "strange" that sanctions intended to punish Russia for alleged interference in the U.S. elections could also trigger penalties against European companies.
dennisgorelik: (Default)
Early ElasticSearch History
Back in 2010 Shay Banon created first version of ElasticSearch.
Over the years the product matured.
In November 2012, ElasticSearch team received $10M in Series A funding.
Then in February 2013 they received $24M in Series B funding.
That helped them to produce very robust ElasticSearch 1.0 (2014-02-12) and then ElasticSearch 1.6 (2015-06-09) that we currently use.

$70M bloat
June 2014 - $70M Series C funding.
Shay Banon became a CEO and excused himself from active involvement in development and communicating with customers.
That is where the bloat began.
It looks like ElasticSearch team decided that since they have so much money - they can do pretty much whatever they want.
So they broke backward compatibility of their percolator by squeezing percolator into the standard format of ElasticSearch index.

What is percolator?
ElasticSearch percolator does reverse operation to a standard ElasticSearch query.
Standard ElasticSearch query allows our job seekers to find matching jobs.
Percolator allows job seekers to use their job search query in order to create a job alert.
Then when, in the future, new job is posted (by somebody else) -- the percolator is able to find all job alerts that job seekers created. That allows us to notify all owners of these matching alerts about new matching job (within a minute of receiving a job).

Differences between standard search query and percolator query
Because of the reverse nature of percolator, it functions very different from a standard search query:
1) Standard search query should normally produce only 10 results (users is unlikely to read more) and support paging.
Percolator always wants to get all matching alerts (also known as "percolator queries") - not just 10 of them, because every job seeker wants to get notified about new matching jobs to their favorite job alert.
2) Standard search - ranks search results based on the quality of the match (and then order results by descending rank). Such ranking does NOT make sense for percolator (because every job seeker wants to get notified anyway).

Why use standard search index format for percolator?
So why had ElasticSearch team decided to break backward compatibility and merge Percolator into a standard search index format?
This is their excuse:
Prior to 5.0, all percolator queries need to be executed on this in-memory index in order to verify whether the query matches. So the idea is that the less queries that need to be verified by the in-memory index the faster the percolator executes.
In my first reading of that ambiguous claim I thought that ElasticSearch would be able to automatically detect what percolator queries is ok to skip, so it would, effectively, improve percolator performance.

What actually happened
We spend few days to setup proper experiment and found out that ElasticSearch 5.4 percolator is 3 times slower than ElasticSearch 1.6 percolator (or in other words, ElasticSearch percolator performance degrades proportionally to the version number).

The correct interpretation of that "less queries that need to be verified" claim actually meant that application developer in ElasticSearch 5.4 has an option to tag percolator queries (alerts), and then write code that would help percolator to skip alerts that have no chance to being triggered by a document we percolate.
But the problem is that it is very hard to come up with such "alerts skipping" algorithm. Percolator is so valuable in the first place exactly because of that ability to determine what alerts match and what alerts do not!

The summary
Series C $70M funding encouraged ElasticSearch team to break backward compatibility and produce useless features (such as paging and ranking in percolator) + degrade performance 3x.

Next: ElasticSearch Percolator Bloat - the Defense


dennisgorelik: (Default)
Dennis Gorelik

August 2017

  1 234 5
678910 11 12
20 212223242526


RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Aug. 21st, 2017 09:56 am
Powered by Dreamwidth Studios