Monday, April 13, 2020

COVID-19: Australia turning the corner

Australia's new COVID cases are plummeting:


... and the cumulative number of cases is flattening, per this logarithmic scale:


These charts and others are courtesy of the state-owned Australian Broadcasting Corporation; they are updated daily.

We're in lockdown, with non-essential business closed, and people working from home where possible.  But in so many ways, the whole word is in a strange place right now - a singularity for which we may never have a similar experience.  The solutions are well known: test, test, test, trace known cases, and isolate.  But testing depends on the number and quality of test kits, which is quite variable around the world - as is both political and social will.

Per the top 30 countries by GDP, our death rate - the number that matters the most - is 20th, one better than China, which implemented faster and more draconian lockdowns.  the 10 worst death rates per million are:


Country
Deaths/ 1M pop
1
Spain
368
2
Italy
329
3
Belgium
311
4
France
221
5
Netherlands
160
6
UK
156
7
Switzerland
128
8
Sweden
89
9
USA
67
10
Iran
53

:
:
20
Australia
2


However, I have learnt from reading between the lines on the numbers is that comparisons are fraught.  Australia, for example, shares no borders with other countries, which is a great advantage.  New Zealand: even more isolated.  Numbers of cases and deaths reported is quite variable, depending on the country and how much testing they do.  Sweden is another exception, as they did not lock down, which is something they have come to regret.

My friend Bill says you should look at the US by individual State.  That puts his State, Oregon, way down with only one death per million.  New York is at top, with 32 deaths per million, followed by New Jersey with 17.  Also above Australia and China's 2 deaths per million is: Louisiana, Michigan, Conneticut, then 11 more.


Actual numbers will never be certain.  I expect that a few years after it's over, analysts will get closer to true numbers by interpolating from existing death rates in the previous few years - and we will be surprised to find the countries whose COVID-related deaths are much higher than reported, as not all COVID deaths are known cases.  This will clearly be so for countries like Indonesia, India and Iran, but may also surprise us in countries like USA, China and UK.

I expect that at least in part, the eventual numbers per country will depend on social cohesion, and willingness to heed lockdowns and isolations.  Governments can mandate or recommend, but it's up to the people as a whole whether they do the right thing.  Complacency is literally a killer.  I see it well in evidence in Sydney - although we have done well by and large by clamping down reasonably early.

How will the world look when we come out of this?  I don't know what permanent changes will be wrought, but I'm pretty sure it will take several years for economies to recover, and many already-disadvantaged people will be far, far worse off.  Expect to see a longer tail of mortality from other poverty-related causes, particularly in less developed economies and pockets of the US.


Sunday, April 12, 2020

A few insights into Blog comments spam

People have been spamming blog comments for years.  From manual beginnings, people developed spambots to automate the process, which understandably resulted in a huge increase in spam traffic.  In earlier days, the intention was a mixture of attempts to build traffic to legitimate sites (for both manual traffic and search engine optimisation),  and various scams including pump and dump.

Many blogs responded by turning off comments altogether; in my case I've vetted comments before publishing.

In more recent years, the spam comments have become apparently innocuous, and don't even try to link to other web sites.  This left me wondering what was going on, but I found it difficult to find out.

So with my previous post, I set up a honeypot to gather spam comments over time, with the intention of analysing them to understand better what was going on.  This entry will give some limited insights; maybe I will add to it when I know more.  The following is a first pass report on the results.

I posted the honeypot in August 2019, and spambot comments flowed in for about two and a half months, before abruptly stopping.  There's an implication that they're all from the same source - or using the same mechanism.

I analysed about 100 comments through a Natural Language Processing framework.  This is a form of Machine Learning (which is popularly referred to Artificial Intelligence, although I don't think it's an accurate term).  It wasn't able to tell me that much.  Amongst other things, it returned a high positive sentiment score through sentiment analysis.  This was fairly obvious already.  To get past spam vetting systems, the spambots intentionally fed relatively upbeat phrases.  They were mostly quite general comments, either about liking the blog or asking help with their own blog (again, no  links).  But it was possible to tell in the first instance that it was spam simply because there was no specific reference to the subject matter in the blog post.  To make this clear, in the honeypot I requested comments to include a specified word to flag that the commenter had read the post.  Which of course is beyond the capability of automated commenting tools.

The only thing I've really gotten from the NLP system so far is that very frequently the comments are very close variants on each other - in groups of two, three or more.  It's as if someone put together three sentences, made some variants on a few keywords/phrases, and then got the spambot to switch around the words frequently enough so as to specifically avoid getting caught by automated processes that blocked groups of indentical comments.

So it looks like it's an arms race between sets of automated tools, a battle to infuse comments on the one hand, and to deflect them on the other.  What hasn't been answered yet to my satisfaction is why the spambots are still running but are not delivering weblink payloads.  My only guess is as before: that the spambots are being used to pinpoint blogs/news sites that allow unfiltered comments to get through.  My feeling is that there must be more to it than that, so suggestions are welcome.

PS: Will a new post get those spambots started again on this blog?  Let's see.