Discussion:
Off-topic concerning a web vote debacle
Joel Palmius
2008-03-14 12:27:13 UTC
Permalink
This is just an interesting story which is spinning around on swedish news
sites today, added with a mod_survey opinion at the end.

In sweden, the national television has daylight information every friday.
It's an animated sun which rises with numbers on when dawn arrives at
three major cities, and then sets with numbers on when night starts. In a
country as north as sweden this is interesting since dawn and dusk varies
heavily over the year. In the northernmost city, the sun does not rise at
all for a period during the winter.

Now, however, the guys at the national television had discovered that the
three cities were geographically distributed in a rather unbalanced way,
leaving a large part of sweden without relevant daylight information. Thus
they started a web poll concerning what city should be added on the info
to make the geographical distribution a bit better.

To start with this worked rather well, and the local news papers in the
candidate cities campaigned for getting their readers to visit the web
poll and vote for their city.

Yesterday the poll was shut down though, since it was discovered that two
cities were rising drastically on the results, with some 150 votes per
minute. Needless to say some locals had discovered how to script the
voting.

This merits some technical analysis. Web votes have lately been gaining
heavily in news papers: the results are related as facts and complete
articles are built around them. No heed is taken to the shakiness of the
underlying technology.

First: the current poll had a uniqueness policy based on session cookies.
It did not even store the cookies on disk, all you had to do to vote again
was to restart the web browser. Every single noob and their grandmother
could figure this out.

Second: No IP-based uniqueness policy was implemented. This is in itself a
rather logical thing what with NAT and everything.

Third: No overload protection was installed, meaning the system swallowed
as many hits it could handle per second without protesting.

These things taken together was in practise an open invitation to a
scripted vote "attack".

So what do you do to avoid these things? In mod_survey, the cookie-based
uniqueness policy isn't even implemented: there's no point since you can't
trust anything coming from a client. The uniqueness policies implemented
are IP-based and authentication-based. Neither would have worked well in
this scenario.

The best you could have done currently would have been making a login form
where you have to register with a valid email adress (stored in a
database) in order to be allowed to vote. This would have been cumbersome
but secure. It would also be technically feasible with the current
mod_survey release. It would, however, have resulted in very very few
votes since no-one would bother to do the whole procedure.

Some day in the future, I will have to address this. My current idea to
solve the problem is a combination of cookie-based and IP-based policies:
uniqueness is primarily based on a disk-stored permanent cookie. However
to avoid scripted attacks, a server-side key would be stored (a hash based
on the combination of browser string and and the IP number) and which
would live for, say, ten minutes. Thus the NAT problem would be solved
(all you would have to do is wait a few minutes before being allowed to
vote if someone with shared IP and same browser string had voted).

Stay tuned, some day my fingers will itch, and I'll implement this.

// Joel

Skickat av Joel Palmius <***@miun.se>
till survey-discussion
Clemens Gruber
2008-03-14 14:27:17 UTC
Permalink
Hi, this is not the solution to the problem discussed below. :-) But I
can descibe what I did as a workaround to controll inputs:

We had a questionnaire only definite people should answer. They got a
password by e-mail. But if you use only a htaccess protection you could
vote more then one time. So I need the password or login in the data set.

Therefor I misused the CASEROUTE function:

<TEXT NAME="pin"
CAPTION="Please enter your PIN"
MAXLEN="8" />

<CASEROUTE SWITCH="pin" DEFAULT="pin-error.survey">
<CASE VALUE="01020304" CONTINUE="introduction.survey" />
<CASE VALUE="05060708" CONTINUE="introduction.survey" />
<CASE VALUE="09101112" CONTINUE="introduction.survey" />
...
</CASEROUTE>


With it user with wrong passwords got a error message and could not fill
out the form. When they guess right the URL of the second page -
introduction.survey - this is no problem, there is no data in the
variable "pin" even if they fill out the survey.

In an final step I purged all data sets with no (or missing entry 999)
in pin and I deleted all multiple data sets - except the last -
identified by the pin.

Clemens
Post by Joel Palmius
This is just an interesting story which is spinning around on swedish
news sites today, added with a mod_survey opinion at the end.
In sweden, the national television has daylight information every
friday. It's an animated sun which rises with numbers on when dawn
arrives at three major cities, and then sets with numbers on when
night starts. In a country as north as sweden this is interesting
since dawn and dusk varies heavily over the year. In the northernmost
city, the sun does not rise at all for a period during the winter.
Now, however, the guys at the national television had discovered that
the three cities were geographically distributed in a rather
unbalanced way, leaving a large part of sweden without relevant
daylight information. Thus they started a web poll concerning what
city should be added on the info to make the geographical distribution
a bit better.
To start with this worked rather well, and the local news papers in
the candidate cities campaigned for getting their readers to visit the
web poll and vote for their city.
Yesterday the poll was shut down though, since it was discovered that
two cities were rising drastically on the results, with some 150 votes
per minute. Needless to say some locals had discovered how to script
the voting.
This merits some technical analysis. Web votes have lately been
gaining heavily in news papers: the results are related as facts and
complete articles are built around them. No heed is taken to the
shakiness of the underlying technology.
First: the current poll had a uniqueness policy based on session
cookies. It did not even store the cookies on disk, all you had to do
to vote again was to restart the web browser. Every single noob and
their grandmother could figure this out.
Second: No IP-based uniqueness policy was implemented. This is in
itself a rather logical thing what with NAT and everything.
Third: No overload protection was installed, meaning the system
swallowed as many hits it could handle per second without protesting.
These things taken together was in practise an open invitation to a
scripted vote "attack".
So what do you do to avoid these things? In mod_survey, the
cookie-based uniqueness policy isn't even implemented: there's no
point since you can't trust anything coming from a client. The
uniqueness policies implemented are IP-based and authentication-based.
Neither would have worked well in this scenario.
The best you could have done currently would have been making a login
form where you have to register with a valid email adress (stored in a
database) in order to be allowed to vote. This would have been
cumbersome but secure. It would also be technically feasible with the
current mod_survey release. It would, however, have resulted in very
very few votes since no-one would bother to do the whole procedure.
Some day in the future, I will have to address this. My current idea
to solve the problem is a combination of cookie-based and IP-based
policies: uniqueness is primarily based on a disk-stored permanent
cookie. However to avoid scripted attacks, a server-side key would be
stored (a hash based on the combination of browser string and and the
IP number) and which would live for, say, ten minutes. Thus the NAT
problem would be solved (all you would have to do is wait a few
minutes before being allowed to vote if someone with shared IP and
same browser string had voted).
Stay tuned, some day my fingers will itch, and I'll implement this.
// Joel
till survey-discussion
Skickat av Clemens Gruber <***@uni-osnabrueck.de>
till survey-discussion
Joel Palmius
2008-03-14 16:29:08 UTC
Permalink
Hehe, did you check out the UNIQUE parameter on <ANSWER> under <SECURITY>?
It does exactly that: prevents re-voting in a htaccess-based login. :-)

(it stores a server-side key with the login and checks if it is there
before allowing someone to submit)

Anyway, the local newspaper is going to interview me tonight on the topic
of "how people hacked the daylight vote". I find the choice of terminology
rather amusing. :-)

// Joel
Post by Clemens Gruber
Hi, this is not the solution to the problem discussed below. :-) But I
We had a questionnaire only definite people should answer. They got a
password by e-mail. But if you use only a htaccess protection you could
vote more then one time. So I need the password or login in the data set.
<TEXT NAME="pin"
CAPTION="Please enter your PIN"
MAXLEN="8" />
<CASEROUTE SWITCH="pin" DEFAULT="pin-error.survey">
<CASE VALUE="01020304" CONTINUE="introduction.survey" />
<CASE VALUE="05060708" CONTINUE="introduction.survey" />
<CASE VALUE="09101112" CONTINUE="introduction.survey" />
...
</CASEROUTE>
With it user with wrong passwords got a error message and could not fill
out the form. When they guess right the URL of the second page -
introduction.survey - this is no problem, there is no data in the
variable "pin" even if they fill out the survey.
In an final step I purged all data sets with no (or missing entry 999)
in pin and I deleted all multiple data sets - except the last -
identified by the pin.
Clemens
Post by Joel Palmius
This is just an interesting story which is spinning around on swedish
news sites today, added with a mod_survey opinion at the end.
In sweden, the national television has daylight information every
friday. It's an animated sun which rises with numbers on when dawn
arrives at three major cities, and then sets with numbers on when
night starts. In a country as north as sweden this is interesting
since dawn and dusk varies heavily over the year. In the northernmost
city, the sun does not rise at all for a period during the winter.
Now, however, the guys at the national television had discovered that
the three cities were geographically distributed in a rather
unbalanced way, leaving a large part of sweden without relevant
daylight information. Thus they started a web poll concerning what
city should be added on the info to make the geographical distribution
a bit better.
To start with this worked rather well, and the local news papers in
the candidate cities campaigned for getting their readers to visit the
web poll and vote for their city.
Yesterday the poll was shut down though, since it was discovered that
two cities were rising drastically on the results, with some 150 votes
per minute. Needless to say some locals had discovered how to script
the voting.
This merits some technical analysis. Web votes have lately been
gaining heavily in news papers: the results are related as facts and
complete articles are built around them. No heed is taken to the
shakiness of the underlying technology.
First: the current poll had a uniqueness policy based on session
cookies. It did not even store the cookies on disk, all you had to do
to vote again was to restart the web browser. Every single noob and
their grandmother could figure this out.
Second: No IP-based uniqueness policy was implemented. This is in
itself a rather logical thing what with NAT and everything.
Third: No overload protection was installed, meaning the system
swallowed as many hits it could handle per second without protesting.
These things taken together was in practise an open invitation to a
scripted vote "attack".
So what do you do to avoid these things? In mod_survey, the
cookie-based uniqueness policy isn't even implemented: there's no
point since you can't trust anything coming from a client. The
uniqueness policies implemented are IP-based and authentication-based.
Neither would have worked well in this scenario.
The best you could have done currently would have been making a login
form where you have to register with a valid email adress (stored in a
database) in order to be allowed to vote. This would have been
cumbersome but secure. It would also be technically feasible with the
current mod_survey release. It would, however, have resulted in very
very few votes since no-one would bother to do the whole procedure.
Some day in the future, I will have to address this. My current idea
to solve the problem is a combination of cookie-based and IP-based
policies: uniqueness is primarily based on a disk-stored permanent
cookie. However to avoid scripted attacks, a server-side key would be
stored (a hash based on the combination of browser string and and the
IP number) and which would live for, say, ten minutes. Thus the NAT
problem would be solved (all you would have to do is wait a few
minutes before being allowed to vote if someone with shared IP and
same browser string had voted).
Stay tuned, some day my fingers will itch, and I'll implement this.
// Joel
till survey-discussion
till survey-discussion
Skickat av Joel Palmius <***@miun.se>
till survey-discussion
Florian Hars
2008-03-17 16:53:09 UTC
Permalink
Post by Joel Palmius
However to avoid scripted attacks, a server-side key would be
stored (a hash based on the combination of browser string and and the IP
number)
How does that help against

for i in $(seq 1000 2000); do
for j in $(seq 4000 5000); do
wget --user-agent="Mozilla/4.0 (compatible; MSIE 7.0; TOB 6.05; Windows NT 5.1; .NET CLR 2.0.4$i; .NET CLR 3.0.0$j.30; .NET CLR 1.1.4322)" \
--post-data=whatever http://example.com/survey/whatever
done
done

- Florian
Skickat av Florian Hars <***@bik-gmbh.de>
till survey-discussion
Joel Palmius
2008-03-17 17:48:43 UTC
Permalink
It wouldn't, but at least it wouldn't be as easy as clearing out the
cookies and voting again.

It's an interesting problem. The only obviously working alternative I know
of atm is a pure IP-based lock. This will, however, accidentally lock out
people who shared that IP (ie big organizations with NAT, long term
surveys answered by people on dialup with dynamic IP allocation...). To
break that, an attacker would have to go as far as IP spoofing, which is
beyond what a script kiddie could manage.

Which means that the only alternative for unknown population voting
currently are to either have voting with some kind of authentication (ie
register with valid mail adress), accept being flooded by duplicates, or
accept losing out on people accidentally being locked out.

Suggestions for better protection schemes are welcome. :-)

// Joel
Post by Florian Hars
Post by Joel Palmius
However to avoid scripted attacks, a server-side key would be
stored (a hash based on the combination of browser string and and the IP
number)
How does that help against
for i in $(seq 1000 2000); do
for j in $(seq 4000 5000); do
wget --user-agent="Mozilla/4.0 (compatible; MSIE 7.0; TOB 6.05; Windows NT 5.1; .NET CLR 2.0.4$i; .NET CLR 3.0.0$j.30; .NET CLR 1.1.4322)" \
--post-data=whatever http://example.com/survey/whatever
done
done
- Florian
till survey-discussion
Skickat av Joel Palmius <***@miun.se>
till survey-discussion

Loading...