Back


Clement York-Kee So
(Associate Professor, School of Journalism and Communication,The Chinese University of Hong Kong)
Jennifer So-Kuen Chan
(Lecturer, Department of Statistics and Actuarial Science,The University of Hong Kong)
 

Note: This article represents the view of the authors and not their respective universities.

 

In Hong Kong there are numerous big and small rallies and protests each year. For those large-scale activities such as the June 4 Candle Light Vigil and the anti-Basic Law Article 23 demonstration held in December 2002, citizens are very interested to know the number of participants. The activity organizers would usually provide a figure while the police would sometimes do so, but these two figures generally differ widely. People tend to think that the organizers' figures are over-estimated, and the police's figures are too conservative. As a result, we can only guess in between the two figures, or simply take an average in order to arrive at a better estimate.

 

Is it possible to accurately calculate the number of participants in these large-scale gatherings? How to do it? This is not only interesting to the general public but is also a challenging topic for the academics. A group of scholars who earlier re-analyzed the compendium of public's submission on Basic Law Article 23 wished to study this topic, and they used the July 1 protest as a case study. The research team members include Ma Ngok from the Hong Kong University of Science and Technology, Sammy Chiu from the Hong Kong Baptist University, Boris Choi, Robert Chung and Jennifer Chan from the University of Hong Kong, Chan Kin Man and Clement So from the Chinese University of Hong Kong. Ten students also helped out as volunteer workers. This article is based on this study and reports about its design, fieldwork and review of the experience. It is hoped that this task will be of some reference value for similar work in the future.

 

The Challenge of Calculation

 

Why did few people attempt to do a scientific count of protesters? There may be many reasons as follows. One reason is that the number is so huge and so it is impossible to hand count at the scene. Another reason is that people in the protest march are moving along at different speeds. A fixed grid method cannot be used for counting. The third reason is that if the marching crowd is large and the march takes a long time, we cannot just sample a few time intervals for counting as the error would be huge. The fourth reason is that if the crowd marches along different routes, and there are people joining and leaving at different times and locations, accurate counting is impossible.

 

Despite the existence of the above problems, a scientific count of protesters is not insurmountable. With limited resources, the research group decided to adopt the following method: Choose two bridges across Hennessy Road, which is the main route for the march. Place a video camera on each of the bridges and tape the march from above. A systematic sampling method is used. For each hour, four minutes (e.g., at 4pm, 4:15pm, 4:30pm, 4:45pm) will be taped, thus having a sampling ratio of 1/15. After each hour of taping, the research group immediately sends the tape to a location with viewing facilities. Using a large viewing screen, volunteers will count the heads in the fixed frames. The research group planned to finish all the head counting on the night of July 1.

 

Whether this method is feasible depends on the following assumptions: (1) There is only one marching route and every participant has walked pass the taping point. (2) For every time segment of 15 minutes the volume of people flow is roughly constant so that the whole segment can be projected by the sampled minute.

 

The research group used two taping points, namely the bridge at the junction of Hennessy Road and Percival Street, and the bridge at the junction of Hennessy Road and O'Brien Road. The reason for choosing two taping points is to adjust for the people who reach only one point so that results from the two points can be cross-referenced.

 

Actual Difficulties

 

In the actual operation, 7 scholars and 10 students joined hands in the fieldwork. The first difficulty was that it was simply not an easy task to count the heads despite using a 30 inch-plus viewing screen. Freezing the frame hurt the picture resolution, and counting was a time consuming job. For counting one minute of taped scene, close to half an hour was needed. Thus, as the march lasted for about six hours until 9 o'clock, it was virtually impossible to have a complete count that evening.

 

The second difficulty was related to the location and angle of the video camera. Choosing the right location is important, as its view had to capture the largest number of protesters, and ideally different taping locations should be secured. The camera must capture the scene straight ahead, and be placed at the center of the bridge so that it would not be a problem when the marchers used both sides of the road. The pedestrian walkways should be captured as well. It would be more convenient for counting if some "reference points" can be found on screen such as lampposts and traffic lights.

 

The research group had not have such experience. Later on we found that at the bridge over junction of Hennessy Road and Percival Street, it oversaw a curve instead of a straight road. There were see-through glasses on the bridge which adversely affected the taping. The research group thus decided to abandon the information gathered from this point. The group put the video camera on one side of the bridge instead of the middle. However, the marching crowd later took both sides of the street. It was difficult to move the camera position during the process as there were too many people on the bridge.

 

The third difficulty is that the research group found that sampling every 15 minutes was not a good decision. The people flow changed quite rapidly. For example, at around 4pm there were about 300 to 400 people passing under the O'Brien Road bridge in a minute, and such figures differ greatly every 15 minutes. After 7pm, people flow per minute sometimes reached 1,300 people, and there might be a difference of 200 people from one minute to another. An ideal way is to tape every minute of the whole process. After reviewing the tapes later, the sampling frame can be more appropriately decided according to the rate of people flow. For example, if the rate of change is fast, a sample of 1 minute from every 10 minutes should be taken instead.

 

The fourth difficulty is the inadequate number of location chosen. The research group only chose two points. But some people left the scene after they emerged from the Victoria Park. They never made it to Hennessy Road. Some people joined the march in Admiralty and hence they never appeared in the tapes and would not be counted. The best arrangement is to set up more taping locations, including the Victoria Park, Admiralty, Central, and even right outside the Government's Central Office Buildings.

 

The fifth difficulty is that the research group did not anticipate there were another marching route westbound in Gloucester Road. The research group did not have any video camera facilities at Gloucester Road, and thus had no counting of this crowd. After some efforts, we were able to ascertain the people flow and time of march along this route. The research group made use of the figures released by the police and the event organizer as reference. Together with the data from an opinion poll done after July 1 as well as the data from our head counting, the research group was able to calculate the total number of July 1 protesters. This is of course not an ideal method. However, given the limited resources and other constraints, the research group did its best.

 

Projected Results

 

On July 1, the research group systematically taped the whole marching process above the bridge crossing the junction of Hennessy Road and O'Brien Road. The projected number of people crossing that point was 26,4000. The research group originally planned to announce the result on that night but then decided not to do so. The reason was that only a portion of the July 1 participants passed through the taping point while many others did not, including those only showed up in Victoria Park, those who did not finish the whole march, those took an alternative marching route, those joining the march at Admiralty or Central, etc. Thus, the research group decided to project the total number of participants only after getting more data.

 

A corresponding effort by the research group was to do a community-wide random telephone poll from July 2 to 5. The poll successfully interviewed 2,206 adults aged 18 or above. The response rate was 68.7% and among the respondents 489 said they participated in the July 1 protest. They answered the following questions: (1) During the march, did you pass through the road next to the Southorn Playground in Wan Chai? (2) At 6pm that day, were you among the marching crowd, including those at or near Victoria Park? Among the 489 respondents, 57.1% said they passed through Hennessy Road next to the Southorn Playground, and 75.4% were on the streets at 6pm that day.

 

Moreover, the research group gathered information provided by callers to a radio phone-in program and first-hand information by the research group members concerning the Gloucester Road marching route. From around 5:30pm that day, there was a marching route westbound on Gloucester Road. The last batch of people on that route left Victoria Park close to 6pm. Assuming that the marching crowd took two car lanes, and the speed was about seven people per second (assuming the same speed of flow on Hennessy Road at similar timeslots), in this half-hour timeslot there were about 13,000 people in this route.

 

With the above data, the research group can make a projection of the total number of participants in the July 1 protest. The research group used two different methods of calculation. One method is based on the head counts from the O'Brien Road bridge together with the public opinion poll data so as to cover those people marching along Gloucester Road, those only in Victoria Park and those joining from Admiralty and Central but never set foot on Hennessy Road.

 

The research group divided the number of 264,000 (head count from the bridge) by 57.1% (proportion A) and arrived at an estimated total of 462,000 protesters. If adopting a 95% level of confidence, the error margin for proportion A should be plus or minus 4.5%. Thus, the range of the total number of people is between 429,000 and 502,000.

 

Another method is to use the police figure and supplement it with other missing data. Using aerial photography and the grid counting method, the police said at 6pm on July 1there were 350,000 people along the marching route including those at Victoria Park and at the government's central office buildings. The research group assumed that this figure does not include those who already finished the march before 6pm, those who still did not make it to the gathering place, and those marching along Gloucester Road.

 

The research group obtained the figure of 363,000 as the base for calculation by adding up the 350,000 people with the 13,000 people along Gloucester Road. This figure of 363,000 was divided by 75.4% (proportion B) and an estimated total of 481,000 protesters were obtained. Again adopting the 95% confidence level, the error margin for proportion B is plus or minus 3.9%. Thus, the range of total number of protesters is between 458,000 and 508,000.

 

Results obtained from these two methods are close and this strengthens the research group's confidence about the results. Furthermore, the results are also similar to the half a million figure stated by the organizer. The head-counting results from Hennessy Road are also similar to those from other independent groups which did the same thing. Both arrived at a figure close to 260,000. This serves as a piece of independent corroborating evidence.

 

Conclusion

 

The total number of participants in the protest as calculated by the research group matches well with the activity organizer's. On the one hand it may due to the situation that both figures have approached the actual one. On the other hand, it may be due to the scenario that before July 1 different parties were aware that there would be calculations and estimations by various groups using different methods, and there would be comparisons among these figures afterwards. This interactive process made their calculations more cautious and they would try to stay away from casual over-estimation or under-estimation. As a result, the figures thus generated tend to be close to each other. The fact that different organizations did their own work and the figures could be compared would have a positive effect on the evaluation of the actual number of participants.

 

The research group has learned some experience after this actual field operation. In doing similar calculations in the future, methodological pitfalls can be avoided and inadequacies improved. The research group believes that by using this tested statistical method, we can show the actual figure within certain error margins. After all, the function of these figures is to provide a psychological impression for the public. Whether it was 460,000, 480,000 or 500,000 does not make much difference in terms of its political significance. The basic goal has been achieved. The important thing is that one should not wrongly estimate it to be 200,000 or 800,000.