Production and use of electricity results in massive losses of energy as coal/natural gas/ nuclear/ whatever form of power is converted into electricity in its usable form. The electricity we consume at the end of the line gives us about 1/3 of the total energy that existed at the beginning of the line. These conversion losses are massive. The true amount of energy saved is far greater than what we calculate from the rate per kilowatt.
Heat energy may be the source of our electricity, which again is convenient to heat water or to cook a meal. Electricity generated by wind energy turns ceiling fans. We pay the price of wasted energy for the convenience of transporting. Today we can’ t carry coal to homes. Newcastle homes too use electricity for heating rather than their coal.
We pay in the form of heavy loss for the convenience. When we save one watt at home, we save 3 watt equivalent energy at the source. We understand when we get something for the expenses.
Consider the conversion activities of Election Commission in Karnataka:
- Enumerators collect voter information, visiting homes. Voter data is recorded in Kannada, English, and mixed alphabets. Apart from house addresses, they do not have any other information when they visit the homes.
- The data recorded in paper is entered to computer in Kannada by data entry operators.
- We have no evidence of quality control while manual collection and digital data entry. There does not seem to be any verification later.
- Using transcription software, English version of the voter list is generated. Data takes a journey Kannada (physical – handwritten)à Kannada (digital – data entry)à English (digital – generated).
- The data in English is available for searching at website – http://184.108.40.206/FinalRollsearch2011/Search.aspx
- EC generates one voter list PDF document per booth (aka part). As of now, voter lists of 27 constituencies of Bangalore are in English and Kannada. The remaining 197 have only Kannada version.
- Other than the search facilities on the CEO’s site, we can download the PDF files. What can we do with these PDF files?
- Read the soft copy or print and read hard copy. If we are reading a soft copy, we can more easily find an entry in the document with the search feature of PDF reader.
- Extract text from PDF so that it can be processed and uploaded to your own database. The extracted text is not truthful to the original in PDF file. Data elements get mixed up. One of the reasons is inconsistent fonts and format of data in the PDF files. Then we work on getting back the elements, writing a program. The patterns of errors keep mounting with new set of PDF files. Population of Karnataka is reported as 5.273 crores. If we consider that 60% of them are eligible voters, we have more than 3 crore voters. With such large data, created with ad-hoc methods, we are not surprised with the variety of errors in format and content.
- The extracted data can be uploaded to a database and made available to the public on websites. Database -> text -> PDF file -> text -> database.
What value do we add by these steps?
To analyse the data, we need a processable digital copy, which the EC refuses to give though we can re-create the voter records. It is some work. With every conversion, there is possible attenuation and distortion.
This is like: thermal power station produces electricity à this electricity is used to generate heat à the heat is used to generate electricity again. These activities are acceptable as experiments in school laboratories, but not in an industry standard production system. Elections are part of democratic process production system, of huge dimension.
- Should we inject inefficiency by possible loss and corruption of data with each conversion?
- Considering the importance of data, shouldn’t we have quality assurance and quality control practices? Shouldn’t we have external audits?
- Can the citizens be assured of the quality of these important records.
- Don’t we need a data standard and consistency?
Dr. Balu told in a talk yesterday that denial of opportunity to a citizen is a crime. Here the EC is denying a large percentage of our population their right to vote, by being non-professional and callous.
By making it difficult to access, sort and filter the data, the EC is denying information to those who do not know how to extract information from the current mess. It is a competitive advantage for those who are capable of doing it and have the patience and perseverance.
Don’t we often hear complaints about “apathy of educated urban middle class” from the officials who themselves belong to this class unless they have amassed wealth by corrupt means to become super-rich? To rub it on further, if we show them their errors, the authorities ignore. If we offer help, they reject. They reject us, who can still extract some sense out of the pathetic data the authorities hoard.
Voter record is technically very simple with very few fields. There is no ambiguity about the part of speech; we have only nouns and noun phrases for name, address, age, relationship, sex, and a few numbers.
In the next blog let us discuss how we can improve the quality of voter lists.