Information obtained in the field is forced to prepare data for analysis. In this post, quickly develop some guidelines for the preparation of information such as encryption, file creation, consistency and weight.
Once collected the data, provide codes relevant to the categories or alternative response of questions of the measuring instrument. Coded categories, we proceed to develop the codebook. Return in another post with the design guidelines of the codebook.
The coding can be made before or after lifting the information in the field. The coding can be made at time of questionnaire design, if the categories were pre-coded and the meter has no open questions.
On the contrary, postcodificación is to establish a code to each of the possible answers ready to be mentioned at time of interview in the open questions of the questionnaire. Therefore, the obtained postcodificación done all of the information.
The results of the measuring instrument are transferred to a matrix or database, using as support the codebook. Without the codebook, cannot be encoded and an array of database meaningless. The creation of the data file involves two steps: the recording and definition of the file.
In turn, the recording of the file can be committed in two ways: manual, i.e. using the computer keyboard, or mechanically with the use of optical scanners. The optical scanners read the codes and transcribed simultaneously.
The definition of the data file has protocols related to each statistical package. These protocols will be developed in detail in subsequent deliveries.
Information obtained in the field and finished recording and definition of data file, it must establish guidelines or procedures for quality control of data prior to analysis. Here soon some notes about the consistency of quantitative data.
Definitely, the consistency of data is a crucial time, complex and laborious process of preparing information for analysis.
The purpose of data consistency to ensure the quality of information. The consistency of data is performed in two stages or phases: at first, completed the recording and definition of the data file. In a second time by the systematic review of each of the variables and their relationship with others.
Designed the data file, i.e. the design of the study variables and their format, consistency begins with the application of filter controls and quotas.
It must pass through a filter selection control of the universe under study in order to verify that all the respondents belong to the objective universe.
In addition, the data must pass through a quota control, in order to verify that the composition of the sample matches the sample design.
In a second step, it checks to determine variables by procedure descriptive statistics, frequency distributions of all variables and relationships between them.
These procedures and protocols are critical in the process prior to analysis. It is necessary to carry out this stage of the consistency of a rigorous and systematic. Completed all these steps, the information shall be submitted for analysis.
Finally, the sample obtained in the field should be proportional to the distribution of the universe or population. Therefore, before starting the analysis of the data, we must verify that each individual or respondent in the sample has a weight proportional to the population distribution. The technique of assigning a weight to each individual or interviewee is called sampling weights.
In subsequent deliveries will develop in more detail these patterns of preparing information for analysis. This post is intended to serve as an introduction to the workshop by Spas Data Analysis. In the first session of the workshop, by way of introduction, we will discuss this and other memos, essential for the success or failure of any investigation. Until then.