Estudios originales
← vista completaPublicado el 5 de noviembre de 2025 | http://doi.org/10.5867/medwave.2025.10.3108
Construcción, validación de contenido y confiabilidad de cuestionario para evaluar discriminación laboral percibida en sector minero chileno: estudio de validación
Development, content validity, and reliability of a questionnaire to measure perceived workplace discrimination in the Chilean mining sector: A validation study
Abstract
Introduction Perceived workplace discrimination is a relevant phenomenon in the field of occupational health, associated with multiple negative effects on work outcomes, mental and physical health. However, the literature reveals significant limitations in its measurement, such as the absence of validated instruments for this purpose. This study aimed to develop, validate the content, and evaluate the reliability of a questionnaire designed to measure perceived workplace discrimination.
Methods We designed a questionnaire based on a scoping review of the literature. The instrument consists of two sections. The first section includes four items aimed at assessing the prevalence of perceived workplace discrimination and the associated reasons. The second section comprises ten items designed to characterize the phenomenon using a Likert-type scale. Content validity was assessed by five experts using Aiken’s V coefficient with 95% confidence intervals. Internal consistency of the second section was evaluated through Cronbach’s alpha coefficient. The sample included 86 workers from the mining sector in Chile.
Results The 14 items in the questionnaire obtained Aiken’s V values ranging from 0.8 to 1, indicating high clarity, relevance, and significance. The reliability analysis of the second section yielded a Cronbach’s alpha coefficient of 0.62, considered acceptable for exploratory stages. Opportunities for improvement were identified in some items to enhance internal cohesion in future applications.
Conclusions The questionnaire demonstrates high content validity and moderate reliability, supporting its preliminary use in research on perceived workplace discrimination. Further studies are recommended to confirm its factor structure and strengthen its psychometric properties in larger and more diverse samples.
Main messages
- The measurement of perceived workplace discrimination has important methodological limitations, including the lack of validated instruments adapted to workplace contexts.
- This study designed and validated a new questionnaire, providing a comprehensive tool to characterize perceived workplace discrimination.
- Internal consistency was moderate, the sample size was limited, and validation was conducted in a single sector, which restricts the generalizability of the findings.
- This tool is a methodological contribution that enables the exploration of the prevalence and implications of perceived discrimination for workers’ mental health, helping to address previously identified limitations and to strengthen the foundations for future research in occupational health.
Introduction
The word “discriminate” comes from the Latin discriminare, which means to separate or distinguish, a meaning that in its etymological origin did not necessarily imply a moral judgment. Even if we check the dictionary of the Royal Spanish Academy, it tells us that it is synonymous with marginalize, exclude, and separate, among others. However, the dictionary also provides a second definition with an incorrect moral connotation, indicating unequal treatment of a person or group based on race, religion, politics, sex, age, physical or mental condition, among others [1]. It is precisely this unequal or unfair treatment that characterizes the definition of perceived discrimination. Thus, it has been defined as the “process by which one or more members of a socially defined group are treated differently, and especially unfairly, because of belonging to that group” [2]. Likewise, perceived discrimination has been conceptualized as a behavioral manifestation of a negative attitude, judgment, or unfair treatment toward members of a group [3]. In the workplace, perceived discrimination refers specifically to unfair or negative treatment of employees based on individual characteristics unrelated to job performance [4].
Perceived workplace discrimination is a problem in the field of occupational health, given its negative impact on the well-being of workers. Numerous studies have documented that it adversely affects work outcomes, mental and physical health [5,6,7,8]. For example, it has been observed that both overt and subtle discrimination impair workers' psychological and physical health, affecting performance-related outcomes. Although subtle discrimination may be less obvious, its consequences are equally harmful [7]. Furthermore, it has been suggested that mechanisms such as work stress and perceived injustice explain the association between discrimination and adverse health and work performance outcomes [6].
Given the substantial interest in perceived discrimination over recent decades, numerous questionnaires have been developed to measure this construct. However, significant challenges remain in capturing the phenomenon as accurately as possible, particularly in occupational settings. The review by Shen and Dhanani [9] highlights multiple problems in measuring workplace discrimination, including the lack of consensus on its definition, uncertainty about who experiences or perpetrates it, heterogeneity in its forms and contexts, and biases inherent to self-report. These shortcomings hinder the precise characterization of the phenomenon and limit the design of effective interventions. Allen [10], in a non-occupational context, similarly highlights the lack of methodological consensus and the biases of minimization or hypervigilance, further noting that instrument reliability is uneven and that the form of discrimination (chronic vs acute) can significantly influence health outcomes. These difficulties are consistent with those described in the scoping review conducted by Bravo et al. [5], which evidenced that more than 90% of studies lack a clear definition of perceived workplace discrimination, that comprehensive, validated instruments are lacking, and that essential aspects—such as timing of occurrence, perpetrator, frequency, and the consequences of discrimination—are often omitted. These findings underscore the urgent need to develop valid and comprehensive instruments for measurement in workplace contexts.
In this context, this article reports the results of a study that designed and evaluated the content validity and reliability of a questionnaire to measure perceived workplace discrimination in all its complexity. This tool is a methodological contribution that allows for exploring its prevalence and implications for workers' mental health. It helps to overcome previously identified limitations and strengthens the foundations for future research in occupational health.
Methods
Design
This instrumental validation study, with a cross-sectional design and exploratory scope, sought to design and validate a questionnaire to measure perceived workplace discrimination in the mining sector in Chile. The instrument was developed based on a comprehensive review of the scientific literature and was refined by the research team. Subsequently, five external experts evaluated its content validity, providing observations that allowed the final version of the questionnaire to be optimized. The validation involved evaluating content validity by analyzing the clarity, relevance, and significance of the items using Aiken’s V coefficient, and assessing reliability using Cronbach’s alpha coefficient for internal consistency.
Population and sample
Content validation was carried out by five experts in organizational psychology, instrument design, and occupational health. To assess internal consistency, a sample of workers from a large mining company in Chile was recruited, who responded via an online platform. The inclusion criteria were being over 18 years of age, being actively employed by the principal company or associated contractor, and having experienced perceived workplace discrimination. All participants gave their informed consent before participating. Incomplete questionnaires were excluded in order to ensure the quality of the data analyzed.
The internal consistency analysis considered the 10 items characterizing workplace discrimination, with a Likert-type response format. Non-probability convenience sampling was used. The final sample included 86 workers, which is adequate for exploratory studies, given that a minimum of 50 participants is recommended to estimate coefficients such as Cronbach’s alpha [11].
Development of the instrument
The questionnaire was developed from a scoping review of the literature on perceived workplace discrimination previously published by this research team [5]. That review identified methodological limitations such as the absence of validated instruments and the need to address multiple characteristics of the phenomenon (Appendix 1).
Initially, 19 items were generated to achieve broad coverage of the construct (Appendix 2), following methodological recommendations for scale development [12]. The instrument incorporated the definition of perceived workplace discrimination as “unfair or negative treatment of workers based on individual characteristics or membership in a social group, unrelated to job performance.” Then, through author consensus, the questionnaire was reduced to 15 items by merging those related to frequency, severity, and relation to job performance (Appendix 3). After expert panel review, one item on verbal aggression was removed due to overlap with psychological aggression, yielding a final 14-item version (Appendix 4). The wording was refined, including the addition of the second-person formal pronoun “usted,” the differentiation between formal and informal settings, the merging of sex and gender, and the adjustment of impact scales. Open-ended items were also included to detail perceived effects. Although splitting items between observed and personally experienced events was considered, a single structure was retained for parsimony. In line with methodological recommendations for exploratory stages [12], the instrument was designed with a unidimensional approach, treating perceived workplace discrimination as a general construct.
Accordingly, the instrument was organized into two sections (Appendix 4). The first section includes four items aimed at assessing the prevalence and reasons for discrimination directed both at third parties and at the respondent. Possible reasons considered were sex/gender, sexual orientation, age, migrant status, ethnicity/race, physical appearance, disability, socioeconomic status, level or place of study, and, additionally, an “other” option allowing specification of a different reason. The second section, addressed to those who reported having experienced discrimination, comprises 10 items (items 5 to 14) that characterize the phenomenon in terms of perpetrator, settings, frequency, type, and effects. These 10 items were answered using a 4-point Likert-type scale coded numerically from 0 to 3:
-
0: strongly disagree.
-
1: disagree.
-
2: agree.
-
3: strongly agree.
The frequency item used the following coding:
-
0: a couple of times per year.
-
1: a couple of times per month.
-
2: a couple of times per week.
-
3: daily.
The items on health impact and job performance used:
-
0: it has not affected me.
-
1: mild.
-
2: severe.
-
3: very severe.
Accordingly, the total score for this section can range from 0 to 30 points, with higher values indicating greater severity of perceived discrimination. A four-point scale was chosen because, when a midpoint is included in five-category scales, many participants tend to select it. This can reduce the instrument’s ability to clearly distinguish respondents’ positions, as it dampens differences between responses and may hinder interpretation of results, especially in studies aiming to measure attitudes or opinions [13]. In summary, the first section serves as an initial screening, enabling the identification of individuals who report experiencing or observing discrimination and the reasons attributed to it. Meanwhile, the second section constitutes the scale that characterizes the phenomenon (perpetrator, settings, frequency, type, and effects) and serves as the questionnaire’s quantitative scale.
Content validity through expert opinion
The content validation process involved five expert judges, each with 15 to 30 years of professional and academic experience. They included: a PhD in human resources psychology; a PhD in social and organizational behavior; a magister in people management with a bachelor in social sciences; a PhD in statistics with a BA in mathematics; and a PhD in psychology.
The instrument was sent to the expert judges via a link distributed by email through the SurveyMonkey platform. Their task was to evaluate the questionnaire in detail, applying rigorous criteria to ensure its quality. To assess agreement among the expert judges, each item was rated on a 1 to 4 scale:
-
1: very inadequate
-
2: inadequate
-
3: adequate
-
4: very adequate
Three core criteria were considered: clarity, pertinence, and relevance. For this purpose, the judges were provided with the definition of each criterion and its associated question. Clarity refers to whether the item is easily understandable to respondents, without ambiguities. The question used was: “Is the item worded in a way that participants can easily understand?” Pertinence indicates whether the item effectively measures aspects of the construct of perceived workplace discrimination; the question was: “Do you consider the item appropriate for measuring the construct of perceived workplace discrimination?” Relevance evaluates whether the item is applicable and meaningful within the context of workplace discrimination; the question was: “Do you believe the item is applicable and meaningful for a population of workers in mining?”
After rating the criteria (clarity, pertinence, and relevance), the expert judges could add any additional comments on each item. In this way, the judges evaluated the questionnaire items.
Subsequently, based on the expert-judgment results, Aiken’s V coefficient was calculated to determine content validity, using the following formula: V=(Ⴟ-l )/k
Where Ⴟ denotes the sample mean of the expert judges’ ratings, l is the lowest possible rating (in this case, 1), and k is the range of possible values on the scale used (in this case, 3). Aiken’s V provides a measure ranging from 0 to 1. A value of 0 is obtained when all judges select the lowest possible rating; a value of 1 is obtained when all judges select the highest possible rating. Accordingly, Aiken’s V was calculated for each item, together with 95% confidence intervals following the methodology proposed by Penfield and Giacobbi. The null hypothesis was that the population value of Aiken’s V would be equal to or less than 0.6. Items whose estimated Aiken’s V had a 95% confidence interval that included values ≤ 0.6 were considered for revision or elimination, whereas items whose 95% confidence interval lay entirely above 0.6 were accepted as valid [14].
During the content validation phase, the experts made observations that helped optimize the instrument’s wording, content, and structure. Several items were refined, including the incorporation of the formal second-person pronoun “usted” in key questions and clarification of terms such as “formal” and “informal” settings. Regarding response options, the sex and gender categories were merged. Following the conceptual review, the verbal aggression item was removed due to overlap with psychological aggression. The health-impact and job-performance scales were also adjusted to differentiate severity levels better, and open-ended items were added to capture perceived effects. Finally, although the possibility of splitting items between experienced and observed discrimination was considered, a single structure was retained to preserve parsimony.
After incorporating the experts’ suggestions and the described refinement process, the final version of the questionnaire was structured (Appendix 4). It comprised 14 items distributed across two main parts: a section on the prevalence of perceived discrimination and a section characterizing discriminatory acts.
Reliability: internal consistency
The reliability of an instrument refers to the degree to which its measurements are consistent and free from random error [15]. One dimension is internal consistency, which assesses whether the items measuring the same construct are homogeneous [15]. Cronbach’s α (alpha)—a weighted mean of inter-item correlations—is widely used to estimate internal consistency. Its formula is: α=k/(k-1)(1-(∑▒S_i^2 )/(S_T^2 ))
In this formula, k is the number of items in the scale,
Statistical analysis
Data analysis was performed using Stata version 16.01. For content validation, Aiken’s V coefficients with 95% confidence intervals were calculated to evaluate items in terms of clarity, pertinence, and relevance according to expert-judge assessments. For internal consistency, Cronbach’s α was calculated considering only the 10 items that characterize workplace discrimination (Appendix 4). These items were answered on 4-point Likert-type scales, all coded from 0 to 3. They were treated as ordinal variables in a single administration of the instrument, with a total score ranging from 0 to 30 points.
Results
Participant characteristics
Table 1 presents the characteristics of the 86 participants included in the analysis, all from the mining sector in Chile. The initial sample comprised 91 workers; however, five cases were excluded due to incomplete data. The mean age was 45 years (range: 26 to 65 years). Regarding sex, most participants were men, while approximately one-quarter were women. In terms of marital status, about half reported being married, whereas 35% reported being single. Concerning parenthood, 83% indicated they had children. A large proportion worked under a shift system, whereas 36% did not. With respect to job position, half performed field duties, followed by 33% who alternated between administrative and field roles. Finally, 72% of participants were employed by the owner company, and 28% worked for contractor firms. A descriptive analysis of item-response distributions is presented in Appendix 5.
Content validity
The assessment conducted by the expert judges led to optimized item wording. Accordingly, the questionnaire was reduced from 15 to 14 items after removing one that addressed verbal aggression due to conceptual overlap with the psychological aggression item. After two rounds of review, content validity was subsequently evaluated for the 14 final items.
The Aiken’s V coefficients, presented in Table 2, ranged from 0.8 to 1.0, indicating a high level of agreement among the judges regarding item clarity, pertinence, and relevance. Clarity exceeded 0.8 in all cases, with complete agreement for items 1, 2, and 4. Pertinence and relevance reached the maximum value for all items, with Aiken’s V equal to 1.0 across the board.
All 95% confidence intervals associated with Aiken’s V exceeded the cutoff for item review or removal. Taken together, these results indicate that the evaluated items satisfactorily met the established content validity criteria.
Reliability
Overall internal consistency of the questionnaire was assessed using Cronbach’s α, considering only the 10 items corresponding to the characterization of discriminatory acts. The value obtained was α = 0.62, indicating moderate internal consistency, which is considered acceptable in exploratory instrument-validation studies.
Additionally, item performance regarding internal consistency was examined using corrected item–total correlations and the impact on α if the item were deleted; results are shown in Table 3. Item–test correlations ranged from 0.31 (item 7) to 0.62 (item 13), while item–rest correlations ranged from 0.12 (item 7) to 0.49 (item 13). The “α if item deleted” analysis showed that removing item 5 would slightly increase α to 0.64, the highest value observed in this comparison. Deleting other items would not yield substantial improvements in internal consistency.
Discussion
This study represents a step forward in the preliminary validation of a questionnaire on perceived workplace discrimination in the Chilean mining sector. A rigorous approach was used for construction, content validation, and reliability, in line with methodological recommendations [12]. Notably, the development of this instrument allows inquiry into dimensions rarely addressed in previous questionnaires, such as identifying the perpetrator, distinguishing between formal and informal settings, and assessing effects on health and job performance. It also incorporates an operational definition of the construct, perceived reasons for discrimination, and the period during which it occurs, thereby enhancing its capacity to characterize the phenomenon in all its complexity. By contrast, many prior studies have used instruments not explicitly designed for workplace contexts, such as the Everyday Discrimination Scale [18], or instruments focused on a particular type of discrimination [19], limiting their ability to capture the diversity of experiences present in the workplace.
Expert-judge assessment showed high levels of content validity for clarity, pertinence, and relevance, with Aiken’s V ranging from 0.8 to 1. Previous studies have used cutoffs between 0.7 and 0.8 as indicative of adequate content validity, reflecting high inter-judge agreement [20,21]. A value ≥ 0.8 indicates high validity by expert judgment, though it does not by itself guarantee the instrument’s overall validity. To obtain more precise estimates of population values of Aiken’s V, 95% confidence intervals were employed, deeming items valid when their interval lay entirely above 0.6, following Penfield and Giacobbi’s approach [14]. This criterion, more stringent than the classic 0.5 threshold [22], sought to increase the precision of estimates and consensus among judges, thereby strengthening content validity.
Internal consistency analysis yielded a Cronbach’s α of 0.62, considered acceptable in exploratory-phase studies. Corrected item–total correlations indicated a reasonably homogeneous internal structure, although areas for improvement were observed that could be refined in future applications. Consistent with methodological recommendations for early stages of scale development [12], the questionnaire was designed under a unidimensional conceptualization of perceived workplace discrimination. Nonetheless, there is a recognized need for future factorial studies to explore potential multidimensional structures of the construct in broader and more diverse working populations.
This study has several limitations. First, although content validity demonstrated high levels of expert agreement, no cognitive interviews were conducted with the target population, a technically recommended step to ensure appropriate item interpretation [12]. Alongside quantitative evaluation via Aiken’s V, qualitative observations from experts were considered to refine wording, improve clarity, and ensure cultural and conceptual pertinence; however, this was not complemented by a formal, systematic qualitative analysis, which can be viewed as a minor limitation in strengthening the validation process.
Second, internal consistency was estimated exclusively with Cronbach’s α, which assumes unidimensionality and tau-equivalence across items. These conditions were not empirically verified due to the sample size, which was insufficient for exploratory factor analysis. This limits certainty about the instrument’s internal structure and homogeneity. Future studies should use larger samples to conduct factor analyses and should complement α with coefficients such as ω (omega), which provide more robust estimates under less restrictive assumptions. In particular, ω is advisable when one or more conditions for α are violated, such as essential tau-equivalence (e.g., unequal factor loadings), the use of ordinal scales with fewer than 5 points, the presence of correlated errors, or small sample sizes. Unlike α, ω can yield more accurate estimates even when items do not contribute equally to the total score, making it especially suitable when such limitations are detected [23].
Third, although 4-point Likert scales facilitated comprehension and forced respondents to take a position by removing a neutral option, they may have limited the instrument’s variability and sensitivity to capture nuances in perceptions of workplace discrimination. As DeVellis and Thorpe recommend [12], scales with five or more points could increase observed variance and improve precision, provided categories are clear, equidistant, and manageable. Future studies could explore this alternative, balancing psychometric sensitivity and cognitive load.
Finally, because validation was carried out exclusively among workers in the mining sector, the performance of the instrument in other work contexts is unknown. Nonetheless, its wording is sufficiently general to be applicable across diverse productive and occupational sectors without substantive adaptations, as it addresses cross-cutting dimensions such as type of discrimination, perpetrator, settings, frequency, reasons, and effects on health and performance.
The sample was non-probabilistic by convenience and showed an inverse distribution in the proportion of employer–contractor workers, with the sector being roughly 70% contractor workers and 30% owner-company employees [24]. This may limit the representativeness and generalizability of the findings. Applications in other populations are recommended to confirm psychometric properties via confirmatory factor analysis and additional reliability estimates. Future applications may also lead to the modification or removal of items with low contribution to the construct, thereby optimizing internal consistency and parsimony without losing the instrument’s capacity to characterize perceived workplace discrimination in all its complexity.
Conclusions
This study provides preliminary evidence of the validity and reliability of a questionnaire to measure perceived workplace discrimination in the Chilean mining sector. The results show high content validity, with expert evaluations supporting item clarity, relevance, and pertinence.
We recommend using the instrument in its entirety as an exploratory tool to describe and characterize perceived workplace discrimination. However, the second part cannot yet be considered a validated scale.