Henri Coandă Airport: Travelers’ Experiences Revealed Through AI Conversations
How can technology enhance qualitative research?
This question is answered by our recent study conducted with the help of an AI moderator. The moderator engaged in discussions with eight travelers about their experiences at Henri Coandă Airport in Bucharest. Each response revealed unique perspectives and highlighted both the strengths and weaknesses of this important Romanian air hub.
By integrating LLM (Large Language Model) technology into the research process, we gained a clear understanding of passengers’ perceptions in a very short time and with minimal resources. This is a great example of how technology can breathe new life into traditional qualitative research, providing rich, detailed insights rapidly. So, what did travelers have to say about their experiences, and what can you expect if you’re traveling through Henri Coandă Airport in Bucharest?
Travelers’ Pain Points
Crowding
One of the most frequently mentioned issues was overcrowding. “I felt rushed and anxious, like I might miss my flight due to the disorganized boarding and deboarding processes,” said one respondent. Long queues at security checkpoints were also cited as a source of frustration.
Facility Issues
Several travelers highlighted shortcomings in the airport’s facilities. Criticisms ranged from broken monitors, which forced one respondent to return to the duty-free area for flight information, to dirty and insufficient toilets. “The restrooms are small, dirty, and almost impossible to access with large luggage,” noted another participant.
Staff Attitude
Another sore point was the behavior of airport staff. One respondent remarked that “staff seemed to treat passengers like a nuisance rather than valuable customers.” A hurried and impolite demeanor was frequently mentioned.
Other Observations
High prices at restaurants, a lack of seating, and inconsistent indoor temperatures also made the list of grievances. One traveler suggested, “More buses are needed for passenger transport.”
Positive Aspects of the Airport
Organization and Efficiency
Some respondents appreciated that, despite the queues, they were managed quickly. “Everything is very clear, and the line at security doesn’t take long,” remarked a participant.
The lack of passport checks due to Schengen membership was noted as a benefit in streamlining traffic and reducing bottlenecks.
Temperature
Climate control was one of the praised aspects, with a respondent stating: “The temperature was perfect—not too hot, not too cold.”
How Henri Coandă Compares to Other Airports
Henri Coandă Airport was described as “modest” and “inferior” compared to other international airports. “It’s like the country cousin,” said one respondent, reflecting the lack of modern facilities. Other travelers unfavorably compared it to airports in London, emphasizing the lack of diverse restaurants, shops, and automated internal transport.
Restaurants: Prices and Preferences
Travelers expressed a desire for more affordable prices and varied menus. “Normal prices, not airport prices!” was a frequent comment. They also noted a lack of vegan options and traditional Romanian dishes.
Security and Check-In: A Mixed Bag
The security process drew both praise and criticism. While some appreciated the organization, others complained about cumbersome procedures and a lack of friendliness from staff. Regarding check-in, most respondents preferred online options, but delays in baggage handling were a significant issue.
What’s Next for Henri Coandă Airport?
This study highlights key areas where the airport can make significant improvements. From cleanliness and facilities to staff attitude and crowd management, there are numerous opportunities to transform Henri Coandă into a more welcoming and efficient air hub.
The Enigma of the Response Rate and How to Ensure Your Sample Doesn't Mislead You
A client asked me whether, with a 9% response rate, the resulting sample from surveying a customer database can still be considered representative.
Situation:
We have 1,400 unique contacts—customers who purchased online in 2023. The product is industrial, sold in large volumes but with low frequency, meaning customers return to purchase multiple times a year only if they are professionals or B2B buyers.
This is the first time the client has conducted this survey and has no prior information about the response rate. Some customers have purchased multiple times, while others have purchased only once. Their purchase experience may be recent or from some time ago.
I propose you take 5 minutes to read through my reasoning and the response I gave to the client, who is keen on doing things the right way.
The answer comes from practical realities, not just what statistics theory teaches us. We build our reasoning based on theory and validate or invalidate the sampling parameters to conclude whether the sample is “good.”
Finite Population Context:
It’s important to note that we are sampling from a finite population. We have a limited number of people we can approach and convince to respond to our survey. In infinite population sampling, volume helps significantly—if someone refuses, we move on to the next, like a bottomless bag.
Approach Method:
Phone, email, or SMS? If I send the survey invitation via email, do I have a guarantee that the message reaches everyone’s inbox? A phone approach is the best option because human interaction generates more responses. SMS ensures that the survey link reaches everyone in the contact base. The real question is who will respond.
Experience Status
How recent the customer’s purchase experience is affects both the response rate and the relevance of the answers.
There are newer and older contacts, and we will see who is more likely to respond to the survey.
Difficulty/Simplicity of the Questionnaire:
This factor greatly impacts the response rate and can lead to high abandonment rates. Customers may intend to complete the survey but give up midway.
Survey Invitation Recipients:
Should I pre-select contacts to send the invitation to, or should I send it to everyone at once? I can send the invitations in batches, provided these batches are randomly selected and maintain the structure of the contact database. Alternatively, I could use the purchase month/period as a criterion to build the batches. In both cases, I have a chance to monitor the response rate dynamics and adjust my approach.
Sample Objective and Tolerance for Error:
I want a robust sample that allows me to analyze data by sub-groups.
Let’s say I want a sample of 600 customers, which requires a 40% response rate. Great! This also gives me a small sampling error of +/- 3%.
What are the alternatives? I could accept a larger error tolerance, say +/- 5%, and settle for a sample of 300, which requires a 20% response rate.
Response Rate Objective:
It’s hard to decide without benchmarks, even approximate ones. I know how it feels—just like when I’m guessing the incidence rate of a product’s consumption.
Phone approach: If phone numbers are accurate, the response rate is around 30%.
Online panels: Typically have a response rate of around 10%.
Email approach: Response rates are lower due to spam/junk filters. For direct email marketing, response rates are about 1%.
When you have no idea, you can set a desired sample size. Ultimately, the final number of responses is the most important factor for the analysis you plan to conduct, and the response rate will result from this.
Target Population Profiling:
What variables are important for my business that are available in the database and can be used for minimal segmentation? It’s important to know my resources, understand who I’m asking for feedback, and anticipate who will respond. Profiling is critical for representativeness—ensuring the sample reflects the target population’s structure after selection.
Sample Profiling:
The descriptive variables of the target population also apply to the sample. I monitor daily progress and check for deviations between the population structure and the sample structure. I hope there’s no sub-group of customers stubbornly refusing to respond. If that happens, I need to get creative in motivating their participation.
Survey Participant Rewards:
Are rewards a good strategy for increasing the response rate? If so, what kind of rewards should I offer?
In my opinion, rewards should be seen as tokens of appreciation, not incentives. They can be symbolic gestures, especially when surveying customers who’ve already had an experience with your business.
There’s no need to turn customer interaction into a condition or irresistible attraction, as it’s preferable for the results to reflect reality.
Monitoring Field Parameters:
Also known as a field report, this involves tracking statistics on the status of the contacts you invited to participate—who abandoned (and at what question), who completed. Unfortunately, you’ll never know how many messages landed in the inbox versus the promotions folder. For this reason, the SMS approach (with a link in the message) is better. By phone, you’ll know exactly who refused outright, who abandoned, and which phone numbers were incorrect.
Final Thoughts:
Returning to the challenge: Is a 9% response rate enough to ensure a representative sample?
Strictly in this context, a higher response rate is desirable—around 20%. It depends heavily on the reference point and what you’re comparing it to. The answer might lie in profiling. For example, you might find that most responses came from customers who purchased within the last three months. In this case, you can adjust the response rate calculation base and redefine the target population. This increases the response rate.
If we consider that the invitation was sent via email, then a 9% response rate could be considered good. The question remains whether the sample of 126 respondents is enough to answer the business objectives. Profiling will again provide the answer regarding representativeness.
Statistics remains fascinating for me because it allows us to craft the story of a representative sample in a given context. Even if I’m unlucky and don’t find a satisfying story that checks all the above boxes, I can be content with the opportunity to experiment and learn. 126 responses are not to be dismissed… Next time, I’m sure to craft a better story and know how to increase the response rate.
Image Credit: Shubham Dhage on Unsplash
Sampling Error in Estimating a Mean Value
When we sample a subset from a population instead of surveying the entire population, we aim for the measured value in the sample to be as close as possible to the true value in the population. When we say "value," we refer either to an average (e.g., the average height of individuals in the population) or a proportion (e.g., the proportion of people who drink coffee daily). The difference between the measured value in the sample, let’s call it x′, and the value in the population, x, represents the sampling error.
Some samples may better reflect the data of the population they are drawn from, while others may not. This brings us to the concept of representativeness.
Let’s imagine that from a population of size N, we repeatedly draw different samples—let’s say k samples—each of size n, until we exhaust all individuals in the population. For each sample, the variable we measure will yield an average x1, x2... xk. If, after drawing each sample, we calculate the average of the sample means obtained, we will notice that as we add more values from successive samples to our calculation, the result converges closer to the true value in the population.
In the case of simple random sampling, the standard deviation of the value x’ (the sample mean), also referred to as the standard error, is smaller by a factor of the square root of the sample size compared to the standard deviation of the value x (the population mean), as shown in the formula below:
The error indicates the average deviation of the sample mean x’ from the true mean value x in the population from which the sample was drawn. It tells us the likely error we can expect when estimating the population mean x using the sample mean (x’). Since it is often the case that real data about an entire population is unknown, the population's standard deviation is also unknown. To calculate the sampling error, we rely on certain assumptions.
1. Samples follow a normal distribution
Returning to our imagined exercise of drawing k samples, the working assumption in sampling is that if we were to plot all the sample means obtained for the variable x, their distribution would follow a normal curve. In other words, when we extract k samples, the means of the measured variable are symmetrically distributed around the population mean (x), with higher frequencies near x and lower frequencies as we move toward the tails of the distribution (resembling a bell-shaped curve or a hat viewed from the side).
2. How to read/interpret the distance or interval on the normal curve between the sample mean and the population mean
The probability that the true population mean falls within a certain interval depends only on the length of the interval t, measured in standard deviations. In our example—where we consider the distribution of sample means obtained after drawing k samples—the standard deviation represents the standard error.
Thus, we aim to establish an interval within which the sample mean (e.g., x13, the mean obtained from the 13th sample) falls, with a sufficiently high probability that the error is smaller than the length of the interval.
In practical experience, the lowest accepted probability is P = 95%, meaning there is at least a 95% chance that, when selecting a random sample, the mean value falls within the specified interval. Conversely, the value p (calculated as 1-P) indicates the probability of making an error.
How Do We Interpret the Data?
There is a 95% chance that a value derived from a sample deviates by less than 2 standard errors (more precisely, 1.96) from the true population mean. There is a 99% chance that the deviation is less than 2.6 standard errors, and a 90% chance that it is less than 1.65 standard errors.
How Do We Calculate the Standard Error in Practice?
Based on the principles already outlined, we can substitute the population mean's standard deviation x with the standard deviation derived from the sample x’.
For example, we are interested in estimating the average height of the population. We conduct a survey on a sample of 800 people and find that the average height of the participants in the study is 176 cm, with a standard deviation of 17 cm for this average.
We substitute these values into the formula mentioned above:
We return to the table and see that for P=95%, t=1.96. The true value in the population lies within the interval: 176cm – 1.96*0,60 cm - 176 cm + 1.96*0,60 cm, meaning we are 95% confident that the true average height in the population is somewhere between 174.8 cm and 177.2 cm.
If we want to report the data at an even higher confidence level of 99%, we substitute into the formula again. For P=99%, t=2.6, the true value in the population lies within the interval: 176 cm – 2.6*0.60 cm - 176 cm + 2.6*0.60 cm meaning we are 99% confident that the true average height in the population is somewhere between 174.4 cm and 177.6 cm.
The maximum error thus increases—from 1.96*0.60 (1.2 cm) to 2.6*0.60 (1.56 cm).
We can also experiment with the sample size. From the formula, we can already deduce that as the sample size increases, the error eee decreases.
Assuming the same average height of survey participants and the same standard deviation, but this time surveying 2000 people, the error e will be:
Referring to a confidence level of 95%, t=1.96, we can say with 95% certainty that the true value in the population (the average height) lies somewhere within the interval 175.3 cm – 176.7 cm. The maximum error in this case is: 1.96*0.38=0.74cm.
In practice, it is often necessary to weigh the situation and decide whether reducing the error (in our case, from 1.2 cm to 0.74 cm) justifies increasing the sample size by 1200 people.
The answer might be affirmative if we aim to estimate averages within sub-populations (e.g., men and women). In this scenario, the sample size will no longer be 2000 people but will depend on the number of women and men included in the sample (let’s assume an equal distribution, with 1000 women and 1000 men).
If the average height among the women interviewed is 165 cm, with a standard deviation of 15 cm, the error will be:
We can thus report that we are 95% confident that the average height among the population of women falls within the interval 164.1 cm – 165.9 cm. The maximum sampling error in this case is: 1.96*0.47=0,92 cm.
Bibliography: Rotariu, T. (coord.), Bădescu, G., Culic, I., Mezei, E., Mureşan, C., Metode statistice aplicate în ştiinţele sociale, Iaşi, Polirom, 1999.
In essence, I am answering the question, "Where can I find the eligible respondent?"
Bank employees can be reached via email, as the bank provides the contact list. Moreover, the bank has decided to invite all 2,800 employees, without applying any additional selection criteria.
Oncologists can be found in hospitals, whether public or private. Alternatively, they can be reached by phone, provided I have a database containing all oncologists in Romania. Email is also an option, but with the same condition—having access to a comprehensive database of all oncologists. If there are only a few hundred oncologists in reality, but my database contains only 100, then I cannot conduct probabilistic sampling.
SMEs with 10 or more employees are registered with the Trade Registry, where addresses and phone numbers are available. I would also know the locality/county where they operate. There are companies that sell such data, and even the Trade Registry has pricing per company. There are two options: either send an operator to the address or call the number listed in the database. It seems promising to conduct the study. We’ll see later what challenges arise—every target comes with its own set of difficulties.
If I need to conduct a study among the +18 population nationwide, I have two options: a) Send an operator into the field—to fixed addresses (randomly selected in advance) or using the random route method. b) Randomly generate phone numbers.
If the target is 18-60-year-olds in urban areas who use the Internet, I have three options: a) Send an operator into the field—to fixed addresses or using the random route method. b) Randomly generate phone numbers. c) Send invitations to an online respondent panel.
5. What is the refusal rate for the target population?
You might think this is overcomplicating things, but yes, experience is needed to give an exact answer. I don’t expect exact figures. An approximate answer based on intuition is sufficient.
Bank employees: How enthusiastic will employees be about their managers’ idea to conduct a satisfaction survey? The exact answer can be provided by the bank commissioning the survey if they’ve undertaken similar initiatives in the past. If not exact, at least something similar. What is the level of employee engagement when it comes to actions initiated by HR/management?
Oncologists: How likely are oncologists to respond to an invitation to answer a few questions about a pharmaceutical company? Is this a vital issue for doctors—significant enough to motivate them professionally/personally? From personal experience, doctors are the busiest segment, often lacking time. Under what conditions would a doctor talk to an interviewer? How do sales representatives who approach doctors manage?
SME decision-makers: I’ll be speaking with decision-makers regarding technology/digital services. Is this target more available and willing to answer questions compared to oncologists? Perhaps yes, perhaps no.
6. What details do I have about the questionnaire?
What will the average interview duration be? Any approach is suitable for an interview lasting up to 20 minutes. As the duration increases, options become limited. For interviews exceeding 35 minutes, face-to-face (F2F) is preferable.
Are there visual materials to show the respondent? Are the questionnaire topics ordinary day-to-day matters, or do they address sensitive or complex issues? How willing will the respondent be to answer via phone or F2F?
7. What interview method is most suitable for reaching the target population?
The choice becomes somewhat automatic once I answer the questions in points 4, 5, and 6—where to find the eligible respondent and what method minimizes the refusal rate. Whether the questionnaire is simple or complex, short or long—these are details not to be overlooked.
8. I know the definition of the eligible respondent, the likelihood of finding them in the general population (incidence), whether profiling data is available, the sampling frame, and the expected refusal rate.
With this information, I have everything I need to choose the sampling method and design the methodology.
8 Steps to design a sample - Part I
In the first two articles (1, 2), we focused on providing arguments for why sampling for a face-to-face (F2F) study in Romania is conducted following the rules/methodology of multi-stratified sampling combined with cluster sampling. Essentially, we answered the question, “How do I reach the respondent?”
When I think about sampling, regardless of the study, we aim to answer the following questions, which are essentially the steps I follow mentally, like a logical framework:
1. Who is the eligible respondent?
The eligible respondent is equivalent to the target or target population, the audience of interest for the study.
To define this in detail, even to the point of visualizing it, I aim to assign values/meaning to basic socio-demographic variables (e.g., gender, age, residence type, or locality size—I'll sporadically refer to both concepts, as well as region). If, in addition to these socio-demographic variables (for which official statistics exist), other variables are added, we can already say the definition becomes more complex.
From this first point, we can determine that the sampling unit will be the population/consumer, household, company (or companies), schools/educational institutions, students, etc.
2. Do I know how many people/entities/units meet the criteria of the target population definition out of the total population?
In other terms, this refers to the incidence rate of the target population at a national level. It may seem surprising how important this information is... it's good when the incidence is high and challenging when it's low. We know this may seem confusing, but allow us to explain what a low or high incidence rate means:
A) Let's say we need to conduct a study among the employees of a bank to measure employee satisfaction. The bank has 3,000 employees, 2,800 of whom have email addresses and are relevant to the bank's action. The intention is to invite all 2,800 employees to the survey. Here, the target population is defined by the employee list deemed relevant by the bank, making the incidence rate 100%.
B) A pharmaceutical company producing oncology medications (regardless of specialization) aims to measure brand awareness and perception. The target population is all oncologists in Romania. Again, we’d have a 100% incidence rate because we need all oncologists. However, if we only needed dermatology oncologists, for example, the incidence rate would be different—calculated as the number of oncologists in the desired specialization divided by the total number of oncologists. This could result in an incidence rate below 20%, possibly closer to 10%.
C) A provider of advanced technology services wants to determine what it needs to do to increase brand awareness and, subsequently, its market share among SMEs (small and medium-sized enterprises) with more than 10 employees.
The SME category definition: "It consists of enterprises employing fewer than 250 people and having an annual turnover not exceeding 50 million euros and/or an annual balance sheet total not exceeding 43 million euros."
According to INS (Romanian National Institute of Statistics), in 2020, there were slightly over 600,000 companies in Romania, 91% of which were in the 0-9 employee segment, which, based on the client's target definition, should be excluded.
Distribution of companies by number of employees:
The SME definition also requires eliminating companies with 250 or more employees. This means we're left with 9% of companies, without even considering the turnover condition.
The incidence rate can dictate many decisions regarding the appropriate research methodology for the market study you're designing.
3. Do I have data about the target profile?
When the target population is defined as the national population aged 18+, I can certainly rely on data provided by INS.
Profiling data is important to know the distribution by region and locality size. No matter how hard you try to ensure probabilistic sampling and take all the precautions regarding accurate responses, errors are real and inevitable, whether stemming from sampling—selection methods for sampling points, respondents—or during the completion of the interview.
The refusal rate (refusal to participate in the survey) has a significant impact on sampling quality, as it can alter the target profile, risking inadequate coverage of the population of interest. Trust me, the refusal rate has increased significantly over time (a negative for researchers), and it's undoubtedly evolving even now, post-pandemic and amid economic and geopolitical uncertainty.
It varies greatly by gender (female vs. male), age group (young vs. older), and between Bucharest/large cities vs. rural or small urban areas.
You need these profiling data to verify the sample structure against socio-demographic variables, understand the size of deviations, and determine where (in which strata) they occur. You may need to consider weighting the data to align the sample with the official structure.
This relates to the representativeness of the sample—ensuring it reflects the target population structure. The goal is to have confidence (a high confidence level) that if you generated an infinite number of samples, following your chosen methodology, you'd arrive at the same results, with deviations within the maximum sampling error limits.
If the answer to this is "yes," you can rest assured. If not, it’s advisable to ensure early on that a source exists and is accessible (ideally from the proposal stage). If you're very unlucky and no source exists, it’s wise to budget separately to address this lack of data.
See the next steps here.
Why you should combine multi-stratified random sampling with cluster sampling when conducting fieldwork in Romania. Part II
Now that we’ve established the context, I feel that we can return to our discussion about multi-stratification. Stratification is employed when you can divide the population into sub-groups which are heterogeneous, disjunctive. They do not overlap. This separation is very clear. It somewhat helps to break a large population or area into smaller, more manageable chunks. Region is a good variable which does this, separating a population into smaller and definitely distinct parts. Regions are based on county composition, for example Tulcea is part of Dobrogea, as a historical region, along with Constanţa. They are the only counties to form Dobrogea. Well, there is one more variable which manages to separate the population into distinct groups. It is easy to infer which one if you remember that there are several settlements, predominantly villages, that are assigned to towns or communes. It is the terms municipality, town or village which provide a difference based on the number of inhabitants and makes them feel proud about themselves. In fact, the variable is “settlement size”, which incorporates the levels rural, large town, medium town, small town. The capital, Bucureşti, is by itself a (historical) region and an independent level as it comprises around 2 million inhabitants. The next largest town is Cluj-Napoca with slightly under 300.000 people. Seeing this enormous difference between Bucureşti and the second largest city in terms of population, it is clear that the capital does deserve its own strata. A small town might have 3000 inhabitants, the same as a village. Yet, assignment to an AU or strata, if we use settlement size, is decided by public/ national authorities based on certain criteria. Which is why the SIRUTA codes matter, as well as how those authorities choose to segment the territory. I will detail below the “settlement size” for various strata. In any way, settlement size strata (or urbanization level, or rural/ urban environment) may be used to meet whatever needs your study has, if you have a database containing all settlements of Romania and the number of inhabitants for each settlement. What do you think, do these strata or subgroups, so diverse due to their AU assignment or number of inhabitants, appear in every region, or can we see some type of towns or communes only in certain regions?
Table 1 - Population distribution by region and settlement size – number of inhabitants*
How to read the table (Region by columns X Settlement size by rows):
Cell B2 states that in Ardeal, in Large urban areas, live close to 600.000 people. Cell G5 shows how many people live in rural Dobrogea. Column H contains total number of inhabitants for each stratum, while row 6 indicates total number of inhabitants for each region.
*Note: keep in mind that the data are quite old, the source being INSSE 2015. My advice is to look at these data as an exercise on how we treat data so as to generate a representative sample
Now let us see what the proportions for each cell in Table 1, population distribution by region & settlement size, are. Let’s look at Table 2, where the % is taken out of the total population, 20 million inhabitants.
Table 2-Population distribution by region and settlement size - % of total
Bucureşti, column A, has a weight of 9% in total population. The rural areas are home to 46% of our country's inhabitants. The largest urban stratum is Small urban, accounting for 18% of the total population. There are 2 regions with good coverage for this stratum, Moldova and Muntenia. Dobrogea has the fewest inhabitants living in small urban lcoalities.
We now have plenty of information and two variables that can adequately separate Romania’s population. How can I know where my respondent is? Or, to rephrase that, suppose we had 100 field agents/ interviewers (ideally 😉 ), where do we send them, to what settlement, on which street? How many of the 13.000 settlements must we visit? We are discussing a face-to-face study, this being the most complex method. We will address later on what happens for online panels or when using CATI – stratification is kept on those two methods as well, but there are fewer steps to it.
Table 3 – Settlement distribution by region and settlement size – number of settlements*
*Note: Treat these data as an exercise, source INSSE 2015.
We can see that once we reach stratum 4, the number of settlements on each cell increases sharply. Obviously, when designing samples, we will not visit each and every one of the settlements. We will select one sample, but an indefinite number of samples might be generated. The solution is to create a sample based on clusters – groups of inhabitants from a homogenous population, who all share the same traits regarding region and settlement size. Bingo, we have to extract population clusters from each cell from the above table. You might be wondering how many people must/ can this cluster include. Before we answer that question, let us see how the distribution of questionnaires/ respondents looks like for a sample of 1000 by region and settlement size.
Table 4.1. - Respondents spread by region and settlement size, N=1000
Table 4.1 states that we have to recruit 93 respondents in Bucureşti, 126 from rural Muntenia. While we can include 93 people from Bucureşti, making sure to visit all 6 sectors, it is impossible to recruit 126 people from the same village in Muntenia. Looking at Table 3, there are over 2,000 villages (some will be communes, some villages dependent on communes). Selecting just one village out of 2600 means covering only 0.04% of the region’s potential. The sample keeps it representativity if we maintain a good territorial spread (it is rather needed/ desired to cover all counties) and if the methods employed ensure its randomness. As you might have guessed, we are yet at the selecting the respondent stage, this is just the first step, selecting the sampling points (and, implicitly, the settlements) for every stratum/ cell.
This is where the cluster sampling method comes into play - to determine the number of sampling points, meaning settlements, and then selecting the respondents for each point. For a given settlement we might have one or several sampling points, it very much depends on the number of settlements each stratum/ cell contains. For a better understanding, we’ll equate sampling point to address/ start point. At these addresses you’ll send your field agent to begin recruiting, rules in hand!
Let us exemplify for Dobrogea – a region which has only one Large urban settlement (Constanţa), only one Medium urban settlement (Tulcea), and 15 Small urban settlements. It is obvious that we conduct interviews in Constanţa and Tulcea, those being our only options. For Constanţa we have to find 14 respondents. Do we recruit them from a single sampling point, or from several? To ensure a better sample, it’s obvious several are needed. Considering a cluster of 7 participants, then 2 sampling points would be used. Were we to employ a 10 sized cluster, we’d end up with a cluster and almost half, a bit tricky to handle, as it is better to have equally sized clusters. For the Small urban stratum, we’ve established there are 15 settlements, where we have to find 11 participants. We might use either one or two clusters, so I’d rather use 2 clusters in 2 different settlements.
What does, essentially, a x sized cluster mean? It means that, starting with the first address, the field agent/ interviewer employs a random selection rule to select the household and another random selection rule to select the participant from within said household until they reach a number of contacts/ selections equal to the cluster size. (A contact/ selection does not necessarily mean a complete questionnaire/ done interview, but we’ll discuss such matters on a later date).
Table 4.2. - Distribution of sampling points for region and settlement size, for cluster=7 respondents, N=1000
For a sample of 1000 respondents and a size 7 cluster, we will be working with 143 sampling points. Using a size 10 cluster, there would be 100 sampling points – a rather large discrepancy. You are probably considering which approach would be best. A theorist would say that more sampling points is better, meaning a smaller sized cluster, 7 in our experiment, because it ensures better spread, allowing for a higher chance to cover all counties and more settlements. Someone focused on cost optimization (fewer rural settlements in the sample, for lower travel expenses) while maintaining an adequate sample quality would favor a size 10 cluster. We could try a somewhat middle of the road approach with a size 8 cluster, for 125 sampling points. Anyways, for a sample of 1000 respondents I wouldn’t recommend a cluster smaller than 7 or larger than 10.
Conclusion
It is very important to be familiar with the country where you are conducting the survey/ study and understand the way its territory is organized.
Its area, as well as the average density and settlement spread provide valuable insight.
Combining stratified sampling with cluster sampling is ideal for any random/ probabilistic sample, regardless of the sample source. It helps in segmenting/ stratifying a population into smaller groups, more easily managed and contacted.
Why you should combine multi-stratified random sampling with cluster sampling when conducting fieldwork in Romania. Part I
Stratified sampling and cluster sampling are two of the four types of probabilistic sampling. I suggest we combine them... it might seem weird, yet it really helps with managing random samples.
One thing is certain regarding Romania and managing population data records – there is no possibility to draw a random sample of citizens from a database containing contact information for all Romanian residents, so that you may claim to perform an adequate probabilistic sampling, with each person having an equal chance of being selected in the sample, or at the very least so you could compute the probability of being selected for each individual that gets drawn in the sample. Personally, I had the opportunity to experience a collaboration with D.E.P.A.B.D. (The Directorate for Persons Record and Databases Management) that was responsible for randomly extracting addresses following an algorithm I supplied. I needed a sample of 5000 addresses for Romanian residents aged 50 and above. Even so, I had to design the sample as a multi-stratified cluster based and extract the localities, and specify how many addresses I needed for each locality. The collaboration was somewhat good, notwithstanding the long time it took to complete. What surprised me, though, was when after several weeks of waiting, I finally received the database containing addresses, only to find out that for some rural localities, for which the concept of a street is a foreign one, they were unable to perform the sampling. I panicked... we eventually found a solution to maintain the conditions needed for a probabilistic sample for these administrative units as well, but it delayed us by 2 weeks.
Now, before going into detail on multi-stratification, I wish to highlight some particularities about Romania and the way its managed or how its territory is organized. I will employ already existing data published by several authorities. I also have several data and I noticed there are discrepancies compared to what one might find on INSSE, provided you are patient and process their files. By the way, INSSE’s structuring of the files containing population details at settlement level is severely lacking. I could never figure out why the SIRUTA codes (unique code for each settlement in Romania), managed by an entity which has a responsibility to organize the territorial management of the country, are not found throughout all the INSSE files, the latter preferring to use text documents. If you are lucky enough to find excel documents, you can be sure you will find the same settlement written sometimes with diacritics, sometimes without, and when dealing with rural areas, you will only find data on communes (and not villages). One wonders... which is why I am stocking up on my patience reserves for the data for the census that is just beginning (named December 1st 2021) and hope that they learned how to create smart files.
A few data about Romania
Population: around 20 mil
Area: 238,397 km2
Population density: 84,4 inh./km²
41 counties
7 historical regions (București, Ardeal, Banat/ Crișana/ Maramureș, Moldova, Muntenia, Oltenia, Dobrogea) or 8 micro-regions (NUTS 2) defined by INSSE, somewhat more balanced (Bucuresti – Ilfov, Nord-Vest, Centru, Nord – Est, Sud – Est, Sud – Muntenia, Sud - Vest Oltenia, Vest).
In 2016 there were 3181 territorial administrative units (a file published by Romanian authorities on Eurostat), called LAUs in European lingo. These AUs, short for administrative units, are meant to manage several settlements. That can mean municipalities, towns, or communes. There are no self-managed villages. Villages, standing at over 10.000 total, are assigned for management to towns or other villages, the latter serving as communes. Among the 3181 LAUs we only find those villages which act as communes. Apart from these, there are also an additional 10 thousand or so villages, pardon my repetition, which are subordinated to a town or a commune.
I would also like to focus your attention on population density. I provided some data above, but without comparing the number with data from other countries one can not say whether Romania is crowded or rather bare.
Below you’ll find a map from Eurostat. Blue areas are low density, people have plenty of available space and live apart, whereas orange areas are denser. As you can see, there’s plenty of space to go around in our country, Romania’s settlements are rather scattered.
Click here for part II.
Wall painting during the pandemic and what to expect come 2022
The two pandemic years differ regarding the reasons people chose to paint their walls. 2020 saw wall painting soaring. The consumers, being at home for extended periods of time, apart from the concern for cleaning (54%), sanitizing (35%) or their walls’ yellowing (25%) also expressed a desire to change the color palette or decor of their homes (22%). In 2021 there are fewer brand rejection reasons, notoriety and consideration stagnated or fell. There are fewer people mentioning cleaning (47%) or even redecorating (13%).
The pandemic changed habits, making people more involved in the home renovation process than befire
If in 2019 35% of consumers painted their home by themselves, 33% alongside another family member, acquaintance, or professional painter and 32% left it entirely for someone else to do, in 2021 40% painted by themselves, 38% received help from someone else and only 22% were not at all involved with painting their house.
Regarding choices made, wall colour is selected together with the partner more frequently than the brand and neither decision needs validation from a professional. Those with higher household incomes are more prone to making decisions as a couple both with regards to brand as well as colour.
One of the washable paints with good momentum during the pandemic is Evrika. It started its communication campaign in 2020 to showcase its products and new packaging, and its brand image consolidated throughout these two years. Among consumers, this brand’s differentiators are connected to personality traits such as honesty and a free spirit, and its emotional benefit is linked to “it helps me express myself when I decorate”. Among professional wall painters Evrika stands out as a trusted brand, considering its functional advantages: “easy to use” and “allows the walls to breath”.
The manager of the Paint Azur Timişoara team, Narcis Obeada, tells us about the investments they have made to meet the promises made in their product communications. “We have our own labs where we develop and test Evrika products. We continue to invest in research and development. We made major adjustments in our recipes and use best quality raw materials in order to offer products at the best quality to price ratio.”
What will happen in 2022 to the washable paint market?
“Starting with September 2021 we saw a new increase in the prices for raw materials within the wall paints sector, and as a producer we are currently facing additional pressure. The impact on the product’s final price is high, as raw materials make up more than 60% of its cost, to which we must add energy and fuel which, likewise, have seen record increases.”, states Narcis Obeada.
The price increases related to the type of raw materials are:
The study carried out by Wisemetry Research during November 15th – December 10th 2021 sought to investigate usage behaviour and consumers and professional painters perceptions regarding the most important brands on the market. It interviewed a sample of 500 recent users of washable paint, aged 25 to 60, urban dwellers, and a sample of 300 professional wall painters.
Wisemetry Research: Throughout 2021 people will be cautious with their budgets, but will spend money on those small pleasures that were unattainable the year before, such ar travelling
The study was carried out by Wisemetry Research between October 30th and November 4th 2020 and aimed to highlight the challenges caused by the mobility restrictions imposed to prevent the spread of the SARS-COV-2 virus. The study touched on topics such as the organization and challenges of telework, the care and education of children and gender imbalance in household chores and supervising online schooling, mechanisms people adopted to cope with isolation or habits that people wish to adopt or keep for the following 12 months.
What are people’s plans for the coming year?
2021 – the year during which people will balance the need for material security, understood through their intent to save money, increase their income by getting promoted at their jobs or getting a second job, with their desire to increase their comfort and relaxation through doing some home-improvement projects or going on a well-deserved vacation
When asked about their plans for 2021, most respondents - 42% stated they wish to increase the amount of money they save in the following 12 months. One in three respondents plan on doing some home improvement/ renovation, while 28% want to go on a special vacation or trip, a sign that they missed travelling during the year that just ended. Moreover, one in four respondents plan to supplement their income through possibly taking a second job or activating other income sources.
The desire for material security and a behaviour oriented towards saving are natural when people are confronted with extreme threats, be it a financial crisis or a health crisis. Among their plans for the new year, we find:
Paying off a loan – 21% of respondents
Taking professional or personal development classes – 20%
Finding a new job – 18%
Making a major change in their career – 17%
Over the next 12 months, most respondents will invest in special holidays, electronics, and home appliances
When it comes to investments people intend to make in 2021, 32% of respondents are certain that sometime over the next 12 months they will go on holiday/ a trip. As 2020 was characterized by lockdowns, it is expected that in the near future, as travel once again becomes safe, people will want to explore new destinations, maybe even exotic countries and stray beyond the traditional tourist sights.
One in four respondents are certain that in the coming year they will invest in electronics or IT&C equipment. Electronic devices became essential goods during 2020 as most activities moved online -job, school, as well as entertainment and keeping in touch with loved ones.
We shouldn’t forget about those intending to invest in home renovation (23%) or buy new furniture (22%). During 2020 people had to redefine the purpose of a home – it developed new meanings and incorporated new functions – workspace, school, space for socialization. In this context a need to adjust the space to its new purposes arose. Undoubtably, people’s expectations of an ideal home have changed. There is also some interest in acquiring real estate – a home or land.
During 2020 people embraced a digital lifestyle and there are hints that this trend will continue into 2021. Additionally, care for others and for oneself is increasing.
Among the top 3 habits people wish to adopt or keep in 2021 we find activities related to digitalization and those that will save up time with administrative tasks:
66% of respondents intend to cut down visits to bank branches as much as possible in 2021
62% plan to use online banking apps
61% want to receive invoices for utilities in a digital format exclusively
Among people's priorities for 2021 are caring for loved ones and caring for their own physical and mental health – 53% plan to visit their parents more frequently, 47% intend to get more involved in their children’s education, 42% want to exercise regularly and another third wish to quit smoking or get medical check-ups more frequently. 8% intend to work less than 8 hours per day. Respondents also wish to be more socially responsible this coming year. Therefore, 54% of those interviewed intend to make a habit out of recycling – respondents would be happier had public authorities set up the infrastructure for selective recycling, and 41% intend to mainly purchase products made in Romania. Volunteering and donations are on the list for 15% of respondents. A desire to use the car more sparingly showcases people’s care for the environment – something authorities might take into consideration
The study was conducted online, on a sample of 1000 respondents. The sample is representative for the Romanian population of Internet users with regards to gender, age, and region. The maximum sampling error is ±3.1%. Along with information regarding future plans, the study also contains data on:
Managing online schooling and the challenges encountered by parents in caring for and supervising their children
Work patterns and challenges during telework (remote work)
Behaviours adopted to cope with social isolation
The impact of social isolation on relationships and family life
You can request access to the full report free of charge by sending an email to: office@wisemetry.com.
Wisemetry Research: 76% of parents of 7- to 17-year-old children are concerned about the negative consequences prolonged use of electronic devices might have on their kids
The study was carried out by the Wisemetry Research between October 30th and November 4th 2020 and aimed to highlight the challenges caused by the mobility restrictions imposed to prevent the spread of the SARS-COV-2 virus. The study touched on topics such as organising remote work, the care and education of children and gender imbalances in household chores, mechanisms people adopted to cope with isolation or habits that people wish to adopt or keep for the following 12 months.
What are the parents' main concerns during these times and how do they manage their children’s online schooling?
Women are more involved than men in the care and education of children
49% of respondents who have children aged under 18 years old stated that both parents are equally involved in their children’s care and education, while 46% said that the mother was more involved. Only 5% believed that the father was the one spending more time caring for and looking after the children. However, among those that stated that both parents are equally involved in caring for their children, women estimated an average of 9.5 hours spent with the child(ren) during an average workday, while men only estimated an average of 6.4 hours daily
63% respondents believe that during the lockdown they spent more time participating in their children’s formal education, a percentage similar across genders.
Almost two thirds of parents think rate the online schooling system/ method below their expectations
34% of parents of school age children believe that online schooling was managed below their expectations, while 26% consider it way below their expectation. Only 16% see it as (way) above their expectations.
76% of parents of 7-to-17-year olds are concerned about the negative consequences prolonged use of electronic devices might have on children
Almost 8 out of 10 parents say they are concerned about the negative impact long term use of electronics might have on their children. Among their main concerns is also their fear that, while being schooled online, their children will fall behind with regards to their knowledge level – 70% of parents interviewed somewhat or totally agree with that statement.
Over half of the parents interviewed state that the time of day when their child has online school is very stressful for them, women more often than men.
On the other hand, two thirds of parents consider themselves more aware of their children’s education level and of their emotional state regarding school as compared to before the pandemic.
Parents’ main need is reopening schools
When asked about what they would need to ease the burden of educating/ raising their children, most parents – over 40% - stated they wished schools reopened/ stayed open even during crisis situations.
38% said a flexible work schedule would help, men needing it more than women
17% need more understanding and flexibility from their employer/ manager
22% believe less homework for children would be helpful
20% would be glad to have someone in the household take care of the housework
Only 11% of those interviewed wished their partner were more involved in raising the children. The share of women holding this opinion is significantly higher than that of men – 5% of men as compared to 15% of women.
The need to spend more time caring for and supervising children was disproportionately felt by mothers
The study’s data confirms a trend already documented in other countries. Speaking about the United States of America’s situation, economist Martha Gimbel said that women are captive in their need to spend more time involved in household chores and caring for their children. Many end up feeling overwhelmed, tired, stressed and helpless (source: NPR.org).
The study was carried out online on a sample of 1000 respondents, of whom 411 had underage children, and 252 had school age children (7–17-year old’s). The sample is representative for the Romanian population of Internet users with regards to gender, age, and regions. The maximum sampling error is ±3.1%.