applied business methods report

31
APPLIED BUSINESS METHODS: RESEARCH PROJECT What Determines Hotel Customer- Review Scores? Group 6A Dennis Johannisse (370759) Zhengchen Wei (36907) Mike Lien (356159)

Upload: nubbles

Post on 15-Nov-2015

11 views

Category:

Documents


0 download

DESCRIPTION

Report on determinants of hotel prices in the US

TRANSCRIPT

  • Report

    APPLIED

    BUSINESS

    METHODS:

    RESEARCH

    PROJECT

    What Determines

    Hotel Customer-

    Review Scores?

    Group 6A

    Dennis Johannisse (370759)

    Zhengchen Wei (36907)

    Mike Lien (356159)

  • 2

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Table of Contents

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    TABLE OF CONTENTS

    Table of Contents ............................................................................................. 2

    Table of Figures ............................................................................................... 3

    Chapter 1 Introduction ................................................................................. 4

    1.1 Background information and research question .............. 4

    1.2 Outline of the report ........................................................... 4

    1.3 Research hypothesis and causal relations scheme ......... 5

    Chapter 2 Univariate Analysis....................................................................... 7

    2.1 Dependent Variable ............................................................ 7

    2.2 Independent Variables ....................................................... 8

    2.3 Bonus Variables .................................................................. 9

    Chapter 3 Bivariate Analysis ......................................................................... 9

    3.1 T-Test ................................................................................... 9

    3.2 One-way ANOVA F-Test ..................................................... 10

    3.3 Coefficient of Correlation ................................................. 13

    3.4 Chi-Squared Contingency Table Test ............................... 14

    3.5 Bonus Variable Tests ........................................................ 16

    Chapter 4 Multivariate Analysis ................................................................. 17

    4.1 Two-Factor ANOVA Analysis .............................................. 17

    4.2 Regression Model I (RM1)................................................ 18

    4.3 Regression Model II (RM2)............................................... 21

    4.4 Comparison of Model I and Model II ............................... 22

    Chapter 5 Conclusion and Evaluation ....................................................... 23

    5.1 Interpretation and Significance of Results ..................... 23

    5.2 Recommendations for Hotel Managers and Owners ..... 24

    5.3 Evaluation of Data and Research .................................... 25

    Appendix A: Bivariate Analysis Tests ............................................................ 27

    Appendix B: Two-way ANOVA Tests ............................................................... 28

    Appendix C: Multivariate Analysis Tests ....................................................... 29

    Appendix D: Bonus Variables ........................................................................ 30

  • 3

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Table of Figures

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    TABLE OF FIGURES

    Figure 1 causal relations scheme .................................................................................. 7

    Figure 2 cr_score histogram (Left: Original, Right: Removal of outlier) ....................... 8

    Figure 3 Scatter plot between price and cr_score ...................................................... 14

    Figure 4 Comparison of number of hotels by stars between hotels that advertise

    (left) and hotels that do not (right) ............................................................................... 15

    Figure 5 Scatter plot of dAirport and cr_score ............................................................ 16

    Figure 6 Plot of mean of treatments within luxury versus cr_score ........................... 17

    Figure 7 Graph presenting advertising*star interaction effect .................................. 18

    Figure 8 Table listing chosen international airports and their coordinates............... 31

    Cover page picture shows the seven star Burj Al Arab seven-star hotel, located in Dubai,

    UAE. Courtesy of Ito Joi (Source: http://www.flickr.com/photos/joi/2086020608/)

  • 4

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Introduction

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    Chapter 1 INTRODUCTION

    1.1 Background information and research question

    The main goal of this report is to find out which factors influence customer reviews, as well

    as understanding how they influence them, which in turn can influence a hotels revenue or

    profitability. The significance of customer reviews has become increasingly more important

    after recent developments in the travel industry across the United States (US), customer

    satisfaction is therefore an important determinant which will be researched for this case.

    For researching this case, data about 1562 hotels spread across 6 geographical markets in

    the US has been gathered. This report aims at finding out what factors might affect

    customer reviews for different hotels, as it could help managers determine how hotels can

    increase customer satisfaction which in turn can improve the revenue and profitability of

    the hotels.

    Thus, the main research question for this report is as follows:

    What are the determinants of hotel customer-review scores on Orbitz for hotels with

    different characteristics?

    1.2 Outline of the report

    This report is made up of five sections and four appendices (A, B, C and D). In section one,

    the introduction and summary of research data and hypotheses, the background

    information and research question for the report will be presented. In addition, our research

    hypothesis and causal relation scheme along with a short introduction of all relevant

    variables will be presented here. The second section, the univariate analyses will reveal the

    characteristics and properties of the aforementioned chosen variables. In our third section

    we will present the bivariate analysis, which analyses the proposed relationships from the

    causal relation scheme. We will use the t-test, one-way ANOVA, correlation coefficient and

    contingency table to analyse this section. Then, we will use the outcomes of the

    aforementioned tests to evaluate the presence, nature and strength for each of the

    proposed relationships. Relevant 7-steps schemes will be presented in appendix A. Our

    fourth section on multivariate analysis consists of a two-factor ANOVA analysis which

    involves the response variable, cr_score, and two other factors from our causal relationship

    scheme. The relevant 7-steps scheme for this test will be provided in appendix B. We will

    also formulate a regression model on the basis of our original causal relationship scheme.

    Relevant computations can be found in appendix C. A second regression model will then be

  • 5

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Introduction

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    presented, and the results will be discussed. Finally, in section five will conclude with the

    results and evaluate.

    1.3 Research hypothesis and causal relations scheme

    Before we show our established causal relations scheme, all included independent

    variables and their proposed relations to the dependent variable, cr_score, will be

    presented below. In addition, we also present a few additional cross variable relationships.

    Independent variables

    Destination ID

    Each destination has different features and popular places, customers might have different

    preferences or ideas what they want to do during their holiday or travel. In addition, some

    destinations might be perceived better and have an overall higher, or at least different,

    score as compared to other locations. Therefore, we believe that destination ID has an

    influence on the central cr_score variable.

    Price

    Price is an important aspect of a customer review. Though not a determinant of its own, it

    does play an important role when compared to the quality: is a stay at this hotel worth its

    money or is it way too expensive? Thus, we conclude that price has an influence on

    customer reviews however we are not sure whether this will be positive or negative.

    Stars

    We suspect that hotels with a higher star ranking will have a higher customer review score

    on average. A star is awarded for good quality and service which, logically speaking, is a

    determinant of positive customer reviews.

    Employee Satisfaction

    If employees like their job, generally the hotel has a more favourable and enjoyable

    atmosphere. We believe that this too can positively influence the customer reviews.

    Therefore, we believe that employee satisfaction is positively related to the dependent

    variable customer review.

    Chain

    Being associated with a certain hotel chain can be viewed as a symbol of prestige. Being

    part of a chain requires hotels to fulfil certain quality conditions in order to be eligible to be

    part of it. Such requirements, for example hygienic standards or certain luxurious furniture

    such as a TV can also mean that customer reviews tend to be more favourable. We propose

    that there exists a positive relationship between being in a chain and a higher customer

    review score.

  • 6

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Introduction

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    Property type

    Different sorts of hotels serve different needs. It is important to remember that, for example,

    a full service hotel is rated differently than a simple youth hostel. We believe that a

    difference in ratings exists between these groups of property.

    Advertisement

    A hotel with more online ad impressions is on average more well-known. A hotel advertises

    its main selling points, which in turn attracts customers. If the perceived quality of the hotel

    matches the quality that it has advertise, customers may provide favourable rating for the

    hotel. The degree of advertisement is ranked into five categories based on degrees of ad

    impression. We believe that hotels which advertises have more favourable rating on

    average.

    Cross-Variable relationships

    dest_idprice

    Some locations are generally more expensive because of their touristic attractions and

    popularity. We therefore assume that the outcome of the destination_id variable has an

    influence of the price variable. We believe that differences in the means of prices exists

    between locations.

    stars price

    A high star ranking shows that a hotel possesses certain degrees of quality or prestige.

    Accordingly, the hotel has a legitimate reason to ask a higher price for a stay. We therefore

    believe a positive relationship exists between star ranking and price.

    property_type price

    Different hotel types charge different prices. For example, a full-service hotel would charge

    a higher price than a simple bed & breakfast. We suspect that some difference exists

    between the average prices of property types. Thus, we identified property type as an

    influencing factor of the variable price.

    advertisement price

    We expect advertisement to have at least some sort of influence on the price of hotels. We

    suspect it is a positive influence.

    advertisement stars

    An interesting relationship is to examine is whether (heavily) advertising hotels have a

    higher star ranking. In case a hotel claims to have certain features and they are deemed

    great, perhaps it receives a higher star ranking if the best is deemed accordingly great as

    well. We therefore suspect a positive relationship between advertisement and amount of

    stars in the star ranking variable.

  • 7

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Univariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    Bonus variable relationships

    dAirport cr_score

    We expect this variable to be inversely correlated with customer review score, assuming

    that easier access to an airport yields a higher customer rating. We are interested in this

    relationship as we put a lot of effort in creating this bonus variable.

    luxury cr_score

    We expect a positive relationship between the two variables, as one would expect a more

    luxurious hotel to receive a better score.

    Figure 1 causal relations scheme

    Chapter 2 UNIVARIATE ANALYSIS

    This chapter analyses the data within each individual variable.

    2.1 Dependent Variable

    cr_score

    The histogram plot of cr_score demonstrates a slightly positively skewed distribution, with a

    mean of 3.58 and standard distribution of 0.702.

    There is also an unusually high frequency of recorded 1.00 and 5.00 scores, whilst

    statistically speaking have to be near zero. A z-score test has been utilized to determine

    whether this has an outlier effect on the overall distribution.

  • 8

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Univariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    The z-score test produces a value of -0.01817, therefore implicating that the 1.00 and 5.00

    cr_score values therefore have little impact on the overall distribution. The removal of these

    data points should thus not be necessary.

    Figure 2 cr_score histogram (Left: Original, Right: Removal of outlier)

    2.2 Independent Variables

    price

    The price distribution is strongly negatively skewed with a mean of 132.89 and standard

    deviation of 90.696. The most frequently occurring prices are centralized around 90.

    Again, an outlier analysis in an attempt to minimize statistical distortion. The chosen cut-off

    point are data points which lie beyond 400. This resulted in a z-score value of -0.08877,

    which therefore indicates that the removal of such datapoints have an insignificant impact

    on the price distribution as a whole.

    emp_sat

    On average, emp_sat has a value of 3.85, and is standardly deviated at 0.625. The

    distribution is slightly positively skewed, with most data points being centralized around the

    4.2 region. It can be visually determined through a histogram plot of emp_sat that there are

    no significant data points which would be indicative of being outlier.

    Destination_id

    The sample distribution in this variable are more or less even, however if it were truly evenly

    distributed each destination would have a frequency of 16.67%. Most variables are near

    this percentage, with the exception of Las Vegas (8,3%) and Los Angeles (23,5%). Thus,

    more observations of hotels in Los Angeles than Las Vegas are included in the dataset.

    Stars

    The distribution of samples within stars not optimal either; the data clearly has more

    observations of the 2 stars and 3 stars: 33,4% and 40,7%. The mode of this variable is a

  • 9

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Bivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    star ranking of 3. The star ranking 0 is most likely an error in processing the data or

    because there is no information available, as stars are always awarded on a 1 to 5 scale.

    Chain

    The dataset reveals that a large amount of hotels, 1220 out of 1562 (78,1%) are part of a

    chain.

    Property_type

    The frequency table of property_type indicates that 1386 of 1562 hotels are full service

    hotels, which occupy an extremely high proportion (88.7%) in the database. This is followed

    by motel and resort with 5.6 and 4.4 percent respectively. The rest of the specialized hotels

    only occupies less than 1.5% in our analysis.

    Advertisement

    A frequency table for this level suggests that 1302 out of 1562 hotels do not advertise at

    all or very low (part of the bottom 5% advertisers). Distribution of this variable is thus far

    from desired.

    2.3 Bonus Variables

    dAirport

    A univariate analysis reveals that significant numbers of hotels are located within 20km

    range from and international airport (60.2%), therefore indicative that a significant number

    of the sampled hotels are built to facilitate customers who make use of aerial

    transportation. The remaining 621 hotels are located outside a 20km radius.

    Luxury

    The frequency table indicates that most hotels have at least one or two luxury facility

    (67.5%), while only a few hotels have a full luxury rating (3%). 219 hotels (14%) have none

    of the aforementioned facilities

    Chapter 3 BIVARIATE ANALYSIS

    In this section, the relations specified in the causal relation scheme will be analysed using

    various statistical tests. We explain the tests and comment on the outcomes for each test.

    3.1 T-Test

    For examining relationships between one qualitative variable (which has two treatments

    thus K=2) and one quantitative variable, we use the t-test. This particular technique is used

    to evaluate the differences between the means of two variables. Before that, we need to

    use the F-test to examine whether the variances are equal or not. This is important, as

  • 10

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Bivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    equal variances result in different results as compared to unequal variances. Results of

    such a test can be found in appendix A.

    chaincr_score

    Due to the rise of travel industry, many companies implemented an expansionary strategy

    through founding chain hotels, building a profound brand image to customers. An

    investigation into whether there is a relation between a chain or non-chain hotel with their

    respective cs_score.

    Before testing the variables, an F-test was conducted to analyse the equality of variances

    for chain and non-chain hotels. The test result shows an equal variance in our variable

    (F=0.860, sig.=0.9554). Based on this result, a T-test was conducted under the equal

    variance situation, and we found that the chain variable is not related to the cr_score due

    to the value of p-value of 0.974, which is significantly bigger that 5%. In fact, hotels that

    were part of a chain in fact had a slightly lower mean than independent hotels (3.5827 and

    3.5841 respectively). Nevertheless this difference is negligible, and therefore, we conclude

    that chain is not an influential factor for the customers comment to the hotel.

    3.2 One-way ANOVA F-Test

    Analysis of variance, ANOVA in short, is used to test the relationship between one

    qualitative variable (with more than 2 treatments, thus K>2) and one quantitative variable.

    The advantage of such a test is that it estimates both the variances in and between

    treatments. Again, results of such a test can be found in appendix A.

    stars cr_score

    Star rating, constructed on the industrys standard, is an essential criterion for many

    customers choosing hotels. Logically, those high star-ranking hotels could offer customers a

    better all-round service during their stay, therefore leading to a higher cr_score. To test the

    assumption of the positive relationship between stars and cr_score, a one-way ANOVA test

    is conducted. We would like to highlight that there are hotels with no star ranking within the

    database (indicated as zero by SPSS). These hotels with missing (zero) star rankings are

    omitted within all further analyses involving the stars variable so as not to distort the test

    analyses.

    The SPSS test result confirmation our prediction. At 5% significance level, the mean of

    cr_score significantly differs between different star-ranking hotels (F=154.185 sig=0.000),

    and the descriptive analysis shows the positive relationship between these two variables.

    Specifically, the cr_score will be increased as the increasing start- rating, with lowest score

    (2.77) for 1 star hotel and the highest score (4.43) for 5 star hotels. It can be concluded

    that customers recognition is positively influence by the star of the hotel.

  • 11

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Bivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    advertisecr_score

    The fierce competition in travel industry forces hotels to promote themselves by

    advertisement to attract more customers. However, it is an illusion to assume that

    advertisement could influence customers evaluation without having physically experienced

    the hotels services. Therefore, we would like to inquire into whether advertisement has an

    effect on cr_score.

    The one-way ANOVA test partially confirmed our hypothesis. At 5% significance level, the

    relationship between these two variables exists (Sig=0.00). The ANOVA table shows the

    value of MST (2.96) is relatively high than MSE (0.49), indicating there is a significant

    difference in cr_score between different adverting level groups of hotel. As the descriptive

    table shows, non-advertising hotels get the lowest cr_score with 3.54. However, instead of

    hotels which advertise the most (category 1) having the highest cr_score, the peak (3.86)

    appears in the middle-level advertising hotels (category 3). Nevertheless, hotels which do

    not advertise still have the lowest cr_score on average.

    It also should be mentioned due to relatively small number of hotels which advertise, the

    validity of this result could be disputed.

    advertiseprice

    Irrespective of the industry, when firms and companies advertise, they often promote

    special offers in the form of discounts to customers in order to make a positive

    advertisement impression. Therefore, we are interested in whether the level of advertise

    has an effect on price. We expected this relationship to be inversely correlated.

    The SPSS test would indicate otherwise, however. At 5% significance level, it could be

    inferred that the means within the advertise treatment are equal to each other, therefore

    indicating that the differences in advertising do not result in a difference in price (F=2.278

    sig.=0.059).

    To analyse this result in more depth, the Fischers Least Significant Difference (LSD) post

    hoc analysis was utilised. With the exception being between category 1 and 3 (sig.=0.027)

    and category 3 and 5 (sig.=0.008), the result indicates that most pair-wise differences were

    insignificant at 5% significance level. This pair-wise insignificance also resulted in the

    insignificance of the advertiseprice relationship overall.

    It can also be determined that hotels which are between the top 25% and bottom 25%

    percentile in terms of ad impressions has on average the highest mean price (158.56),

    However, hotels that are in the top 5% percentile in ad impressions do indeed have the

    lowest average price (113.50), thus partially confirming our inverse correlation expectation.

  • 12

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Bivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    While a significance of 5.9% does mean there are insignificant differences between prices

    and advertising categories, this is a relatively weak inference considering that the p-value is

    only 0.9% off 5% significance level. An increase in the total number of samples should

    provide a more concrete conclusion in future research.

    Nevertheless, statistically speaking, any price variation is not determined by advertise.

    property_typeprice

    While we have established under the univariate analysis of property_type that almost 90%

    of the entries are full service hotels, it is still interesting to determine whether a difference

    in the property that has an influence on its respective price. Our prediction is that there is a

    difference in price between each property type.

    Statistical analysis supports that our hypothesis (F=10.349 sig.=.000), therefore implying

    that there is a strong indication that there is a pricing difference between property types.

    This is as expected, as each property types functions already takes into account its target

    group, which in turn also takes their respective incomes and levels of spending (e.g. motels

    in the US mostly serve the trucking and freight handling customers, while youth hostels

    target the adolescents and backpackers).

    Further analysis of this relationship indicate that it is difficult to determine whether the

    variances of each treatment is equal to each other, as there are too little samples for some

    of the treatments to draw such conclusions (e.g. there is only one sample within the

    apartment hotel treatment, and thus has been omitted from the ANOVA analysis all

    together). A descriptive analysis reveals that the apartments have the highest average

    price (197.20) and motels the lowest (67.26).

    destination_id price

    With this ANOVA-test, the relationship between the destinations and the price could be

    determined. We have expected there to be a difference.

    From the results, an F-test value of 62.701 and a corresponding p value of .000 was

    derived, therefore indicting that there is overwhelming evidence to reject the null

    hypothesis; the means among the categories do differ. Such a large F-test value implies

    MSE is a lot smaller than MST; the difference in prices between destinations is much larger

    than the difference within location.

    Las Vegas had the lowest mean price: 73.69. In addition, New York City had the highest

    mean price: 194.60. This result was quite unexpected. We have no clue what the reason for

    this could be, as most observations in Las Vegas were really low in comparison to other

    locations. Perhaps Las Vegas really is cheaper than we perceived it to be.

    property_typecr_score

  • 13

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Bivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    An ANOVA test was also conducted for the relationship property_type and cr_score. The

    computed ANOVA shows an F-test value of 26.731 and a p-value of 0.00; meaning that

    there is significant evidence to conclude that at there is a difference in means between at

    least 2 groups. With MSE being much smaller than MST, we can therefore conclude that

    the differences of review scores between property types are much greater as compared to

    the differences in scores within one group of hotels.

    There was only one observation of the apartment hotel property type (property type 8) and

    had a relatively high cr_score of 4 out of 5. We therefore decided to remove this variable,

    as this distorted the results of the analysis. Upon the removal of this treatment, the highest

    average score is to be found for the resort types of hotels (3.9088) and lowest for

    apartments (2.640). This is quite logical, as a resort is normally possesses many luxurious

    features and services. Naturally, one would expect such a hotel to have a high score.

    starsprice

    With this ANOVA test, the relationship between the number of stars and the price of hotel

    has been examined. The corresponding F-test shows a value of 304.735 and the p-value is

    0.00. We can thus conclude that there is sufficient evidence to infer that a difference

    between at least two means exists. We would like to point out that the difference between

    star rankings is much higher than the differences within one star ranking, attributed to a

    large MST when compared to MSE. As mentioned previously, hotels with supposed zero

    stars have been removed from the analysis.

    To conclude, the number of stars positively correlates to the average price as expected. A

    strong linear relationship therefore exists between the star ranking and price variables.

    3.3 Coefficient of Correlation

    The coefficient of correlation can be used to measure the strength of a linear relationship

    between two quantitative variables. Results are between -1 and 1, with each extreme

    resulting in a perfect negative of positive linear relationship and a result of 0 implying both

    variables are independent. Results can be found in appendix A.

    pricecr_score

    In order to analyse the relationship between price and cr_score, a correlation coefficient

    test was conducted. There was some variation in price at both ends of the cr_score variable

    which could have looked like an outlier to us, but filtering those results out gave a

    correlation coefficient which gave roughly the same value. We therefore decided not to

    exclude those values, as the amount of observations was not large enough to make a

    significant difference.

  • 14

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Bivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    Both the one-tail and two-tail test show a p value of .000 and a Pearson correlation

    coefficient of 0.487. Thus, a positive and moderate relationship between price and

    cr_score exists.

    The scatter plot between the two variables (figure 3) would this relationship is non-linear,

    however. The plots would seem to indicate more a negative exponential distribution, thus

    indicating that this variable may be unsuitable to linear regression analysis. Furthermore,

    the variance within this graph also seems to be high.

    Figure 3 Scatter plot between price and cr_score

    emp-satcr_score.

    Employee satisfaction could be seen an essential criterion to judge a hotels overall

    performance. We assume that when employees are working in a better work environment,

    employees are more motivated to offer customers the best service, which, in return, will

    positively increase cr_score. To test the relation between these two variables, the

    correlation coefficient test was utilised.

    This assumption is proven to be correct by the SPSS test result. At 5% significance level,

    there is a strongly positive correlation between these two variables (sig=0.000). Further

    analysis reveals that adding 1 unit in the employee satisfaction will increase the cr_score

    by 0.821.

    This relationship was also proven to be linear in a scatter plot, indicating that this variable

    would be suitable for further analyses involving linear regression models.

    3.4 Chi-Squared Contingency Table Test

    The chi-squared test is used to infer relationships between qualitative variables. Results

    can again be found in appendix A.

  • 15

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Bivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    advertisestars

    It is hypothesised that the level of advertisement, expressed through the variable advertise,

    has a positive correlation with the perceived level of quality of hotel, stars. Basing off the

    assumption that the degree of advertisement is proportional to a hotels physical size and

    its willingness and need to advertise, it can therefore be concluded that the better the hotel

    fits the two aforementioned criterions, the more they will, on average, also inherently have

    a better quality in terms of stars.

    The chosen test for testing this assumption is chi-squared contingency table test. However,

    because several elements under the advertise variable do not fulfil the n=5 chi-squared

    test requirement, it was decided that a new advertisement variable, advertise_NEW, will

    used. This variable does not distinguish between the degrees of advertisement, but rather

    classify hotel into either very little to no advertisement (previously category 5 hotels) or

    there is some advertisement (all category 1 to 4 hotels). This new variable is then tested

    with the stars variable.

    The contingency table indicates that most hotels that do not advertise have a star rating of

    2 or 3. In addition, the actual numbers of hotels within 4 or 5 stars are notably lower than

    the expected values. The converse is true for hotels who do advertise, however; the actual

    number of hotels within the 4 and 5 star ratings is almost double to that of the expected.

    This result reflect upon in figure 4, demonstrating that there are a greater number 4 to 5

    star hotels when hotels do advertise (in total 40%) than when hotels do not (in total 17%).

    The SPSS output confirms our previously made assumption that, indeed, advertisement

    does correlate with stars (2=112.237 sig.=.000). The fact that the p-value is near zero, it

    can thus also be inferred that this correlation is strong.

    Figure 4 Comparison of number of hotels by stars between hotels that advertise (left) and hotels that do not (right)

    1% 12%

    47%

    32%

    8%

    1

    2

    3

    4

    5

    5%

    38%

    40%

    14% 3%

    1

    2

    3

    4

    5

  • 16

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Bivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    3.5 Bonus Variable Tests

    For our bonus variables, we conducted one-way ANOVA tests for both.

    dAirportcr_score

    Airports, particularly those who serve international passengers, are often seen as symbols

    of commerce and international trade. Businesses flourish when they are easily accessible

    to a large airport, and thus it could also be assumed that this level of business prestige also

    translates on to hotels, which in turn positively reflect on cr_score. Thus, the distance to the

    nearest international airport (dAirport) is the variable we are interested in. If our hypothesis

    is true, then this means that cr_score is inversely correlated with dAirport.

    While the SPSS ANOVA alaysis strongly confirms our hypothesis (F=8.134, sig.=.000), the

    results, however are slightly different from our expectations. It seems that hotels located

    within the 10-15km radius have the highest cr_score (3.7116) rather than within the 20km radius, have on average the lowest overall cr_score

    (3.4584).

    Figure 5 shows that the relationship between these two variables is suitable to be tested

    with the One-way ANOVA test due to the fact that the variance of each treatment seem

    similar.

    Figure 5 Scatter plot of dAirport and cr_score

    luxurycr_score

    It is predicted that there is a positive correlation between luxury score and cr_score, as it is

    logical to assume that the more recreational facilities there are available to the customer

    leads to greater customer satisfaction.

  • 17

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Multivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    This hypothesis is proven to be correct, as the SPSS output demonstrates that there is

    enough evidence to indicate that different luxury scores have different cr_scores

    (F=63.636, sig=.000).

    Further analysis reveals that there is almost perfectly linear relationship between the two

    variables. Hotel with zero luxury have a mean review score of 3.099, while fully luxury

    hotels (with a score of four) have a mean of 4.050.

    In addition, the observations of the highest luxury score lie much higher as compared to the

    lower rankings as can be seen below in figure 6. This enforces our established view on this

    matter that a higher luxury rate results in better review ratings.

    Figure 6 Plot of mean of treatments within luxury versus cr_score

    Chapter 4 MULTIVARIATE ANALYSIS

    The following section analyses the multifactor influences on the central dependent variable.

    4.1 Two-Factor ANOVA Analysis

    advertise*starscr_score

    It was determined under the bivariate analysis the stars correlated with the central

    dependent variable, cr_score, while advertise did not. It was also established that in a chi-

    squared contingency table test that advertise and stars variable are dependent on each

    other. This relationship led us to consider whether these two variables combined

    interaction would have an effect on cr_score.

    A two-factor ANOVA test was used for this purpose. The SPSS output reproduced the

    bivariate tests for both treatments, and again it can be seen that there is cr_score

    correlation for one of the factors (stars), while there is none for the other (advertise) (the p-

    value of these two correlations are sig.=.000 and sig.=0.778 respectively).

  • 18

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Multivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    The SPSS output further indicates that a the stars*advertise do not interact to have an

    effect on cr_score (F=.543 sig.=.888). Because the p-value of stars*advertise is much

    higher than the standard 5% significance level, the test strongly does not reject the null

    hypothesis.

    This outcome can also be seen graphically in figure 7. Note the relatively parallel increases

    in marginal mean of cr_score between different levels of advertising against increasing

    stars number. This behaviour is indicative of a low effect interaction between the two

    variables towards cr_score.

    While it was previously shown that there was a significant relationship between

    advertisestars, this relationship does not seem to influence cr_score.

    Figure 7 Graph presenting advertising*star interaction effect

    4.2 Regression Model I (RM1)

    The previous section was about the analysis of the interaction effect between two variables.

    The following two section analyses the interaction of multiple variables, conducted through

    linear regression analysis.

    Within our model, there are two quantitative variables and six qualitative variables, of which

    28 dummy variables are formed. Using SPSS linear regression analysis, the following model

    could be determined:

  • 19

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Multivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    Validity and Explanatory Power Analysis

    The model has a high validity (F=110.451, sig.=.000), with the

    most significant variables being price and emp_sat. The model

    also has a high explanatory power, as can be seen from a

    moderately high correlation of determination (R2=0.696). In

    addition, a graph of cr_score to Regression Standardised

    Predicted Value (RSPV) indicates that the relationship seems

    linear. The fact that most data points are concentrated around

    the linear regression line indicates to us that produced model

    is of good fit, thus high validity.

    Variables Analysis

    It should first be noted that SPSS omitted several variables due to very low correlations with

    cr_score. These variables include property_type3, property_type4, and property_type6. The

    SPSS analysis also indicates that there are three variables which are significant at 5%

    significance level: price, emp_sat and property_type1.

    The model shows the price variable has a strong relation to cr_score (t=4.77 Sig=0.00),

    and while holding all the other variables constant, increasing one unit in price will add

    0.001 in cr_score, hence indicating a positive correlation. As we have tested under the

    bivariate analysis, when there is on average a higher priced hotel, there is also relatively

    better services and general hotel quality (see starsprice bivariate test). In addition to

    confirming price is correlated with cr_score in this regression analysis, we have also

    demonstrated in the previous pricecr_score bivariate test there is a significant

    relationship between the two variables. Note that while it was previously established that

    the pricecr_score relationship was non-linear, this variable was still eventually included to

    test its multivariate characteristics.

    The same goes for the influence of emp_sat to cr_score (t=37.649, sig=0.00). If employee

    satisfaction could increase by one unit, the cr_score will be 0.819 higher given the rest

    variables constant. This is a surprising result, given that cr_score, a customer-orientated

    scoring system, is highly influenced by employee satisfaction. Despite this, we have already

    demonstrated in the emp_satcr_score bivariate test that these two variables are

    correlated with each other, thus providing further evidence that the regression analysis is

    valid

    The only property_type variable which has a significant p-value is property_type1, the full

    service hotels. (t=2.691, sig.=.000). In being a full service hotel, the cr_score of the hotel

    increases by 0.151. We know already from the property_typecr_score analysis that these

  • 20

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Multivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    two variables are correlated. Furthermore, we have also demonstrated that full service

    hotels receive a higher cr_score rating than most other property types. We believe that this

    is due to the fact that because full service hotels are better equipped to handle different

    customer groups (e.g. travellers and businessmen) than other property types. This service

    compatibility thus increases its overall cr_score rating.

    One of the more surprising results within the model is the chain variable. When the

    chaincr_score bivariate analysis was conducted, we have demonstrated that there is very

    little difference between chained and non-chained hotels (ie. low significance). In contrast

    with the regression analysis, the significance value of chain is not only much lower (t=1.893,

    sig.=.059), but on average the difference between chained and unchained hotels is also

    significantly bigger (on average, being in a hotel chain leads to a cr_score increase of 0.053.

    This value was 0.013 when conducting the chaincr_score test)

    Multicollinearity Analysis

    While there most variables within the regression analysis are insignificant in nature, there

    are no signs of multicollinearity within our model.

    The price variable has the highest VIF value (3.053), but this value is still low enough to rule

    out the possibility of multicollinearity. Nevertheless, further analysis of this variable reveals

    that it is highly multicollinear to the dummy variables luxury1 and luxury3 (variance

    proportions are 0.51 and 0.41 respectively). Removals of these variables, however, do lead

    to a slightly lower VIF value, and instead of being highly multicorrelated with the

    aforementioned variables, price now becomes highly multicorrelated with property_type7

    and property_type8.

    The conclusion that can be drawn from this is the price is multicorrelated with all dummy

    variables. This result seems reasonable, as these variables and factors all logically have an

    impact on the price variable to a certain extent (as demonstrated in some of bivariate tests

    involving price). While we should note that such influences exist, it is still a significant and

    predictive variable within our model.

    Another variable with, on average, higher VIF value are dest_id dummy variables, which all

    have a VIF value of 2.0 to 2.6. Most of the multicollinearity exists between each dest_id

    dummy variable, this indicates to us that dest_id is perhaps not a very feasible variable to

    consider within our model.

    All other variables have low VIF values of 2 or lower.

    Conclusion

    Overall, the model I is both valid and has a high explanatory power, but because it only has

    three significant variables, so therefore has a low predictive power.

  • 21

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Multivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    4.3 Regression Model II (RM2)

    For our second regression model, we decided to include some variables which did not make

    it through our selection for the proposed causal relationship scheme such as the amount of

    customer and employee satisfaction reviews. In addition, we were interested in the

    qualitative dummy variable green. Finally, we wanted to include our other two bonus

    variables as we were highly interested in the outcome of those two.

    Validity and Explanatory Power Analysis

    There is a similarly high coefficient of determination (R2=0.694)

    for a second model, as well as very high significance value

    (F=256.244, sig. =.000). This means that for our second model, it

    is also has a high explanatory power and is of relatively good fit. In

    plotting RSPV versus cr_score, it can be seen that signs of linearity

    exists with the majority of the data points centred on the

    regression line.

    Variables Analysis

    Given the rest variables, we found four variables that are significantly relevant to the

    central dependent variable, including price, emp_review, dTallestB4 and hotrec0.

    Not surprisingly, a positive relationship between price and employee satisfaction to

    cr_score are still valid in this model, a result reconfirmed within with our conclusion in

    bivariate tests and previous regression model. This level of validity reveals that these two

    variables are essential for hotels when predicting the cr_score.

    Hotrec0 is the only hotrec variable significantly relating to the cr_score (t=2.60, sig=0.01),

    and it can be demonstrated that a hotels cr_score be negatively influenced, specifically,

    decreasing by 0.11 if the hotel does not offer the facilities of a restaurant, safe, business

    centre and minibar. However, it is unexpected that rest hotrec variables are not significantly

    relevant for cr_score.

    DTallestB4 is the last significant variable relating to cr_score (sig.=0.03) Specifically, a

    hotel that is 15 km to 20km away from the tallest building will increase its cr_score by 0.07.

    This result contradicts logic, however, a long distance to the citys largest business and

    retail district should negatively influence the hotels cr_score due to its inconvenience for

    travel. An alternative hypothesis is that in being located within the 15 to 20km from the

    central business districts, the hotel is able to avoid overcrowding and congestion issues,

  • 22

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Multivariate Analysis

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    thus providing extra comfort for the hotels customers. The relations between the other

    dTallest variables and central variable are not significant.

    SPSS indicated that it does not support our assumption about the relation between the

    variable of green, employee review and customer review with cr_score at 5% significance

    level.

    Multicollinearity Analysis

    The coefficients table shows that the emp_reviews and cr_reviews variables VIF values

    (5.14 and 4.63, respectively) are significantly higher, indicating there is a problem of

    multicollinearity within this model. To investigate for the possible reasons, the highest VIF

    value variable, employee reviews, is removed from the independent variables list

    temporarily. After performing the test again, the new result demonstrates that cr_reviews

    VIF values significantly decreased to 1.226. It can therefore be concluded that employee

    satisfaction is the cause to cr_reviews high multicorrelation. We, however, do not know the

    reason for the existence of this relationship.

    Two other variables with relatively high VIF values, namely 2.19 for hotrec3 and 2.53 for

    hotrec4, are also worthy for further analysis. It turns out, however, that these variables are

    also influenced by emp_reviews, as their respective VIF values dropped to 1.393 and 1.583

    respectively upon the removal of emp_reviews from the regression analysis. Again, we are

    not able to explain the cause of this relationship.

    Besides from the aforementioned instances, all the rest of the variables within this model

    have acceptable VIF levels. This mirrors the result we have within the first regression model,

    where our variables have low multicollinearity but are also insignificant.

    Conclusion

    The results of the second regression model yielded similar results to the first one in terms

    of validity, explanatory power, variable significance and multicollinearity levels. While this

    regression analysis yielded two more significant variables, it does demonstrate that there

    are only a few variables that are determinant in relation to cr_score.

    4.4 Comparison of Model I and Model II

    Both models are very similar in nature: both contain few significant variables of low

    multicollinearity, but still have a high explanatory power and validity. Nevertheless, a

    combination of the results of the two models leads to the creation of the final regression

    model which still holds some degree of predictive power.

  • 23

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Conclusion and Evaluation

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    Chapter 5 CONCLUSION AND EVALUATION

    Basing on the conclusions from the two regression models, we are able to identify five

    significant variables for our final regression model:

    ( )

    In answering the focus question What Determines Hotel Customer-Review Scores?, this

    model effectively states that customer satisfaction is determined by room price, employee

    satisfaction, hotel property type, the number of HOTREC facilities within the hotel, and the

    distance from the tallest building in the city.

    More specifically, for the last three qualitative variables, there is only a significant relation

    with cr_score when the property type is of a full service hotel, has none of the HOTREC

    facilities and is located within 15-20km of the tallest building within the city.

    5.1 Interpretation and Significance of Results

    An increase in price or emp_sat, while keeping all other variables constant, increases the

    cr_score by a 0.001 and 0.841 respectively. If the hotel fulfils the criterions of

    propertype_type1 and dTallestB4, then there is cr_score increase of 0.134 and 0.054

    respectively. However, should the hotel in question fulfil the hotrec0 criteria, then it

    decreases the cr_score by 0.142.

  • 24

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Conclusion and Evaluation

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    We can conclude from this analysis that emp_sat is the most influential variable within this

    model, while price has the least. These two variables are more controllable by hotel

    managers in relation with other variables, and it is for this reason that managers should

    place the most emphasis on when looking into improving customer satisfaction. In addition,

    these variables have more predictive ability than the qualitative variables, as they are more

    applicable to a wider variety of hotel types and situations.

    It should be noted with caution, however, that simply increasing the price of hotels does not

    mean that it will directly lead to higher levels of customer satisfaction. The increase in price

    has to be justified with an equal increase in the quality and services of the hotel, otherwise

    it could instead have a negative effect on satisfaction rating. Customers are more

    concerned with the value for money of the hotel, rather than the price of the hotel rooms.

    Managers are advised to avoid using price as a director predictor of cr_score, and should

    instead look for ways to improve the price of the hotel through increasing the service quality

    of the hotel.

    The reader can gain a rough perspective of price determinants within our bivariate analysis,

    where there is a significant number of relationships concerning price. Essentially, however,

    a more in-depth research would have to be conducted to accurately find the determinants

    of price, something which goes beyond the scope of this report.

    Caution must also be exercised when utilising the price variable due to its non-linear

    behaviour in relation to cr_score. Future analysis and research should look into non-linear

    regression models to incorporate this variable more effectively. 5.2 Recommendations for Hotel Managers and Owners

    Our recommendation for hotel owners and managers is to prioritise on improving employee

    satisfaction and value for money of the hotels.

    Increase value for money, invest in HOTREC facilities

    Firstly, as mentioned previously, manager and owners should attempt to invest in more

    value-adding services within the hotel to increase value more money for customers, which

    in turn could potentially increases price. Particularly, investments into facilities listed under

    the HOTREC variables will prove to be more essential, as the neglect of the implementation

    of the said facilities will significantly lower customer review score. However, the hotel can

    also choose to invest in sports-related luxury facilities and increased customer services as

    means to increase the perceived value for money.

    Improve employee satisfaction and implement data within operations

    Secondly, hotels should look into ways to improving as well as implementing schemes and

    operations involving employee satisfaction. Managers should first implement greater

  • 25

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Conclusion and Evaluation

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    measurability of employee satisfaction through both qualitative (e.g. increased meetings

    and employee feedback) and quantitative means (e.g. balanced scorecard). After

    establishing these measures, managers should set improvement targets based on the

    employees perceived unsatisfactory points, and should act accordingly to fulfil these needs.

    In addition to a reactive strategy, managers should also implement a proactive strategy of

    improving employee satisfaction. This includes making sure that employees are provided

    with the tools and ability to conduct their work effectively, as well as investing more into

    human resource related areas. The establishment of an effective human resource

    management (HRM) department is an important factor to consider.

    Consider geographical position of hotel

    With regards to hotel owners, owners may wish to consider geographical factors relating to

    the hotel. If a relocation of the hotel or the opening of a new hotel is considered, owners are

    advised to look into locations which are not overly congested with offices and businesses,

    but should still contain relevant service facilities (e.g shopping malls) for customers to enjoy.

    Locations which fulfil these criterions are typically located about 15 to 20km away from the

    tallest building in the city

    Lastly, if a hotel owner should look into serving customers groups so as to remain

    competitive with other hotels. Full service hotels are most capable to fulfilling this

    requirement.

    5.3 Evaluation of Data and Research

    While this research has not yielded many determinants of customer satisfaction scores, it

    does demonstrate the complexity of human nature, something that will always be the basis

    for further research for social scientists.

    The biggest shortfall of the data collected within this report is the uneven distribution of

    samples for certain independent variables such as property_type and advertisement; the

    lack of samples for certain treatments within these variables often led to weak or false

    conclusions within tests. An increase in the number of sample size would overcome such

    an issue.

    The second shortfall within this research is the high number of qualitative variables that

    lower the predictive value of the research. An increase in quantitative variables such as the

    number of operational years could add further depth into the research, and could us to

    make more relevant recommendations for hotel owners.

    Another shortfall is that we could not use all variables to test the research hypothesis.

    Although it would become highly complicated, it does in fact mean that we had to make a

    choice which variables to include and omit. We could bridge this partially by creating the

  • 26

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Conclusion and Evaluation

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    bonus variables luxury and hotrec which were built on some variables which we were not

    able to include.

    Regardless, we were able to obtain a satisfactory goodness of fit (R2) score for both

    regression analyses therefore we do think that the variables which were in fact significant

    did have good explanatory power for the central variable.

    Overall, this field of study is still relatively open, and further research in the future will most

    certainly be welcomed.

  • 27

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Appendix A: Bivariate Analysis Tests

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    APPENDIX A: BIVARIATE ANALYSIS TESTS

    Chi-Squared (2) Test Unequal Variances T-Test

    Variance F-Test One-way ANOVA

    F-Test

    Correlation

    Coefficient

    Demonstrated

    Example advertisestars chainscore chainscore advertiseprice pricescore

    Step 1:

    Formulate

    H0 and H1

    H0: advertise and stars

    independent

    H1: advertise and stars

    dependent

    H0: 1-2=0

    H1: 1-20 H0: 21/ 22 = 1

    H1: 21/ 22 1

    H0: 1= 2= n H1: At least two

    means differ

    H0: =0

    H1: 0

    Step 2:

    Determine test

    statistic

    ( )

    Step 3:

    Determine test

    stat distribution 2obs ~ 2(r-1)(c-1) t ~ tn(welch) F ~ F(n1-1,n2-1) Fobs ~ F(k-1, n-k) T ~ t(n-2)

    Step 4:

    Assess intuitive

    rejection area 2obs >> 0

    tobs >> 0

    tobs 1 Fobs >> 0

    T 0

    Step 5:

    Determine

    significance

    = 0.05 = 0.05 = 0.05 = 0.05 = 0.05

    Step 6:

    Look up critical

    value 2(4) = 9.49 t467.759 t468 =1.96 F(341,1219) = 1.15 F(4, 1557) = 2.37 T1520 = 1.96

    Step 7: Perform

    the test

    ( )

    Conclusion

    Because 2obs is bigger than2 (112.237>9.49)

    therefore reject H0

    Because tobs is

    smaller thant

    (0.271

  • 28

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Appendix B: Two-way ANOVA Tests

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    APPENDIX B: TWO-WAY ANOVA TESTS

    Treatment A Treatment B Treatment A*B

    Demonstrated

    Example stars advertise stars*advertise

    Step 1:

    Formulate

    H0 and H1

    H0: 1= 2= n H1: At least two

    means differ

    H0: 1= 2= n H1: At least two

    means differ

    H0: 1= 2= n H1: At least two means

    differ

    Step 2:

    Determine test

    statistic

    ( )

    ( )

    ( )

    Step 3:

    Determine test

    stat distribution Fobs ~ F(k-1, n-k) Fobs ~ F(k-1, n-k) Fobs ~ F(k-1, n-k)

    Step 4:

    Assess intuitive

    rejection area Fobs >> 0 Fobs >> 0 Fobs >> 0

    Step 5:

    Determine

    significance = 0.05 = 0.05 = 0.05

    Step 6:

    Look up critical

    value F(4, 1488) = 2.37 F(4, 1488) = 2.37 F(12, 1488) = 1.76

    Step 7: Perform

    the test

    Conclusion

    Because Fobs is bigger thanF (25.444

  • 29

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Appendix C: Multivariate Analysis Tests

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    APPENDIX C: MULTIVARIATE ANALYSIS TESTS

    T-Test for Regression Coefficient F-Test for Regression Model

    Demonstrated

    Example RM1 emp_sat RM1

    Step 1: Formulate

    H0 and H1

    H0: 2=0

    H1: 20 H0: 1= 2= n

    H1: Not all i equal to zero

    Step 2:

    Determine test

    statistic

    Step 3:

    Determine test stat

    distribution Tobs ~ T(n-k-1) Fobs ~ F(k, n-k-1)

    Step 4:

    Assess intuitive

    rejection area

    Tobs > 0

    Fobs >> 1

    Step 5:

    Determine

    significance = 0.05 = 0.05

    Step 6:

    Look up critical

    value T(1477-30-1) = 1.96 F(30, 1446) = 1.47

    Step 7: Perform the

    test

    Conclusion

    Because Tobs is bigger thanT (37.649>1.96) therefore reject H0

    Because Fobs is bigger than F (110.451

  • 30

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Appendix D: Bonus Variables

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    APPENDIX D: BONUS VARIABLES

    Description of bonus variables

    dAirport

    A citys airport is an important contributor of business and tourism aspects within a city.

    This important infrastructure could have an effect on variables tested within this report.

    This variable measures the distance to the nearest international airport, and hotels are

    subsequently classed into five distance categories: 20km.

    dTallestB

    Distance to the tallest building within the city (dTallestB) is a variable chosen to reflect on

    the hotels distance from the citys largest business and retail district, of which the citys

    tallest building is normally located. Originally, we wanted to measure the distance between

    hotels and city centres, but as there is no universally definition of a city centre, such a

    variable was difficult to compute. Thus, the distance to the tallest building was chosen. The

    distance categories are the same as dAirport.

    luxury

    The luxury scale has been arbitrarily determined by the authors through the data that has

    been available to them. These are facilities which are defined as facilities which provide

    recreational services to customers, but are not a perquisite towards the successful running

    daily hotel operations. In this instance, the chosen variables are fitness centre (fit),

    swimming pool (pool), tennis court (tenn) and sauna (saun). The number of listed facilities

    that the hotel possesses also equates to the luxury score.

    hotrec

    The hotrec scale are variables which the Hotels, Restaurants & Cafs in Europa (HOTREC)

    deems as important to a high quality hotel. The items within the hotrec score differ from the

    luxury score by focusing on facilities which do not necessarily provide recreational service

    for customers, but facilities which provide comfort and increased accessibility for

    customers. These include of restaurant (res), safe (safe), business centre (bus) and minibar

    (mini).

    Calculation of distances between two coordinates (dAirport, dTallestB)

    Calculations relating to these two variables incorporate the use of Haversine formula:

    ( ( ) ( ) ( ) ( ) ( ))

  • 31

    What Determines Hotel Customer-Review Scores?

    Dennis Johannisse, Zhengchen Wei, Mike Lien

    Appendix D: Bonus Variables

    Applied Business Methods: Research Project

    Erasmus University Rotterdam

    Where d is the distance between the two coordinates, latn being the latitude of coordinate

    n, lonn being the longitude of coordinate n, and R being the radius of sphere in question

    (6371km, the average radius of Earth, is used for this report).

    Afterwards, the identification of the coordinates of each landmark must be determined.

    Upon determining these coorindates, they have been first converted into degrees, and then

    radians, before being calculated through the Haversine formula.

    Name: City: Longitude: Latitude:

    Hartsfield-Jackson Atlanta International Airport Atlanta 33.636667 -84.428056

    Chicago O'Hare International Airport Chicago 41.978611 -87.904722

    Chicago Midway International Airport Chicago 41.786111 -87.7525

    Los Angeles International Airport Los Angeles 33.9425 -118.408056

    McCarran International Airport Las Vegas 36.08 -115.152222

    John F. Kennedy International Airport New York City 40.639722 -73.778889

    LaGuardia Airport New York City 40.777222 -73.8725

    Orlando International Airport Orlando 28.429444 -81.308889

    Figure 8 Table listing chosen international airports and their coordinates

    An example would be calculating the distance between the Inn at the Peachtrees hotel,

    located in Atlanta, and the Hartsfield-Jackson Atlanta International Airport. The coordinates

    of both location are (33.7636, -84.3878) and (33.6367, -84.4281) respectively. Converting

    to radians, these coordinates become (0.5893, -1.4278) and (0.5871, -1.4736)

    respectively. From here, the distance can then be calculated:

    ( ( ) ( ) ( ) ( )

    ( ))

    Determination of luxury and HOTREC scales (luxury, hotrec)

    Both scales are ranked from zero to four, and are cumulative number of each facility

    present within each hotel.

    In the instance of Sheraton Hotel Atlanta, we observe that this hotel has a business centre,

    restaurant, safe and minibar. These four facilities count towards the hotrec scale, thus this

    hotel has a hotrec score of four.

    This hotel, however, only has a fitness centre and swimming pool, thus only fulfilling two

    criterions within the luxury scale. Thus, Sheraton Hotel Atlanta only has a luxury score of

    two.