Effects of product network relationships on demand in russian ecommerce
Customer lifetime value like one of the key business indicators which we can predict net income, as well as future relations between the company and the client. Characteristics of the several measurements and evaluations of recommendation systems.
Рубрика | Маркетинг, реклама и торговля |
Вид | дипломная работа |
Язык | английский |
Дата добавления | 18.07.2020 |
Размер файла | 2,7 M |
Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже
Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.
Although we found some significance from the Recommendation Network, this one is not so strong in terms of statistical significance. Nonetheless they present a value that is worth analyzing. One of them is the Discount from the Recommended products. When the discount is higher this tends to decrease the number of Reviews of the main product. This result follows the same logic for Ratings as a dependent variable. The recommendation network of the co-purchased products takes part this time on the equation. The price of the co-purchase products has a negative influence on Reviews, hence, when a higher product is offered as a co-purchase the reviews are negatively influenced. On the other hand, Ratings from Co-purchased products have a positive influence on Reviews. Then, high rated products tend to increase the Number of reviews of the main product. From the recommendation network, based on this model, we can get an idea that to increase the number of Reviews of a product, it might be useful as a test by showing co-purchased products with a lower price and with a high rating.
Figure 24. Linear regression of reviews of main product
The Sales factor is the main component we look to analyze for in this research. Based on previous research, there are different ways of how the Recommendation Network drives value and increases demand. That is why we carried out a deeper analysis of this variable. We performed several regressions, one for the overall data, and regressions for each cluster with significant results. There are four variables from the control group that affect Sales. The first of them is the Rating, as it was expected the higher a product is rated the more likely a product would be purchased. Following the same trend, the variable discount has positive influence on Sales, products with higher discount also are more likely to be purchased. The last variable from the control group that has influence on Sales is the number of Reviews, it is interesting that this variable has a stronger influence than Discount. So, products with a higher number of Reviews are more likely to have more Sales even more than those products with high discounts.
The recommendation network of products has several influences on Sales depending on the clusters. Nonetheless, there are 3 variables that can help to explain Sales in an overall scenario. The price of Recommended Products has a slight effect on the main Sales, this model gives the idea that, if products with higher prices are presented in the Recommendation Network, then it is more likely that the customer will tend to buy the main product. Additionally, the variable who has the strongest explanation from the Also Viewed products is the Number of Reviews. These variable influences positively on Sales, meaning that customers are driven by Reviews from similar products to buy the item from the focal products. Even more if Recommended products are from the same brand, then reviews might confirm the purchase of the customer. Finally, following the same line, but in a less intensity, the number of Reviews of a co-purchased product has a positive effect on demand. Summing up the variables for an overall model. We can have the idea that presenting products with a high number of Reviews can increase the probabilities of purchase of a Product. Moreover, presenting more expensive alternatives might lead the customer to end up buying the focal product.
Figure 25. Linear regression of sales of main product
The need to use cluster analysis in our study is determined by the development of data classification, research on the principle of user grouping, as well as testing hypotheses described in the “Research design and methodology” section. As a result of the analysis, the k-means (PAM) algorithm is better and more convenient for our study, for which it is necessary to provide Data sampling and get the number of clusters and the distance between the values. Since there are different recommendation networks in our dataset, there is data on focal product, also viewed products and co-purchase products, we will use this method three times.
To implement cluster analysis, we divided our dataset into three samples for different product categories: focal product, also viewed products and co-purchase products.
Then we determined the variables by which the objects in the sample will be evaluated:
· Main product - brand, product name, price, rating, discount, number of reviews and sales.
· Also viewed products - brand, product name, average price, average rating, average discount, average number of reviews and sales.
· Co-purchase products - brand, product name, average price, average rating and average number of reviews.
The next step is the calculation of the values ??of a measure of similarity between objects.
For the first data set (focal product) in the result presented on the graph, it can be noted that 7 clusters are the optimal number of clusters when using the k-means (PAM) algorithm.
Figure 26. Calculation of how many clusters need to be applied for the main product
For the second data set (also viewed products) in the result presented on the graph, it can be noted that 7 clusters are the optimal number of clusters when using the k-means (PAM) algorithm.
Figure 27. Calculation of how many clusters need to be applied for also viewed products
For the third data set (co-purchase products) in the result presented on the graph, it can be noted that 11 clusters are the optimal number of clusters when using the k-means (PAM) algorithm.
Figure 28. Calculation of how many clusters need to be applied for co-purchased products
Using the optimal number of clusters for each product category, we sequentially applied cluster analysis to identify measures of similarity and differences between values. We built a graph with X and Y coordinates, where we placed 7 clusters of the first category “main product”.
Figure 29. Plot of clusters of main products
Clusters 1 and 3 are the most significant for analysis due to the high sales rate (28.4 and 11.3, respectively) which are shown by the data of the product group. Consider the reasons for the high sales rate in these clusters. Based on cluster analysis, we can conclude that the greatest demand and sale of goods occurs at the highest rated products and with the most reviews. At the same time, the discount size is not so significant and affects sales to a lesser extent. For example, the rating of the main product of the first cluster is on average 4.78 out of 5.0, respectively.
Figure 30. Results and actual values of clusters of main products
In total, two of these clusters account for 27% of the entire category of televisions, and about 70% of reviews account for these two clusters.
Using the optimal number of clusters for each product category, we sequentially applied cluster analysis to identify measures of similarity and differences between values. We built a graph with X and Y coordinates, where we placed 7 clusters of the first category “also viewed products”.
Figure 31. Plot of clusters of also viewed products
Cluster 6 is the most significant for analysis due to the high sales rate - average sales of also viewed products 27.6 (42% of sales in the category also viewed products). Consider the reasons for the high sales rate in this cluster: 1) the average cost of goods - the average cost of goods in cluster 6 is 25 958, when in cluster 4 it is 33 414. 2) the highest average rating is 4.38 out of 5.0. 3) the largest number of reviews for products of cluster 6 - an average of 51 reviews. Based on the basic statistical results, we can conclude that the impact on sales is primarily affected by the number of reviews and rating, and secondly: discount and cost.
Figure 32. Results and actual values of clusters of also viewed products
Using the optimal number of clusters for each product category, we sequentially applied cluster analysis to identify measures of similarity and differences between values. We built a graph with X and Y coordinates, where we placed 11 clusters of the first category “co-purchase products”.
Figure 33. Plot of clusters of co-purchased products
We can conclude that in the category of “co-purchase products” there is much more variation within the clusters. Cluster 10, which is located in the upper left corner, is characterized by a low average cost of goods (253 rubles) and an average number of reviews (13), while the presence of this cluster in the recommendation network is of the greatest importance (goods from this cluster can be seen 49 times).
Figure 34. Results and actual values of clusters of co-purchased products
The same pattern can be identified in cluster 5, which has a low average cost (294 rubles) and an average number of reviews equal to 13 pieces, while the presence of this cluster in the recommendation network is of the greatest importance (products from this cluster can be seen 39 times). Cluster 4 is the most significant due to the high price, rating, and number of reviews, but at the same time, the presence in impressions in the recommendation network for goods in this cluster does not exceed 15 pieces.
The client sees the main product and decides based on what is presented on the page. Therefore, the next question that we asked is what quantities have the greatest impact on the sale of the main product. Given the clustering, we took the main product clusters as a basis and added all the additional indicators of the main product, also viewed products and co-purchased products to it. For a comprehensive verification of the results, we analyzed the results for each of the 7 clusters. It is worth noting that, depending on the cluster number (similarities or differences in values), we revealed a pattern that was traced in four of the seven clusters. Consider each of the clusters, which showed significant results for us.
Figure 35. Linear regression results of the first cluster
Figure 36. Variables that affect sales in the first cluster
First cluster of main product has 5 dependent variables for sales. The data is relevant because of the R-squared = 33% and p-value = 0.0016. The first dependent variable on sales is the price of the main product. This is justified by the fact that, compared with other clusters, goods cost 20,000 and 30,000, respectively, when in the first cluster price = 10,000, it is this variable that plays the greatest role among consumers. The following significant indicators for the first cluster are the ratings and reviews of the main product. The rating of goods in the first cluster is 4.78 out of 5.0. Reviews of the first cluster - the largest number of reviews among all product categories. The user is interested in purchasing a product at a reduced price, with a good rating and positive reviews, which is why the TVs in this cluster have a winning position. The least impact on sales of the main product of the first cluster is affected by the indicator from the recommendation network - the average cost also viewed products.
Figure 37. Linear regression results of the second cluster
Figure 38. Variables that affect sales in the second cluster
The second cluster of main product has 3 dependent variables on sales. The data is relevant in view of the R-squared = 54% and p-value = 0.0010. The main dependent value is the reviews of the main product. It is noteworthy that the average number of reviews is 6 reviews, when the first cluster has 110 reviews. The cost of goods located in the second cluster is much higher than that of the first cluster, therefore reviews play the greatest importance when choosing an expensive product. The average number of reviews of also viewed products and sales of also products also had less impact on sales of the main product. The presence of the influence of sales and values from the recommendation network indicates the relevance of building work with the recommendation network and once again emphasizes the importance of this work.
Figure 39. Linear regression results of the third cluster
Figure 40. Variables that affect sales in the third cluster
The third cluster of main product has 2 dependent variables on sales. The data is relevant in view of the R-squared = 67% and p-value = 0.0000001. The main dependent value is the reviews of the main product. It is noteworthy that the average number of reviews is 37 reviews, when the first cluster has 110 reviews. Products in the third cluster are the most expensive in the online store. That is why customer reviews play an important role in making decisions. The discount on the main product has a lesser effect on the sale of the main product, it is equal to the average value of 10%, but when calculating the nominal amount of the cost, the discount when buying the product will be much larger. In this linear regression, there are no values for the recommendation networks “also viewed products” or “co-purchase products”.
Figure 41. Linear regression results of six cluster
Figure 42. Variables that affect sales in the sixth cluster
The fourth cluster of main products has 7 dependent variables on sales. The data is relevant in view of the R-squared = 36% and p-value = 4.287e-11 score. The main dependent values are reviews and ratings of the main product and sales of also viewed products. Reviews and ratings of the main product have a strong impact, due to the presence of cluster 6 products in the middle price category. Also dependent values (slightly less) are the price of the main product and the average rating of co-purchased products. After the client has decided on the product, the cost and rating of the goods in the recommendation network have a secondary effect. A person begins to compare and look for a better offer in a recommendation network. Least Dependent Values - discount of main product and average number of reviews of co-purchased products. The last step in choosing a person is the availability of discounts (benefits for the client).
Figure 43. Variables that affect sales from all clusters
As the results of cluster analysis, it can be noted that the results obtained are significant and even reveal the behavior of the client. Using the data from the Ozon.ru website, we make a preparatory analysis and data cleaning, we select by what values we can combine and identify the classification of similar product groups. Then, using the k-means (PAM) algorithm, we clustered the data and got our own clusters for each product group. And the final calculation was a linear regression of the data, which are grouped in clusters to identify patterns which values and parameters affect the sale of goods. Of the 7 clusters built, 4 gave us truly meaningful results. It can be noted that the key indicator that determines the sales of each product is the reviews of the main product. This value had the greatest value on sales in our linear regressions. The rating of main product and sales of also viewed products also had a strong influence. We list the variables that also affect sales - average rating of co-purchased products, price of main product. The least influences are the average number of reviews of co-purchased products, average number of reviews of also viewed products, average price of also viewed products and discount of main product. After analyzing these results, we can conclude that the characteristics of the main product and the indicators of the goods of the recommendation network influence sales.
Based on the literature review, in our study we formulated about 9 hypotheses to determine the relationship between key indicators. After conducting cluster analysis and linear regression, we obtained significant results, which will be described in this chapter. Consider each individual hypothesis and the resulting result.
H1a: It was revealed that the sales volume of a product is influenced by the number of reviews of neighborhood similar products. The hypothesis was confirmed, and it was revealed that the dependence is high. This similar hypothesis was confirmed a year ago by Professor Xue Pan, when the analysis was performed on the American e-commerce sites (Amazon.com and Barnesand-Noble.com). The positive impact (volume sales) was practically confirmed in the Russian e-commerce market. It is worth noting that the influence of this value is greatest in comparison with other values ??in the recommendation network. If we consider linear regression after clustering data, the effect for each cluster will be different. For instance, cluster 3 showed that a positive effect, but to a lesser extent, persists. When stimulating the writing of reviews by the manufacturer and receiving many reviews, the number of sales will also increase, as this indicator is one of the key decision-making factors. Therefore, as a conclusion, it can be noted that the number of reviews also viewed products positively affects sales.
H1b: Indeed, our work confirmed that the sales volume of a product is positively related to the number of reviews of neighborhood co-purchased products. This indicator has a positive effect on sales. The degree of influence depends on the individual type of product. So, in a general analysis, linear regression showed a positive effect on the number of reviews in co-purchased products, but not very significantly. And in the analysis after cluster analysis, cluster 6 showed a stronger effect of this indicator on sales. This means the number of reviews under the goods plays the most importance when choosing goods, therefore their availability is important. This analysis was already carried out by Professor Xue Pan in his works in 2019 but received the influence of the number of reviews on the eWOM indicator, which also affects the effectiveness of recommendation systems.
H1c: The reviews of a focal product is influenced by discount and price (economic indicators). This hypothesis was confirmed. Indeed, the reviews of a focal product is influenced by discount and price. When creating a linear regression, it was revealed that the influence of the price on reviews is much lower, it is minimal. Because this factor is not key for the client when writing a review. The discount has a greater impact on the reviews. At the same time, you can notice that the greater the discount on the product, the higher the likelihood of writing many reviews from buyers. It is interesting enough that economic indicators can so strongly affect customer feedback. It is worth noting that the quantity of sales also affects the number of reviews and the impact of this indicator is exceptionally large. This hypothesis is confirmed by the fact that the number of reviews depends on all economic indicators that are reflected on the e-commerce site.
H2a: Directly connected products have similar range of sales with each other. Inside the TV category there are products with a considerably higher number of sales than the average. On the other hand, there are products with no sales at all. Each product, besides to be the main product, it also serves as a suggestion in the recommendation network. This assumption was considered when we created the clusters to carry out regressions for each of them. Fortunately, there are 4 clusters from the 7 created that have a statistically significant explanatory model.
H2b: The focal product has more sales when neighborhood similar products have higher prices. In our study, this hypothesis was not confirmed. We cannot objectively assess the dependence of the sales volume of the main product and prices of similar products. For an objective assessment and improvement of the quality of our model, we removed indicators from the data set that greatly exceed the average values. When calculating the linear regression for indicators: rating, reviews, sales, we used pure data. Focusing on cluster analysis, namely, linear regression of cluster 1, we can say that the volume of sales of the main product affects the price of similar products, but this effect is insignificant.
H3a: The sales volume of a focal product is negatively influenced by the high average of discounted similar offered products. In our study, this hypothesis was not confirmed. When constructing a linear regression, we found that there is no effect on sales on the average discount of similar offered products. Also, when constructing linear regressions for each of the 7 clusters of the main product, no dependence was revealed. Professor Lin, Z. in 2016 tested a similar hypothesis, but did not get a significant result. This means that for a qualitative impact on sales, you should not use the discount value in the recommendation network, this will not lead to any significant result.
H3b: The sales volume of a focal product is positively influenced by the high average of rating co purchased products. Indeed, the sales volume of the main product positively affects the average rating of co purchased products. This influence is especially pronounced in cluster analysis when all products are divided by characteristics. Linear regression of the overall dataset did not show the effect of sales on co-purchased products. The largest influence of these values ??was identified in cluster 6 and 7, where the smallest volume of sales of goods. Product rating is another factor in the decision-making process. For marketers and researchers, the main result is - the stimulation of many ratings. It is most rational to increase the rating of goods that are less in demand among customers.
H4a: The sales volume of a product is influenced by the ratings of neighborhood similar (also viewed) products. In our study, this hypothesis was not confirmed. When constructing a linear regression of a common data set, there is no relationship. When constructing a linear regression for each individual cluster, there is no dependence. Professor Lei Hou and colleagues analyzed the impact of the rating of neighborhood similar (also viewed) products and the eWOM score. No influence was also identified. It can be noted that with an increase in sales, it is worthwhile to deal with and pay attention to other more significant values ??of neighborhood similar (also viewed) products. It is interesting that the dependence of the sales volume and the rating of the main product is dependent, and when comparing the sales of the main product and the rating of a similar product (recommendation network), there is no dependence.
H4b: The rating of a focal product is influenced by average discount of neighborhood similar (also viewed) products. During the analysis, this hypothesis was confirmed. Indeed, the rating of a focal product is influenced by average discount of neighborhood similar (also viewed) products.
When constructing the general dataset, it was confirmed that the influence of these values is and it is quite significant in comparison with the influence of other factors.
Professor Duan et. al. 2008 confirmed that the influence of these factors exists. In our study, this indicator is confirmed, which certainly carries managerial value for the practical application of this study in the market.
Table 3. Results of hypothesis
Hypothesis |
Hypothesis |
Status of hypothesis |
|
H1a |
The sales volume of a product is influenced by the number of reviews of neighborhood similar (also viewed) products. |
confirmed |
|
H1b |
The sales volume of a product is positively related to the number of reviews of neighborhood co-purchased products. |
confirmed |
|
H1c |
The reviews of a focal product are influenced by discount and price (economic indicators). |
confirmed |
|
H2a |
Directly connected products have similar range of sales with each other. |
confirmed |
|
H2b |
The focal product has more sales when neighborhood similar products have higher prices. |
not confirmed |
|
H3a |
The sales volume of a focal product is negatively influenced by the high average of discounted similar offered products. |
not confirmed |
|
H3b |
The sales volume of a focal product is positively influenced by the high average of rating co purchased products. |
partially confirmed |
|
H4a |
The sales volume of a product is influenced by the ratings of neighborhood similar (also viewed) products. |
not confirmed |
|
H4b |
The rating of a focal product is influenced by average discount of neighborhood similar (also viewed) products. |
confirmed |
To sum up, it can be noted that of the 9 hypotheses that we formulated, 6 were confirmed, the results are presented in the table. The most important indicator for an e-commerce site is sales volume, which is why we first identified which variables can positively affect it: number of reviews of neighborhood similar products and co-purchased products, average of rating co purchased products. The second important indicator is customer reviews, which is why we also found that all economic indicators (discount, sale volume, price) that positively affect customer reviews. The final indicator is the rating, it was found that average discount of neighborhood similar (also viewed) products affects this indicator. We analyzed the demand for e-commerce site metrics and realized which marketers and sales managers can use our research in practice.
Conclusion
In our study, we analyzed all the key indicators of the recommendation network and identified the main values that are most important for increasing volume sales. Our main model was based on linear regression and cluster analysis, which allowed us to discover what best describes the desires and behaviors of clients. We tested 9 different hypotheses, which we formulated based on the literature review. It was confirmed that the greatest impact on sales from the Recommendation Networkare customer reviews. Our empirical experiment on the real-world dataset showed that recommender systems make simplify a client's journey to purchase, reducing his time searching for the right product.
Key indicators of the recommendation network (the number of reviews, sales and ratings of also viewed and co-purchased products) allow to quickly determine and purchase the necessary product, so this work will be useful to e-commerce retailers, as well as entrepreneurs who want to create their own online store, researchers of recommendation systems and company analysts.
Our main findings revealed that the Product Recommendation Network might influence in different ways for different types of products according to their situation related to Price, Reviews, Discounts and Ratings. As an overall situation, the recommendation system has influence in only 3 factors: Price from Also Viewed Products, the Number of Reviews of Also Viewed Products, and the Number of Reviews of Co-purchased products. These three factors affect in general the category of TVs in ozon.ru. Moreover, there are variables from the recommendation network that might influence the customer depending on the type of product, which we have divided by clusters. In those clusters were sales are high, the most important factors that drive demand are from the control variables, however, the price of the suggested products has a small effect on it. Moreover, those cluster with an average amount of sales are influenced by several variables from the recommendation network. These variables include the Number of reviews from Recommended Product, the Number of Review of Co-purchased, the amount of Sales of Recommended Products and the ratings of Co-purchased products.
This research has as a main objective to provide an explaining model that allows considering practical decisions for Ecommerce owners. By providing an idea of how Recommendation Networks work in a Russian ecommerce it will be possible to test new actions to drive sales. Even more, with the regression we have carried out for each closer, it can be easier for a retailer to design their Recommendation Network in a way that such products suggestions will help to increase demand of a main product. In general terms, Retailers might want to create A/B tests in a certain product category, by designing a Product Recommendation Network that presents similar items with a high number of Reviews. In addition, the number of Reviews from the focal product also is a factor that can help the customer to lead a future purchase.
It would be expected that many strong ecommerce companies already count with a machine learning algorithm that shows the customer Recommendation Products based on their profile and research. Therefore, it would be useful for retailers to compare historical reports that confirm that such suggestions are the optimal display based on Product Network characteristics besides the customer profile.
By dividing the products in clusters based on Sales, Price, Discount, and the like, we have a main idea of how Product Recommendation Networks influence each situation. Therefore, it is necessary to focus the efforts to optimize the decisions about what variables the retailers need to pay attention to. It is necessary to clarify that the models created are explanatory and not predictive, that is why we propose such optimization to understand the performance of a product instead of predicting an increase of sales.
Figure 44. Focusing optimization of factors
We see that, as an overall, for the main model an increase of Sales can be driven by an optimization of the control variables followed by focusing on the price of Also Viewed products. Moreover, if it is presented Recommended Products with a high amount of Review then, there might be a probability of a Sales increase. In case the product was purchased with different items, it would be better to present those purchased items that were Reviewed by other customers. To sum up the overall explanatory model, an optimized Recommendation Network to drive sales is the one that presents products with a higher alternative price and with a high amount of Reviews both for Similar Recommended products and Co-purchased products.
Proceeding to dividing the products by similarity and storing them in clusters. It is possible to create a strategy of Recommendation Network to optimize the demand for each category. For those TVs that fell into the first cluster, it is recommended to optimize the control variables to keep the sales pace they are having, which is high. If there is a factor from the Recommendation Network that is worth to pay attention to, that one is the Price, hence by presenting alternatives with higher prices and optimizing control variables that might help to drive more sales to cluster 1.
The second cluster is the one who has less sales, but it has a slightly more influence from the Recommendation Products. To optimize this category, it would be useful to focus on presenting alternatives with a high amount of reviews and a high amount of Sales. In case the recommended products have similarity, this might help the customer to purchase by having alternatives with great acceptance and demand.
The cluster 3 is the second group that contains more sales, for this group it is recommended that the retailer focuses on optimizing the focal factors such as Reviews and Discounts. This group has several sales above average and only its own variables seem to have impact to increase its demand.
The cluster that seems to be more influenced by the Recommendation Network is the cluster 6. Although the amount of sales are not as high as other groups there is a possibility to optimize the Recommended Products to generate demand for it. This group is affected by several variables. All the control variables need to be improved to drive more sales for this group. It is interesting to observe that the most important control variables to focus on are the Number of Reviews and the Rating, these variables are more important than Price and Discount for this group. On the other hand, there are factors from the Recommendation Network that need to be enhanced. Those factors are the number Sales of the alternatives, the higher the number of Sales of presented alternatives might confirm a purchase of the focal product in case the product presents similar characteristics. Moreover, presenting co-purchased products with a high rating might decline the sales decision of the customer. A possible explanation can be that such information might be a distractor to complete a purchase, even more if the price of the offered product is above average. Finally, the number of reviews of the co-purchase product influences positively in the procurement of sales. As a final glance this cluster could drive more sales if there are presented alternative products with a good demand together with co-purchased products with a high amount of reviews but with an average rating.
By generating an explanatory model based on empirical research we realized that there are limitations that need to be addressed in following research. We carried out an analysis to one of the biggest ecommerce in Russia, however, it would be practical to carry out the same regressions to other ecommerce stores in Russia (Beru, Goods, etc.) to discover if the variables from the recommendation network have the same influence on its products. In addition, it is necessary to consider additional variables from the recommendation network. In the case of OZON there were unstable variables called “Sponsored suggestions” that appeared in selected products. Therefore, it would be useful to discover if such additional variables influence Demand. We address this issue because the average of the R2 of our model is around 40%, that is why it is necessary to find more variables to enhance this explanatory model.
Another limitation of our research is that we analyzed Sales from a certain period and not overall sales of the product. The reason for this is the availability of the information on the website which presented “Sold products” for a short period of time. Therefore, it is valuable to carry out the same regressions for the total amount of sales of a product to enhance the explanatory model.
An important aspect that limits our research is the fact we analyzed only one category from a big marketplace, therefore, this model only explains those types of products. There are two possible additional activities to be done to strengthen our assumptions: 1. Carrying out the model for other categories in the same marketplace. 2. Carrying out the model for the TV section of other marketplaces. Additionally, we analyzed a model from customers located in a specific country. This means that there exist different factors that can influence the way of purchase. Hence, it becomes necessary to replicate our regressions (with the same conditions) in marketplaces from another country, so we can find the similarities and differences that Recommendation Products affect the demand of TVs.
There are previous researches that analyzed how the demand is affected in terms of linkage. Therefore, a research that combines both the influence of such links plus the influence of the variables included in the Recommendation Network from our model can create a more robust explanation of how the Recommended products influence on Sales.
References
1. Alnogaithan O., Algazlan S., Aljuraiban A., Shargabi A. (2019), Tourism Recommendation System Based on User Reviews, International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies, p. 345-350.
2. Bakharev I. (2019). Online trading in 2019: data from Yandex.Market and GfK. https://goo-gl.su/YaTudMB
3. Breugelmans E., Campo K. (2011). Effectiveness of In-Store Displays in a Virtual Store Environment, Journal of Retailing 87. p. 75-89.
4. Cagan Urkup, Burcin Bozkaya, F. Sibel Salman (2018). Customer mobility signatures and financial indicators as predictors in product recommendation. Plus One, p. 1-19.
5. Cenfetelli T., Benbasat I., Al-Natour S. (2008). Addressing the What and How of Online Services: Positioning Supporting-Services Functionality and Service Quality for Business-to-Consumer Success. Information Systems Research Vol. 19, No. 2. p. 161-181.
6. Chen S., Owusu S., Zhou L. (2013). Social Network Based Recommendation Systems: A Short Survey, Conference SocialCom, p. 882-885.
7. Clement J. (2020). Instagram accounts with the most followers worldwide. https://goo-gl.su/m7MM5O
8. Delafrooz N., Rahmati Y., Abdi M. (2019). The influence of electronic word of mouth on Instagram users: An emphasis on consumer socialization framework. Cogent Business Management. p. 1-14.
9. Duan, W., Gu, B. and Whinston, A.B., (2008). Do Online Reviews Matter? An Empirical Investigation of Panel Data. Decision Support Systems, 45(4), p. 1007-1016.
10. Feng S., Zhang H., Cao J., Yao Y. (2018). Merging user social networks into the random walk model for better group recommendation, Applied Intelligence (2019) 49, p. 2046-2058.
11. Gal O., Arun S. (2012). Recommendation networks and the long tail of electronic commerce. MIS Quarterly Vol. 36 No. 1, p. 65-83.
12. Gal O., Barak L., Liron S., Eyal C., Ohad Y. (2013). The Network Value of Products. Journal of Marketing, 77, p. 1-26.
13. Goldenberg J., Oestreicher G., Reichman S. (2012). The Quest for Content: How User-Generated Links Can Facilitate Online Exploration. Journal of Marketing Research, p. 457-466.
14. Gomzin A., Korshunov A. (2012). Recommender systems: a survey of modern approaches. Proceedings of the Institute for System Programming of the RAS (Proceedings of ISP RAS), p. 401-417.
15. Gupta, P., Harris, J. (2010). How e-WOM recommendations influence product consideration and quality of choice: A motivation to process information perspective. Journal of Business Research, 63 (9-10), p. 1041-1049.
16. Hosanagar K., Fleder D., Lee D., Buja A. (2014). Will the Global Village Fracture Into Tribes? Recommender Systems and Their Effects on Consumer Fragmentation, Management Science No 60 (4). p. 805-823.
17. Huang J., Jun Y., Benrong Z. (2019). Demand effects of product similarity network in e?commerce platform. Springer Science, Business Media, LLC, part of Springer Nature 2019, p. 1-31.
18. Jain A., Murty M., Flynn P. (1999). Data Clustering: a review, ACM Computing Surveys. Vol. 31, No. 3. p.23-35.
19. Jannach D., Zanker M., Felfernig A., Gerhard Friedrich (2012). Recommender Systems: An Introduction, Intl. Journal of Human-Computer Interaction, No. 28, p. 72-73.
20. Jasek P. (2017) Impact of Customer Networks on Customer Lifetime Value Models. University of Economics, Prague. p. 759-764.
21. Karthik.R.V, Arputharaj K., Sannasi G. (2018). A Recommendation System for Online Purchase Using Features and Product Ranking. Proceedings of 2018 Eleventh International Conference on Contemporary Computing (IC3), Noida, India. p. 1-6.
22. Kyoung-jae K., Hyunchul A. (2017). Recommender systems using cluster-indexing collaborative filtering and social data analytics, International Journal of Production Research, Vol. 55, No. 17, p. 5037-5049.
23. Lewis R., Nguyen D. (2015). Display advertising's competitive spillovers to consumer search, Quant Mark Econ 13. p. 93-115.
24. Linyuan L., Matus M., Chi Ho Y., Yi-Cheng Z., Zike Z., Tao Z. (2012). Recommender systems. Journal of Physics Reports, 519, p. 1-49.
25. Oestreicher-Singer G., Sundararajan A. (2012). Recommendation networks and the long tail of electronic commerce, MIS Quarterly Vol. 36 No. 1 p. 65-83.
26. Oestreicher-Singer G., Sundararajan A. (2012). The Visible Hand? Demand Effects of Recommendation Networks in Electronic Markets, Management Science 58(11), p. 1963-1981.
27. Oestreicher-Singer G., Sundararajan A., Gerald C. (2017). The Power of Product Recommendation Networks, MIT Sloan management review, p.1-7.
28. Prasad U., Kumari N., Ganguly N., Mukherjee A. (2017). Analysis of the Co-purchase Network of Products to Predict Amazon Sales-Rank, 5th International Conference, BDA 2017. p. 204-234.
29. Rosario A., Valck K., Sotgiu F. (2018). Conceptualizing the electronic word-of-mouth process: What we know and need to know about eWOM creation, exposure, and evaluation, Journal of the Academy of Marketing Science, p. 1-27.
30. SEMrush: traffic-analytics. https://goo-gl.su/QKcH2kZ
31. Senecal S., Nantel J. (2004). The influence of online product recommendations on consumers' online choice, Journal of Retailing 80. p. 159-169.
32. Sheng X., Zolfagharian M. (2014). Consumer participation in online product recommendation services: augmenting the technology acceptance model, Journal of Services Marketing, No 28/6, p. 460-470.
33. Sua J., Changb W., Tsengc V. (2017). Effective social content-based collaborative filtering for music recommendation. Intelligent Data Analysis. p. 195-216.
34. Tryon R.C. (1939). Cluster analysis. London: Ann Arbor Edwards Bros, p. 122-139.
35. Xiao B., Benbasat I. (2007). E-commerce product recommendation agents: use, characteristics, and impact, MIS Quarterly Vol. 31 No. 1. p. 137-209.
36. Xin Z., Sui L., Yulan H., Chang Y., Wen J., Li X. (2016). Connecting Social Media to E-Commerce: Cold-Start Product Recommendation Using Microblogging Information, IEEE, Transactions on knowledge and data engineering, p. 1147-1160.
37. Xue P., Hou L., Kecheng L. (2019). The Effect of Product Distance on the eWOM in Recommendation Network, Powered by Editorial Manager and ProduXion Manager from Aries Systems Corporation, p.1-35.
38. Yochum P., Chang L., Tianlong G., Zhu M. (2019). Linked Open Data in Location-Based Recommendation System on Tourism Domain: A Survey. Natural Science Foundation of Guangxi Province, p. 16409-16439.
39. Zaikina O. (2020). Statistics and information about Ozon.ru, https://docs.ozon.ru/company/.
40. Zhao P., Ma J., Hua Z., Fang S. (2018). Academic Social Network-Based Recommendation Approach for Knowledge Sharing, The data base for Advances in Information Systems, Vol. 49, No 4, p. 78-92.
41. Zhijie L., Khim-Yong G., Heng C. (2017). The demand effects of product recommendation networks: an empirical analysis of network diversity and stability. MIS Quarterly Vol. 41 No. 2, p. 397-426.
Appendix A
Code of scaping the data
import sqlite3
import time
from pprint import pprint
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
from main import parse, get_id, drivers
connect = sqlite3.connect("database.sqlite") #или :memory: чтобысохранитьв RAM
options = Options()
options.headless = True
driver = webdriver.Firefox(options=options)
flag = False
def pre_parse(url, pack):
try:
parse(url, pack)
except:
print("!!!ERROR!!!")
for driver in drivers:
driver.close()
drivers.remove(driver)
pre_parse(url, pack)
# unloaded = [165454370]
# for i in unloaded:
# pre_parse("https://www.ozon.ru/context/detail/id/"+str(i)+"/", "v1")
for i in range(1, 21):
if i == 2:
i = 3
driver.get("https://www.ozon.ru/category/televizory-15528/?page=" + str(i))
driver.implicitly_wait(3) # seconds
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(3)
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
elements = driver.find_elements_by_css_selector(
'html body div#__nuxtdiv.layout-page.desktopdiv.block-vertical div.container.c0x2 div.c1d div.c0u9 div.ce4.c0v0 div div.widget-search-result-container.ap div.ap0>div')
for j in range(0, len(elements)):
element = elements[j]
url = element.find_element_by_xpath("div/div/div[1]/a").get_attribute("href")
id = get_id(url)
if id == "172197496":
flag = True
if not flag:
print(id + " skip")
continue
print(url)
pre_parse (url, "v1")
driver.close()
import csv
import re
import sqlite3
import time
from pprint import pprint
from selenium import webdriver
from selenium.webdriver.firefox.options import Options
options = Options()
options.headless = True
profile = webdriver.FirefoxProfile()
profile.native_events_enabled = False
drivers = []
wait_time = 10
def load(url, type):
id = get_id(url)
print(id)
connect = sqlite3.connect("database.sqlite") #или :memory: чтобысохранитьв RAM
data = connect.cursor().execute("""SELECT * FROM items WHERE id=?""", [id]).fetchall()
connect.close()
if len(data) > 0:
print(id + " load from db")
return data[0][1] + "," + type
# id = driver.find_element_by_css_selector("[data-widget=detailSKU]").text.split(": ")[1]
driver = webdriver.Firefox(options=options)
drivers.append(driver)
driver.implicitly_wait(wait_time) # seconds
driver.get(url)
item_name = driver.find_elements_by_css_selector('[data-widget="webProductHeading"]>h1')
if len(item_name) == 0:
return None
item_name = item_name[0].text
item_sale = driver.find_elements_by_xpath(
'/html/body/div[1]/div/div[1]/div[4]/div[2]/div[2]/div/div[1]/div/div/div[1]/div[1]/span[1]')
if len(item_sale) != 0:
item_sale = item_sale[0].text
else:
item_sale = "-"
item_score = driver.find_elements_by_css_selector('[data-widget="reviewProductScore"] div[title]')
if len(item_score) == 0:
item_score = "-"
else:
item_score = item_score[0].get_attribute('title')
item_price = driver.find_elements_by_css_selector('[data-widget="webSale"]>div>div>div>div>div>span')
if len(item_price) > 0:
item_price = item_price[0].text.replace(' ', '')
else:
item_price = "-"
item_salary_name = driver.find_elements_by_xpath(
"/html/body/div[1]/div/div[1]/div[4]/div[2]/div[2]/div/div[3]/div[1]/div[2]/div[3]/div/div/span")
if len(item_salary_name) != 0:
name = item_salary_name[0].text
current_sleep = 0
while name == "":
current_sleep += 1
if current_sleep> 2:
break
print("nameWhile")
time.sleep(1)
item_salary_name = driver.find_elements_by_xpath(
"/html/body/div[1]/div/div[1]/div[4]/div[2]/div[2]/div/div[3]/div[1]/div[2]/div[3]/div/div/span")
if len(item_salary_name) == 0:
break
name = item_salary_name[0].text
reviews = driver.find_elements_by_css_selector('[data-widget="reviewProductScore"] a')
if len(reviews) != 0:
reviews = reviews[0].get_property("textContent").split()[0]
else:
reviews = "0"
salary_all_count = "-"
salary_today_count = "-"
salary_week_count = "-"
if len(item_salary_name) != 0:
name = item_salary_name[0].text
if "Купилиболее" in name:
salary_all_count = name.split()[2]
elif "засегодня" in name and "покуп" in name:
salary_today_count = name.split()[0]
elif "занеделю" in name and "покуп" in name:
salary_week_count = name.split()[0]
print(
"ID:" + id + " Name:" + item_name + " Price:" + item_price + " Score:" + item_score + " Sale:" + item_sale + " salary all count:" + salary_all_count + " salary today count:" + salary_today_count + " salary week count:" + salary_week_count + " Reviews:" + reviews)
driver.close()
drivers.remove(driver)
data = ",".join(
[id, item_name, item_price, item_score, item_sale, salary_all_count, salary_today_count, salary_week_count,
reviews])
connect = sqlite3.connect("database.sqlite") #или :memory: чтобысохранитьв RAM
connect.cursor().execute("""INSERT INTO items VALUES (?,?)""", [id, data])
connect.commit()
connect.close()
return data + "," + type
def parse(url, pack, reload=0):
print("---MAIN---")
driver = webdriver.Firefox(options=options)
drivers.append(driver)
driver.set_window_size(1366, 9000) # because firefox not scroll to element
driver.implicitly_wait(wait_time) # seconds
driver.get(url)
main_data = load(url, "main")
if main_data is None:
return
print("---RECOMMENDS---")
recommends = driver.find_elements_by_css_selector('[data-widget="skuShelfCompare"]>div>div>div>div>div>div>a')
current_sleep = 0
while len(recommends) == 0:
current_sleep += 1
if current_sleep> 2:
if reload > 1:
break
driver.close()
drivers.remove(driver)
parse(url, pack, reload+1)
return
time.sleep(1)
print("recWhile")
recommends_data = []
for element in recommends:
recommends_temp_data = load(element.get_property("href").split("?")[0], "recommends")
recommends_data.append(recommends_temp_data)
print("---SPONSORED---")
sponsored = driver.find_elements_by_css_selector('[data-widget="skuShelfGoods"][title="Спонсорскиетовары"] a')
current_sleep = 0
while len(sponsored) == 0:
current_sleep += 1
if current_sleep> 2:
if reload > 1:
break
driver.close()
drivers.remove(driver)
parse(url, pack, reload+1)
return
driver.save_screenshot("sponsored_screen.png")
print("sponsoredWhile")
time.sleep(1)
sponsored = driver.find_elements_by_css_selector('[data-widget="skuShelfGoods"][title="Спонсорскиетовары"] a')
sponsored_data = []
for element in sponsored:
sponsored_temp_data = load(element.get_property("href").split("?")[0], "sponsored")
sponsored_data.append(sponsored_temp_data)
print("---ALSO-BUYED---")
also_buyed = driver.find_elements_by_css_selector(
"#__nuxt>div>div.block-vertical>div:nth-child(6)>div>div:nth-child(2)>div>div:nth-child(4) a")
also_buyed_data = []
for element in also_buyed:
also_buyed_data_temp = load(element.get_property("href").split("?")[0], "also_buy")
also_buyed_data.append(also_buyed_data_temp)
driver.close()
drivers.remove(driver)
with open('data/data' + pack + '.csv', 'a') as csvfile:
writer = csv.writer(csvfile, delimiter=';')
writer.writerow([main_data] + recommends_data + sponsored_data + also_buyed_data)
csvfile.close()
def get_id(url):
return list(filter(lambda e: e != '', re.split(r'[\-/]', url)))[-1]
# parse("https://www.ozon.ru/context/detail/id/154925584/", "test")
Appendix B
Code of clustering data
ozon_tv<- read.csv("ozon_finalv3.csv", header=TRUE, sep=",")
#CLustering analysis for main products
library(dplyr) # for data cleaning
library(ISLR) # for college dataset
library(cluster) # for gower similarity and pam
library(Rtsne) # for t-SNE plot
library(ggplot2) # for visualization
# Remove college name before clustering and little bit prepare the data
main_ozon_tv<-ozon_tv[, c(-1, -9: -17)]
main_ozon_tv$Brand = as.factor(main_ozon_tv$Brand)
main_ozon_tv$Product.Name = as.factor(main_ozon_tv$Product.Name)
main_ozon_dist<- daisy (main_ozon_tv, metric = "gower", type = list(logratio = 3))
# Check attributes to ensure the correct methods are being used
summary(main_ozon_dist)
#Create matrix
df_mat<- as.matrix(main_ozon_dist)
# Output most similar pair
main_ozon_tv[which(df_mat == max(df_mat[df_mat != max(df_mat)]),
arr.ind = TRUE)[1, ], ]
#Choosing a clustering algorithm #Calculate silhouette width for many k using PAM
sil_width<- c(NA)
for(i in 2:12){
...Подобные документы
The concept of brand capital. Total branded product name for the whole company. Nestle as the largest producer of food in the world. Characteristics of technical and economic indicators. Nestle company’s brands. SWOT-analysis and Nestle in Ukraine.
курсовая работа [36,2 K], добавлен 17.02.2012The internal and external communication systems of the Nestle company. Background of the company. SWOT analysis: strength, weaknesses, opportunities. Architecture of Intranet systems. Business use of intranet systems. Intranet tools and its benefits.
контрольная работа [304,7 K], добавлен 28.10.2013Business plans are an important test of clarity of thinking and clarity of the business. Reasons for writing a business plan. Market trends and the market niche for product. Business concept, market analysis. Company organization, financial plan.
реферат [59,4 K], добавлен 15.09.2012Основные сведения об интернет-торговле в Интернете как в B2B-секторе (business-to-business), так и в B2C-секторе (business-to-customer), а также о построении системы интернет-торговли и принципах работы интернет-магазинов. Организация интернет-аукционов.
курс лекций [63,5 K], добавлен 31.10.2009The current status of our business. Products and services. Benefits of location and challenges. Number of patients who received dental services in 2013. Impact from industry changes. Market description and characteristics. Market niche and share.
бизнес-план [302,5 K], добавлен 02.10.2014The history of the company. Entering the market of pastas and the present position of the company. The problem of the company. The marketing research. The history of the market of pastas of Saint Petersburg and its present state.
курсовая работа [28,2 K], добавлен 03.11.2003Strategy and major stages of project’s fruition. Production of Korean cuisine dishes. Analysis of the industry sector, of produce’s market, of business rivals. Marketing plan, volume of sales, personnel and company management. Cost of the project.
курсовая работа [724,1 K], добавлен 17.02.2013История и причины для размещения product placement. Виды размещения product placement: визуальный; вербальный; кинестетический. Отношение читательской аудитории к размещению торговой марки в книгах. Плюсы и минусы российского книжного product placement.
курсовая работа [40,9 K], добавлен 24.11.2010Историческое развитие и современное состояние Product Placement. Скрытая реклама в СМИ. Практическое применение Product Placement как инструмента маркетингового PR в РФ. Социологическое исследование Product Placement в российском кино, его преимущества.
курсовая работа [332,4 K], добавлен 09.06.2014Characteristics of the international regime for the protection of well known trademarks. Protection of trademarks under Paris Convention, TRIPS and WIPO joint recommendation. Comparative analysis of famous brands in Italy, Pakistan and Uzbekistan.
курсовая работа [55,5 K], добавлен 24.03.2012Product Placement в книжных изданиях: виды, преимущества и недостатки. Характеристика отечественного рынка книжной продукции: основные игроки. Популярные жанры художественной литературы и авторы для размещения Product Placement, их целевая аудитория.
дипломная работа [119,9 K], добавлен 19.07.2011The concept of advertising as a marketing tool to attract consumers and increase demand. Ways to achieve maximum effect of advertising in society. Technical aspect of the announcement: style, design, special effects and forms of distribution channels.
реферат [16,1 K], добавлен 09.05.2011The main products of the company Apple. The first programmable microcomputer. Apple's marketing policy. The encoding of the voice signal. Secure data transfer protocols. Infringement of the patent in the field of wireless data company Motorola Mobility.
презентация [640,7 K], добавлен 25.01.2013История развития и характеристика основных достоинств и недостатков Product Placement в российской киноиндустрии как рекламного приёма, заключающегося в использовании реального коммерческого бренда в качестве реквизита. Применение рекламного логотипа.
курсовая работа [98,6 K], добавлен 06.01.2011Основные понятия и определения, види и функции рекламы, ее закат и второе дыхание. Изучение основ public relations, его подъём и преимущества; создание и продвижение бренда. Рассмотрение основных сходств и различий между public relations и рекламой.
курсовая работа [884,5 K], добавлен 17.09.2014Поняття Public Relations, основні принципи та необхідність в сучасному світі. Поняття іміджу та іміджмейкінгу. Реклама в системі Public Relations. Світовий досвід PR-технології в сучасному спорті. "Помаранчеві" події в Україні з позиції Public Relations.
научная работа [47,3 K], добавлен 10.05.2009Скрытая реклама, ее понятие, характеристики и виды. Product placement как разновидность скрытой рекламы и техника его эффективного применения, ее отличия от других видов рекламы. Правовые основы размещения Product placement в современной телепродукции.
курсовая работа [895,1 K], добавлен 19.10.2010Research tastes and preferences of consumers. Segmenting the market. Development of product concept and determine its characteristic. Calculating the optimal price at which the firm will maximize profits. Formation of optimal goods distribution.
курсовая работа [4,4 M], добавлен 09.08.2014Понятие и структура Public Relations (PR). Основные этапы PR-деятельности. Роль корпоративного имиджа организации. Связи с общественностью для разных сфер бизнеса. PR в банковской сфере, на рынке недвижимости, в гостиничном и в ресторанном бизнесе.
курсовая работа [41,8 K], добавлен 03.06.2014Становление Public Relations. Основные средства организации связей с общественностью. Классификация PR-технологий. PR-технологии в информационно-психилогической войне. Public Relations - это искусство и наука анализа тенденций, предсказания последствий.
реферат [23,0 K], добавлен 25.05.2005