Effects of product network relationships on demand in Russian ecommerce
This research analyzes the relationship and influence of the values contained in the Product Recommendation Network, and how they impact on an e-commerce’s demand. We carried out an empirical analysis of the TV category in a major e-commerce from Russia.
Рубрика | Менеджмент и трудовые отношения |
Вид | дипломная работа |
Язык | английский |
Дата добавления | 27.08.2020 |
Размер файла | 3,4 M |
Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже
Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.
One of the key tools for understanding customer preferences in the e-commerce market is an effective recommendation system model. Most researchers study the creation and design of recommendation systems exclusively for a specific client. The development of communication through social networks has changed everyday life and people make all decisions in the presence / in coordination with other people, so Professor Shаnshan Fеng and colleagues in 2018 studied the creation of a recommendation system for a group of people.
The professor offers a new approach based on multifaceted associations incorporation. This model uses selected and evaluated products by a group of people, then recommendation strategies based on the proposed model are applied. The system developed by the new group recommendation assesses consumer preferences much better and shows the best performance in average execution time compared to other methods (Fеng et. al. 2018).
Another work related to assessing the impact of the recommendation system on product selection on an e-commerce site was conducted by Professor Sуlvain Sеnecal (2004).
To conduct an empirical experiment, 3 SATs of electronic commerce, 2 items, 4 recommendation resources and about 500 potential buyers were randomly selected. The influence of the standard subsection on the site “With this product we recommend” showed particularly good results in comparison with traditional methods of recommendations (expert method and user advice). An additional factor in the influence of demand is personalization of the client, while providing accurate, useful, and necessary information to the client, it increases loyalty to the company and also increases the level of sales. The analysis revealed a direct relationship between users who used the recommendation system (items were selected 2 times more), in comparison with users who did not access the recommendation systems. This study makes a scientific contribution to understanding consumer demand and assessing the impact of demand when using different tools to predict customer behavior (Senеcal et. al. 2004).
The appearance of visible links to a recommendation network on electronic commerce sites is one of the most interesting phenomena that began to appear not so long ago and requires additional analysis. Professor Gаl Oestrеicher-Singеr, who wrote more than one work on the topic of the e-commerce market, together with his colleagues studied these phenomena. The main issue discussed in this study is a recommendation system that can redistribute customer demand from the most popular and well-known products to less popular / not popular ones, thereby forming a prolongation of income, customer attention and demand (Oestrеicher-Singеr et. al. 2012). As a result of the analysis, this hypothesis was confirmed and the author indicated the key factors of redistribution: attracting and retaining attention with the help of visual tools, expanding the assortment of goods, as well as reducing the cost of finding the right product. This work expands the topic of the influence of demand factors of recommender systems and allows you to professionally study all data processing methods (Google PageRank, the Gini coefficient and the category's Lorenz).
The effect of recommendation of information also has an impact when it comes to advertising. Rаndall Lеwis et. al. (2015) studied the effect that display advertising has on customer search and the effect this makes on the brand and its competitors. The statistically proved that display advertising (Ad recommendation) can lead to an increase of search in the brand advertised and also its alternatives or competitors, and other similar items or ideas that this ad can generate. This is an interesting approach to see that recommendations on the internet have some influence on users. The main question is to define what kind of influence these suggestions have depending on their type and industry.
Recent further studies have been made to predict sales based on product networks. Utpаl Prasаd et. al, (2017) proposed a model that predicted the sales rank of Amazon based on co purchase networks. They found different factors that can affect the sales rank, among these features are volume of reviews, co purchases, and other linguistic features. Their findings suggest that these factors can act as strong indicators to predict the sales rank, moreover their results proved to perform more accurately than traditional product review analysis.
In their work (Utpаl Prasаd et. al, 2017) proposed a new model based on co purchase, reviews, and ratings of a product to predict the sales rank. This model serves as a baseline because there were no similar studies to study the amazon Sales rank. They proposed the importance that a network feature can have for such predictions. That network features such as community membership, closeness and clustering are important for the network product.
Summary of researches
The researches we studied for this purpose cover an extensive approach to the topic of Online Product Recommendation. We studied works from those that define general concepts to those who studied a deeper effect of the Recommendation Networks. To this approach, we aim to contribute with our research by providing an analysis of how the values included in the Recommendation Networks have an impact on Sales in the Russian e-commerce environment.
The works we analyzed with earliest dates contain a study in Recommendation Models based on Customer perceptions and analysis of characteristics of those Product Networks. The study of this topic began by analyzing its structure more than defining a concept (2004). Subsequently, after several researches of the characteristics of such functionalities (2007 - 2008), more works started to formalize concepts about Recommendation Networks. Such theories were crucial for our own work because we could carry out our calculation based on those concepts.
We found more papers with a deeper analysis in Product Recommendation. These works (2012 - 2014) include clustering customers, merging Social Network data, measuring the value of the networks, etc. The common objective these researches have is that most of them try to explain a way to optimize these networks for better results. The results they aim to optimize vary from Sales, WOM, Ratings, Reviews and other factors that affect the customers actions.
The technologies used to perform these works have evolved considerably. Most of them began with descriptions and customers' perceptions with questionnaires. Recently the methods have changed drastically, and methods such as bid data, data mining and web scraping have enabled them to perform research in a broader scope (2015-2020). In addition, it is possible to analyze other aspects of Product Networks such as Co purchases, Ranking factors, and comparison between Recommendation Algorithms.
Figure 1. Summary of literature review
The recommendation network contains new analysis in other industries besides ecommerce. We discovered some research that analyze the recommendations in Social Media channels as well as other markets like tourism. Recently, there is research that measures the Product Networks by separating them into categories, which is a relevant stage to discover differences between types of products. Additionally, to such categories, many researches carried out analysis in e commerce from different countries, which might represent a further research to compare ecommerce environments with similar characteristics but in different regions. The works we analyzed are based on general concepts of types of recommendation networks, therefore our research will be performed in the same line. Our main contribution is to provide an explaining model of how the values included in the Recommendation Network have impact on Russian e commerce specifically in the TV category.
Research design and methodology
Hypothesis
Reviews:
H1a: The sales volume of a product is influenced by the number of reviews of neighborhood similar (also viewed) products.
One of the key factors influencing the buyer's decision making is the availability of review. So, Professor Xue Pan et. al. (2019) revealed that not only reviews of the main product, but also of additionally viewed, which are in the same recommendation chain, influence. The positive impact (volume sales) was practically confirmed when selling books on Amazon.com, as well as selling books on Barnesand-Noble.com. When reading a review of a client who bought an additional or main product, a positive impression of the goods is formed, confidence and reliability are formed in the intentions of a person, which is why statistical sales figures increase in direct proportion to the number of reviews of additional goods in the recommendation system.
H1b: The sales volume of a product is positively related to the number of reviews of neighborhood co purchased products.
With an increase in the number of reviews of co purchased products, there is an increase in sales of the main product. The purchase of goods within the recommendation network launches a virtual wave of sales, which allows you to catch nearby products and significantly change its internal eWOM. After the publication of the recall of additional products in the category co purchased products, the client, having familiarize themselves with them, decides regarding the main product (Xue Pan et. al. 2019).
H1c: The reviews of a focal product are influenced by discount and price (economic indicators).
Interestingly, the economic performance of goods in the recommendation network can positively affect the quantity and quality of reviews of the main product. This issue was studied by Professor Gupta, P. and colleagues, when it was revealed the positive impact of discounts and prices on reviews of the main product. Today, the key indicators for which the buyer decides: cost, discount size, rating and reviews. That is why the hypothesis is applicable to our study due to the increase in the number of reviews on e-commerce sites, as well as their specific value when choosing a customer for their product. We have the opportunity to understand how Russian customers behave when placing an order and how this effect is expressed on the Russian marketplace (Gupta et. al. 2010).
Similarity:
H2a: Directly connected products have similar range of sales with each other.
Based on previous studies of professors Xue Pаn and colleagues (2019) in the field of demand networks dependence of recommendation networks, it can be noted that the location of goods in the system affects two factors: 1) intention to purchase; 2) consumer involvement and duration of stay on the site. With an increase in intention to buy, the consumer usually compares the goods, characteristics, and cost, which ultimately leads to the final choice. One of the empirical studies of Xue Pаn, studying this issue, revealed the pattern that causally related products have the same sales volume. This hypothesis will allow us to conduct an additional analysis and identify the dependence of demand for goods within the recommendation network. During the study, we will separately pay attention to products from the recommendation network, which have different sales volumes.
H2b: The focal product has more sales when neighborhood similar products have higher prices.
In our literary review, we found that many authors found out and wanted to understand the economic effect of the location of the goods on the e-commerce site. Namely, Professor Andеrson (2006) reported that recommendation systems help guess consumer behavior by showing a wide variety of products, which should have led to increased sales. Another scientist Moоney (2000) argues that the recommendation system maintains or improves the position in the virtual storefront of only popular products. Professors Flеder and Hosаnagar (2009) concluded that product diversity is best affected because the referral network will direct consumers to new products. As a result, Professor Zhijie Lin and colleagues found that the main product is more in demand than products in the recommendation network with high prices. An empirical analysis of Zhijiе Lin was carried out on the American e-commerce market, we can analyze the Russian market and identify the existing pattern regarding the dependence of the volume of sales of the main product on neighbors in the recommendation network.
Economic incentives:
H3a: The sales volume of a focal product is negatively influenced by the high average of discounted similar offered products.
H3b: The sales volume of a focal product is positively influenced by the high average of rating co purchased products.
E-commerce sites have very well learned how to manage customer attention and behavior using discount systems. It is not in vain that events such as Black Friday bring revenue to stores much more than the previous months of work. The sales volume and discount of the goods directly depends on the size of the discount in similar products in the recommendation network. This issue was additionally studied by Professor Xuе Pаn, where he created 5 models that explore the economic effect of interconnected products of the recommendation network. He found that sales of the main product that attracts the most attention from the client in certain situations adversely affect similar products that are sold at a discount. And in other cases, this effect has a positive effect. This pattern was also identified by other scientists: Lin, Z., Brеugеlmans, E., Oеstrеicher-Singеr, G. In our data set, there are a large number of basic goods with sales, as well as goods in the recommendation network with discounts, so this hypothesis is still One important issue is the impact of product demand in a network of recommendations.
Ratings:
H4a: The sales volume of a product is influenced by the ratings of neighborhood similar (also viewed) products.
According to a study by Professor Lеi Hоu and colleagues, the effects of the eWOM indicator on similar products in the recommendation network were identified. Subsequently, an additional detailed analysis of each indicator affecting sales was carried out, which showed that the influence on the rating of a similar product is also significant. Oestrеicher-Singеr, G., when analyzing Amazon.com, confirmed the effect of the rating, the number of reviews, and the size of the discount on sales of the main product. Today, buyers prefer to trust the recommendation network more than spend time searching for products by rating, so the effect of sales on the average rating of goods is a significant hypothesis that we took into work.
H4b: The rating of a focal product is influenced by average discount of neighborhood similar (also viewed) products.
The location of the goods in the virtual shelf of the e-commerce site was originally calculated so that the goods had approximately the same average rating. Buyers, using the recommendation system, before the final purchase, in any case, check and compare manually the rating, discount and other important indicators, so we assume that the rating of the main product is significantly affected by the average rating and the average discount of neighboring products (Duan et. al. 2008).
Conceptual model
Based on the theories mentioned previously, we proposed the following research model. We aim to explain the sales of the category of TVs with the chosen indicators related to recommendations to the user. We want to quantify the value of showing alternative options to the consumer. (Еls Breugеlmans et. al, 2011) stated in their work that not all types of displays have effectiveness to increase brand sales. In the same work they state that showing a product in the first place is a strong indicator of influence the customers purchase. For this model we proposed we considered the fact that the recommended products are shown first than some of the main products' attributes, such as reviews and ratings.
There are researches that estimate the value of the network where a product is located, (Gal Oestreichеr-Singеr et. al, 2013) carried out a research to discover the value of the product and the value they provide to the network. They state that depending on the revenue, they received more links and are being recommended more in their network. In the same research they point out that best sellers' books receive more value than they provide in the network, compared to low sellers that they provide significantly more value to the network than they receive from it.
Figure 2. Visualization of conceptual model
In our model we aim to explain actual sales of a product in a certain period. We want to discover if similar alternatives affect the customers decision in buying a TV. Moreover, we want to explain if co-purchasing products, with a considerably lower price than the main product, has an influence on the customer. We have proposed several hypotheses to explain each of the main available attributes of the recommended products.
Table 1. Variables of conceptual model.
Type of Variable |
Name of variable |
|
Dependent |
Sales volume |
|
Ratings of product sold |
||
Reviews of product sold |
||
Independent |
Average number of reviews of “also viewed” products |
|
Average number of reviews of co purchased products |
||
Average price of “also viewed” products |
||
Average price of co purchased products |
||
Average amount of discount of “also viewed” products |
||
Average sales of “also viewed” products |
||
Average rating of “also viewed products” |
||
Average rating of co purchased products |
||
Control variables |
Product price |
|
Number or own reviews |
||
Presence of discount |
||
Average rating |
Choosing a category and a suitable layout of a product
The product recommendation network in ozon.ru presents a common layout for most ecommerce. Nonetheless there are some attributes in its layout that can allow a deeper understanding of the influence of the product recommendation systems in the purchase behavior of a customer. In order to carry out our analysis we need to parse all the data related to other products that are being recommended on the page of the main product. (Lin et. al, 2017) did previously a similar analysis by getting data from recommended products to analyze the effect of a recommended product if it presented as an option or if it's being influenced.
There are several ecommerce stores in the Russian market that provide different kinds of products to their customers. Some of them offer specific products to a niche market, while others offer a broad range of products to the mass market. This research aims to analyze one of the biggest ecommerce stores in Russia. It is well accepted that ozon.ru is one of the leaders in the industry. One of the main advantages of analyzing this site is that it contains the required attributes for our objectives. And, the most important, it contains information about sales for a specific category. All those attributes combined allow us to carry out our research to reply to our questions and hypothesis.
Figure 3. An average layout of a product page in OZON
This is what an average layout of a product page looks like in ozon.ru. It is important to take in consideration that, for some products or seasons, it might appear other attributes and recommendations. This can include the presence of sponsored products, new discounts, subscription offers, etc. Therefore, it is understandable that consumer behavior could be affected by those attributes.
That is why we analyzed a category that might not be unstable in terms of recommendations in a certain period. According to Google Trends (2020) the search for televisions and smart tv are stable during most part of the year, excepting for the last part of November and December where most sales for this product are offered.
Figure 4. Overall interest in searches for TVs in the Internet
The period when we got information about TVs was during a time where the interest is flat so there are no incentives for ozon.ru to add additional recommendations other than its default recommendation systems. By analyzing this category during this period, we can be sure that sales obtained during this period were not affected by other variables rather than the main attributes of the product and its usual product recommendations.
Figure 5. An example of the location of the values and characteristics of the main product
The layout of this category in that particular period allows us to carry a cleaner analysis with attributes for every product in the selected category. This category presents sales for a day or a period for the last 2 months. Having this information, we can carry out an empirical analysis about what are those attributes from product recommendations that might affect the demand of this category.
Figure 6. An example of the location of the values and characteristics of the also viewed products and co-purchased products
According to the product attributes, there are offered alternative options to the customer. These alternative products contain the attributes we want to analyze to discover if they have influence in the purchase decision of a customer. At a first glance, the alternatives recommended seem like the main product. In addition, the co-purchase products are offered in the same format to the customer.
Figure 7. The network of products
We want to see if for this category they have influence on the customer decision. It is important to mention that, there are additional attributes of the main product below these recommendations, therefore these attributes were considered as control variables, so can be sure that, if they have influence, means that the customer took into account information below the recommendations.
Collecting the data
We analyzed Russian e-commerce sites and chose one of the largest and most popular ones - OZON.ru is a Russian online store that provides its customers with more than 5 million positions in 24 categories, including electronics, household appliances, household and garden goods, goods for mothers and children, repair, sports and leisure , beauty and health, clothes and shoes, auto products, pet products, food, books, multimedia and others. In 2019, Ozon.ru took the fifth position in the ranking of Forbes magazine "20 most expensive companies in Runet-2019." In 2018, the company showed record sales growth over 10 years of existence. According to the company itself, the online store turnover grew by 73% to 42.5 billion rubles.
We chose this resource because this service is one of the most innovative in the Russian market, which implements 3 types of recommendation systems: “We recommend with this product”, “Sponsorship products”, “Buy with this product”. The category of TVs is presented on the site in the best way in terms of sales and representation, so we parsed the site and received data on more than 800 units of goods that are connected to the recommendation network (Zaikina, 2020).
Scraping the data
The category of televisions contains several products and recommendations, this requires web scraping methods to get the data we want to analyze. For this purpose, we used Python and the Selenium package, this allowed us to get more than 700 rows of data and more than 8000 attributes of recommendations. The Selenium package was useful to scrap all the desired information from the TV category of ozon.ru, moreover, it was useful to automate the parsing.
Figure 8. Piece of code for scraping the data from OZON
It is important to mention that OZON does not have a static webpage and the information is constantly changing in a certain period. This means that new ads or new discounts can appear during a special season. Therefore, some values of the co purchased products were instable during the parsing process. In addition, recommended products can keep the same information, however we detected that the order of the shown products change and are not static. The order that recommended products are shown can affect the purchase behavior of the customer (Еls Brеugelmаns et. al, 2011). Therefore, we scraped data in a period where there are no incentives, such as sales or high seasons, to make significant changes on the algorithm of the recommended systems.
In addition, we discovered that sales information about data was not presented in all categories of the ecommerce. And sometimes the information presented was the number of times the product was seen instead of the number of sales. Consequently, we specified in our code to get only those actual sales and no other relevant information.
Operationalizing variables
We aim to analyze the variables of this model divided in 4 categories. Reviews, Economical Incentives, Ratings and Similarities. Different authors have measured Product Recommendation in a variety of ways. For example, (Xuе Pаn et. al, 2019) assessed the Product Recommendation Networks by measuring the impact of reviews in such systems. In addition, (Еls Breugеlmans et. al, 2011) focused attention on the effectiveness of In Store displays in a virtual environment. They suggested that those products featured in virtual displays on E commerce tend to change the behavior of the consumer. Based on this premise it is interesting to study what are those additional variables, in the recommendation environment, that may increase the sales effect of a certain product. Additional work related to this is the research of (Zhijiе Lin et. al, 2018) reveals that there is a relationship between two products inside a recommendation system, this relationship is affected by the ratings and reviews of those products. Based on this study, it would be useful to analyze what are those other attributes of the recommended products that might affect the demand and performance of a certain item.
How Sales are affected by recommendation systems
The main objective of this research is to analyze how the demand is affected by the recommended products and its existing conditions. Previous studies have based their criteria for selection on the effects of demand. (Zhijiе Lin et. al, 2017) assessed previously this variable by taking Sales of an ecommerce in a certain period as the variable that is looking to explain how is affected by the characteristics of a recommended product.
This research will try to explain the variable of Sales, in a certain period. The difference of previous studies is that we will use different attributes that might explain the demand of a product. Moreover, the value in Sales we have chosen is for a short period.
Sal_n = |Wi|
Where:
Sal_n = Number in sales of a product
Wi = Number of sales of a product for one week
Factors in the recommendation systems affect the demand of a product. Reviews.
In most recent studies, eWOM has been measured in different ways. (Xue Pan et. al, 2019) Measured the impact of the recommendation systems in the reviews of a product. This study took as explanatory variables the level of distance of 2 products and its relationship based on it. In the same study they found that a product's rating is influenced by products that can be close even up to three clicks of distance. Based on this, we chose the variables Average number of reviews of “also viewed” products and Average number of reviews of co purchased products.
We have chosen such variables to see how sensible the number of reviews of similar products can affect the demand of a focal product. In addition, we want to measure if the same variable, but from co purchased products, can affect the sales of the main product.
Reviews_Main + ANR_av + ANR_cp
Where:
Reviews_main = Number of reviews of the main product.
ANR_av = |RAVi + RAVj / (RAVi + RAVj)|
RAV = Number of reviews of also viewed products
ANR_cp = |RCPi + RCPj / (RCPi + RCPj)|
RCP = Number of reviews of co purchased products
Economical Incentives:
Some recommendation systems might use their algorithm to suggest those products that might be similar and have a purchase discount. This also can be used for those products that can be purchased together with the focal product. Based on the work of (Linyuan Lьa et. al 2012) we want to analyze the attributes of those recommended products that contain some purchase incentives. That is why we added to our model the variables Average amount of discount of “also viewed” products and Average amount of discount of co purchased products.
Discount_main + DA_av
Where:
Discount_main = Discount of the main product
DA_av = |DAVi + DAVj / (DAVi + DAVj)|
DAV = Discount of also viewed product
Similarity.
When a product is considered by a user on Ecommerce, there is also the possibility to choose between alternatives and to purchase additional products related to the customer's needs. (Zhijiе Lin et. al, 2017) analyzed the implicit demand correlation of their study by defining the level of elasticity for each recommended product. For this research we are analyzing the category for TVs, which here is easier to define what products are substitutes (“also viewed”) and what products are complementary (“Co-purchased”). It is necessary to mention that co purchased products have a significant lower price than the focal product. Therefore, it cannot be considered as an alternative purchase. Hence, we want to analyze how much impact those prices can have to the focal product, for this we want to use the variables Average price of “also viewed” products and Average price of “co purchased” products. So, we can measure if the better the prices of an alternative product can decrease the demand of the main one, similarly we want to analyze if the price of the co purchased products incentive the demand of this one.
Price_main + AP_av + AP_cp
Where:
Price_main = Price of the main product
AP_av = |PAVi + PAVj / (PAVi + PAVj)|
PAV = Price also viewed
AP_cp = |PCPi + PCPj / (PCPi + PCPj)|
PCP = Price co purchased
Ratings.
There exists research that analyses the Rating factor in the recommendation network (Lua et. al, 2012). This variable has been assessed as one of the factors that has importance to design recommendation networks (Kima et. al, 2017). Such value is present in most ecommerce in Russian when a product is being recommended. That is why we decided to include it in our model to discover if such variables influence the purchase decision of a customer on the TV category. One of the main objectives with this variable is to explain until what stage the discount of a recommended product can serve as an incentive to customers to purchase a product.
Rating_main + AR_av + AR_cp
Where:
Rating_main = Rating of the main product.
AR_av = |RA_AVi + RA_AVj / (RAVi + RAVj)|
RA_AV = Ratings of also viewed products
AR_cp = |RA_CPi + RA_CPj / (RA_CPi + RA_CPj)|
RA_CP = Ratings of co-purchased products
Data clustering: An additional process
Nowadays, with the development of digitalization, new mathematical models and methods are used in data processing that effectively structure information and identify patterns in large volumes of data. Among them, a significant role is played by methods for identifying clusters (classes). The relevance is confirmed by the fact that at the request of the term "classification analysis" or "clustering" in the Google search engine (as of April 2020) it will be more than 274k for clustering and around 530 for classification analysis (SEMrush: trаffic-anаlytics, 2020).
Figure 9. Keyword overview of clustering and classification analysis
Consider the definition of this concept. Cluster analysis is a data analysis based on the collection of statistical information about the general structuring of the proposed data. In other words, dividing the data sample into subsets (clusters) so that each cluster consists of similar elements. At the same time, there are significant and understandable differences between clusters. The main goal of cluster analysis is to identify similar objects in the studied sample.
There are many methods of cluster analysis, so we conducted a comparative analysis of each of the generally accepted methods for evaluating and selecting the method that will be used in our study.
Probabilistic approach. It is assumed that each object in question belongs to one of the classes. It includes the most famous methods: k-means, k-medians and EM-algorithm.
The most popular clustering method is the k-means method. It was invented in the 1950s by mathematician Hugo Dyonizy Steinhaus. The algorithm is such that it seeks to minimize the total quadratic deviation of the points of the clusters from the center of these clusters:
V = i=1kxS(xj-i)2
Where:
k - number of clusters, Si- resulting clusters, i = 1,2, … k, i - center of cluster.
The sample divides many space elements into a predetermined number of clusters. The key idea is that, at each iteration, the center of each cluster is recalculated, so that the newly created cluster is closer to the selected metric. The action of the algorithm ends when the change in the centers of the clusters stops (occurs when the partitions of a finite set end). At each stage, the total quadratic deviation decreases, so looping is impossible (Jain et. al. 1999).
The second clustering method, k-medians, was created in the likeness of the k-mean method, where the median is used instead of the mean square deviation to determine the center of the cluster. It also helps minimize errors across all clusters. It is worth noting that in practice this method shows better results in comparison with k-mean, especially in problems where the sum of the squared distances is minimized. The distance sum criterion is widely used for transportation problems.
The third method of the probabilistic approach is the EM-algorithm, which is used in mathematical statistics to search for estimates of the maximum correspondence of the parameters of probabilistic models (the model depends on hidden variables). Each stage of the method implementation consists of two components: first, the expected value of the likelihood function is calculated (hidden variables are taken as observable), then the maximum likelihood estimate is calculated. This increases the expected likelihood calculated in step 1. Then this value is used for 1 step in the next iteration. The algorithm is executed until convergence.
The second big approach is methods based on artificial intelligence systems. Namely, the Kohonen neural network and the genetic algorithm.
The Kohonen neural network contains many straightforward adders. This method works on the principle that everything goes to the best. The strongest signal is transformed into a single signal, the rest are transformed to zero. This method allows you to qualitatively increase the clustering method due to the rejection of excess / weak signals. As a rule, the output signals of the Kohonen layer are processed according to the rule “the winner takes everything”: the largest signal turns into a single one, the rest turn to zero.
A genetic algorithm is a heuristic search algorithm used to solve optimization and modeling problems by randomly selecting, combining, and varying the desired parameters using mechanisms similar to natural selection in nature. This method allows you to effectively solve optimization problems. Very often, researchers use these methods in biology and chemistry. The key difference from other methods is the possibility of "crossing".
The third and final subgroup of approaches - hierarchical clustering - is a set of data ordering algorithms, the visualization of which is provided using graphs. Presence of nested groups (clusters of various orders) is assumed.
Algorithms for sorting and separating data of various types. If there are clusters of various orders, the algorithm creates nested groups. There are 2 categories of this algorithm: unifying (enlargement of close points) and separating (separation of clear boundaries in a dataset). Monothetic and polythetic classification methods are sometimes distinguished by the number of signs. Like most visual methods for representing dependencies, graphs quickly lose visibility as the number of objects increases. There are a number of specialized graphing programs.
One of the most popular methods is the Sugar & James algorithm. This is a nonparametric method that allows you to transform the quality function so that the kink or jump becomes clearly visible. The method is based on the use of the concept of “distortion”, which are the variance estimates within a class (cluster). During the implementation of this method, a special “distortion function” is calculated, which is determined based on three parameters:
1. Distortion for a given cluster solution.
2. The number of clusters for a given cluster solution.
3. The conversion coefficient. Then, the distortion is transformed into transformed distortion, based on the conversion coefficient. At the end, the behavior of the transformed distortion function is analyzed depending on the number of clusters. Based on the analysis, it is concluded which cluster solution is the best (Tryon, 1939).
To summarize and select the data clustering method for our study, we have compiled a comparative table of all the data clustering methods described above.
Table 2. Summary of clustering methods and impact on conceptual model.
Clustering algorithm |
Cluster shape |
Input data |
Results |
|
Hierarchical |
Arbitrary |
The number of clusters or distance threshold for truncating a hierarchy |
Binary Cluster Tree |
|
k-means (PAM) |
Hypersphere |
Number of clusters |
Cluster Centers |
|
k-medians |
Hypersphere |
The number of clusters, the degree of fuzziness |
Cluster centers, membership matrix |
|
EM-algorithm |
Arbitrary |
Distance threshold R |
Membership matrix |
|
Kohonen neural network |
Arbitrary |
Number of clusters or distance threshold for edge removal |
Tree structure of clusters |
|
Sugar & James algorithm |
Linear |
Data sampling |
Linear model showing different levels of width between clusters |
Results and findings
An overview about OZON's performance
First, it is important to present a landscape of one of the most important ecommerce in Russia. Ozon is a leader in digital sales with a vast variety of products. Most of its traffic comes from search engines and referrals. Despite the category for TV might be a challenge to purchase online because of its price, we can see there is a considerable amount of traffic to that category. In addition, we can see that Ozon.ru is growing in traffic faster than the average of the market for similar sites.
Figure 10. Traffic journey of customers
We can observe that most of potential customers come from Direct channels, this means that they go directly to the site without passing through Search Engines or Social Media channels. On the other hand, the most popular search engine that drives traffic to the marketplace is Yandex, a Russian company, so it can be important for Ozon to focus more its advertisement on this platform rather than Google (SEMrush: trаffic-anаlytics, 2020).
After leaving Ozon, most customers go directly to Yandex, followed by Google. In addition, some of them go directly to other ecommerce stores probably to check out more options. Therefore, it would be recommended by them to focus on remarketing activities on Yandex and Google for those potential customers who visited the TV section and did not complete a purchase. Moreover, if there is a strategy for Social Media channels, then VK should be the first platform to focus on in case they want to promote their products.
The proportion of traffic by device from Ozon.ru is composed by 68% for Mobile devices and 32% for Desktop devices. Based on this, it is important to optimize selected categories to drive mobile purchases as well as continuing enhancing the mobile friendly version of the site. It is recommended to create different strategies for those who visit Ozon from mobile devices and other visitors from Desktop devices (Sеmrush: trаffic-anаlytics, 2020).
Figure 11. Traffic share of customers
Figure 12. Traffic by counties of customers
The traffic by countries of Ozon presents a predictable scenario, more than 90% of traffic comes from Russia, which is around 75 million of visits per month. Nonetheless, there is an amount from Ukraine that cannot be ignored with around 1 million visits per month. Moreover, another 1 million visits per month comes from both the United States and Germany, since this ecommerce store is only in Russian language, this can be interpreted as those expats who are looking for products sold in Russia. This data is important in case the company is trying to sell specific products for those people who live abroad and want to buy something online from Russia (SEMrush: trаffic-anаlytics, 2020).
Figure 13. Entrance sources and type of devices of customers.
The webpage for TVs presents interesting results about traffic. Contrary to the general overview of Ozon, most of the traffic to the category of TVs is made by Desktop with a 77% share, followed by Mobile with 23%. This means that, when it comes to purchase TVs, more people do it with desktop devices. As for the sources of traffic to this webpage, most if it comes from Direct sources and Search engines. This can lead to the conclusion that, in February 2020, most people who bought TVs by Ozon have already planned previously of buying such products.
The traffic growth by sources of Ozon.ru has a positive trend in all its categories. As it was expected from this ecommerce store, Ozon grows faster, in terms of traffic, than the average of its own market. This can be derived from the integral strategies of Digital Marketing they are currently carrying out. An aspect that is important to highlight is that the Market in general is growing from Social media sources. While the number of visits is not comparable to the Direct or Search sources, most ecommerce stores in Russia are placing efforts on Social Media strategies (SEMrush: trаffic-anаlytics, 2020).
Figure 14. Growth of market and OZON
Figure 15. Growth quadrant of e-commerce companies
The ecommerce landscape in Russia presents some notable results. For this analysis we the top 8 ecommerce in terms of traffic growth and number of visitors. Ozon.ru is an established player in the market, they have by default a high traffic volume far more than its competitors. However, they are not growing as fast as other players in the market. Following the trend of growth there are two main players that can be considered as game changers in the industry. Beru.com, a Yandex owned company, and goods.ru are the ecommerce stores that have fastest growth in the industry (SEMrush: trаffic-anаlytics, 2020). This means that, soon, they can overtake Ozon as one of the main ecommerce stores in Russia. We can observe that Ozon is a well-established marketplace in Russia, however they need constant innovation and enhancements of their store to keep up with the market demands.
Descriptive statistics
After scraping the data from the TV section from Ozon, we proceeded to our main analysis to explain the Effect of Product Recommendation on Sales in Russian Ecommerce. First, we want to present basic information about the main variables we aim to analyze. There are five numerical variables that are part of the focal product: Sales, Ratings, Reviews, Price and Discount. In addition, there are 5 variables for recommended products under the label “Also Viewed”, followed by three available variables for Co purchased products. Therefore, we can group those variables in 3 main categories: Main, also viewed and Co purchased. All the mentioned variables were present in 709 products from OZON.
Figure 16. Descriptive statistics
The main group presents the following basic statistics: As for price we can see that the cheapest item in this category is 4,999 rubles, and the highest is more than 200k. Such difference leads us to a high coefficient of variation of 1.06; this means there is a high dispersion in such prices. Moreover, there are some items that are special, and the price is high and can be considered as outliers to analyze the effect only in the most common products. Moreover, the ratings present a more stable pattern, there are some non-rated products and those that have the highest grade. However, this distribution is not so dispersed, with a 0,65 coefficient of variation. The same situation is presented for Discounts, there are products with up to 42% discount, however most of them have around 15%. This stability is helpful because there the period when the data was gathered is not seasonal that might have some instability on discounts. Moreover, we can see that best Reviewed products have up to 376 reviews, while there are others that do not have any reviews at all from customers. Finally, the most unstable data is from Sales, with a coefficient of variation of 2.60 which is a high dispersion variable, there are best sellers' products with up to 100 sales, while other products have not generated any purchase.
The variables devoted to the “Also viewed” products present in General a less dispersed variation. This might give us an idea that those outlier products are not presented as recommendations to customers. For example, there are not over expensive products recommended on the main pages. In addition, there are less products offered with no ratings. We can interpret this stability on Coefficient of Variation by saying that the algorithm of Ozon aims to offer products within a specific range of price, ratings, and discounts. On the other hand, the Co purchased products are the most dispersed group from all, they show co purchased products with prices up to 5k rubles, which is high for an additional product to TVs. Such high variations present the individual situation of purchase of customers.
Reviews
Figure 17. Descriptive statistics of reviews of main products
The number of reviews of the main products present some outliers in the range. Most products present zero number of reviews while others have up to almost 400. We can see a high variation in products that are being reviewed and others that are not being considered by customers. With such dispersion, there is an average number of 26 Reviews per product. We will create a regression to explain what those variables from the main product and product recommendations that are might influence on this number. We want to get an overview of this variable because one review means one sale generated.
Discount
Figure 18. Descriptive statistics of discount of main product
As for the discount presented on the focal products, we can see that there is a flatter curve. The most popular discount for TVs is 15%, nonetheless there are others with up to 40%. This information is from a period where there are not seasonal incentives to raise or decrease discounts. This variable is likely to become unstable in seasonal periods such as sales, black Friday and the like.
Figure 19. Descriptive statistics of rating of main product
The ratings of the main products present a curve. Most of the purchased products are rated with the highest mark, which is 5. Followed by non-rated products with 0 ratings, this can be due to the lack of sales of those products. It is interesting that there are no low rated products, which might be necessary to contrast this information with devolutions to prove if the purchased products are well received by customers.
Sales
Figure 20. Descriptive statistics of sales of main products
There is a total of 4309 sold products. However, the distribution of sales is focused on selected brands. With our regression we aim to explain what are those attributes of the Product recommendation that affect the number of sales. This variable is not available for all categories in Ozon, instead of it they present the number of times the product was seen.
Figure 21. Brand and sales for the main products
From the more than 30 brands offered by Ozon, there are only nine brands that drove the total number of sales for the analyzed period. We can see that the most popular brand is Thompson, a product that has competitive prices. Followed by popular brands such as LG, Samsung, and Panasonic, which have a price slightly higher than the average. From this graph we can get to the conclusion that exists 2 main types of customers: 1. Most of them, who prefer a product with a competitive price, and 2. Those who prefer to buy high quality products from a popular brand.
Correlation between variables
We created a correlation matrix to get an overview of the relationship between variables we are controlling. The main dependent variables we aim to explain in our general model related to demand are Ratings, Reviews, and Sales. We want to discover at what stage and scope the Product Recommendation Networks influence on demand for customers. Based on our Literature review, we know that links at some degree of the recommended products can have influence on Sales. This time we want to explain if the values and characteristics of the Recommended products have an impact on demand for TVs in a Russian ecommerce.
Figure 22. Correlation matrix
First of all, we got results for the variable Ratings, as it was expected it has a negative correlation with price, the lower the price the higher a product ranks in ratings. In addition, it has a stronger and positive correlation with Discount, it means that those products that were purchased with a discount were ranked higher. Moreover, recommended products have a correlation with this variable, a negative one with price and a positive one with rating, reviews and sales. This might be due to the similarity of products that were recommended on the main product. Correlation with co-purchased products appears to be irrelevant.
The variable Reviews presents the same tendency as Ratings but with less intensity. There is a negative correlation with its main price and prices of recommended products. Also present a positive one with Reviews, both for main and recommended products, and Sales, for its main product. For some variables we can see a stronger correlation with co purchased products. Even though this is not so relevant we can observe a positive correlation with Price, Ratings and Reviews of co-purchased products. This means that the more present are co-purchased products in the recommendation system, the greater Number of reviews the product has.
...Подобные документы
Origins of and reasons for product placement: history of product placement in the cinema, sponsored shows. Factors that can influence the cost of a placement. Branded entertainment in all its forms: series and television programs, novels and plays.
курсовая работа [42,1 K], добавлен 16.10.2013Improving the business processes of customer relationship management through automation. Solutions the problem of the absence of automation of customer related business processes. Develop templates to support ongoing processes of customer relationships.
реферат [173,6 K], добавлен 14.02.2016Оргтехника как основа для работы офиса, ее типы и функциональные особенности, значение. Необходимость использования компьютера, ее обоснование. Информационные системы в управлении и принципы их формирования. Модели продаж CRM-систем On-demand (или SaaS).
курсовая работа [1,6 M], добавлен 01.04.2012Evaluation of urban public transport system in Indonesia, the possibility of its effective development. Analysis of influence factors by using the Ishikawa Cause and Effect diagram and also the use of Pareto analysis. Using business process reengineering.
контрольная работа [398,2 K], добавлен 21.04.2014The impact of management and leadership styles on strategic decisions. Creating a leadership strategy that supports organizational direction. Appropriate methods to review current leadership requirements. Plan for the development of future situations.
курсовая работа [36,2 K], добавлен 20.05.2015Selected aspects of stimulation of scientific thinking. Meta-skills. Methods of critical and creative thinking. Analysis of the decision-making methods without use of numerical values of probability (exemplificative of the investment projects).
аттестационная работа [196,7 K], добавлен 15.10.2008Рассмотрение концепции Customer Relationship Management по управлению взаимоотношениями с клиентами. Возможности CRM-систем, их влияние на эффективность бизнеса. Разработка, реализация и стоимость проекта внедрения CRM-системы для ЗАО "Сибтехнология".
дипломная работа [5,5 M], добавлен 15.09.2012Critical literature review. Apparel industry overview: Porter’s Five Forces framework, PESTLE, competitors analysis, key success factors of the industry. Bershka’s business model. Integration-responsiveness framework. Critical evaluation of chosen issue.
контрольная работа [29,1 K], добавлен 04.10.2014Impact of globalization on the way organizations conduct their businesses overseas, in the light of increased outsourcing. The strategies adopted by General Electric. Offshore Outsourcing Business Models. Factors for affect the success of the outsourcing.
реферат [32,3 K], добавлен 13.10.2011Major factors of success of managers. Effective achievement of the organizational purposes. Use of "emotional investigation". Providing support to employees. That is appeal charisma. Positive morale and recognition. Feedback of the head with workers.
презентация [1,8 M], добавлен 15.07.2012Сущность CRM-систем - Customer Relationship Management. Преимущества клиенториентированного подхода к бизнесу. Формы функционирования и классификация CRM-систем. Основные инструменты, которые включает в себя технология управления отношениями с клиентами.
реферат [30,9 K], добавлен 12.01.2011Six principles of business etiquette survival or success in the business world. Punctuality, privacy, courtesy, friendliness and affability, attention to people, appearance, literacy speaking and writing as the major commandments of business man.
презентация [287,1 K], добавлен 21.10.2013Discussion of organizational culture. The major theories of personality. Social perception, its elements and common barriers. Individual and organizational influences on ethical behavior. The psychophysiology of the stress response.
контрольная работа [27,7 K], добавлен 19.11.2012Relevance of electronic document flow implementation. Description of selected companies. Pattern of ownership. Sectorial branch. Company size. Resources used. Current document flow. Major advantage of the information system implementation in the work.
курсовая работа [128,1 K], добавлен 14.02.2016Понятие и сущность мотивации трудовой деятельности персонала. Особенности применения методов стимулирования в коммерческих организациях на примере Levi’s Russia. Методы нематериального стимулирования персонала. Вклад сотрудника в прибыль компании.
курсовая работа [27,8 K], добавлен 15.05.2014Searching for investor and interaction with him. Various problems in the project organization and their solutions: design, page-proof, programming, the choice of the performers. Features of the project and the results of its creation, monetization.
реферат [22,0 K], добавлен 14.02.2016Analysis of the peculiarities of the mobile applications market. The specifics of the process of mobile application development. Systematization of the main project management methodologies. Decision of the problems of use of the classical methodologies.
контрольная работа [1,4 M], добавлен 14.02.2016Description of the structure of the airline and the structure of its subsystems. Analysis of the main activities of the airline, other goals. Building the “objective tree” of the airline. Description of the environmental features of the transport company.
курсовая работа [1,2 M], добавлен 03.03.2013The concept and features of bankruptcy. Methods prevent bankruptcy of Russian small businesses. General characteristics of crisis management. Calculating the probability of bankruptcy discriminant function in the example of "Kirov Plant "Mayak".
курсовая работа [74,5 K], добавлен 18.05.2015Value and probability weighting function. Tournament games as special settings for a competition between individuals. Model: competitive environment, application of prospect theory. Experiment: design, conducting. Analysis of experiment results.
курсовая работа [1,9 M], добавлен 20.03.2016