Recommender Model for Optimal Team Composition in Dota2 Professional Matches

The current paper addresses the in-game units’ cooperation issue using real-world techniques to meet the demand for analysis for team composition routine optimization. Current research creates a system, which suggests the optimal hero selections.

Рубрика Менеджмент и трудовые отношения
Вид дипломная работа
Язык английский
Дата добавления 25.08.2020
Размер файла 6,8 M

Отправить свою хорошую работу в базу знаний просто. Используйте форму, расположенную ниже

Студенты, аспиранты, молодые ученые, использующие базу знаний в своей учебе и работе, будут вам очень благодарны.

Размещено на http://www.allbest.ru/

FEDERAL STATE EDUCATIONAL INSTITUTION

OF HIGHER EDUCATION

NATIONAL RESEARCH UNIVERSITY

HIGHER SCHOOL OF ECONOMICS

Saint Petersburg School of Economics and Management

Department of Management

Recommender Model for Optimal Team Composition in Dota2 Professional Matches

Bachelor's thesis

In the field 38.03.02 `Management'

Educational programme `Management'

Pavlov Alexander Dmitrievich

Sidorenko Danila Andreevich

Saint Petersburg 2020

Table of contents

Introduction

1 Background: Dota 2

1.1 Game basics

1.2 Competitive nature

2 Related Work

3 Methodology

3.1 Dota 2 match history and sampling

3.2 Data collection

3.3 Descriptive statistics and graphs

3.4 Association rules

3.5 Social Network Analysis

4 Results

4.1 Dataset and formatting

4.2 Results for recommender model

4.3 Application design

4.4 Evaluation

Conclusion, limitations, and future works

List of references

Appendices

Abstract

Computer games have recently faced rising interest, become a big business, and created an electronic sport around themselves. Multiplayer Online Battle Arenas, being the most successful among other eSports disciplines, are also the most complex and strategic-reliant games to play. To succeed in this kind of game, teams should master the idea of finding and selecting the best heroes' combinations. The current paper addresses the in-game units' cooperation issue using real-world techniques to meet the demand for analysis for team composition routine optimization. Based on previous games' history, current research creates and describes a system, which suggests the optimal hero selections to consider during ongoing matches. The performance of the system is then evaluated by comparing it to a created-by-hand game analysis done by tournament-broadcasting studios. Though the paper also provides the critical limitations of studying the exact issue and proposes possible solutions and areas for future research.

Keywords: cooperation study, social network analysis, association rules, team composition, multiplayer online battle arena

team composition routine optimization

Introduction

Team-based competitive online games like Dota 2, League of Legends, or Counter-Strike: Global Offensive are among the most played and watched today. (Summerville et al., 2016) The skyrocketing in recent years industry of cybersports now attracts big companies ready to invest millions of dollars into the tournaments, events, teams, and any related stuff. (Bathurst, 2017; Takahashi, 2017) Hardware producers and software developers adjust the products and entire line-ups to meet the demand for a specific `for gamers' tag. (Logitech International, 2017) The cybersports branch itself experienced $4.6 billion of investments in 2018 (Deloitte Corporate Finance LLC & The Esports Observer, 2019) via substantial marketing budgets and facilities development. Many teams reposition themselves as cybersport organizations through receiving sponsors' attention and using the funds to create or empower existing brands. Due to the popularity, the competitive nature, as well as the strategic complexity, eSports share many similarities with traditional team sports. With the rise of data analysis in the latter, statistical and machine learning tools also become increasingly important for the development and analytics of computer sports. Moreover, online games have proven to be a nice fit for experimental research due to the exceptional environment yielding rich information to work with. Despite all of that, almost every real-world application for eSports' in-game analytics and research is currently performed by hand.

Cooperation is a common practical mechanism available in di?erent areas spanning from biological laws to human society. Businesses, being in that list, strive to find the best human capital combinations to emphasize employees and to boost productivity and effectiveness as a result. Cyber sports work the same way and sometimes utilize combinatorial ideas even twice: at both human and in-game levels. Multiplayer Online Battle Arena (from now on - MOBA) games suit as a perfect example of team performance analysis and evaluation at team-roster and gameplay layers. Games of that genre oblige players to coordinate to reach a common goal and, at the same time, develop selves as competitive individuals to enhance performance. (Morschheuser et al., 2019) Selecting Dota 2 allows the current research to specifically focus on in-game team composition (also known as a draft), which includes both selection of own heroes (pick) and restrictions for an opponent (ban). The drafting aspect in Dota 2 is at the most advanced and tactics-reliant level across all the MOBA games existing and may be viewed within the combinatorial scope. It is characterized as a two-person zero-sum game (one team always win, when another loses) with nearly perfect information, subsequent actions (teams orderly select heroes in real-time having all available options and each other picks at hand) and deterministic rewards (win or lose.) (Z. Chen et al., 2018)

Despite the numerous attempts to create an entirely mathematical algorithm to predict or to compose the draft for a team (Conley & Perry, 2013; Hanke & Chaimowicz, 2017; Summerville et al., 2016), none was game-relevant in terms of heroes' cooperation at the distinctive degree. By the moment of writing of the current paper, all known topic-related researches do not consider game-specific hero cooperation. Additionally, the previous researches have focused on a universal case, which does not attribute to team preferences and stable in-game trends.

The current paper aims to fill in the existing gap by proposing all the interested parties (professional teams and its managers, casting studios, etc.) with a ready-to-use application to help them during the drafting stage and find beneficial picks and bans. That is achieved by suggesting various draft options based on the analysis of previous Dota 2 matches done with Social Network Analysis, association rules, and in-depth statistics. The application's proposed options (outputs) are interactive (updated based on passed input) and, thus, never final until the drafting phase ends. The current paper's development will help teams and managers to create a systematic approach to cybersports decision making in a variety of scopes, not limited to team-management, transfer policies, and mergers and acquisitions. The other possible positive effect would be the substantial growth of the game-side professional scene due to the skill-sharing principle, and overall competitiveness level increase. The deepening understanding of the in-game processes by outsiders might also reduce the uncertainty score of the market and, thus, bring new investments.

The current paper consists of 40 pages, excluding any appendices. The Introduction section is followed by the four main sections, which (1) describe the nature of Dota 2, (2) discuss related prior work, (3) define the tools used to create the analysis, and (4) the overview of the results including evaluation. The current paper references 42 sources, including prior researches in journal articles, conference proceedings, reports, interviews, and book chapters.

1 Background: Dota 2

To understand the context and applications of each future-discussed model, one should consider the gameplay basics, specifically the draft stage. The following chapter describes what it is to play Dota 2, defines the professional level, and briefly overviews the eSports scene.

1.1 Game basics

Dota 2 is a MOBA genre title, which gameplay mostly takes the form of a third-person real-time strategy game, where two teams rival each other for the ultimate goal of destroying an enemy's base. Many in-game features and mechanics vary from title to title; nonetheless, the main objective remains the same. Dota 2, being one of the most advanced MOBAs, not only fulfills the basics but also sets the trends for the competition in both gameplay and community-manage scopes.

The game of Dota 2 is a match between two teams consisting out of five players, which has two main stages strictly taking place one after another. Drafting stage allows players to choose a unique hero out of 119 available at the moment of writing, with no hero picked more than once. And an actual game phase, which has already been briefly described in the previous paragraph. Ever since the heroes' selection process heavily relies on game mechanics and players' expectations of the team's gameplay using certain heroes, actual in-game stage and its basics should be explained. Teams fight each other to get to and to destroy the enemy's Ancient (the main building) through the game progression while attaining other various objectives, which are rewarded with formula-followed experience and gold. The latter could be spent on hero-strengthening items purchases and abilities upgrades. Most of the heroes possess four abilities, which define the gameplay and could only be enhanced once the player reaches a new level. During any match of Dota 2, players roam the same map, which receives game-version-related updates from time to time to bring variability and change the balance. The map is separated in special zones: three lanes partially defining player roles (top, middle, bottom), the river which splits the map into Dire and Radiant team sides with its bases, and four jungles (two for each team on respective sides). All of the zones' positions and shapes might change depending on the game version updates. At the game, start teams head to the lanes and gain first experience and gold from computer-controlled units called creeps, which meet in the middle of each of the mentioned lanes. Through the game progression, players remain to gain resources and move around the map to fight enemy heroes, destroy the opposing team's towers and get to the Ancient.

Figure 1. Dota 2 map 7.22 game update

1.2 Competitive nature

Professional and high-tier Dota 2 is remarkably different from what regular players might experience during a regular entertainment session at aspects such as game understanding, time dedication, and others. While being focused on tournaments success, professional players might practice up to 18 hours a day (Maincast, 2019), which also includes tactics and hero combination testing. Synergies of discussed in the previous section abilities create strategies and game styles for teams. Specific combinations of heroes possessing exact abilities might enhance the entire team's line-up. Professional level Dota 2 extracts the maximum of that, and players might spend hundreds of matches to understand the best possible synergy for a single ability in the game.

On top of that, players classify heroes by roles and use-scenarios depending on ability mechanics: damage dealing, healing, movement speed amplification, etc. Even Dota 2 itself has a basic hero's separation, which puts them into different categories of different gameplay nature, namely: ranged, melee, nuker, durable, escape, etc. An easy to understand example for that would be armor reduction mechanics, which amplifies physical damage the enemy target (hero, creep, or structure) receives. Since the game mechanic supports the stacking of armor reduction, it might be beneficial for a team to get a pair of heroes with abilities to reduce armor and combine those with a hero able to deal a lot of physical damage. Nevertheless, the game has mathematically explained limitations to prevent any mechanics abuse.

Game modes also differ for casual and professional games. Ordinary matches use All Pick, which is a system allowing every player to nominate heroes for banning (restriction to select) in a special time frame and only then to proceed to a sequential picking phase on a per-player basis. League games stick to a Captain's Mode, instead, in which a single player - team captain - picks and bans for the rest of the team against rival's captain. Two captains participate in a drafting phase; they alternate turns and actions (pick/ban) according to a strictly predetermined order depicted in the table below.

Table 1

Dota 2 draft order in Captains mode*,**

Pick stage

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

Team A

B

 

B

 

B

 

P

 

 

P

B

 

B

 

P

 

 

P

 

B

P

 

Team B

 

B

 

B

 

B

 

P

P

 

 

B

 

B

 

P

P

 

B

 

 

P

* B stands for hero ban selection, P stands for hero pick selection

** Order as of data sample's game version rules

Teams also perform a coin-toss before a series start. It allows teams to determine draft ordering and assign map sides (Radiant or Dire.) The so-called `firstpick and lastpick' rule is essential in professional matches' drafts since it creates an asymmetry of information. The team with the `lastpick' receives a chance to have the final decision on all the drafting stage, which could potentially flip the game odds by introducing a firm counter or picking the best-unrivaled option available. That is a trade for a `firstpick' opportunity for an opposing team enabling the latter to pick any available hero earlier, which sometimes equals an additional starting ban.

The balance system of Dota 2 is built upon support/core hero role distinction, which defines who gets most of the resources in a team. That is because core heroes tend to have more power when rich and high-level, whereas support heroes could sacrifice that and still be useful with fewer resources accumulated. Many success factors affect the drafting procedure. Probably the most crucial is to balance a pick with synergizing heroes of both camps, able to bring the entire line-up to a victory. Additionally, when revealing heroes, an opponent might think of a possible building-up combination and try to prevent it from happening through bans or `hero-steals.' That is said, revealing heroes in a precise order might delay an opponent from detecting specific combinations and strategies and allow an overall line-up to benefit from this. These predictions about what might be picked or should be banned next stand as a core of the Dota 2 professional scene. It is ultimately crucial to master ordering and balance when being a team captain as it becomes the key to the team's strategic advantage in every other aspect.

The professional-level drafting process is also profoundly affected by game meta and teams' preferences. Whereas the latter is simply what an individual team likes and uses more in terms of hero choices, the first stands for stable scene- and update-wide trends for increased pick or ban rates of specific heroes coming after game patches. These include stats, abilities' powers, and mechanics changes. Team captains, coaches, and analysts study the upcoming opponents in a scrupulous manner, trying to note and recognize every possible hero combination to prepare countermeasures and determine the best possible draft scenario in terms of final line-up. Since Dota 2 tournaments feature quite a short schedule of mostly up to two weeks and participants list of up to sixteen, teams should be prepared, flexible, and focused when facing each new opponent. Captains and coaches have to analyze matches, often in real-time and track any changes in other teams' drafting and playstyle.

Making predictions about what comes next in a draft might be fundamental not only to players, but also to casting studios, which mainly drive conversations between panel members and, thus, build content upon analyzing Dota 2 drafts when broadcasting a match. Theorizing on future picks or bans, questioning decisions of teams, or discussing the entire meta is only available with a decent level of expertise and analysis done by hand at the backstage.

Figure 2. Dota 2 match broadcast during the drafting phase

A diverse field of experimentation with hero drafts in MOBA games enables teams, analysts, and researches like the current paper to investigate the area of cooperation deeply. Dota 2 introduces high involvement of heroes in drafts, capping higher than 90% for selected tournaments, and, thus, provides a more interesting and extensive learning problem, as well as defines a strategy as a core.

2 Related Work

In prior, there were no scientific papers considering team composition in eSports in correspondence to a hero cooperation basis. Hence, the analogies might be drawn to parallel team-based cyber disciplines with more traditional real-world sports and managerial practices, including but not limited to project management when recruiting staff, semantic and partners analysis in marketing, or something else. The following chapter observes the prior works related to the current paper.

To better understand the prerequisites, effects, and results of cooperation of units, be it human beings or game-specific interrelated computer code, one should also explore the roots of the relations studies. First notions of importance to analyze the groups of people as independent social units with specific attributes and characteristics which can assess and react to incoming actions date back to the early XX century. The asymmetry of independence of individuals and imperfectness of groups composed of these individuals were first noted and declared as social geometry (Simmel, 1902), and many years to come until the book and ideas became available to publicity. The social geometry ideas, which included both triad/dyad and distance basis concepts, heavily influenced many early sociological books discussing collective decisions and social cooperation models. (Park, 1928; Wood, 1934) The booming interest in sociology studies in the early `30s quite fastly brought scientists to the new techniques of social groups and connections explorations. During that time, numerous independent sociologists and psychologists started developing the very first systematic studies on network analysis on social interactions. Up to the moment, sociology is proud of these pioneering papers on qualitative approaches to create and investigate sociometric networks built during interviews (Moreno, 1934) and descriptions of social relations and origins within selected anthropological and historical communities. (Lйvi-Strauss, 1947; Radcliffe-Brown, 1923) Studying interconnections of people of different eras, age, race. These articles and books had established the completely new at that time research sphere focused on understanding and general overview of how a given person, or a group of people might or did interact when having other individuals around them.

Rapid technology and scientific developments allowed the broad application of the discussed above works ranging from police investigations to project planning, sports modeling, and others. Researchers and businesses can now account for most of the underlying assumptions given in a comfortable way to better focus on a practical side. Cooperation analysis has effectively widened the presence and now helps many spheres in different environments. Microbiology and bioinformatics have seen researches and applications on enrichment analysis and network-based gene prioritization (D'Souza et al., 2017) and frameworks to systematically study post-translation protein interaction networks. (Woodsmith et al., 2017) The effects of human interactions within corporate structures were studied by Methot et al. (2018) to develop a theory to understand HR practices' driven structural internalities' effects on individual employees' performance. Social network analysis was also applied to inform managing employee competencies, behaviors, and attitudes. (Soltis et al., 2018) Many more to come, constant data culture improvement drives rapid changes in call-to-action response speeds and overall involvement of automatization in other business spheres. That enables supply chains and logistics companies to optimize routes and fleet loads using dense shipment interconnection based on a multi-step genetic algorithm using historical parcel data (Tummel et al., 2013), companies to determine cooperation capacity of each employee, based on TOPSIS and network centralities, to find suitable talents for project management (C. T. Chen & Hung, 2012), find an optimal balance between formal and informal institutions in a business environment (Wang et al., 2018), and build strategic alliances using collaborative networks to overcome resources limitations and to proceed with an open innovation project and its possible success assessment. (Nunes & Abreu, 2020)

Widening the context and now speaking of sports in the digital era, today's professional teams heavily rely on statistics and in-depth analytics for training session buildups, player development, and scouting networks. (Cheng & Li Xia, 2014; Zuccolotto et al., 2019) Interestingly, sports fans are well-known of player drafting, which has a close connection to the earlier discussed drafting phase in Dota 2. United States professional leagues hold annual young player drafts for premier NBA, NHL, NFL, to allow teams to pick talents before the start of a season. The procedure is, of course, regulated and mainly build around pick the order and the related analysis to it. Teams always try to find the perfect opportunity to sign the best possible and team-fitting player, sometimes through transfers or other trades. Apart from that, the entire team sports world could now be classified as a system open for decomposition and research. (Davids et al., 2005) Many open-source and proprietary software offer both free and paid-membership tools for individuals and smaller-team executives to deliver the very best of data and statistics for the greater good of team results. (Gerrard, 2019) Specialized agencies and in-house clubs' departments do the same job at the professional level with the help of dedicated tools, entire globe staff network chains, and club-specific approaches with the ultimate manager-set goals. This is possible thanks to advanced data collection and procession methods available at hand during and after training sessions. (Link, 2018) Some researches have proven to be successful studying sports dynamics in social networks analysis scope (Korte & Lames, 2019) to understand the interaction behavior nature of handball teams and players' involvement during the matches. Interaction studies even consider team sports in general as a cooperation-opposition game (Manuel et al., 2016) and propose a summary of applicable and sustainable measures of social network analysis applications to it. Managers and players themselves can track any weak spots in game style, pace, position, moves, combinations, passes, and whatever is connected with the game. Top-tier professional clubs even tend to build a transfer-based business strategy on top of player scouting and development using advanced modern statistical techniques designed to find and help young talented players to faster and better develop the skills. (Burt, 2017)

Considering video games' popularity and prevalence, it was only a question of time, when the scientific community would enter the world of gaming entertainment and cybersports in particular. The digital nature of video games offers a nearly perfect platform to test and study new yet, unavailable data and view known phenomenon under an entirely new angle. Some researchers have already tried to discover the world of MOBA; there were many attempts, to be honest. For example, one of those was research on a dynamic difficulty adjustment for such games done by introducing a programmed computer opponent, which gets player's performance on specific in-game mechanics as the input and later dynamically adapts to provide a user with better gaming experience. (Silva et al., 2017) Another example is the research on data-driven MOBA combat outcomes computation, which identifies winning combat tactics patterns through graphs and features and judges which exact approach to combat tactics contributes the most towards teams' success. (Yang et al., 2014) Gradual game outcome prediction research based on the game-state data mining (Johansson & Wikstrцm, 2015) concludes, machine learning techniques an ongoing game of Dota 2 could be predicted in real-time with a more than 80% of success rate with the application of machine learning techniques. The total number of game-related (especially MOBA) researches and scientific knowledge applications to those grows every year, which is ultimately beneficial for the entire eSport community and business.

There were already some attempts closely related to the current research and its branch that derives towards studying the drafting stage of the Dota 2. Conley & Perry (2013) used machine learning algorithms of logistic regression and clustering on historical matches data to predict odds for winning based on heroes selected by teams, resulting in the final k-nearest neighbors' model overall accuracy of 67.43%. Despite that, the work was turned entirely towards scientific data computations with zero respect to in-game parameters and unit characteristics-related approaches. One other paper (Kinkade et al., 2015) also demonstrates the winning odds prediction mechanism using logistic regression based on separately hero line-ups and post-game data. Authors conclude researches require a lot more input values to predict the winner using exclusively computational tools accurately. Another research (Eggert et al., 2015) uses supervised machine learning algorithms to define and discuss rich low-level insights and analysis on Dota 2 players' role classification, which is very close to the current paper's topic. After breaking down the in-game classes, authors reduce the total set and effectively apply that to various ML algorithms to validate and compare the performance of those on game data, which proves the rich playground nature of the video games and Dota 2 specifically. Worth notion prior example of association rules usage to predict drafts (Hanke & Chaimowicz, 2017) mostly focuses on general procedure and, again, winrate prediction; the application does not attribute any in-game mechanics or draft ordering and only relies on simultaneous heroes' appearance within a winning team, which anyways resulted in 74.9% winrate across 1000 testing matches simulation and accuracy of predicting the winner of 88.63%. An exceptional example of a tool initially designed to suggest rather than predict (Summerville et al., 2016) draws attention by its focus towards simplification and automation of statistical analysis made for teams. Bayesian Networks and Long Short-term Memory Recurrent Neural Networks machine learning algorithms introduced in the paper analyze the Dota 2 drafting phase to suggest new heroes based on already selected ones. Authors only attribute professional matches and later compare the algorithms' results to human predictions as well as provide insights on the future of eSports analysis and its implications.

Notwithstanding, existing web-based draft analyzing and hero suggestion tools available to all the Dota 2 players mostly use very basic parameters of win-rate pairs and sorted hero counts for exact players both on allied and rival teams. These mostly do the job of giving the right names for the right situations, but that only counts towards the public matchmaking. Still, professional matches cannot anyhow benefit from superficial statistics because that does not take into account many varied team-specific in-game preferences and strategic touches.

Aiming to extend previous researches and bring additional constraints, the current paper uses several methods to not decide instead of players but to help them to account for the context of the exact match and its possible in-game situations led by properly or improperly executed drafting stage. The applications are also of importance for coaches, commentators, and team managers interested in mastering the drafting procedure in Dota 2 for various reasons, not limited to analytics and cybersport team management.

3 Methodology

The following section describes (1) Dota 2 data nature and collection methods, (2) statistical approaches to determine in-game trends later included in the final results, and (3) methods used to build the recommender model. The tools of choice are associative rules, social network analysis, and rich descriptive statistics (Sankey, line chart, ridgeline, bump graph.)

3.1 Dota 2 match history and sampling

The entire match history of Dota 2 starts in early 2010 (closed alpha test launch) and acts as the general population. Due to the enormity and complexity of the noted chunk and also the obsolescence of the older data, the current paper uses sampling. The data selected for the research is split into two independent datasets meant for different computational techniques. The first dataset was gathered for matches held in the professional scene only for the specific time range, which equals one professional Dota 2 season lasting from late September to early August; the season selected is 2018/19, because it is the most actual season for the moment of research. This dataset consists of 7.000 matches. Team-roster changes and meta-sensitive game updates profoundly affect professional matches. It is pointless to consider more than a single season for the analysis of any team since preferences and trends vary a lot from year to year, even for the most stable ones. The second dataset contains match records on high-tier public matches, which are played by the same professionals and top-5000 ranked leaderboard members; the dataset also spans for one season. Even though public matchmaking is less preference dependent. The total number of matches of the second dataset hits an approximate of 150.000 entries. Both samples use non-random convenience sampling as the selected data is strictly targeted and defined. Collected datasets have the same entries' codification, which represents separate matches. Thus, the current paper assumes that the data gathered, observed, and analyzed is panel.

3.2 Data collection

Since the current paper analyzes the online cybersports discipline, it should be stated that all in-game data is being written, collected, and stored raw on the game servers. In the case of Dota 2 specifically, the raw data provider is Valve - the company-developer of the game. Therefore, there are three ways to get the required game data in total. The first, and the most obvious, is to get it, whether manually by watching to games or automatically using parsing software or scripts, all of which would require to replay downloads and tons of time as a result. The other two ways would require API use, which stands for an application programming interface that allows communication between the computer program operated by a user and the database with raw data. One option suggests going to the Dota 2 developer - Valve; the other would be to stick with third-party data operators. Accessing Valve's API appears to be more complicated since documentation and server data responses have poor organization and weak structure. The third-party data operators, on the opposite, provide a more user-friendly service enhanced by a wider list of available data outputs. There are a lot of active companies on the market that provide such services, while the current research relies on the two most popular ones: OpenDota.com and stratz.com because of convenient data formats and prior experience. The need for several services is driven by a difference in data available at these services and its availability. In total, three ways to get the data defined in the previous section were used: Valve's API to gather a large chunk of public matches since the developer does not limit the number of times a user can access the server daily/weekly/monthly, OpenDota API for professional matches index (ID) and players information extraction, and Stratz API for matches detailed data.

3.3 Descriptive statistics and graphs

When all data is collected and processed, the first part of the analysis began - descriptive statistics. The very first step in the descriptive statistic is to figure out what statistic is useful and requires a representation of why. There are several ways to find it out. The first one is community analysis. On platforms like Reddit, there are plenty of Dota 2 communities that discuss new game features, new heroes, new meta. On the one hand, the advantage of such a method is the size of the community. Many Dota 2 players are willing to express their opinions. On the other hand, a disadvantage is a large number of different, often contradictory, opinions.

The second source of information regarding useful descriptive statistics comes from pro-players. The current paper had a little influence of one of the top Dota 2 players in St. Petersburg, who shared some thoughts on the in-game data and meta state to start with. He also consulted with other players to collect different opinions of professionals, consolidated this information, and provided a summary of the essential statistics according to professional Dota 2 players. After all needed information had been collected, the next step was to prepare all existing data into the proper readable and visually understandable format. The main format is a graphical representation of those statistics. With the help of special package ggplot2 in R language, several graphs of different forms were built. Ggplot2 is the external package for R language that is used as the primary tool for building graphs. It helps to represent information simply. The first one of the descriptive graphs is the ridgeline chart that represents many density plots at once. The density plot itself shows the distribution of the numeric value over time. It shows the possible change of meta throughout different game patches. Also, it can demonstrate other time-spanning and ever-changing variables dependent on counts and frequencies. The other visual statistic to use is a simple line chart depicting rush changes in any variable in a short time frame. This can prove some basic assumptions and provide additional insights on meta fluctuations, for example. The most complex overview method in the current paper is a bump graph, which depicts ordered moving trends through certain time periods. Lastly, a variative style showcase in a summary format allows one to understand how the exact in-game roles utilize a total of 119 heroes pool in the team's matches.

The other distinctive case of descriptive statistics analysis is a Sankey diagram, which is a type of flow diagram. Even though ggplot2 is the best-known tool for building diagrams in R, Sankey is built with the help of a networkD3 package, which enables data visualization with the help of different flow charts and interactive graphs. The central aspect of interpreting Sankey diagrams is to understand its sequential nature and arrows direction, which is, by the way, proportional to the number of data pieces represented. In the current paper, the Sankey diagram visualizes the draft process of the particular team and shows which heroes they use in every stage of the draft. Interactivity that was stated above is helping here not only to interpret Sankey as part of descriptive statistics, but it also opens the space of using it as one of the recommendation tools. Interactivity, in this particular case, removes the unused draft heroes as soon as the relative stages pass. This helps to understand what heroes would be most likely picked or banned next.

3.4 Association rules

One of the most critical methods of analysis in the current paper is?association rules. Association rules are the set of rules that shows the relationship between two or more data items. For a better understanding of the concept underlying association rules, it is useful to consider Market Basket Analysis?or MBA. Market Basket Analysis is a technique to analyze patterns of consumers' buying habits by finding relations between their purchases. The simplest, most straightforward, and obvious example of an MBA is the following rule: if a person buys bread and sausage, he/she will buy butter in 90% of cases.? 

Before diving into association rules, the base concepts of the latter should be presented. Two terms should be stated: itemset and transaction. The itemset is the items (or just one item) that occur together in one transaction. The transaction, in general, is an entry containing one or more items. From the definitions above, it becomes clear that the transaction contains several items inside; and several transactions form the transaction list. Continuing with the Market Basket Analysis, in the example above, the transaction is the total purchase or the cash receipt, and an itemset is the whole range of products the consumer has bought; all receipts that are coming through the shop in a day form the transaction list of that particular day. To analyze the behavior of customers, it is not enough to take only one day or even a week as a transaction list. Much longer time intervals should be considered to conduct proper analysis and find out about genuine buyer's behavior.

Table 2

Association rules' main measures for transactions and itemsets

Measure

Formula

Description

Support

P(AB)

Possibility of occurrence of item A and item B together or the number of occurrences of itemset (A, B) in all the transactions

Confidence

Possibility of occurrence of item A and item B together among all itemsets where item A occurs

Lift

The rise in the probability of occurrence of A with the knowledge of B being present over the probability of occurrence of A any knowledge about the presence of B

For example, of association rule, A => B, and if a lift is higher than 1, it means that A and B are positively correlated. If lift is lower than 1, A and B are negatively correlated; finally, if lift equals to 1, items A and B are independent. After setting up the base notions that underlying the association rules and detailed explanation of the principles behind association rules, the main algorithm used in the analysis for the current paper should be stated. It is called Apriori. (Agrawal & Srikant, 1994)

Apriori algorithm, in general, is working the following way: it finds all the itemsets consisting of one item that meets the minimum support set up by the user of the algorithm. A clear advantage of the algorithm follows the latter statement: the user of the algorithm could decide on support value by himself and not be attached to some predefined number. After finding all desired pieces of information, the algorithm finds all the possible itemsets consisting of the pairs of the remaining items. Step one and two, then repeat several times, until all possible itemsets would be found. The user of the algorithm may also set the maximum possible length of the itemset, which is adding flexibility as well.  

The current paper uses association rules in general and apriori algorithm in particular to analyze picks and bans of game heroes in the different stages of the draft process. For that purpose, the table consists of the hero name as an item and match id as a transaction. The draft order is critical in the game as once team picks or bans a hero, no one can select it further; because of that hero names join with the order in which they were picked or banned; so, hero_name_1 and hero_name_3 are two different and unique items in the itemset. The minimum length of the itemset is one hero; the maximum length is not determined and limits automatically by the minimum level of support set. The manual setting of the length just limits the work of the algorithm itself. After the apriori algorithm found all the rules and counted support, confidence, and lift for them, there is a time to post-process obtained information. The resulting dataset consists of many pairs, triplets, and even squads of heroes. However, this data is not in the final state. As it was mentioned above, the order is essential in the draft process, and there is no way to make a prediction or give recommendation if an order is mixed up in association rules outcome. Unfortunately, the apriori algorithm cannot deal with ordering even with a given order number for each hero, so this data needs to be filtered. The fact of numbering heroes helps to filter data based on the right order. After filtering the dataset significantly reduces, but still has enough information to base recommendations on.

The second, additional, way of using association rules for analysis in the current paper is to apply hierarchy. (An et al., 2006) Hierarchical grouping in terms of association rules is applied in the situation when it is possible to distinguish the higher levels of hierarchy to demonstrate additional results that would not be seen in the straight application of association rules. They are several ways of grouping heroes in the case of this research, but only one valid for the recommendation system. The first thing that comes to mind is probably to group heroes by the teams; this approach is not applicable in that situation for two reasons: such groups will be helpful only in application to a particular team, not in general, and also there are no so many matches played by one team to build up any rules based on that information. On the second thought, grouping by players seems possible, but it is also not a suitable approach here, for just about the same reasons as the team grouping. The final and the most appropriate one is to build a hierarchy of heroes based on their roles in the game. Roles themselves were listed above in the last part of the data collection chapter. After the application of such a group filter, there will be five big groups of heroes: Mid, Safe, Offlane, Soft Support, and Hard Support. It is even possible for one hero to be in several groups. Such a hierarchy is showing the general pattern of the game and can be a helpful instrument along with other recommendation tools.

3.5 Social Network Analysis

Social Network Analysis (from now on - SNA) is the method of studying interactions between different members of a particular group. SNA is usually represented in the graphical form, more precisely in the form of either directed or undirected graphs. The analyzed members are presented in the form of nodes of the graph. The interactions between the members are links that connect all nodes in the graph. The current paper uses SNA to analyze the interaction between heroes in each team. Therefore, nodes here are heroes, and links are indicators of the mutual appearance of heroes in one team. As it was stated above, there are two possible types of links that could be used: direct and indirect; the current research uses the undirected graph to show relationships between heroes. There is no need to specify the direction because it is not how Dota 2 works. There is no leader and no driven heroes, hence no need in directing links. This paper using both links and nodes as indicators for the recommender model.

After the main terms and purposes of SNA were stated, the measurements of SNA must be presented and explained. The current paper uses the centrality measurements to calculate the importance of every hero to a team. The three centrality measures used in the research are the following:

· Degree centrality measures how many neighbors does the node has. In other words, in the case of this research, it counts the number of heroes connected. (Wasserman & Faust, 1994)

· The more matches one hero played with others, the more degree it will have. It is enhanced by the link thickness that represents the quantity of heroes' mutual appearances. Hence, from all of the above, degree centrality helps to find so-called team-building heroes, that are the core of the whole team.

· Betweenness centrality is the second measure used in the research. It is defined as a measure showing how many times a node falls into other nodes' shortest connections. (Uddin, 2017) The shorter the paths which are going through the node, the bigger its betweenness. Betweenness helps to show how critical a specific hero node is for the closest heroes.

· Eigenvector or eigen centrality is the advanced version of degree centrality. Unlike the degree centrality, it measures not only the number of connections between nodes but also the strength of those connections via second and third-order connections. (Laporta et al., 2018) In the case of degree centrality, the node will have a higher score as far as it has the most connections, but it might have a low eigen score because those connections are not meaningful.

4 Results

The described above algorithms' results are at the best when combined. For the maximum usability and seamless user experience (player interaction), the standalone application was designed using R Shiny. The following section demonstrates (1) the models' inputs and outputs, (2) the algorithms' performance examples for several teams paired with explanations, (3) demo version of app UI added with its work principles, and (4) evaluation of the recommender results.

4.1 Dataset and formatting

For the current research, data collection was done under the R programming language using the calls to the API of the providers designated, which is the optimal solution for data collection and analysis.?The raw data is being received from the servers via special API?calls (request to a particular data structure.)?One call can proceed with a limited amount of information. The current paper demands data from many different parts of the Dota 2 game, and each part requires different API calls. The meta info of the match (id, teams, sides, and others) is received from one call, the draft info (order, pick_ban, drafting team, and others) from anther, hero info (hero name, hero_id) from the third one. All that info is being collected for the professional matches through the third parties, and all info about public matches is coming from the utterly different call to the Valve's API.?The raw data is being received from the server in JSON format. It is the format where is the human-readable text is stored the way to be read by the computer, which is hard to analyze in a raw state; hence, it was converted. For all conversations and the final analysis, this research relies on the R programming language. With the tools of the R language, all data is processed and stored in the datasets, firstly as lists and finally as tables.?

Algorithms' approaches understanding requires depicting the database structure. API-gathered raw data is completely unsuitable for algorithms as earlier mentioned providers deliver the information in a JSON-compatible lists format; hence, data needs reorganization to meet the desired input requirements and proceed with the computations. Several separate databases accept and store all the preprocessed data. Table 3 contains an example data sample from the post-processed database on every drafting record of each match used in the current research. Such finalized databases drop a lot of initial variables (mostly in-game stats and non-relevant meta-information) since those do not carry any value for the draft analysis. These are then combined and filtered multiple times to create new databases to pass to the algorithms finally.

Table 3

Drafts database sampled for the first phase only for a randomly picked match

Account ID for a player

player

NA

NA

NA

NA

NA

NA

91654584

345803031

323792491

389022189

Computed for which team the entry is

drafter

Execration

Neon Esports

Execration

Neon Esports

Execration

Neon Esports

Execration

Neon Esports

Neon Esports

Execration

Dire team name

(Each of three spans for 22 entries)

dire

Execration

Execration

Execration

Execration

Execration

Execration

Execration

Execration

Execration

Execration

Radiant team name

rad

Neon Esports

Neon Esports

Neon Esports

Neon Esports

Neon Esports

Neon Esports

Neon Esports

Neon Esports

Neon Esports

Neon Esports

ID of a match

m_id

5224499148

5224499148

5224499148

5224499148

5224499148

5224499148

5224499148

5224499148

5224499148

5224499148

Draft stage

0 to 21

order

0

1

2

3

4

5

6

7

8

9

Boolean for team,

0 = radiant,

1 = dire

team

1

0

1

0

1

0

1

0

0

1

Boolean for draft state

is_pick

FALSE

FALSE

FALSE

FALSE

FALSE

FALSE

TRUE

TRUE

TRUE

TRUE

In-game hero name

hero

Drow Ranger

Tiny

Magnus

Omniknight

Treant Protector

Snapfire

Lich

Rubick

Abaddon

Naga Siren

Uniqie ID of a hero

hero_id

6

19

97

57

83

128

31

86

10

89

Another API provider made it possible to compute roles for each match. Using the same match identificators, an additional database containing special role and lane variables was gathered. Through pattern identification and prior experience, the actual in-game roles were computed, as shown below.

Table 4

Players' sampled database for the Alliance team for a single match

Match unique ID

Player unique ID

API-provider hero role and lane determination

Team unique ID

Team actual name

Computed actual role from API columns

m_id

player

role_temp

lane

team_id

team

role

5055417109

86799300

2

1

111474

Alliance

Hard support

5055417109

12231202

0

2

111474

Alliance

Mid

5055417109

412753955

0

1

111474

Alliance

...

Подобные документы

  • The main reasons for the use of virtual teams. Software development. Areas that are critical to the success of software projects, when they are designed with the use of virtual teams. A relatively small group of people with complementary skills.

    реферат [16,4 K], добавлен 05.12.2012

  • Critical literature review. Apparel industry overview: Porter’s Five Forces framework, PESTLE, competitors analysis, key success factors of the industry. Bershka’s business model. Integration-responsiveness framework. Critical evaluation of chosen issue.

    контрольная работа [29,1 K], добавлен 04.10.2014

  • Milestones and direction of historical development in Germany, its current status and value in the world. The main rules and principles of business negotiations. Etiquette in management of German companies. The approaches to the formation of management.

    презентация [7,8 M], добавлен 26.05.2015

  • Investigation of the subjective approach in optimization of real business process. Software development of subject-oriented business process management systems, their modeling and perfection. Implementing subject approach, analysis of practical results.

    контрольная работа [18,6 K], добавлен 14.02.2016

  • Value and probability weighting function. Tournament games as special settings for a competition between individuals. Model: competitive environment, application of prospect theory. Experiment: design, conducting. Analysis of experiment results.

    курсовая работа [1,9 M], добавлен 20.03.2016

  • Evaluation of urban public transport system in Indonesia, the possibility of its effective development. Analysis of influence factors by using the Ishikawa Cause and Effect diagram and also the use of Pareto analysis. Using business process reengineering.

    контрольная работа [398,2 K], добавлен 21.04.2014

  • Relevance of electronic document flow implementation. Description of selected companies. Pattern of ownership. Sectorial branch. Company size. Resources used. Current document flow. Major advantage of the information system implementation in the work.

    курсовая работа [128,1 K], добавлен 14.02.2016

  • Значимость внутрикорпоративной коммуникационной политики и сплоченного коллектива в финансовом институте. Характеристика корпоративных праздников и методик team building. Принципы разработки внутрикорпоративного праздника для сотрудников ОАО "Альфа-Банк".

    курсовая работа [83,2 K], добавлен 08.12.2009

  • The impact of management and leadership styles on strategic decisions. Creating a leadership strategy that supports organizational direction. Appropriate methods to review current leadership requirements. Plan for the development of future situations.

    курсовая работа [36,2 K], добавлен 20.05.2015

  • Logistics as a part of the supply chain process and storage of goods, services. Logistics software from enterprise resource planning. Physical distribution of transportation management systems. Real-time system with leading-edge proprietary technology.

    контрольная работа [15,1 K], добавлен 18.07.2009

  • Analysis of the peculiarities of the mobile applications market. The specifics of the process of mobile application development. Systematization of the main project management methodologies. Decision of the problems of use of the classical methodologies.

    контрольная работа [1,4 M], добавлен 14.02.2016

  • Different nations negotiate with different styles. Those styles are shaped by the nation’s culture, political system and place in the world. African Approaches to Negotiation. Japanese, European, Latin American, German and British styles of Negotiation.

    презентация [261,2 K], добавлен 27.10.2010

  • Description of the structure of the airline and the structure of its subsystems. Analysis of the main activities of the airline, other goals. Building the “objective tree” of the airline. Description of the environmental features of the transport company.

    курсовая работа [1,2 M], добавлен 03.03.2013

  • History of development the world leader in the production of soft drinks company "Coca-Cola". Success factors of the company, its competitors on the world market, target audience. Description of the ongoing war company the Coca-Cola brand Pepsi.

    контрольная работа [17,0 K], добавлен 27.05.2015

  • Оргтехника как основа для работы офиса, ее типы и функциональные особенности, значение. Необходимость использования компьютера, ее обоснование. Информационные системы в управлении и принципы их формирования. Модели продаж CRM-систем On-demand (или SaaS).

    курсовая работа [1,6 M], добавлен 01.04.2012

  • Company’s representative of small business. Development a project management system in the small business, considering its specifics and promoting its development. Specifics of project management. Problems and structure of the enterprises of business.

    реферат [120,6 K], добавлен 14.02.2016

  • Понятие и сущность стратегии фирмы. Особенности управления конкурентоспособностью туристского предприятия. Анализ основных экономических показателей и оценка конкурентоспособности предприятия ТОО "Real-RS". Рекламная деятельность и PR-инструменты фирмы.

    дипломная работа [660,9 K], добавлен 27.10.2015

  • Impact of globalization on the way organizations conduct their businesses overseas, in the light of increased outsourcing. The strategies adopted by General Electric. Offshore Outsourcing Business Models. Factors for affect the success of the outsourcing.

    реферат [32,3 K], добавлен 13.10.2011

  • История основания корпорации в городе Рочестер (США) в 1906 г. Появление первого ксерокопировального аппарата с незатейливым названием Model A. Выпуск в 2003 г. цифровой печатной машины нового поколения - iGen3. Изобретения, принадлежащие компании Xerox.

    презентация [1,7 M], добавлен 01.12.2013

  • The ecological tourism agency in Lithuania which would provide sustainable tours within the country, individual and group travel tours to eco tourists, professional service and consultation. Mission and vision. Company ownership. Legal establishment.

    курсовая работа [781,7 K], добавлен 11.04.2013

Работы в архивах красиво оформлены согласно требованиям ВУЗов и содержат рисунки, диаграммы, формулы и т.д.
PPT, PPTX и PDF-файлы представлены только в архивах.
Рекомендуем скачать работу.